Initializing help system before first use

Introduction

This module implements functionality for exchanging data between a Mosel model and R and for calling R functions from a Mosel model.

The r module also defines an I/O driver for exchanging data using the initializations from and initializations to Mosel constructs.

It is the Mosel run-time library that loads and runs R, not vice versa.

The purpose of the module is to make the extensive data processing capabilities of R available within Mosel. The interactive and graphing features of R are beyond the scope of this module as it does not implement a full interactive R GUI. However, it is possible to use some of these to a limited extent.

Prerequisites

This module does not include R binaries. In order to use R you need a working installation of R, version 3.0 or newer and targeting the same platform as Mosel (you won't be able to use, e.g., the Windows 32-bit version of R from the Windows 64-bit version of Mosel). The most recent supported R version is 3.3.x. To download R, please visit the R Project web site at www.r-project.org.

This module will try to load R from the directory specified by the R_HOME environment variable, if set, or from the default R installation location otherwise.

More specifically Mosel looks for a file named R.dll in Windows, libR.so in Linux, and libR.dylib in Mac OS X.

For Windows platforms, the default location is retrieved from the registry (from registry key HKEY_LOCAL_MACHINE\Software\R-core\R\InstallationPath); it is /Library/Frameworks/R.framework/Resources for Mac OS X and /usr/lib64/R for 64bit Linux and /usr/lib/R for 32bit Linux.

If R_HOME and R_ARCH environment variables are defined, they are used to construct a path like R_HOME/lib in Linux and like R_HOME\bin\R_ARCH in Windows (the default for R_ARCH is x64 or i386 respectively for Windows 64-bit and Windows 32-bit).

If you have multiple installations of R, or if R is installed in a different location or not automatically found, you will need to set the environment variable R_HOME to point to your R installation directory.

Note that the loading of R is not influenced by eventual Mosel statements like setparam('workdir',...) or setenv('R_HOME',...) as these don't affect the process's environment used for R loading. The environment variables or current path must eventually be set before launching Mosel in order for this to influence R loading.

As an example, if R 3.2.3 is installed in "C:\Program Files\R\R-3.2.3\bin\..." in Windows 64-bit, then the correct value for the R_HOME environment variable (or registry key) is C:\Program Files\R\R-3.2.3 (thus, without the bin subdirectory) and Mosel would try and load R.dll from C:\Program Files\R\R-3.2.3\bin\x64\R.dll.

R initialization

The R environment is automatically initialized at the point where a Mosel model uses for the first time any function that requires it. So we can have the following small example that just prints the R version (it prints the same output as if you typed R.version.string on an R console):

model "r version example"
  uses 'r';
  Rprint('R.version.string')
end-model

Alternatively it is possible to explicitly initialize R using the Rinit function. This can be useful in order to retrieve a status code or to specify non-default initialization options.

By default, R is initialized with the options "--slave --vanilla", so no site or user environment, profile, history and workspace files are processed. Please refer to the R documentation for more details on these and other options (http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Invoking-R).

Upon startup, only the "utils", "stats", and "methods" R packages are loaded by default. Other packages can be loaded via R statements (using for example the library or require R functions) or a different initial package list can be specified by setting the R_DEFAULT_PACKAGES environment variable (prior to running Mosel).

As R is single-threaded, it is not possible to create more than one R session per model, nor to execute two models in parallel if both use R.

Data types

The types of data that can be exchanged with R are the four Mosel elementary types boolean, integer, real and string, plus arrays, lists and sets of these (nested compositions are not supported). Both static and dynamic Mosel arrays are supported and mapped into R atomic vectors of the corresponding type. Mosel lists and sets can also be exported into R vectors.

There is no direct mapping of more complex R types such as factors or data.frames, however these can be exchanged after conversion to basic types. For example, a factor can be loaded into a Mosel array as an array of integers with:

  Rgetarr("unclass(f)", intarray)

or as an array of strings with:

  Rgetarr("levels(f)[f]", strarray)

Note that the first is also equivalent to this simpler form:

  Rgetarr("f", intarray)

since this module ignores the factor's "class" and "levels" attributes; similarly the second is equivalent to the simpler:

  Rgetarr("f", strarray)

since the casting to string, performed within R, automatically takes into account the "levels" attribute.

To load a data.frame into Mosel, it should be converted to a matrix (for instance using as.matrix if the column types allow that) or split into individual column vectors.

For the opposite operation, that is, exporting a Mosel array to R, note that Mosel arrays are always exported as R (dense) atomic vectors. Any index that is not a 1-based integer range is created in R as a named index. Index names, in R, are always strings, so for example, when the Mosel array in the following example is converted to R, the index set J is kept as an unnamed integer index, while I (which does not start with 1) and K (which has holes) are created as named indices.

model "array to r"
  uses "r"

  declarations
    I= 2..3
    J= {1,2}
    K= {1,3}
    a: array(I,J,K) of integer
  end-declarations

  a(2,1,1):=4                       ! Define some test data entry
  Rset('aR',a)                      ! Copy data to R
  writeln("Array in R:")
  Rprint("aR")                      ! Display data held in R
  writeln("dimnames(aR):")
  Rprint("dimnames(aR)")            ! Display R indices
end-model

Executing this model generates the following output:

Array in R:
, , 1

  [,1] [,2]
2    4    0
3    0    0

, , 3

  [,1] [,2]
2    0    0
3    0    0

dimnames(aR):
[[1]]
[1] "2" "3"

[[2]]
NULL

[[3]]
[1] "1" "3"

Note how the first and last entry of dimnames, which correspond to indices I and K respectively, are set to the list of index elements converted to strings; while the second entry is left to NULL since the index set J is a 1-based integer index with contiguous elements.

Conversion to R data frames or other complex R data structures should be done in the R realm and is outside of the scope of this guide. A few examples are shown below, but please refer to the R documentation for further information.

Some common and useful R functions to convert vectors into data frames are e.g. data.frame(), as.data.frame(), and the functions from the reshape or reshape2 packages, just to name a few. Also functions names() (for 1-dimensional vectors) or dimnames() (for any vectors) can be used to retrieve the index names of a vector.
In the following example, a Mosel single-indexed demand array is converted to a 2-column R data frame: the first column for the index and the second column for the value:

 model "dataframe"
  uses "r"

  declarations
    Locations = {12,34,56}
    demand: dynamic array(Locations) of real
  end-declarations

  forall(l in Locations) demand(l):=l*100      ! Fill array with some data
  Rset("demand", demand)
  Rprint("table <- data.frame(Loc=names(demand),Dem=demand,row.names=NULL)")
end-model

This is the resulting output:

  Loc  Dem
1  12 1200
2  34 3400
3  56 5600

Alternatively, calling data.frame just as data.frame(demand) without any other parameters would create a data frame with a single column (for demand) and named rows, thus yielding:

   demand
12    1200
34    3400
56    5600

A bidimensional demand array such as:

  declarations
    Locations = {12,34,56}
    C={"A","B"}
    demand: dynamic array(Locations,C) of real
  end-declarations
  demand(12,"A"):=1234
  demand(56,"B"):=6789
  Rset("demand", demand)

could be converted to a data frame via data.frame(Loc=dimnames(demand)[[1]],demand, row.names=NULL) which results in the following form:

  Loc    A    B
1  12 1234   NA
2  34   NA   NA
3  56   NA 6789

Finally, for instance by calling function melt form the reshape2 package such as melt(demand, varnames = c('Loc','Prod'), value.name = 'Demand'), it is possible to obtain a data frame with a column for each index plus a column with the array values like the following:

  Loc Prod Demand
1  12    A   1234
2  34    A     NA
3  56    A     NA
4  12    B     NA
5  34    B     NA
6  56    B   6789