R
Topics covered in this chapter:
The module r makes it possible to easily exchange data with R and execute R scripts or evaluate expressions in the R language.
R is a free software environment for statistical computing and graphics. R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License.
To use this module, the following line must be included in the header of the Mosel model file:
uses 'r';
Introduction
This module implements functionality for exchanging data between a Mosel model and R and for calling R functions from a Mosel model.
The r module also defines an I/O driver for exchanging data using the initializations from and initializations to Mosel constructs.
It is the Mosel run-time library that loads and runs R, not vice versa.
The purpose of the module is to make the extensive data processing capabilities of R available within Mosel. The interactive and graphing features of R are beyond the scope of this module as it does not implement a full interactive R GUI. However, it is possible to use some of these to a limited extent.
Prerequisites
This module does not include R binaries. In order to use R you need a working installation of R, version 3.0 or newer and targeting the same platform as Mosel (you won't be able to use, e.g., the Windows 32-bit version of R from the Windows 64-bit version of Mosel). The most recent supported R version is 4.1.x. To download R, please visit the R Project web site at www.r-project.org.
This module will try to load R from the directory specified by the R_HOME environment variable, if set, or from the default R installation locations otherwise.
More specifically Mosel looks for a file named R.dll in Windows, libR.so in Linux, and libR.dylib in Mac OS X.
For Windows platforms, the default location is retrieved from the registry (from registry key HKEY_LOCAL_MACHINE\Software\R-core\R\InstallationPath); it is /Library/Frameworks/R.framework/Resources for Mac OS X, /usr/lib/R for 32bit Linux, and either /usr/lib64/R or /usr/lib/R for 64bit Linux.
If R_HOME and R_ARCH environment variables are defined, they are used to construct a path like R_HOME/lib in Linux and like R_HOME\bin\R_ARCH in Windows (the default for R_ARCH is x64 or i386 respectively for Windows 64-bit and Windows 32-bit).
If you have multiple installations of R, or if R is installed in a different location or not automatically found, you will need to set the environment variable R_HOME to point to your R installation directory.
Note that the loading of R is not influenced by eventual Mosel statements like setparam('workdir',...) or setenv('R_HOME',...) as these don't affect the process's environment used for R loading. The environment variables or current path must eventually be set before launching Mosel in order for this to influence R loading.
As an example, if R 3.2.3 is installed in "C:\Program Files\R\R-3.2.3\bin\..." in Windows 64-bit, then the correct value for the R_HOME environment variable (or registry key) is C:\Program Files\R\R-3.2.3 (thus, without the bin subdirectory) and Mosel would try and load R.dll from C:\Program Files\R\R-3.2.3\bin\x64\R.dll.
R initialization
The R environment is automatically initialized at the point where a Mosel model uses for the first time any function that requires it. So we can have the following small example that just prints the R version (it prints the same output as if you typed R.version.string on an R console):
model "r version example" uses 'r'; Rprint('R.version.string') end-model
Alternatively it is possible to explicitly initialize R using the Rinit function. This can be useful in order to retrieve a status code or to specify non-default initialization options.
By default, R is initialized with the options "--slave --vanilla", so no site or user environment, profile, history and workspace files are processed. Please refer to the R documentation for more details on these and other options (http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Invoking-R).
Upon startup, only the "utils", "stats", and "methods" R packages are loaded by default. Other packages can be loaded via R statements (using for example the library or require R functions) or a different initial package list can be specified by setting the R_DEFAULT_PACKAGES environment variable (prior to running Mosel).
As R is single-threaded, it is not possible to create more than one R session per model, nor to execute two models in parallel if both use R.
Memory limit on Windows
On Windows platforms, R has an internal mechanism that can limit the maximum amount of memory it can use. The limit can be read or changed using the R memory.limit function; for example, the current limit can be printed from Mosel with
writeln('R memory limit is ',Rgetreal("memory.limit()"),' MB')
and the limit can be set, e.g. to 16 GB, with
Reval("memory.limit(16*1024)")
Note that in versions of R prior to 3.6 the default value for this limit is different when R is executed as a standalone application rather than embedded in another application (including Mosel): in the first case the limit is set to the amount of physical memory available whereas it is fixed to 2 GB for embedded mode. Therefore, in order to allow R to use more than 2 GB of memory from Mosel on Windows it is necessary to explicitly raise this limit as shown above. Starting with R version 3.6, by default there is no memory limit anymore when R is executed in embedded mode.
Data types
The types of data that can be exchanged with R are the four Mosel elementary types boolean, integer, real and string, plus arrays, lists and sets of these (nested compositions are not supported). Both static and dynamic Mosel arrays are supported and mapped into R atomic vectors of the corresponding type. Mosel lists and sets can also be exported into R vectors.
In general, there is no direct mapping of more complex R types such as factors or data.frames, with the exception of the Rsetdf function, however these can be exchanged after conversion to basic types. For example, a factor can be loaded into a Mosel array as an array of integers with:
Rgetarr("unclass(f)", intarray)
or as an array of strings with:
Rgetarr("levels(f)[f]", strarray)
Note that the first is also equivalent to this simpler form:
Rgetarr("f", intarray)
since this module ignores the factor's "class" and "levels" attributes; similarly the second is equivalent to the simpler:
Rgetarr("f", strarray)
since the casting to string, performed within R, automatically takes into account the "levels" attribute.
To load a data.frame into Mosel, it should be converted to a matrix (for instance using as.matrix if the column types allow that) or split into individual column vectors.
For the opposite operation, that is, exporting a Mosel array to R, note that, except for the Rsetdf function, Mosel arrays are always exported as R (dense) atomic vectors. Any index that is not a 1-based integer range is created in R as a named index. Index names, in R, are always strings, so for example, when the Mosel array in the following example is converted to R, the index set J is kept as an unnamed integer index, while I (which does not start with 1) and K (which has holes) are created as named indices.
model "array to r" uses "r" declarations I= 2..3 J= {1,2} K= {1,3} a: array(I,J,K) of integer end-declarations a(2,1,1):=4 ! Define some test data entry Rset('aR',a) ! Copy data to R writeln("Array in R:") Rprint("aR") ! Display data held in R writeln("dimnames(aR):") Rprint("dimnames(aR)") ! Display R indices end-model
Executing this model generates the following output:
Array in R: , , 1 [,1] [,2] 2 4 0 3 0 0 , , 3 [,1] [,2] 2 0 0 3 0 0 dimnames(aR): [[1]] [1] "2" "3" [[2]] NULL [[3]] [1] "1" "3"
Note how the first and last entry of dimnames, which correspond to indices I and K respectively, are set to the list of index elements converted to strings; while the second entry is left to NULL since the index set J is a 1-based integer index with contiguous elements.
Conversion to R data frames can be done using function Rsetdf. If this does not provide the required data frame format or other complex R data structures are needed, then the conversion should be done in the R realm and is outside of the scope of this guide. A few examples are shown below, but please refer to the R documentation for further information.
Some common and useful R functions to convert vectors into data frames are e.g. data.frame(), as.data.frame(), and the functions from the reshape or reshape2 packages, just to name a few. Also functions names() (for 1-dimensional vectors) or dimnames() (for any vectors) can be used to retrieve the index names of a vector.
In the following example, a Mosel single-indexed demand array is converted to a 2-column R data frame: the first column for the index and the second column for the value:
model "dataframe" uses "r" declarations Locations = {12,34,56} demand: dynamic array(Locations) of real end-declarations forall(l in Locations) demand(l):=l*100 ! Fill array with some data Rset("demand", demand) Rprint("table <- data.frame(Loc=names(demand),Dem=demand,row.names=NULL)") end-model
This is the resulting output:
Loc Dem 1 12 1200 2 34 3400 3 56 5600
Note that this is the same result that you would get, more simply, with Rsetdf("table", demand, ["Loc","Dem"]).
Alternatively, calling data.frame just as data.frame(demand) without any other parameters would create a data frame with a single column (for demand) and named rows, thus yielding:
demand 12 1200 34 3400 56 5600
A bidimensional demand array such as:
declarations Locations = {12,34,56} C={"A","B"} demand: dynamic array(Locations,C) of real end-declarations demand(12,"A"):=1234 demand(56,"B"):=6789 Rset("demand", demand)
using Rsetdf("df", demand, ["Loc","C", "Value"]) would yield:
Loc C Value 1 12 A 1234 2 56 B 6789
or it could be converted to a data frame via data.frame(Loc=dimnames(demand)[[1]],demand, row.names=NULL) which results in the following form:
Loc A B 1 12 1234 NA 2 34 NA NA 3 56 NA 6789
Finally, for instance by calling function melt form the reshape2 package such as melt(demand, varnames = c('Loc','Prod'), value.name = 'Demand'), it is possible to obtain a data frame with a column for each index plus a column with the array values like the following:
Loc Prod Demand 1 12 A 1234 2 34 A NA 3 56 A NA 4 12 B NA 5 34 B NA 6 56 B 6789
Example
The following example shows how to execute R statements and exchange data with the R workspace.
model "r example" uses "r" declarations CITIES = {"LONDON", "PARIS", "ROME"} ZONES = 1..4 mosarray, backarr, backarrio: array(ZONES, CITIES) of integer backnum: real end-declarations setparam("Rverbose",true) ! Enable showing R error messages ! Reval evaluates arbitrary R statements Reval("t<-Sys.time();now<-format(t, '%H:%M')") ! Rprint also prints the result (via the R print function) Rprint("paste('Hello from R at',now)") ! Assign some Mosel scalars to R vars and show results Rset("a_num", 1.2) Reval("str(a_num)") Rset("a_chr", "word") Reval("str(a_chr)") ! The lvalue can be any R valid lvalue, e.g. the dim attribute Rset("a_vec", 1..6) ! a_vec is an R vector Rset("dim(a_vec)", [2,3]) ! change its dimensions writeln("a_vec") Rprint("a_vec") ! now it is a 2x3 matrix ! Assign a Mosel array to an R variable forall(i in ZONES, c in CITIES) mosarray(i,c):=i*10+getsize(c) Rset("arr", mosarray) ! The R vector keeps index names writeln("arr") Rprint("arr") ! Retrieve R variables writeln("a_num is ", Rgetreal("a_num")) writeln("a_chr is ", Rgetstr("a_chr")) Rgetarr("arr", backarr) writeln("arr is ", backarr) ! Data can also be exchanged via the I/O driver newnumber:=1.3 mosarray(1,"LONDON"):=1 ! Send data to R initializations to "r.rws:" newnumber as "a_num" mosarray as "arr" end-initializations ! Get data back from R initializations from "r.rws:" backnum as "a_num" backarrio as "arr" end-initializations writeln("backnum is ", backnum) writeln("backarrio is ", backarrio) end-model
I/O drivers
In order to simplify access to R this module provides a driver that is designed to be used in initializations blocks for both reading and writing data, providing the same functionalities as the Rget and Rset functions.
Driver rws
rws:
The driver can only be used in `initializations' blocks. It does not take any argument and provides access to the R workspace.
In the block, each label entry is understood as one or more R statements. For 'from' blocks, if the label contains more than one statement, the value from the last one is returned. For 'to' blocks, the label must contain only one expression.
This driver requires an existing R session, therefore it is necessary to initialize R (either by calling function Rinit or any of the other module functions that create an R session) before using it.
Example:
initok:=Rinit ! Initialize R initializations to "r.rws:" ! Send data to R scalarvar as "val" arrayvar as "arr" end-initializations initializations from "r.rws:" ! Get data from R backscalar as "val" backarr as "arr" end-initializations
Troubleshooting
This section describes some known issues and possible solutions.
- When running a model in Windows, a dialog is shown with title 'Unable to locate component' and content 'The application has failed to start because Rlapack.dll was not found...'.
This may occur with Windows 2003. Please add your R binary directory (usually 'C:\Program Files\R\R-3.x.x\bin\i386' or 'C:\Program Files\R\R-3.x.x\bin\x64' to the system PATH environment variable. - When an R session is initialized in Windows, R installs a console handler to detect Ctrl-C events which may prevent Mosel from properly detecting these same events itself.
- In Linux, R may fail to load if the dynamic libraries in $R_HOME/lib cannot be found by the dynamic linker. In this case, please add $R_HOME/lib to the LD_LIBRARY_PATH environment variable.
- This module is not compatible with Mosel security restrictions, therefore it would fail to load if Mosel is run in restricted mode.
- On Mac OS X, if the R release being used is linking the Apple CoreFoundation library, then this module can only be successfully initialized from the main thread of the process in which Mosel is running (because the CoreFoundation library can only be loaded from the main thread of a process). So, for example, the module would fail to load R from an mmjobs submodel. In this case, it is possible to overcome this issue by setting the environment variable DYLD_INSERT_LIBRARIES to /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (use the correct path to the CoreFoundation library on your system) before launching the Mosel process, thus forcing an anticipated loading of the CoreFoundation library at process creation.
© 2001-2024 Fair Isaac Corporation. All rights reserved. This documentation is the property of Fair Isaac Corporation (“FICO”). Receipt or possession of this documentation does not convey rights to disclose, reproduce, make derivative works, use, or allow others to use it except solely for internal evaluation purposes to determine whether to purchase a license to the software described in this documentation, or as otherwise set forth in a written software license agreement between you and FICO (or a FICO affiliate). Use of this documentation and the software described in it must conform strictly to the foregoing permitted uses, and no other use is permitted.