Dense vs. sparse data format
All examples we have seen so far use sparse data format, i.e., every data entry in the database tables or spreadsheets is given with its complete index tuple. If the index set(s) of an array are defined in the model or fixed otherwise it is also possible to work with data in dense format, i.e., just the data entries without their index tuples.
The following example (file indexeg.mos) shows how data tables given in different formats may be read in by a Mosel model. We have enabled the `debug' option to see the SQL statements generated by Mosel.
model ODBCImpEx uses "mmodbc" declarations A: array(1..3, 1..2) of real B: array(1..2, 1..3) of real C: array(1..2, 1..3) of real CSTR: string end-declarations CSTR:= 'mmodbc.odbc:debug;indexeg.sqlite' ! Data must be dense - there are not enough columns to serve as index! initializations from CSTR A as 'Range3by2' end-initializations forall(i in 1..3) writeln("Row(",i,"): ", A(i,1), " ", A(i,2)) ! Dense data initializations from CSTR B as 'noindex;Range2by3' end-initializations forall(i in 1..2) writeln("Row(",i,"): ", B(i,1), " ", B(i,2), " ", B(i,3)) ! Indexed data initializations from CSTR C as 'Range2by3i' end-initializations forall(i in 1..2) writeln("Row(",i,"): ", C(i,1), " ", C(i,2), " ", C(i,3)) end-model
The first array, A is read from a table that helds data in dense format—just the data, no indices:
First | Second |
1.2 | 2.2 |
2.1 | 2.2 |
3.1 | 4.4 |
This table has two columns and three rows. Since data in sparse format would require at least three columns, there is no confusion possible between the two formats and mmodbc will deduce automatically the correct format to use. (However, if you wish to document clearly which data format is expected you may choose to add the option `noindex' as in the following case.)
If an input table has a sufficiently large number of columns to serve as indices as is the case for the table Range2by3 and array B, then the situation is ambiguous and we need to state explicitly that no indices are specified by using the option `noindex'. Otherwise mmodbc will use its default (namely sparse format), which will lead to an error message in the present example since the data type (real) of the first two columns does not correspond to the type of the indices (integer) indicated in the model:
First | Second | Third |
1.2 | 1.2 | 1.3 |
2.1 | 2.2 | 2.3 |
The third case, table Range2by3i and array C, corresponds to the (default) format that we have already seen in the previous examples—each data item is preceded by its index tuple. This table defines exactly the same data as the previous one:
Firsti | Secondi | Value |
1 | 2 | 1.1 |
1 | 2 | 1.2 |
1 | 3 | 1.3 |
2 | 2 | 2.1 |
2 | 2 | 2.2 |
2 | 3 | 2.3 |
The same model as above may be rewritten with SQL commands instead of initializations blocks (the driver option `noindex' is replaced by resetting the value of the mmodbc parameter SQLndxcol):
model ODBCImpEx2 uses "mmodbc" declarations A: array(1..3, 1..2) of real B: array(1..2, 1..3) of real C: array(1..2, 1..3) of real CSTR: string end-declarations CSTR:= 'indexeg.sqlite' SQLconnect(CSTR) setparam("SQLdebug",true) ! Data must be dense - there are not enough columns to serve as index! SQLexecute("select * from Range3by2 ", A) forall(i in 1..3) writeln("Row(",i,"): ", A(i,1), " ", A(i,2)) setparam("SQLndxcol", false) ! Dense data SQLexecute("select * from Range2by3 ", B) forall(i in 1..2) writeln("Row(",i,"): ", B(i,1), " ", B(i,2), " ", B(i,3)) setparam("SQLndxcol", true) ! Indexed data SQLexecute("select * from Range2by3i ", C) forall(i in 1..2) writeln("Row(",i,"): ", C(i,1), " ", C(i,2), " ", C(i,3)) SQLdisconnect end-model
If using the excel driver instead of an ODBC connection, we need to use the noindex option also with the first data table ('Range3by2') since this driver does not do any 'guessing' about the table format.
model ODBCImpEx3 uses "mmodbc" declarations A: array(1..3, 1..2) of real B: array(1..2, 1..3) of real C: array(1..2, 1..3) of real CSTR: string end-declarations CSTR:= 'mmsheet.excel:indexeg.xls' ! Dense data ('noindex'), skipping the header line ('skiph') initializations from CSTR A as 'skiph;noindex;Range3by2' end-initializations forall(i in 1..3) writeln("Row(",i,"): ", A(i,1), " ", A(i,2)) ! Dense data initializations from CSTR B as 'skiph;noindex;Range2by3' end-initializations forall(i in 1..2) writeln("Row(",i,"): ", B(i,1), " ", B(i,2), " ", B(i,3)) ! Indexed data initializations from CSTR C as 'skiph;Range2by3i' end-initializations forall(i in 1..2) writeln("Row(",i,"): ", C(i,1), " ", C(i,2), " ", C(i,3)) end-model
Multidimensional tables in rectangular format
Quite frequently, particularly when working with spreadsheets, multidimensional input data arrays are formatted in 2-dimensional (rectangular) form. For example, take a look at the following table: we want to populate an array Aijkwith 3 dimensions from the data held in seven columns, where the first 2 columns are indices (i and j) and the remaining columns are data values corresponding to different index values for the third and last index k of the array.
Firsti | Secondi | Value_1 | Value_2 | Value_3 | Value_4 | Value_5 |
2 | B | 22.1 | 22.2 | 22.3 | 22.4 | 22.5 |
1 | D | 14.1 | 14.2 | 14.3 | 14.4 | 14.5 |
This table can be read by the following Mosel code (see example file threedimarr.mos). Notice that the contents of the first two index sets is not defined in the model—their contents is read with the input data. However, the last index k that is written 'across the table columns' needs to be defined in the model and it has to be of type range (that is, using a set type that has an ordering) to make sure that the entries will be read in the same order as expected by the model. With this data format we need to use the partndx option of the I/O drivers to indicate that not all indices are to be read in with the input data. Furthermore, as the input data only defines values for a few index tuples, we use a dynamic array that will only contain those elements for which values are specified in the data file.
model "ThreeDimArr" uses "mmodbc" declarations I: range J: set of string K = 1..5 ! The last index set must be defined in the model A: dynamic array(I,J,K) of real end-declarations initializations from 'mmsheet.excel:partndx;threedim.xls' A as 'Tab_23' end-initializations writeln("A: ") forall(i in I, j in J, k in K | exists(A(i,j,k))) writeln("A(", i, ",", j, ",", k, "): ", A(i,j,k)) end-model
To read the same data with the odbc driver we can use
initializations from 'mmodbc.odbc:partndx;threedim.sqlite' A as 'Tab_23' end-initializations
and to obtain the corresponding version using SQL statements, the initializations block can be replaced by the following lines.
declarations Idx3: text end-declarations SQLconnect("threedim.sqlite") setparam("SQLndxcol", false) ! Partially indexed data forall(k in K) Idx3+= (", Value_"+k ) SQLexecute("select Firsti, Secondi" + Idx3 + " from Tab_23", A) SQLdisconnect