Character encoding of text files
Mosel uses UTF-8 for its internal representation of text strings and this is also the default character encoding for text files. It is however possible to read and write text files in different encodings: for model source and initialization block files the selection can be achieved by means of a special comment (see sections Source file character encoding and Initialization block) but the encoding may also be specified at the time of opening a file by prefixing its name with the "enc:" prefix:
enc:encoding [+unix|+dos|+sys] [+bom|+nobom],filename |
Mosel supports natively the encodings UTF-8, UTF-16, UTF-32, ISO-8859-1, ISO-8859-15, CP1252 and US-ASCII. For UTF-16 and UTF-32 the byte ordering depends on the architecture of the running system (e.g. this is Little Endian on an x86 processor) but it can also be specified by appending LE (Little Endian) or BE (Big Endian) to the encoding name (e.g. UTF-16LE). The availability and names of other encodings depends on the operating system.
The following aliases may also be used in place of an encoding name: RAW (no encoding), SYS (default system encoding), WCHAR (wide character for the C library), FNAME (encoding used for file names), TTY (encoding of the output stream of the console), TTYIN (encoding of the input stream of the console), STDIN, STDOUT, STDERR (encoding of the default input/output/error stream).
In addition to the encoding name a couple of options might be applied: +unix and +dos select the line termination (note that +dos is automatically used when writing to a physical file on Windows). Options +bom and +nobom decides whether a Byte Order Mark is to be inserted at the beginning of the file (this option only applies to UTF encodings when the file is not open in appending mode). By default a BOM is inserted when the encoding is UTF-16 or UTF-32, the option +nobom disables this insertion. The option +bom implies the insertion of a BOM on UTF-8 encoded files (this is usually not required for this encoding but often used on Windows systems). The option +sys selects the line termination and BOM convention of the running system (i.e. it is equivalent to +unix on a Posix system and +dos+bom on a Windows machine).
© 2001-2019 Fair Isaac Corporation. All rights reserved. This documentation is the property of Fair Isaac Corporation (“FICO”). Receipt or possession of this documentation does not convey rights to disclose, reproduce, make derivative works, use, or allow others to use it except solely for internal evaluation purposes to determine whether to purchase a license to the software described in this documentation, or as otherwise set forth in a written software license agreement between you and FICO (or a FICO affiliate). Use of this documentation and the software described in it must conform strictly to the foregoing permitted uses, and no other use is permitted.