The source file is a text file which contains a table of mappings between Unicode and a foreign character set. The table consists of tab-separated columns which the cnvtool
uses to create a Charconv converter.
The file is case-insensitive. Comments begin with a # and extend to the end of the line. Blank lines and leading and trailing whitespace are ignored.
Columns
The first column lists the foreign character code and the second lists the corresponding Unicode character code. Both codes are in hexadecimal. The third column is optional and contains comments prefixed with a comment sign #
to make the file more readable.
0x3E 0x003E #GREATER-THAN SIGN 0x3F 0x003F #QUESTION MARK 0x40 0x0040 #COMMERCIAL AT 0x41 0x0041 #LATIN CAPITAL LETTER A 0x42 0x0042 #LATIN CAPITAL LETTER B 0x43 0x0043 #LATIN CAPITAL LETTER C
Note: The table can contain other hex columns. In such cases the columns for the foreign character code and corresponding Unicode must be specified using the -columns
option of cnvtool
.
SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE command
In some cases, the foreign character codes that appear in the source file need to be processed in some way before being used in the binary output file. You can specify how they must be processedby including a SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE
command line in the source file as follows:
SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE [Perl code]
All of the characters following the line will be processed using the perl code. You can stop processing using the SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE command with no parameter.
SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE return $foreignCharacterCode|0x00008080; 0x2121 0x3000 # IDEOGRAPHIC SPACE 0x2122 0x3001 # IDEOGRAPHIC COMMA 0x2123 0x3002 # IDEOGRAPHIC FULL STOP ... SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE
$foreignCharacterCode variable
The $foreignCharacterCode
variable stands for the foreign encoding (the first column). For example, if the high bit of each foreign character is off in the source file but is required to be on in the output file, the Perl code (assuming the foreign character set uses only one byte for each character) is:
return $foreignCharacterCode|0x80;