ASNA WingsRPG™ Reference Manual |
Wings Double Byte Support
This topic describes Wings' support for languages which use the Double Byte Character Set (DBCS).
New Support in 8.0
As of version 8.0, the following DBCS code pages will be supported with the optional ASNA DBCS library. This library is a drop-in that can be recognized by the ASNA runtime via the Microsoft Extensibility Framework (MEF). Support for other CCSIDs can be added by the user via MEF by implementing the ASNA.Runtime.IConverterFactory interface. Converters added via MEF will take precedence over existing converters (e.g. if you use the ASNA DBCS library the converter for 37 will be taken from it instead of using the default provided by .Net - 37 is included in the DBCS library as it's needed by 937).
For reference, the CCSIDs defined on the IBM i are listed here. Bidirectional languages are NOT supported in ASNA BTerm, whether a converter is added for them or not.
CCSID | Description |
---|---|
37 | USA, Canada (S/370), Netherlands, Portugal, Brazil, Australia, New Zealand |
290 | Japan Katakana (extended) |
833 | Korea (extended) |
834 | Korea - including 1880 UDC |
835 | Traditional Chinese - including 6204 UDC |
836 | Simplified Chinese (extended) |
837 | Simplified Chinese - including 1880 UDC |
939 | Japan English/Kanji (extended) - including 4370 UDC |
1364 | Korea (extended) |
1388 | Traditional Chinese |
1399w | Japan English/Kanji |
4396 | Japan - including 1880 UDC |
4930 | Korea Windows |
5026 | Japan Katakana/Kanji (extended) - including 1880 UDC |
5035 | Japan English/Kanji (extended) - including 1880 UDC |
5123 | Japanese Latin Host Extended SBCS (includes euro) |
13124 | Traditional Chinese |
16684 | Japanese Latin Host Double-Byte including 4370 UDC (includes euro) |
The DBCS is the IBM i's support for languages requiring more than 256 characters. In it, each character is represented by 2 bytes (hence Double-Byte).
The DBCS supports four languages:
- Simplified Chinese
- Traditional Chinese
- Japanese
- Korean
There are multiple CCSIDs that can be used for each of the above languages; these have been introduced and updated over time to include additional characters like the Euro. IBM i and Wings both support Unicode as a special case (support for Unicode was added with Wings 6.0, release 6.1 completed the Double-Byte support).
A great deal of information on the topic can be found in at http://www-03.ibm.com/systems/i.software/globalization/codepages.html in PDF form. It covers the current IBM code pages, many of which are supported by IBM i servers.
DBCS Types
There are 4 "types" of DBCS fields in DDS. For more detail on the types, check here: (http://pic.dhe.ibm.com/infocenter/iseries/v6r1m0/index.jsp?topic=/rzakc/dbcdtype.htm):
J
(Only) – accepts only DBCS characters. The Field Length must be an even number (of bytes). The display station automatically inserts shift-control characters in fields specified with this data type.E
(Either) – accepts either DBCS or alphanumeric (single byte) characters. The field length must be an even number (of bytes)
DBCS or alphanumeric characters can be typed into the field. The type of data entered into the field's first position determines the type of data that the rest of the field will accept. If blank, the system assumes alphanumeric data will be entered Positioning the cursor on the field and putting the keyboard in DBCS mode readies the field to accept DBCS data.O
(Open) – accepts a mixture of single- and double-byte characters. The length must be a multiple of 1 (bytes).
If the field contains DBCS data, the system does not ensure that the data is enclosed between shift-control characters.-
G
(Graphic) — accepts exclusively DBCS data. The length specifies the number of characters, not of bytes.
Data typed in this field does not contain shift-control characters.
A unicode field is considered type "G" with an explicit CSID value of 1200 for UTF-16 and 13488 for UCS-2.
For more information on Unicode fields in IBM i, check here http://pic.dhe.ibm.com/infocenter/iseries/v6r1m0/topic/rzakc/dspfil.htm
In Wings the special case for Unicode has been promoted as a separate DBCS type.
DdsCharField and DBCS
The Wings control for a character field (DdsCharField) is designated DBCS capable by the following properties:
Property | Value |
---|---|
DbcdType | None — Not a DBCS type |
Length | Length in characters |
DbcsByteLen | Length in Bytes on the RPG program |
DisplayPositions | Maximum number of characters to display and accept on the screen |
The Wings Handler converts any DBCS data to Unicode at the RPG program border and keeps it like this in its internal buffer and as the transmission format shown below:
For these conversions to be handled correctly it is
imperative that the DbcsByteLen
and Length properties be set
correctly. DDS provided a single Length attribute for
the data; it denotes bytes for types J, E
and
O
, and characters (actually Code Units one unit for the character and
potentially another unit for the surrogate) for G
.
Importing DDS Definitions
The process of importing DDS into an ASPX has the challenge of determining how many characters are allowed based on the Length found in the DDS source. In the case of Graphic, and Only, the relation is functional, however for Either and Open, it is not deterministic. This is what Wings does for each type:
DbcsType | Length (characters) | DbcsByteLen | Notes |
---|---|---|---|
Unicode Graphic | DD_Len | DDS_Len * 2 | None |
Only | (DD_Len -2)/2 | DDS_Len | Account for 2 shift controls. |
Either | (DD_Len -2)/2 | DDS_Len |
Notice that this is correct if the data is DBCS, however
if the data were single byte, then Length should have
been set to DDS_Len . |
Open | (DD_Len -2)/2 | DDS_Len |
The hardest case: On one extreme you could have all
single byte chars which would mean that Length should be
set to DDS_Len , on the other extreme you could have
alternating single and double byte chars which would
need to set Length=DDS_Len/2.5 |
Since the developer knows the actual usage of the field, he can reset the value of Length to accommodate the application's needs.
At runtime, it's expected that Length and DbcsByteLen will be available; if DbcsByteLen is missing it will be computed as follows:
DbcsType | DbcsByteLen |
---|---|
Unicode Graphic | Length *2 |
Only Either Open | Length *2 +2 |
Sample Data
The following image shows the Traditional Chinese DBCS characters 顯示
whose hex values in CCSID 937 are 686F
and
4D9D
respectively.
The file MISUCH
(library ERCHINA
) has multiple fields.
The First two are a packed number and a DBCS field of type
Open that starts at position 8 called
CTRUCK.
The second record of the data file MISUCH
includes the
'顯示' characters on the CTRUCK
field. The byte sequence
0E 686F4F9F 0F
include the DBCS characters plus the
necessary Shift-Out/Shift-In.
Displaying the two fields in a Mobile RPG/Wings application would look like this:
The CTRUCK
field is defined as follows:
<mdef:DdsCharField ID="DdsCharField1" runat="server" Alias="CTRUCK" CssClass="DdsCharField"
DbcsByteLen="35" DbcsType="Open" Length="17" Lower="True"
In order for the application to work correctly a DG connection must be established, with a user profile that sets the Job's CCSID to a value 'compatible' with the DBCS data that will be manipulated by the program. The sample above was created while running a CCSID of 937 (Mixed Traditional Chinese, NO Euro).
In order to deal with the single byte characters, Wings
uses the translators provided by Windows: The
GetEncoding
method of the Encoding
class. The Translators are not used
for the actual DBCS translations, so because there are many
CCSIDs that are not recognized by Microsoft, Wings maps some
of the CCSIDs used for Mixed sets to the corresponding CCSID
of the single byte portion of the whole set. For example,
the CCSID 937 (mixed traditional Chinese NO Euro) is mapped
to 37 and the DBCS 1371 (mixed traditional Chinese with Euro)
is mapped to 1149. There are even several single byte CCSIDs
like 28709 that are mapped, in this case to 1140.
ASNA will continue to map these sets as they become
relevant to our products. Also remember that Wings
provides a mechanism that enables the user to provide a
custom encoding alternative via the
ASNA.Monarch.Wings.ICodePageConverter
interface.