RTC 4.0 Zimport PDS of HTML from zOS - why does encoding default to CP1252 and why do brackets ( [] ) show up as funny symbols?
A customer brought in a PDS from zOS of HTML members. When we do a show properties it shows that the encoding is CP1252. Within the HTML member there are some brackets ( [] ) and they all display as a funny looking letter A. If we make any change to a member and save it we get an error saying there are some characters that do not conform to CP1252 and we are forced to save it as UTF-8. Now after a build that copies the HTML member back to a zOS PDSs. the brackets look like funny symbols and are not translated to brackets. What do we need to do to have them get zimported in a brackets and stay as brackets when being "moved" back out to a PDS on zOS?
We can manually within RTC changed the funny looking letter A back to the specific bracket an then when moved back to zOS PDS thing are OK - still brackets. We have a lot of affected members and do not want to have to manually update members. Any suggestions?
We can manually within RTC changed the funny looking letter A back to the specific bracket an then when moved back to zOS PDS thing are OK - still brackets. We have a lot of affected members and do not want to have to manually update members. Any suggestions?
One answer
Hi Donald,
1) Why does encoding default to Cp1252?
I'm guessing this is because that's what your Eclipse settings specify. On Windows systems, Eclipse starts with Cp1252 set for the default text encoding. Check this by going to Window -> Preferences -> General -> Workspace and looking for the "Text file encoding" setting.
We do have a work item open to have zimport establish a project-level override for the text file encoding, so that even if your workbench preferences specify Cp1252 as the text file encoding, the files within your zComponent project will be opened as UTF-8.
2) Why do brackets ( [ ] ) show up as funny symbols?
Every time I've seen this before, it's been caused by a mismatch between the z/OS system's default encoding and the actual encoding of the member in the PDS. Brackets are variant characters that have different hex values in different EBCDIC code pages. For example, in IBM-1047, their hex values are 0xAD and 0xBD, while in IBM-037, their hex values are 0xBA and 0xBB. If your default system encoding is set to IBM-1047, but the member being imported is really in IBM-037, you'll see exactly this behavior.
To determine your system's default encoding, you can run the following command from USS:
chcp -q
To determine the hex values of the characters in the member, you can open the member in the ISPF editor and use the hex on command to display the hex values (use hex off to turn them off again).
If it does turn out you have a mismatch, you can instruct zimport to use a specific EBCDIC encoding when doing the conversion to UTF-8 (rather than using the system default encoding) by setting the ZLANG environment variable immediately before running the zimport:
export ZLANG=IBM-037
or
export ZLANG=IBM-1047
Hope this helps!
1) Why does encoding default to Cp1252?
I'm guessing this is because that's what your Eclipse settings specify. On Windows systems, Eclipse starts with Cp1252 set for the default text encoding. Check this by going to Window -> Preferences -> General -> Workspace and looking for the "Text file encoding" setting.
We do have a work item open to have zimport establish a project-level override for the text file encoding, so that even if your workbench preferences specify Cp1252 as the text file encoding, the files within your zComponent project will be opened as UTF-8.
2) Why do brackets ( [ ] ) show up as funny symbols?
Every time I've seen this before, it's been caused by a mismatch between the z/OS system's default encoding and the actual encoding of the member in the PDS. Brackets are variant characters that have different hex values in different EBCDIC code pages. For example, in IBM-1047, their hex values are 0xAD and 0xBD, while in IBM-037, their hex values are 0xBA and 0xBB. If your default system encoding is set to IBM-1047, but the member being imported is really in IBM-037, you'll see exactly this behavior.
To determine your system's default encoding, you can run the following command from USS:
chcp -q
To determine the hex values of the characters in the member, you can open the member in the ISPF editor and use the hex on command to display the hex values (use hex off to turn them off again).
If it does turn out you have a mismatch, you can instruct zimport to use a specific EBCDIC encoding when doing the conversion to UTF-8 (rather than using the system default encoding) by setting the ZLANG environment variable immediately before running the zimport:
export ZLANG=IBM-037
or
export ZLANG=IBM-1047
Hope this helps!