RTC / Oracle: Characterset?
Hello
I am a complete N00b in Oracle, so please forgive me if this question is in any way obvious to answer...
-> RTC instructions tell to use "UTF-8" for the characterset in Oracle.
Our Oracle guys asked back two question:
- UTF8 or AL32UTF8?
- Please tell me what do you want for characterset and what for national characterset?
--> Seems like Oracle uses two different UTF-8 implementations, and it seems like there are two charactersets to define in Oracle.
What are the correct answers for Oracle?
Thanks! Martin
P.S. It is Oracle 11g, and yes I know that this version in only supported officially with RTC 3 (and that there is a workaround for RTC 2).
I am a complete N00b in Oracle, so please forgive me if this question is in any way obvious to answer...
-> RTC instructions tell to use "UTF-8" for the characterset in Oracle.
Our Oracle guys asked back two question:
- UTF8 or AL32UTF8?
- Please tell me what do you want for characterset and what for national characterset?
--> Seems like Oracle uses two different UTF-8 implementations, and it seems like there are two charactersets to define in Oracle.
What are the correct answers for Oracle?
Thanks! Martin
P.S. It is Oracle 11g, and yes I know that this version in only supported officially with RTC 3 (and that there is a workaround for RTC 2).
One answer
The only difference between AL32UTF8 and UTF8 character sets is that AL32UTF8 stores characters beyond U+FFFF as four bytes (exactly as Unicode defines UTF-8). Oracles UTF8 stores these characters as a sequence of two UTF-16 surrogate characters encoded using UTF-8 (or six bytes per character). Besides this storage difference, another difference is better support for supplementary characters in AL32UTF8 character set.
As this storage difference has no impact on RTC, but just the database, either should work fine. However, the one that was tested was UTF8 so that is what we would recommend.
As this storage difference has no impact on RTC, but just the database, either should work fine. However, the one that was tested was UTF8 so that is what we would recommend.