This wiki: The development wiki is a work area where Jazz development teams plan and discuss technical designs and operations for the projects at Jazz.net. Work items often link to documents here. You are welcome to browse, follow along, and participate. Participation is what Jazz.net is all about! But please keep in mind that information here is "as is", unsupported, and may be outdated or inaccurate. For information on released products, consult IBM Knowledge Center, support tech notes, and the Jazz.net library. See also the Jazz.net Terms of Use. Any documentation or reference material found in this wiki is not official product documentation, but it is primarily for the use of the development teams. For your end use, you should consult official product documentation (infocenters), IBM.com support artifacts (tech notes), and the jazz.net library as officially "stamped" resources. |
U+00C5
LATIN CAPITAL LETTER A WITH RING ABOVE
U+212B
ANGSTROM SIGN
U+0041
LATIN CAPITAL LETTER A, followed by U+030A
COMBINING RING ABOVE
U+0066
U+0069
represents the string "fi" (LATIN SMALL LETTER F followed by LATIN SMALL LETTER I), while the sequence U+FB01
represents the single character 'fi' (LATIN SMALL LIGATURE FI). By converting Unicode text to Normalization Form KC, the second representation is converted to the first, and the information that a ligature was used is lost.
In summary, NFC removes the distinction between equivalent characters, while preserving the distinction between compatible characters or sequences; NFKC removes the distinction between both equivalent and compatible sequences. NFC conversion is not considered lossy, but NFKC conversion is.
SPARQL does not automatically compensate for these alternate representations. This may lead to some results being unintentionally omitted from a query result. It is therefore important to standardize on a normal form for Unicode encoding, and to write appropriate queries.
str
is converted to NFC using the method call Normalizer.normalize(str,Normalizer.Form.NFC)
, or to NFKC using Normalizer.normalize(str,Normalizer.Form.NFKC)
.