Does anyone have a suggestion for extracting Chinese characters from a text string? I have some documents imported from Word that have Chinese translations in the same paragraph as the English text. I am trying to write a routine to split the Chinese into a separate attribute, but it is proving tricky to reliably detect the Chinese characters. Any Ideas?
|
Re: Extract Chinese Characters from a string I've attached a sample which splits the Object Text attribute into attributes called "Chinese" and "English". Its not perfect, handling of space and full-stop characters would need to be added. Attachments attachment_14921119_findNonLatin.dxl |