Hi all, I have an object which has a mixture of Object Text and OLEs that currently has strikethrough applied. I'd like to remove the strikethrough but keep all the remaining RTF and obviously the OLE too! I've managed it for Object Text and Object Headings (i.e. no OLEs) by using RegExp to find the "\\strike" and remove it. The "\\strike0" is left in the string but that appears (?!?) to be ok and successfully removes the strikethrough whilst leaving the remaining RTF as is. Where the Object Text is a mixture of text and OLEs, it seems to be more complicated... It identifies the beginning of the string ok but cuts it short midway through leaving me with an object containing a little text but nothing from the first OLE onwards. A snapshot of the code:
Module m = current
Module o = current
string s = ""
Regexp StrikeOutText = regexp2 "^(.*)\\\\strike[^0](.*)"
if (oleISObject o) {
Text = richTextFragment richTextWithOle(o."Object Text")
s = richTextWithOle"Object Text"
if (StrikeOutText Text) {
print "Match start of line: " Text[match 1] "\n"
print "Match strikethru: " Text[match 2] "\n"
print "Match end of line: " Text[match 3] "\n"
o."Object Text" = richText "{\\rtf1" Text[match 1] Text[match 2] Text[match 3] "}"
} else {
print "doesn't match"
}
}
Does anyone have any ideas where I'm going wrong??
Thanks for your help lumish82 - Tue Dec 16 10:10:57 EST 2014 |
Re: Remove strikethrough RTF-Tag extraction is VERY tricky. Its basic form is to search for "\strike" (trailing space), or "\strike" followed by another valid rtfcode, like "\strike\ul". But wait!
You also have the problem of needing to ignore any such \strike that is inside the body of an OLE object. Yuuuuck, good luck simulating an RTF code parser accurately. You could I suppose rebuild the entire string, something like this. I think you end up losing some clever markup not recognized by DOORS rt."characterisctics".
You may have better luck using OLE automation to send to an empty MS-Word document the entire text, command word to select all and remove strike-outs, then copy it back and insert it back into the object. Maybe someone as some ideas. -Louie |
Re: Remove strikethrough But you asked a question. this is a crude response, I'm weak wth ReqExp. But I wonder:
-Louie |
Re: Remove strikethrough llandale - Thu Dec 18 18:14:03 EST 2014 But you asked a question. this is a crude response, I'm weak wth ReqExp. But I wonder:
-Louie Hi Louie, thanks for your reply. I'll read into greedy regexp and hopefully that will help. From the quick look I did just now, I think you're right that I ideally want the regexp to be non-greedy. I'll definitely put a buffer in there too - I keep forgetting that's a more efficient/safer way of doing regexp. Plus look into the rt.characteristics… But for now a quick reply to say thanks for giving me something to look into and I'll hopefully be able to post some progress after the New Year. Thanks, Lumish. |
Re: Remove strikethrough Hello Rtf is something like the html owned by microsoft. The structure a it's use is similar. And you cannot use regular expressions to parse xml or html and so you cannot use regular expressions for rtf (http://stackoverflow.com/questions/8577060/why-is-it-such-a-bad-idea-to-parse-xml-with-regex). If you want to manipulate rtf-data you have to use something like a "rtf dom paser" and no, I actually do not know a usable product. Best regards Wolfgang |