Hi, I want to parse an object text to put each match in a new attribute. But, when there is some new lines, my function doesn't work. Any help ?
/*My current object is litterally :
"Version 2 :
Add some links
with "Satisfy" link module"
I want to have :
"Add some links
with "Satisfy" link module"
in match [3]
*/
Object curro = current Object
string Text_Version = curro."Object Text"
string Text_Description
Regexp MyText = regexp2 ".*(Version [0-9]*:) [^\\n*](\\\\~| *)*(.*)"
if (MyText Text_Version){
Text_Description = Text_Version [match 3]
}
else Text_Description = "Nothing"
print Text_Version [match 0] "\n"
print Text_Version [match 1] "\n"
print Text_Version [match 2] "\n"
print Text_Version [match 3] "\n"
/*
match[0] = "ersion 2 :
Add some links
with "Satisfy" link module"
match[1] = "ersion 2 :
Add some links
with "Satisfy" link module"
match [2] = ""
match[3] = ""
*/
What is wrong ? Why the V of Verision is not parsed ? Estebell - Tue Apr 22 08:38:44 EDT 2014 |
Re: Regexp with new lines I don't quite get your regular expression but in DXL you can match newlines in different ways, however the "." placeholder does not match on newlines: string sText = "\n"; Regexp re1 = regexp "\\\n" // match Regexp re2 = regexp "\\n" // match Regexp re3 = regexp "\n" // match Regexp re4 = regexp "." // no match if (re1 sText) print "Match re1\n"; if (re2 sText) print "Match re2\n"; if (re3 sText) print "Match re3\n"; if (re4 sText) print "Match re4\n"; Although I don't know what the (\\\\~| *) part of your regexp is supposed to match to, but the following test at least matches in group 1 and 3 (note that you have a space between 'Version 2' and the ':' ... string Text_Version = "Version 2 : Add some links with \"Satisfy\" link module" Regexp MyText = regexp2 "(Version [0-9]*[ ]*:)[ \n]*(\\\\~| *)*(.*)" print "Match 0: " Text_Version [match 0] "\n" print "Match 1: " Text_Version [match 1] "\n" print "Match 2: " Text_Version [match 2] "\n" print "Match 3: " Text_Version [match 3] "\n" Regards, Mathias |
Re: Regexp with new lines Not really following; but I will say:
Thus, parsing text with EOLs causes problems. I resolve that with this
That seesm to handle EOLs in the text -Louie |
Re: Regexp with new lines llandale - Tue Apr 22 13:04:16 EDT 2014 Not really following; but I will say:
Thus, parsing text with EOLs causes problems. I resolve that with this
That seesm to handle EOLs in the text -Louie Well, I tried your const string cl_re_strAnyChar but it doesn't work... I've simplfied my object text.
// object text : "Version 2 : Add some links with link module" (without any EOL nor special characters)
Object curro = current Object
string Text_Version = curro."Object Text"
const string str_anychar = "["charOf(1)"-"charOf(255)"]"
Regexp MyText = regexp2 "([A-Z a-z 0-9]*:)(str_anychar)*"
if (MyText Text_Version)
{
print Text_Version [match(0)] "\n"
print Text_Version [match(1)] "\n"
print Text_Version [match(2)] "\n"
}
// match [0] = "Version 2 :"
// match [1] = "Version 2 :"
// match [2] = ""
Why match (0) and then match(2) are wrong ?? I expected match(0) = "Version 2 : Add some links with link module" and match(2) = "Add some links with link module" Even without EOL and with the const string, the regexp does not match !!! |
Re: Regexp with new lines Estebell - Wed Apr 23 03:26:18 EDT 2014 Well, I tried your const string cl_re_strAnyChar but it doesn't work... I've simplfied my object text.
// object text : "Version 2 : Add some links with link module" (without any EOL nor special characters)
Object curro = current Object
string Text_Version = curro."Object Text"
const string str_anychar = "["charOf(1)"-"charOf(255)"]"
Regexp MyText = regexp2 "([A-Z a-z 0-9]*:)(str_anychar)*"
if (MyText Text_Version)
{
print Text_Version [match(0)] "\n"
print Text_Version [match(1)] "\n"
print Text_Version [match(2)] "\n"
}
// match [0] = "Version 2 :"
// match [1] = "Version 2 :"
// match [2] = ""
Why match (0) and then match(2) are wrong ?? I expected match(0) = "Version 2 : Add some links with link module" and match(2) = "Add some links with link module" Even without EOL and with the const string, the regexp does not match !!! Your line 7 is wrong. It should read:
Regexp MyText = regexp2 "([A-Za-z0-9 ]*:)(" str_anychar ")*"
Regards, Mathias
|
Re: Regexp with new lines Mathias Mamsch - Wed Apr 23 17:14:02 EDT 2014 Your line 7 is wrong. It should read:
Regexp MyText = regexp2 "([A-Za-z0-9 ]*:)(" str_anychar ")*"
Regards, Mathias
Thank's so much ! It works fine !!!
|
Re: Regexp with new lines Mathias Mamsch - Wed Apr 23 17:14:02 EDT 2014 Your line 7 is wrong. It should read:
Regexp MyText = regexp2 "([A-Za-z0-9 ]*:)(" str_anychar ")*"
Regards, Mathias
Beat my head against the wall yesterday and missed that. Doh! However, I think you should move that last asterisk * inside the parens; this gives "match 2" the entire rest of the string. The way you have it, match 2 is just the last character, in this case "e".
-Louie |
Re: Regexp with new lines Hello Louie, I continue in this topic because I have a new problem. When a string has special characters, my regexp is not working. Here is my code and I don't know why it does not work. It prints "Objectif :\nVérifier que la sortie est capable d'effectuer 5000 man" although intOf "œ" = 156 that is between 1 and 255 ...
void Test (string s)
{
string anychar = "["charOf(1) "-" charOf(255)"]"
Regexp Text = regexp2 "(Objectif[ ]*:[ ]*\\n)("anychar"*)"
if(Text s)
{
s = s[match 2] ""
}
print s""
}
Test ("Objectif :\nVérifier que la sortie est capable d'effectuer 50000 manœuvres.")
|
Re: Regexp with new lines Estebell - Fri Mar 31 09:35:55 EDT 2017 Hello Louie, I continue in this topic because I have a new problem. When a string has special characters, my regexp is not working. Here is my code and I don't know why it does not work. It prints "Objectif :\nVérifier que la sortie est capable d'effectuer 5000 man" although intOf "œ" = 156 that is between 1 and 255 ...
void Test (string s)
{
string anychar = "["charOf(1) "-" charOf(255)"]"
Regexp Text = regexp2 "(Objectif[ ]*:[ ]*\\n)("anychar"*)"
if(Text s)
{
s = s[match 2] ""
}
print s""
}
Test ("Objectif :\nVérifier que la sortie est capable d'effectuer 50000 manœuvres.")
DOORS is internally using UTF8, which has a much wider set of characters. The char type can hold much more than 255 different characters, but IBM never bothered to correct the "charOf", "intOf" functions.
So while your test suggests that intOf "œ" = 156, it really 339 (see http://www.fileformat.info/info/unicode/char/0153/index.htm) and therefore does not match the range 0-255 (which regexp fully respects). char c = addr_ 339 print c c = charOf 339 print c So the easiest way for your specific regexp, is to remove the "anychar" part from the regex and simply take a substring after the "end 0" match. If you really need anychar, you need to resolve to something like "[^" + (charOf 1) + "]" (which assumes that you do not consider chr(1) as a valid character in your text. Hope this helps, Regards, Mathias
|
Re: Regexp with new lines Well I need anychar so I resolve by excepting ñ character.
Thanks a lot ! |