How to do string comparision by ignoring case-sensitive ?

can anyone help how to do string comparision by ignoring case-sensitive ?
SystemAdmin - Fri Nov 30 03:40:00 EST 2012

Re: How to do string comparision by ignoring case-sensitive ?
Mathias Mamsch - Fri Nov 30 04:30:15 EST 2012

Well the easiest way is to convert both strings to lower or upper case:
 

string s1 = "Hello"
string s2 = "hEllo" 
 
print (lower s1 == lower s2) "\n"

 


Regards, Mathias

 

 


Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

 

Re: How to do string comparision by ignoring case-sensitive ?
hell_se - Mon Dec 03 10:32:20 EST 2012

Mathias Mamsch - Fri Nov 30 04:30:15 EST 2012

Well the easiest way is to convert both strings to lower or upper case:
 

string s1 = "Hello"
string s2 = "hEllo" 
 
print (lower s1 == lower s2) "\n"

 


Regards, Mathias

 

 


Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

 

This works nicely for plain ASCII characters (A-Z, a-z), but unfortunately not for non-latin (ΓΔΘФЦЩ etc) and accented characters (ÁÅÄÖÜÔ etc).

I have written a few simple scripts (toUpper, toLower and toTitle) which work as one would expect, as long as you are using codepage 1252. In case someone is interested:

const int CAPITAL_LETTER_A = 0x41
const int CAPITAL_LETTER_Z = 0x5a
const int SMALL_LETTER_a = 0x61
const int SMALL_LETTER_z = 0x7a
const int CAPITAL_LETTER_A_WITH_GRAVE_ACCENT = 0xc0
const int CAPITAL_ICELANDIC_LETTER_THORN = 0xde
const int SMALL_LETTER_a_WITH_GRAVE_ACCENT = 0xe0
const int SMALL_ICELANDIC_LETTER_THORN = 0xfe
const int MULTIPLICATION_SIGN = 0xd7
const int DIVISION_SIGN = 0xf7
const int CAPITAL_LETTER_S_WITH_CARON = 0x8a
const int SMALL_LETTER_s_WITH_CARON = 0x9a
const int CAPITAL_DIGRAPH_OE = 0x8c
const int SMALL_DIGRAPH_oe = 0x9c
const int CAPITAL_LETTER_Z_WITH_CARON = 0x8e
const int SMALL_LETTER_z_WITH_CARON = 0x9e
const int CAPITAL_LETTER_Y_WITH_DIAERESIS = 0x9f
const int SMALL_LETTER_y_WITH_DIAERESIS = 0xff
 
bool whiteSpace (char c)
{
    return(c==' ' || c=='\t' || c=='\r' || c=='\n' || c=='\v' || c=='\f')
}
 
string toUpper (string s)
{
        int i,n
        string sUpper = null
        
        if (currentANSIcodepage != 1252) return s       // Unknown codepage...
 
        for (i = 0; i < length (s); i++)
        {
                n = intOf (s[i])
                if ((n >= SMALL_LETTER_a && n <= SMALL_LETTER_z) || (n >= SMALL_LETTER_a_WITH_GRAVE_ACCENT && n <= SMALL_ICELANDIC_LETTER_THORN && n != DIVISION_SIGN))
                        n = n - 0x20
                elseif (n == SMALL_LETTER_s_WITH_CARON)
                        n = CAPITAL_LETTER_S_WITH_CARON
                elseif (n == SMALL_DIGRAPH_oe)
                        n = CAPITAL_DIGRAPH_OE
                elseif (n==SMALL_LETTER_z_WITH_CARON)
                        n = CAPITAL_LETTER_Z_WITH_CARON
                elseif (n == SMALL_LETTER_y_WITH_DIAERESIS)
                        n = CAPITAL_LETTER_Y_WITH_DIAERESIS
                sUpper = sUpper charOf (n) ""
        }
        return sUpper
}
 
string toLower (string s)
{
        int i,n
        string sLower = null
        
        if (currentANSIcodepage != 1252) return s       // Unknown codepage...
 
        for (i = 0; i < length (s); i++)
        {
                n = intOf (s[i])
                if (( n >= CAPITAL_LETTER_A && n <= CAPITAL_LETTER_Z) || (n >= CAPITAL_LETTER_A_WITH_GRAVE_ACCENT && n <= CAPITAL_ICELANDIC_LETTER_THORN && n != MULTIPLICATION_SIGN))
                        n = n + 0x20
                elseif (n == CAPITAL_LETTER_S_WITH_CARON)
                        n = SMALL_LETTER_s_WITH_CARON
                elseif (n == CAPITAL_DIGRAPH_OE)
                        n = SMALL_DIGRAPH_oe
                elseif (n == CAPITAL_LETTER_Z_WITH_CARON)
                        n = SMALL_LETTER_z_WITH_CARON
                elseif (n == CAPITAL_LETTER_Y_WITH_DIAERESIS)
                        n = SMALL_LETTER_y_WITH_DIAERESIS
                sLower = sLower charOf(n) ""
        }
        return sLower
}
 
string toTitle (string s)
{
        int i,n
        string sTitle = null
        bool upper = true
        
        if (currentANSIcodepage != 1252) return s       // Unknown codepage...
 
        for (i = 0; i < length (s); i++)
        {
                n = intOf (s[i])
                if (whiteSpace (s[i]))
                        upper = true
                else
                {
                        if (upper)
                        {
                                if ((n >= SMALL_LETTER_a && n <= SMALL_LETTER_z) || (n >= SMALL_LETTER_a_WITH_GRAVE_ACCENT && n <= SMALL_ICELANDIC_LETTER_THORN && n != DIVISION_SIGN))
                                        n = n - 0x20
                                elseif (n == SMALL_LETTER_s_WITH_CARON)
                                        n = CAPITAL_LETTER_S_WITH_CARON
                                elseif (n == SMALL_DIGRAPH_oe)
                                        n = CAPITAL_DIGRAPH_OE
                                elseif (n == SMALL_LETTER_z_WITH_CARON)
                                        n=CAPITAL_LETTER_Z_WITH_CARON
                                elseif (n == SMALL_LETTER_y_WITH_DIAERESIS)
                                        n = CAPITAL_LETTER_Y_WITH_DIAERESIS
                        }
                        elseif ((n >= CAPITAL_LETTER_A && n <= CAPITAL_LETTER_Z) || (n >= CAPITAL_LETTER_A_WITH_GRAVE_ACCENT && n <= CAPITAL_ICELANDIC_LETTER_THORN && n != MULTIPLICATION_SIGN))
                                n = n + 0x20
                        elseif (n == CAPITAL_LETTER_S_WITH_CARON)
                                n = SMALL_LETTER_s_WITH_CARON
                        elseif (n == CAPITAL_DIGRAPH_OE)
                                n = SMALL_DIGRAPH_oe
                        elseif (n == CAPITAL_LETTER_Z_WITH_CARON)
                                n = SMALL_LETTER_z_WITH_CARON
                        elseif (n == CAPITAL_LETTER_Y_WITH_DIAERESIS)
                                n = SMALL_LETTER_y_WITH_DIAERESIS
                        upper = false
                }
                sTitle = sTitle charOf (n) ""
        }
        return sTitle
}

 


Regards, Lennart

 

Re: How to do string comparision by ignoring case-sensitive ?
OurGuest - Mon Dec 03 11:34:05 EST 2012

int cistrcmp(string s1,string s2)

Re: How to do string comparision by ignoring case-sensitive ?
hell_se - Tue Dec 04 02:09:34 EST 2012

OurGuest - Mon Dec 03 11:34:05 EST 2012
int cistrcmp(string s1,string s2)

print cistrcmp ("qwérty", "QWÉRTY") // prints 1...

Re: How to do string comparision by ignoring case-sensitive ?
OurGuest - Tue Dec 04 08:01:20 EST 2012

hell_se - Tue Dec 04 02:09:34 EST 2012
print cistrcmp ("qwérty", "QWÉRTY") // prints 1...

Which is what one would expect: there is no reason to expect that é is the lower case of É

Re: How to do string comparision by ignoring case-sensitive ?
Mathias Mamsch - Tue Dec 04 09:40:48 EST 2012

OurGuest - Tue Dec 04 08:01:20 EST 2012
Which is what one would expect: there is no reason to expect that é is the lower case of É

The unicode consortium says otherwise, which has put a clear distinction into capital, small and title case letters. See also:

http://unicode.org/faq/casemap_charprop.html

Why would you think that there is NO reason to assume that the upper case of é is É? It would seem pretty natural, doesn't it?

When you for example type in python:
 

print u"abcdé".upper()


you also get:

 

"ABCDÉ"



Of course when you type print "abcdé".upper() (without specifying a unicode string) then the é will not be upper cased, since you did not specify this. Since DOORS should naturally handle UTF-8 strings, I would conclude that this is indeed a bug. Regards, Mathias



 

 


Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

 

Re: How to do string comparision by ignoring case-sensitive ?
OurGuest - Tue Dec 04 10:11:59 EST 2012

Mathias Mamsch - Tue Dec 04 09:40:48 EST 2012

The unicode consortium says otherwise, which has put a clear distinction into capital, small and title case letters. See also:

http://unicode.org/faq/casemap_charprop.html

Why would you think that there is NO reason to assume that the upper case of é is É? It would seem pretty natural, doesn't it?

When you for example type in python:
 

print u"abcdé".upper()


you also get:

 

"ABCDÉ"



Of course when you type print "abcdé".upper() (without specifying a unicode string) then the é will not be upper cased, since you did not specify this. Since DOORS should naturally handle UTF-8 strings, I would conclude that this is indeed a bug. Regards, Mathias



 

 


Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

 

If you want to make the comparison in python you are a shoe in.
If you want to make the comparison in DOORS your are a shoe out.