How would I find the first alpha character in a string?

I don't use DXL much so maybe my queston has an obvious answer to some you power users. I have a string field which is being set to a portion of a log file line which contains a username. I'm finding that for our users logging in from China, ahead of the username there is a variable number of unprintable characters preceding the username. I've been looking for something similar to "firstNonSpace" but instead of returning the index of the first non-space character in the buffer, it would return the index of the first alpha character. If no such function exists, which seems to be the case, could someone suggest an elegant way of doing the same with code. Thanks!
Doug_Wilson - Fri Oct 26 10:21:09 EDT 2012

Re: How would I find the first alpha character in a string?
llandale - Mon Oct 29 11:52:04 EDT 2012

The following it technically what you asked for:

int  GetFirstAlpha(string in_String)
{    // find the first Alpha character in the string.  Return -1 if there are none
 
    int  i
    char c
    for (i=0; i<length(in_String); i++)
    {  c = in_String[i]
       if (isalpha(c)) then return(i)
    }
    return(-1)   // Not found at all?
}   // end GetFirstAlpha()

There are several such "ischaracter" functions, here are my notes:

bool function(char)   // these functions all look this this.
    isalpha     'a' - 'z' 'A' - 'Z'  
        isupper         'A' - 'Z'  
        islower         'a' - 'z'  
        isdigit         '0' - '9'  
        isxdigit        '0' - '9' 'a' - 'f' 'A' - 'F'  
        isalnum         'a' - 'z' 'A' - 'Z' '0' - '9'  
        isspace         ' ' '\t' '\n' '\v' '\f' '\r'    // manual is wrong, \m\j\k don't exist.
        ispunct         any character except <space>              // Not space, not isalnum, not iscntrl
                        and alpha numeric characters  
        isprint         a printing character                    // i.e. != iscntrl, includes \t (#9)
        iscntrl         between 0 and 31, and code 127  
        isascii         between 0 and 127  
        isgraph         any visible character                   // i.e. =isprint, except space

isascii() above has some potential
I suggest run the following on one of your Chinese names:

void PrintCharacters(string in_String)
{  
   int i
   char c
   print "<">\n"
   for (i=0; i<length(in_String); i++)
   {  c = in_String[i]
      print "\t" (intOf(c)) "\t[" c "]\n"
   }
}  // end PrintCharacters()

take note of the code printed for those odd characters.

-Louie

 

Re: How would I find the first alpha character in a string?
Doug_Wilson - Mon Oct 29 17:46:55 EDT 2012

llandale - Mon Oct 29 11:52:04 EDT 2012

The following it technically what you asked for:

int  GetFirstAlpha(string in_String)
{    // find the first Alpha character in the string.  Return -1 if there are none
 
    int  i
    char c
    for (i=0; i<length(in_String); i++)
    {  c = in_String[i]
       if (isalpha(c)) then return(i)
    }
    return(-1)   // Not found at all?
}   // end GetFirstAlpha()

There are several such "ischaracter" functions, here are my notes:

bool function(char)   // these functions all look this this.
    isalpha     'a' - 'z' 'A' - 'Z'  
        isupper         'A' - 'Z'  
        islower         'a' - 'z'  
        isdigit         '0' - '9'  
        isxdigit        '0' - '9' 'a' - 'f' 'A' - 'F'  
        isalnum         'a' - 'z' 'A' - 'Z' '0' - '9'  
        isspace         ' ' '\t' '\n' '\v' '\f' '\r'    // manual is wrong, \m\j\k don't exist.
        ispunct         any character except <space>              // Not space, not isalnum, not iscntrl
                        and alpha numeric characters  
        isprint         a printing character                    // i.e. != iscntrl, includes \t (#9)
        iscntrl         between 0 and 31, and code 127  
        isascii         between 0 and 127  
        isgraph         any visible character                   // i.e. =isprint, except space

isascii() above has some potential
I suggest run the following on one of your Chinese names:

void PrintCharacters(string in_String)
{  
   int i
   char c
   print "<">\n"
   for (i=0; i<length(in_String); i++)
   {  c = in_String[i]
      print "\t" (intOf(c)) "\t[" c "]\n"
   }
}  // end PrintCharacters()

take note of the code printed for those odd characters.

-Louie

 

Louie,
Thank you for your reply. The list of "ischaracter" functions is a good thing to know! I did make use of your PrintCharacters function to see the ASCII codes of the unreadable characters preceding the user name. All are codes in the table of extended ASCII codes table.

I haven't yet gotten the GetFirstAlpha function to work. Adding it to my script exactly as you wrote it, here is how I'm calling the function;

if (GetFirstAlpha(strLogMessage)) > 0

where strLogMessage is the string I want to parse. I check for a return value of > 0 meaning success in finding an alpha character. Is this a valid way to call the function?

Thanks,
Doug

Re: How would I find the first alpha character in a string?
llandale - Wed Oct 31 11:46:44 EDT 2012

Doug_Wilson - Mon Oct 29 17:46:55 EDT 2012
Louie,
Thank you for your reply. The list of "ischaracter" functions is a good thing to know! I did make use of your PrintCharacters function to see the ASCII codes of the unreadable characters preceding the user name. All are codes in the table of extended ASCII codes table.

I haven't yet gotten the GetFirstAlpha function to work. Adding it to my script exactly as you wrote it, here is how I'm calling the function;

if (GetFirstAlpha(strLogMessage)) > 0

where strLogMessage is the string I want to parse. I check for a return value of > 0 meaning success in finding an alpha character. Is this a valid way to call the function?

Thanks,
Doug

Parenthesis problem. Try:
  • if (GetFirstAlpha(strLogMessage) > 0) then print "1st character not alpha \n"

Re: How would I find the first alpha character in a string?
llandale - Wed Oct 31 11:48:15 EDT 2012

llandale - Wed Oct 31 11:46:44 EDT 2012
Parenthesis problem. Try:

  • if (GetFirstAlpha(strLogMessage) > 0) then print "1st character not alpha \n"

  • if (GetFirstAlpha(strLogMessage) > 0) then print "1st character not alpha [" strLogMessage "]\n"

Re: How would I find the first alpha character in a string?
Doug_Wilson - Fri Nov 02 14:40:27 EDT 2012

llandale - Wed Oct 31 11:48:15 EDT 2012

  • if (GetFirstAlpha(strLogMessage) > 0) then print "1st character not alpha [" strLogMessage "]\n"

I could not get the GetFirstAlpha function to work. My debugging showed that the "isalpha" character class had the undesired feature of returning true for the unprintable characters that I was looking to bypass. I ended up using a regular expression in the form;

Regexp RegexpFoundAlpha = regexp "a-zA-Z"

I used the regular expression like this;

while (!null LoginMessage && RegexpFoundAlpha LoginMessage) {
nameStart = start 0
break
}
LoginUserName = LoginMessagenameStart:

May not be very elegant but it works. Louie, thanks for your help.

Re: How would I find the first alpha character in a string?
llandale - Fri Nov 02 16:45:49 EDT 2012

Doug_Wilson - Fri Nov 02 14:40:27 EDT 2012
I could not get the GetFirstAlpha function to work. My debugging showed that the "isalpha" character class had the undesired feature of returning true for the unprintable characters that I was looking to bypass. I ended up using a regular expression in the form;

Regexp RegexpFoundAlpha = regexp "a-zA-Z"

I used the regular expression like this;

while (!null LoginMessage && RegexpFoundAlpha LoginMessage) {
nameStart = start 0
break
}
LoginUserName = LoginMessagenameStart:

May not be very elegant but it works. Louie, thanks for your help.

Oops. I now remember this nagging memory that these functions roll over for bytes larger than 127; so byte 129 is the as 1. Try this:
  • if (isascii(c) and isalpha(c)) then return(i)

but that's because I really don't like regular expressions but I've forgotten why.

-Louie