Exporting to CSV and retain integrity of commas and quotes

I used to put "\"" before and after a string when streaming out data to a csv file. This retained all the commas in the text I was streaming out (output << "," "\"" tempStringOf b "\"" ",") . But when the customer has a bunch of quotes within the text, the streaming breaks-up the text into multiple columns instead of one column. For example:

Now, is, the, time "for" all good "men, to come," to the "aid", of their nation.

becomes:
_________________________________________________________________________________
Now, is, the, time for" all good "men to come to the aid" of their nation."


Is there a way to put delimiters before and after the stream to make sure it exports as a single string?

Thanks in advance!

Pete
pete.giorgianni - Wed Jun 22 14:14:14 EDT 2011

Re: Exporting to CSV and retain integrity of commas and quotes
llandale - Wed Jun 22 16:09:25 EDT 2011

No, you need to escape the embedded double quotes by turning them into two consecutive double quotes:

Now, is, the, time ""for"" all good ""men, to come,"" to the ""aid"", of their nation.

You also, of course, need to replace all the TAB and the various EOL characters in the text.

The original text needs to be raw without rich text formatting.

Its all quite messy.

I gave up on the notion of whether to surround cells with double quotes, when you really do not "need" to when the text contains no double quotes nor commas.

  • Louie

Re: Exporting to CSV and retain integrity of commas and quotes
llandale - Wed Jun 22 16:34:50 EDT 2011

OK, I stripped apart the code that I have that does this and here you go. You will need to tweak it a bit to make it work.

const string    cl_EOLs         = "\n\v\r"            // Any End of line Character
const string    cl_EOLsTAB      = "\n\v\r\t"  // EOL or TAB
 
 
Buffer  gl_fReplace_ResultsBuff         = create(1024), // Temp results                 -
        gl_fReplace_ToStngBuff          = create(1024)  // Houses ToString, allows quick Combine
 
//**********************
bool    fIsEmptyBuf(Buffer in_buf)
{     // Is the input buffer Empty or null.
        // That is, will retrieving the string value from the buffer be null?
        // Note that retrieving the value to see if its null would be wasteful,
        //      and checking the 'length' is misleading, in case the user
        //      has inserted a null character inside the buffer.
        return(null in_buf              or 
                 length(in_buf) == 0    or
                 null in_buf[0])
}     // end fIsEmptyBuf()
 
//***************************
void    fReplaceCharacters(Buffer &in_buf, string in_Characters, in_ReplaceString)
{     // Replace each character found in the input buffer, with the Replace String.
        if (null in_buf         or 
            fIsEmptyBuf(in_buf) or 
            null in_Characters) return  // Nothing to do
 
        int     i, j
        char    c
        bool    Found
 
        gl_fReplace_ToStngBuff  = in_ReplaceString      // Combining Buffers is more efficient than appending Strings.
        gl_fReplace_ResultsBuff = ""
 
        for (i=0; i<length(in_buf); i++)
        {  c  = in_buf[i]
           Found        = false
           for (j=0; j<length(in_Characters); j++)
           {  if (c == in_Characters[j])
              {  Found        = true
                 break
              }
           }  // end for each in_Characters
           if (Found)   //-
           then combine(gl_fReplace_ResultsBuff, gl_fReplace_ToStngBuff, 0)     // Replace the character
           else gl_fReplace_ResultsBuff += c                                            // Preserve the Character
        }     // end for every character in in_buf.
 
        in_buf  = ""                  // Replace the results
        combine(in_buf, gl_fReplace_ResultsBuff, 0)
 
}    // end fReplaceCharacters(Buffer)
 
Buffer  cl_bufCSVEntry = create()
 
//*******************
void    fCSVEntry(Buffer &inout_bufEntry, string &out_Comma)
{    // Turn the specified Entry into one suitable for CSV export.
        //   o Entries start and end with double quotes.
        //   o Entries are separated by commas.
        //   o Natural double quotes are converted to two double-quotes.
        //   o The user will want to manually end the line (the row) with a hard '\n'.
        // 'Entry' is equivalent to a cell in a column.
        // Checking to see which Entries must start/end with quotes doesn't seem
        //   realistic.  This function surrounds ALL non-null entries with quotes.
        // The Entry is preceded with the specified out_Comma.  After processing,
        //   out_Comma is set to an actual comma ",".  Users can set the variable
        //   null when starting a CSV line and let this function set it to a comma,
        //   which will actually separate the strings with commas.
        //   (the first entry (cell in the row) doesn't start with a comma)
        // EOLs and TABs are converted to spaces.  Scripts that find that unacceptable
        //      should deal with them first, perhaps via fReplaceEOLTAB()
        // Example.  The following code in a calling function will work:
        //   string out_Comma = ""
        //   string InfoMod =   fConCat(fCSVEntry(mod.name "", out_Comma),
        //                                        fCSVEntry(mod.description "", out_Comma), "\n")
 
        Buffer  bufTemp
 
        fReplaceCharacters(inout_bufEntry, cl_EOLsTAB, " ")   // Replace to spaces
 
        if (fIsEmptyBuf(inout_bufEntry))
        {  inout_bufEntry = out_Comma
        }
        else
        {  fReplaceCharacters(inout_bufEntry, "\"", "\"\"")       // One double-quote to two-double-quotes
 
           bufTemp      = create()
           bufTemp += out_Comma         // Optional comma
           bufTemp += "\""               // open quotes
           combine(bufTemp, inout_bufEntry, 0)  // Entry
           bufTemp += "\""               // close quotes
                                // Next two lines assign output parameter
           delete(inout_bufEntry)
           inout_bufEntry       = bufTemp               // This make Alias, effectively setting output
        }
 
        out_Comma = ","                               // Get ready for next call to this function
}    // end fCSVEntry(buffer)
 
//---------------------
string  fCSVEntry(string Entry, &out_Comma)
{    // Overloaded, accept and return a string
        cl_bufCSVEntry = Entry
        fCSVEntry(cl_bufCSVEntry, out_Comma)
        return(stringOf(cl_bufCSVEntry))
}    // end overloaded fCSVEntry(string)
 
 
 
Example of creating a line intended for CSV export:
You don't need a "comma" before the first CSV cell, but indeed do
before each of the next ones.  fCSVEntry() turns variable Comma into a comma
in preparation for the next cell.
 
 
           Line         = "tab\t"delimited,","\tline"
           Comma        = ""
           LenLine      = sizeof(Line)
           ThisTAB      = -1
           MoreEntries  = true
           CountEntries = 0
           while(MoreEntries)
           {  NextTAB = contains(Line, '\t', ThisTAB+1)
// set(dbeStatus, "#entri = " (CountEntries++) "  ")
              if (NextTAB < 0)
              {  MoreEntries = false
                 NextTAB        = LenLine
              }
              Entry     = Line[ThisTAB+1:NextTAB-1]
              Results   += fCSVEntry(Entry, Comma)
              ThisTAB   = NextTAB
           } // end while there are more entries to process in this Line
           Results      += "\n"       // Add EOL to the end of this line
 
           ThisEOL      = NextEOL


Now there are folks chuckling at the above, and I invite one to post something better.

 

 

  • Louie

 

 

Re: Exporting to CSV and retain integrity of commas and quotes
Mathias Mamsch - Thu Jun 23 13:05:11 EDT 2011

If you want to export to Excel you can find additional information in this post:

https://www.ibm.com/developerworks/forums/thread.jspa?threadID=373788&tstart=0

You can find a function for escaping strings in there, as well as a solution to a common problem with non-ascii characters.

Regards, Mathias

Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS

Re: Exporting to CSV and retain integrity of commas and quotes
SystemAdmin - Mon Apr 01 08:13:28 EDT 2013

Hi,
I have following dxl code for get modlues from folder and export some attributes into csv.
Attributes are export but format is not good cause of Commas and double quotes.
I want to skips commas and double quotes in object heading and object text attributes and i am not able to use this code.
Could you please tell me how can i do because i am begginer in dxl scripting.

void processModule(string mName)
{
Module m = current
//Module m = null

m = read(mName, false)

if (null m)
{
print("ERROR opening module " mName "\n")
return
}

print(mName " ok\n")
//processExport(m)
//*******
Object o
Stream outfile = write("D:\\Test\\"name(m)".csv")
string str1=null
string str2=null
string str3=null
string str4=null

for o in m do
{
int str5= o."Absolute Number"
str1= o."Object Heading"
str2= o."Object Text"
str3= o."PR_BR213-BCF-HELLA_Supplier_Responsible"
str4= o."PR_BR213-BCF-HELLA_Supplier_status"

if(str5 > 0 && str1 !="")
{

outfile <<str5"," str1","str3","str4"\n"
}
elseif(str5 > 0 && str2 !="")
{
outfile <<str5"," str2","str3","str4"\n"

}

}

close outfile
close(m)
}

void processFolder(Folder f)
{
Item i

for i in f do
{
if (isDeleted(i)) continue

if (type(i) == "Folder")
{
processFolder(folder(fullName(i)))
}
else if (type(i) == "Formal")
{
processModule(fullName(i))
}
}
}

Folder f = folder "/STAR2_BC_X_Gen2/00 Customer Requirements/Software Requirement Specification"
processFolder(Folder f)
Attachments

attachment_14971829_CountReq.dxl