Skip to main content
Participating Frequently
August 26, 2008
Question

Commas in List Data

  • August 26, 2008
  • 6 replies
  • 984 views
I have a form that uploads a CSV file and puts it into a DB. I am trying to make it as "idiot-proof" as I can with as little responsibility on the enduser as possible to provide "proper format" data. Is there a way to go about escaping certain commas in the text (and distinguishng them from the true delimiter)? I understand that putting the value in quotes works, but that again relies on the enduser doing so. Is there a way to do this upon upload or does the end user have no choice but to clean the file up first.

TIA
    This topic has been closed for replies.

    6 replies

    Inspiring
    August 27, 2008
    Just out of curiousity, other than solvable programming problems, is there anything else wrong with a comma in text data?
    davidsimms
    Inspiring
    August 27, 2008
    VBall:

    I do this exact thing all the time. Let me offer the following not so much as a direct answer to your query (since Adam and Dan have already done that), but rather, as advice about handling the user experience end of this. When I started it, I felt pretty sure asking end user's to submit data in such a strict format wouldn't work, but I gave it a shot anyway. Four years later I've been pleasantly stunned at how well it has worked.

    A HUGE part of the reason for this success is that I make extensive use of some VERY robust regular expressions to validate, validate, validate that which is submitted. When invalid data is detected, I don't simply reject the submission. Instead, I write to an array of structures three things:

    1. The line number on which the invalid data was found.
    2. The invalid data itself.
    3. The reason the data failed validation.

    If invalid data is detected, I output that array to the user so they know exactly what they need to do to correct the data and try again.

    David
    Mr_VballAuthor
    Participating Frequently
    August 27, 2008
    Thanks all for the input - The file data is line by line based so I don't need to reject the whole file if just 1 line is bad. I like the idea of actually spitting the bad data (and why) back to the user. This way if it happens enough they understand and can fix the problem and resubmit.
    Inspiring
    August 27, 2008
    > My suggestion is a variation of Adam's. It goes something like this.
    >
    > 1. read the file and do what you have to do to deal with empty elements.
    > 2. if the list length is the expected value, add the database record.
    > otherwise copy that line to a file somewhere.
    > 3. do something with the file of rejected records. This will involve a person
    > reading the file and changing the delimiters to preserve the commas.
    > 4. process the modified file with the new delimiters.

    Yep, I'd go along with this *on the proviso* that it's valid to process
    some lines and not others, as opposed to the whole file needing to be
    valid.

    --
    Adam
    Inspiring
    August 27, 2008
    My suggestion is a variation of Adam's. It goes something like this.

    1. read the file and do what you have to do to deal with empty elements.
    2. if the list length is the expected value, add the database record. otherwise copy that line to a file somewhere.
    3. do something with the file of rejected records. This will involve a person reading the file and changing the delimiters to preserve the commas.
    4. process the modified file with the new delimiters.
    Inspiring
    August 27, 2008
    > I have a form that uploads a CSV file and puts it into a DB. I am trying to
    > make it as "idiot-proof" as I can with as little responsibility on the enduser
    > as possible to provide "proper format" data. Is there a way to go about
    > escaping certain commas in the text (and distinguishng them from the true
    > delimiter)?

    In short: no. How can CF determine whether a comma is a delimiter or
    whetehr it's content? How would you expect that to work?


    > I understand that putting the value in quotes works, but that again
    > relies on the enduser doing so. Is there a way to do this upon upload or does
    > the end user have no choice but to clean the file up first.

    "Garbage in: garbage out". Don't try to process something that is not
    appropriate for the task. Upload the file, parse it, validating it as
    proper CSV data (there's no official spec, but this RFC is useful,
    http://rfc.net/rfc4180.html, and you can point your client to it). If the
    file validates: process it. If it doesn't, reject it.

    --
    Adam
    Participating Frequently
    August 26, 2008
    Instead of creating a strict CSV, could you instead use a different less commonly used character as your delimiter?

    Phil
    Mr_VballAuthor
    Participating Frequently
    August 27, 2008
    quote:

    Originally posted by: paross1
    Instead of creating a strict CSV, could you instead use a different less commonly used character as your delimiter?

    Phil


    I'm not the one that creates the CSV - the clients do. Then they upload it. I can tell clients don't use commas or if they must, then quote it. We all know how well that goes.....:)

    Obviously there's no guarentee that the data will be clean so I'm just trying to cover as many bases as I can.