Parsing a tab delimited file

Report · Apr 21, 2008

I have a tab delimited text file that I'm trying to parse. My question is how do you find the end of the row if its tab delimited??? Thanks

Report · Apr 21, 2008

Treat the the tab delimted text as list with the newline character as the delimiter. See the CHR function.
http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_c-d_04.html#4749523

Report · Apr 21, 2008

Bob,

Can you be more specific? How is the newline character used as a delimiter?

Also another issue I forgot to mention is what do you do when the field is empty in a tab delimited file!?? If a field is emply it messes up my parsing sequence and the fields are mismatched...

Report · Apr 22, 2008

You have a list within a list. The outer list is new-line (Chr(10)) delimited; the inner list is tab-delimited. So in dealing with a new row, use <CFSET foo = ListGetFirst(bar, Chr(10))>

Once you have the row in foo, you must deal with the empty list elements. You can accomplish that by finding consecutive tab characters and inserting a space between them.
Something like <CFSET foo = Replace(foo, "#Chr(9)##Chr(9)#", "#Chr(9)# #Chr(9)#", "ALL")>. You will want to do this twice to account for multiple empty list elements next to each other.

Now you have a tab-delimited list that will work for you.

Report · Apr 22, 2008

quote:

Originally posted by: jdeline
Once you have the row in foo, you must deal with the empty list elements. You can accomplish that by finding consecutive tab characters and inserting a space between them.
Something like <CFSET foo = Replace(foo, "#Chr(9)##Chr(9)#", "#Chr(9)# #Chr(9)#", "ALL")>. You will want to do this twice to account for multiple empty list elements next to each other.

I use a while loop to do this sort of thing.

while (find(chr(9)chr(9)) gt 0 ) {
replace
}

Report · Apr 22, 2008

See attached sample

Report · Apr 22, 2008

Thank you for the detailed answer and code however the code you provided is not working on the tab delimited file I have. I used your example line for line. First your specified loop delimiter "Chr(13)" does NOT parse the tabs, however when I replaced delimiters="#Chr(13)#" with delimiters "(six spaces)" in the cfloop, it worked. I don't know why.

Second, I get an error "The element at position 2 cannot be found." #variables.lineSet[2]# which tells us we're not identifying the end of line either. I've included a chunk of the actual file I'm trying to parse for your reference. The actual 7mb file can be downloaded here Tab delimited file

Again, thank you.

Report · Apr 22, 2008

quote:

Originally posted by: LionelR
Thank you for the detailed answer and code however the code you provided is not working on the tab delimited file I have. I used your example line for line. First your specified loop delimiter "Chr(13)" does NOT parse the tabs, however when I replaced delimiters="#Chr(13)#" with delimiters "(six spaces)" in the cfloop, it worked. I don't know why.

Second, I get an error "The element at position 2 cannot be found." #variables.lineSet[2]# which tells us we're not identifying the end of line either. I've included a chunk of the actual file I'm trying to parse for your reference. Again, thank you.

Chr(13) is intended to match carriage return at the end of a line, not tabs within a line. You would use Chr(9) to match tabs. Note that some text files might use Chr(10), linefeed, for the end of lines. If six spaces works for you as a delimiter I suspect your data is not delmited by tabs.

Report · Apr 22, 2008

Bob, Can you take a look at the actual file, (I've included a link to it in my previous message) and tell me if its even possible to parse this file. Thanks

Report · Apr 22, 2008

quote:

Originally posted by: LionelR
Bob, Can you take a look at the actual file, (I've included a link to it in my previous message) and tell me if its even possible to parse this file. Thanks

Browsing URL http://www.wireworks.net/cashout.dbf returns HTTP Error 404.3 - File or directory not found: MIME map policy prevents this request.

You might try posting a zipped copy of the file or a copy with a different file extension. Also please verify cashout.dbf is a text file and not a database file.

Report · Apr 22, 2008

Sorry, try http://www.wireworks.net/cashout.zip

Report · Apr 22, 2008

quote:

Originally posted by: LionelR
Sorry, try http://www.wireworks.net/cashout.zip

The content of the zip file is not a text file. If you are trying to query a *.dbf file you might see if this file format supports JDBC or ODBC and use regular cfquery tags.

Report · Apr 22, 2008

Unfortunatly our ISP does not support dbf. databases, that's why I'm trying to parse into a query. So there's no way to parse this file correct? Thanks again.

Report · Apr 22, 2008

You might try using the Access driver to connect to the file.

Parsing a tab delimited file

Photos