Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Parsing a tab delimited file

New Here ,
Apr 21, 2008 Apr 21, 2008
I have a tab delimited text file that I'm trying to parse. My question is how do you find the end of the row if its tab delimited??? Thanks
TOPICS
Getting started
1.8K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 21, 2008 Apr 21, 2008
Treat the the tab delimted text as list with the newline character as the delimiter. See the CHR function.
http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_c-d_04.html#4749523
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 21, 2008 Apr 21, 2008
Bob,

Can you be more specific? How is the newline character used as a delimiter?

Also another issue I forgot to mention is what do you do when the field is empty in a tab delimited file!?? If a field is emply it messes up my parsing sequence and the fields are mismatched...
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 22, 2008 Apr 22, 2008
You have a list within a list. The outer list is new-line (Chr(10)) delimited; the inner list is tab-delimited. So in dealing with a new row, use <CFSET foo = ListGetFirst(bar, Chr(10))>

Once you have the row in foo, you must deal with the empty list elements. You can accomplish that by finding consecutive tab characters and inserting a space between them.
Something like <CFSET foo = Replace(foo, "#Chr(9)##Chr(9)#", "#Chr(9)# #Chr(9)#", "ALL")>. You will want to do this twice to account for multiple empty list elements next to each other.

Now you have a tab-delimited list that will work for you.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 22, 2008 Apr 22, 2008
quote:

Originally posted by: jdeline
Once you have the row in foo, you must deal with the empty list elements. You can accomplish that by finding consecutive tab characters and inserting a space between them.
Something like <CFSET foo = Replace(foo, "#Chr(9)##Chr(9)#", "#Chr(9)# #Chr(9)#", "ALL")>. You will want to do this twice to account for multiple empty list elements next to each other.


I use a while loop to do this sort of thing.

while (find(chr(9)chr(9)) gt 0 ) {
replace
}
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 22, 2008 Apr 22, 2008
See attached sample
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2008 Apr 22, 2008
Thank you for the detailed answer and code however the code you provided is not working on the tab delimited file I have. I used your example line for line. First your specified loop delimiter "Chr(13)" does NOT parse the tabs, however when I replaced delimiters="#Chr(13)#" with delimiters "(six spaces)" in the cfloop, it worked. I don't know why.

Second, I get an error "The element at position 2 cannot be found." #variables.lineSet[2]# which tells us we're not identifying the end of line either. I've included a chunk of the actual file I'm trying to parse for your reference. The actual 7mb file can be downloaded here Tab delimited file

Again, thank you.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 22, 2008 Apr 22, 2008
quote:

Originally posted by: LionelR
Thank you for the detailed answer and code however the code you provided is not working on the tab delimited file I have. I used your example line for line. First your specified loop delimiter "Chr(13)" does NOT parse the tabs, however when I replaced delimiters="#Chr(13)#" with delimiters "(six spaces)" in the cfloop, it worked. I don't know why.

Second, I get an error "The element at position 2 cannot be found." #variables.lineSet[2]# which tells us we're not identifying the end of line either. I've included a chunk of the actual file I'm trying to parse for your reference. Again, thank you.




Chr(13) is intended to match carriage return at the end of a line, not tabs within a line. You would use Chr(9) to match tabs. Note that some text files might use Chr(10), linefeed, for the end of lines. If six spaces works for you as a delimiter I suspect your data is not delmited by tabs.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2008 Apr 22, 2008
Bob, Can you take a look at the actual file, (I've included a link to it in my previous message) and tell me if its even possible to parse this file. Thanks
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 22, 2008 Apr 22, 2008
quote:

Originally posted by: LionelR
Bob, Can you take a look at the actual file, (I've included a link to it in my previous message) and tell me if its even possible to parse this file. Thanks


Browsing URL http://www.wireworks.net/cashout.dbf returns HTTP Error 404.3 - File or directory not found: MIME map policy prevents this request.

You might try posting a zipped copy of the file or a copy with a different file extension. Also please verify cashout.dbf is a text file and not a database file.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2008 Apr 22, 2008
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 22, 2008 Apr 22, 2008
quote:

Originally posted by: LionelR
Sorry, try http://www.wireworks.net/cashout.zip


The content of the zip file is not a text file. If you are trying to query a *.dbf file you might see if this file format supports JDBC or ODBC and use regular cfquery tags.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2008 Apr 22, 2008
Unfortunatly our ISP does not support dbf. databases, that's why I'm trying to parse into a query. So there's no way to parse this file correct? Thanks again.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 22, 2008 Apr 22, 2008
LATEST
You might try using the Access driver to connect to the file.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources