Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

CFHTTP parsing troubles, plz help

New Here ,
Oct 19, 2009 Oct 19, 2009

My client is a retailer and publishes on his website the items & prices exactly as they appear on the suppliers website.

The problem occured, when the wholesaler recently changed the structure of their webpages, yielding the parsing code obselete...

Now I'm coming into the picture, trying to rewrite everything, but I'm hitting some holes that I can't resolve.

Below I will post the code [partially]  but I will briefly explain first what goes on.

On the suppliers website threre is a product table with multiple <tr> rows, and 6 <td> columns for every line item.

My client however is only publishing the contents of the first and third columns,  and skip everything else.

For that, I need to loop once through the table to determine how many rows in the table. Then I need to nest a loop to determine and subtract the table cells contents.

Here I first parse the extra chars. etc.:

<cfhttp url="#trim(show_url.address)#?#show_tour.sport#" method="GET"></cfhttp>
<cfset Gfile =REplacenocase(#CFHTTP.FileContent#,'<tr class="results1">', '|', 'ALL')>
<cfset gfile = replacenocase(gfile,'<tr class="results2">', '|', 'ALL')>

<cfif Gfile DOES NOT CONTAIN "There are currently no dates/rates for this tour"
AND Gfile DOES NOT CONTAIN "Searched tour is not available">

<cfset gfile =listdeleteat(gfile,1, "|")>

<cfset gfile = replacenocase(gfile,"<tr>", "|", "ALL")>
<cfset gfile = replacenocase(gfile,"</tr>", "", "ALL")>
<cfset gfile = replacenocase(gfile,"<td  >", "~", "ALL")>
<cfset gfile = replacenocase(gfile,"<TD class=", "~", "ALL")>
<cfset gfile = replacenocase(gfile,"</TD>", "", "ALL")>
<cfset gfile = replacenocase(gfile,"</TABLE>", "", "ALL")>
<cfset gfile = replacenocase(gfile,'<input type="button" name="btnResults" class="evSubmit" value="back to results"','')>

  The next step is to populate the prices into the input form fields on the admin page:

<cfset records = listlen(gfile,"|")> 
<cfif records GTE 1>

<cfloop index="rec" list="#gfile#" delimiters="|">
<cfset rec =listdeleteat(rec,3, "~")>

<cfset rec =replacenocase(rec, "'","","all")>
    <cfset rec =replacenocase(rec, "?","")>
    <cfset rec =replacenocase(rec, '"','','all')>
     <cfset rec =replacenocase(rec, "onclick=document.location=/tours/Search.aspxAffiliateCode=&TrackingCode=>","","all")>
  
  
    <cfoutput>
    <cfset rec = replacenocase(rec, "mon, ","")>
     <cfset rec = replacenocase(rec, "tue, ","")>
     <cfset rec = replacenocase(rec, "wed, ","")>
     <cfset rec = replacenocase(rec, "thu, ","")>
   <cfset rec = replacenocase(rec, "fri, ","")>
     <cfset rec = replacenocase(rec, "sat, ","")>
     <cfset rec = replacenocase(rec, "sun, ","")>
    <cfset rec = replacenocase(rec, "<p>","")>
 


    <cfset td1 = listdeleteat(rec,1, "~")>
    
    <cfset td2 = listdeleteat(rec,2, "~")> 

    <tr>

    <th> <B>#COUNTER#.</B>
  

  <cfloop index="td1" list="#trim(td1)#"   delimiters="~">
   <cfif isDate(trim(td1))><input type="Text" name="date#counter#" value="#dateFormat(td1,'dd/mmm/yyyy')#"></th><cfelse><th><input type="Text" name="price#counter#" value="#removeChars(trim(td1),1,1)#"></th> <th><select name="descript#counter#"><option value="1" SELECTED>Land Only<option value="2">Land and Air</select></th></tr></cfif>  
    </cfloop>

    <cfset counter = #counter# + 1>   
    <tr><th> <B>#COUNTER#.</B>
    <cfloop index="td2" list="#trim(td2)#"   delimiters="~">
   <cfif isDate(trim(td2))><input type="Text" name="date#counter#" value="#dateFormat(td2,'dd/mmm/yyyy')#"></th><cfelse><th><input type="Text" name="price#counter#" value="#removeChars(trim(td2),1,1)#"></th> <th><select name="descript#counter#"><option value="1">Land Only<option value="2" selected>Land and Air</select></th></tr></cfif>
  
    </cfloop>
    <cfset counter = #counter# + 1>    
  
    </cfoutput>
  
    </cfloop>
   </cfif>
  
   <cfset counter = #counter# - 1>    
   <cfoutput>
  
     <input type="hidden" name="PriceCount" value="#counter#">
     <input type="hidden" name="OldPriceCount" value="#show_prices.recordCount#">
     <input type="hidden" name="select_offer" value="#form.select_offer#">
   </cfoutput>
  


And now the error:

Error Occurred While Processing Request

Invalid list index 3.

In function ListDeleteAt(list, index [, delimiters]), the value of index, 3, is not a valid as the first argument (this list has 1 elements). Valid indexes are in the range 1 through the number of elements in the list.
Resources:
  • Enable Robust Exception Information to provide greater detail about the source of errors. In the Administrator, click Debugging & Logging > Debugging Settings, and select the Robust Exception Information option.
  • Check the ColdFusion documentation to verify that you are using the correct syntax.
  • Search the Knowledge Base to find a solution to your problem.

Browser  Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Remote Address  70.198.180.199
Referrer 
Date/Time  19-Oct-09 10:32 AM

1.5K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 19, 2009 Oct 19, 2009
Invalid list index 3.

This tells us there are values of  rec that contain one or no delimiter ~. If you wish to delete just the last list element, you could do somthing like

<cfset listLength = listLen(rec, "~")>
<cfset rec =listdeleteat(rec,listLength,"~")>

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 20, 2009 Oct 20, 2009

I already tried doind that, Now it is all really realy messed up, because for some reason it falls in to an infinite loop!!!!!!!!!!!1

At this point I'm ready to rewrite the whole thing, I just need a little help on the way. This is no big deal trust me.

My suppliers website has a tour product table with a dated item on each row. Every item has 6 columns. 2 dates (dep & rtn), a price, a blank variable, get a qoute link, and a submit.

Each of these columns are in a <td>.
Now, I only need two of these columns (first and third). But first I need to find out the number of rows, because I parse it directly into a input form field and only after review I submit and publish it on my site.

Below I post you 2 brief snippets of code. first one a snapshot of suppliers' webpage, and the second one a snapshot of mine.

The supplier:

<table class="searchResults" cellpadding="10" cellspacing="1" border="0" width="634" ID="Table1">
<tr>
<th width="130">Departs On</th>
<th width="135">Returns On</th>
<th width="105" align="right">Land Price</th>
<th width="80">Preferred <br/> Departure</th>
<th width="95">Get a Quote</th>
<th width="95">Availability</th>
</tr>


<!-- Got Results Locally -->

<tr class="results1">
<td style="color: #BB0000;" >Fri, Apr 16, 2010</td>
<td style="color: #BB0000;" >Sat, Apr 24, 2010</td>
<td style="color: #BB0000;" >US$1899.00</td>
<td align="center"></td>
<td><a href="contentasp ?id=2080&tropicsProdID=9110&DepCode=16D10a&tourID=8444" title="Get a Quote" style="text-decoration:underline;">Get a Quote</a></td>
<td align="center"><td>
</tr>
<tr class="results2">
<td >Fri, Apr 23, 2010</td>
<td >Sat, May 01, 2010</td>
<td >US$1899.00</td>
<td align="center"></td>
<td>

</td>
<td align="center">
</td>
</tr>


This was only one row, but there are at times up to 60 rows or even more...

Now for my website, all I need are 2 of the fields from each row.

<tr><th>#IDX#. <input type="Text" name="date#idx#" value="#tvl_date#"></th><th><input type="Text" name="price#idx#" value="#price#"></th> <th><select name="descript#idx#"><option value="1" <cfif descript eq 1>SELECTED</cfif>>Land Only<option value="2" <cfif descript eq 2>SELECTED</cfif>>Land and Air</select></th></tr>
<input type="hidden" name="dateID#idx#" value="#id#">



Thanks for your prompt response.





gflex is online now Report Post 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 20, 2009 Oct 20, 2009

The delimiter ~ played a central role in your original posting. It is absent from your latest posting.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 21, 2009 Oct 21, 2009

BKBK, that's true . Because I'm trying to avoid the original code and start from scratch, as I indicated earlier. The original code was meant for the earlier version of the supplier webpage. Now that they have rewritten their code I have to either fix or rewrite mine too. To fix I tried and it's going hay wire.

thanks for noticing that.

Also. plz ignore the CF coding in my new posting as it is just intended as a cfelse if the supplier has nothing published (it relates to queried results from my db). The only reason I posted it, is for you to see the table structure & design

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 22, 2009 Oct 22, 2009

You can always consider an HTML table as XML, and then use Coldfusion's XML functionality to fish out the data. Here follows an example to illustrate what I mean.


<cfsavecontent variable="table"><table class="searchResults" cellpadding="10" cellspacing="1" border="0" width="634" ID="Table1">
<tr>
<th width="130">Departs On</th>
<th width="135">Returns On</th>
<th width="105" align="right">Land Price</th>
<th width="80">Preferred <br/> Departure</th>
<th width="95">Get a Quote</th>
<th width="95">Availability</th>
</tr>
<tr class="results1">
<td style="color: #BB0000;" >Fri, Apr 16, 2010</td>
<td style="color: #BB0000;" >Sat, Apr 24, 2010</td>
<td style="color: #BB0000;" >US$1899.00</td>
<td align="center"></td>
<td><cfoutput>#xmlformat('<a href="contentasp?id=2080&tropicsProdID=9110&DepCode=16D10a&tourID=8444" title="Get a Quote" style="text-decoration:underline;">Get a Quote</a>')#</cfoutput></td>
<td align="center"></td>
</tr>
<tr class="results2">
<td >Fri, Apr 23, 2010</td>
<td >Sat, May 01, 2010</td>
<td >US$1899.00</td>
<td align="center"></td>
<td>
</td>
<td align="center">
</td>
</tr>
</table>
</cfsavecontent>

<cfxml variable="xmlDoc">
    <cfoutput>#table#</cfoutput>
</cfxml>

<cfset row = arrayNew(1)>
<cfset row = xmlDoc["table"].XmlChildren>

<cfdump var="#row#">
<br>
<cfset lastIndex = arrayLen(row)>
Last value of land price: <strong><cfoutput>#row[lastIndex]["td"][3]#</cfoutput></strong>

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 26, 2009 Oct 26, 2009

bkbk,

Although I'm extremely thankful for your detailed answered. I think the main issue is NOT resolved yet.

Yes it is very simple to transfer data from one webpage to another via xml, and frankly I do that many times when I see it useful. However in our case, the most difficult task is to actually GET to the data on the suppliers website, once I have the clearly detemined that all the rest of the code has been parsed out successfully, that marks this task already almost accomplished.

In order for you to get a better understanding of my complication, I attach a sample link to the suppliers' webpage

Remember. I need to cut through everything up till the table contents. And even then, I need only the values of the first & third columns of each row

http://www.trafalgar.com/USA/DisplayTour?RegionID=4&CountryID=0&TypeID=0&Keywords=&LengthID=0&Budget...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 26, 2009 Oct 26, 2009
LATEST

Although I'm extremely thankful for your detailed answered. I think the main issue is NOT resolved yet.

Yes
it is very simple to transfer data from one webpage to another via xml,
and frankly I do that many times when I see it useful. However in our
case, the most difficult task is to actually GET to the data on the
suppliers website, once I have the clearly detemined that all the rest
of the code has been parsed out successfully, that marks this task
already almost accomplished.

In order for you to get a better understanding of my complication, I attach a sample link to the suppliers' webpage

Sorry about that. I had assumed the html table was a given static variable.

I have another suggestion. It is in Coldfusion 8. You could improve it by implementing  a case-insensitive regular expression to extract the innertext of td tags in one go. I did it in two steps, using ReReplaceNoCase.

<cfhttp url="http://www.trafalgar.com/USA/DisplayTour?RegionID=4&CountryID=0&TypeID=0&Keywords=&LengthID=0&Budget&TourID=8444&Detail=6" method="GET" />
<!--- <cfdump var="#cfhttp.filecontent#"> --->

<!--- array of table tags --->
<cfset tableTags=REMatchNoCase("<table\b[^>]*>(.*?)</table>",cfhttp.FileContent)>

<!--- array of rows in first table --->
<cfset tableRows=REMatchNoCase("<tr\b[^>]*>(.*?)</tr>",tableTags[1])>

<!--- fetch first column and third column of each row --->
<cfset rowNumber = 1>
<cfloop array="#tableRows#" index="row">
    <!--- array of columns for each row --->
    <cfset columns = REMatchNoCase("<td\b[^>]*>(.*?)</td>",row)>
    <!--- consider only rows that have at least 3 <td> --->
    <cfif arrayLen(columns) GTE 3>
    <!--- remove td start tag  --->
    <cfset contentFirstColumn = REReplaceNoCase(columns[1],"<td\b[^>]*>","")>
    <!--- remove td end tag  --->
    <cfset contentFirstColumn = REReplaceNoCase(contentFirstColumn,"<\td>","")>
    <!--- remove td start tag  --->
    <cfset contentThirdColumn = REReplaceNoCase(columns[3],"<td\b[^>]*>","")>
    <!--- remove td end tag  --->
    <cfset contentThirdColumn = REReplaceNoCase(contentThirdColumn,"<\td>","")>
     <cfoutput>
    row: #rowNumber# <br>
    contentFirstColumn: #contentFirstColumn#<br>
    contentThirdColumn: #contentThirdColumn#<br> 
    <hr>
    </cfoutput>
    </cfif>
    <cfset rowNumber = rowNumber + 1>   
</cfloop>

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 26, 2009 Oct 26, 2009

bkbk,

Although I'm extremely thankful for your detailed answered. I think the main issue is NOT resolved yet.

Yes it is very simple to transfer data from one webpage to another via xml, and frankly I do that many times when I see it useful. However in our case, the most difficult task is to actually GET to the data on the suppliers website, once I have the clearly detemined that all the rest of the code has been parsed out successfully, that marks this task already almost accomplished.

In order for you to get a better understanding of my complication, I attach a sample link to the suppliers' webpage

Remember. I need to cut through everything up till the table contents. And even then, I need only the values of the first & third columns of each row

http://www.trafalgar.com/USA/DisplayTour?RegionID=4&CountryID=0&TypeID=0&Keywords=&LengthID=0&Budget...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources