Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

starting and stopping a loop

Guest
Jan 11, 2007 Jan 11, 2007
I am importing a list of 10,000 emails into memory and then comparing them to a database which holds over 400,000 records. It doesnt take long for the <cfquery> to time out. I know there must be a way to loop thru 20 or so, then stop, then loop thru the next 20, etc. I cannot lengthen the timeout so I need to figure out a way to process the data in chunks. Thanks.

<cfhttp method="Get"
url=" http://192.168.14.51/matt/email.CSV"
columns="Email"
name="importCSV"
delimiter="#chr(10)##chr(13)#"
textqualifier="">

<cfset counter = 0>
<cfloop index="element" list = "#CFHTTP.FileContent#" delimiters="#chr(10)##chr(13)#">
<cfset counter = counter +1>
</cfloop>

<cfset start = 1>
<cfset end = 20>
<cfset lastRow = #counter#>
<cfloop from="#start#" to="#end#" index="loop">
<cfset ThisEmail = listGetAt(cfhttp.filecontent,loop,"#chr(10)##chr(13)#")>
<CFQUERY NAME = "CheckEmail" DATASOURCE="t">
SELECT Email
FROM Subscriber_Info_New
WHERE Email = '#ThisEmail#'
</CFQUERY>
<CFIF CheckEmail.RecordCount EQ 0>
<CFQUERY NAME="Insert" datasource="t">
INSERT INTO Subscriber_Info_New
(Email)
VALUES (<cfqueryparam cfsqltype="CF_SQL_VARCHAR" value="#Trim(ThisEmail)#">)
</CFQUERY>
</CFIF>
</cfloop>
500
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 11, 2007 Jan 11, 2007
I'd use a different approach. Query the database, getting all e-mail addresses. Place the results in a list called baseList. Then peal off each e-mail address form your CFHTTP.FileContent and do a ListFind on baseList. If you find a match, go to the top of the loop. If you don't find a match, to a INSERT INTO the database with the new e-mail address. If you have 50 new e-mail addresses, you have 51 trips to the database.

BTW, your initial loop, counter = counter + 1, is unnecessary. Do <CFSET counter = ListLen(CFHTTP.FileContent, delimiters="#chr(10)##chr(13)#">
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 11, 2007 Jan 11, 2007
quote:

Originally posted by: jdeline
I'd use a different approach. Query the database, getting all e-mail addresses. Place the results in a list called baseList. Then peal off each e-mail address form your CFHTTP.FileContent and do a ListFind on baseList. If you find a match, go to the top of the loop. If you don't find a match, to a INSERT INTO the database with the new e-mail address. If you have 50 new e-mail addresses, you have 51 trips to the database.

BTW, your initial loop, counter = counter + 1, is unnecessary. Do <CFSET counter = ListLen(CFHTTP.FileContent, delimiters="#chr(10)##chr(13)#">



I tried the "<CFSET counter = ListLen(CFHTTP.FileContent, delimiters="#chr(10)##chr(13)#">" but that didnt work. What do you mean by "going to the top of the loop?"
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 12, 2007 Jan 12, 2007
What's the final value of counter when you loop through incrimenting it? What's its value when you use the ListLen( ) function? By "top of the loop", I mean you have should have <CFLOOP><CFIF stuff-matches> do the INSERT </CFIF></CFLOOP>
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 16, 2007 Jan 16, 2007
LATEST
quote:

Originally posted by: jdeline
What's the final value of counter when you loop through incrimenting it? What's its value when you use the ListLen( ) function? By "top of the loop", I mean you have should have <CFLOOP><CFIF stuff-matches> do the INSERT </CFIF></CFLOOP>


The final value of the counter is 10212 which is the length of the .csv file. I tried your idea but putting the database records (400,000+) into a list. This method times out while tryng to add so many records into a list. I see the problem. Either I have a list of 400,000 records which breaks or I have to loop thru 400,000 records to check for each email in my .csv file which will inevitably time-out.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jan 11, 2007 Jan 11, 2007
I cannot lengthen the timeout so I need to figure out a way to process
the data in chunks. Thanks.

Are you sure you can not lengthen the timeout. The <cfapplication ...>
tag allows one to define a specific timeout for a given page, as long as
that is less then the maximum timeout defined in the administrator.

If this is not sufficient for your needs you will need to break up your
process over multiple requests. Just starting and stopping the loop in
a singe request does nothing for reducing the timeout threshold of the
page. You can do this with scheduled tasks, web services, and|or the
<cflocation ...> tag. You will need to pass through relevant data about
what iterations to process, and use some kind of extra request storage
for the results, such as server|application|session scopes, databases or
files.

Hope that gives you some help.

Ian
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 11, 2007 Jan 11, 2007
quote:

Originally posted by: Newsgroup User
I cannot lengthen the timeout so I need to figure out a way to process
the data in chunks. Thanks.

Are you sure you can not lengthen the timeout. The <cfapplication ...>
tag allows one to define a specific timeout for a given page, as long as
that is less then the maximum timeout defined in the administrator.

If this is not sufficient for your needs you will need to break up your
process over multiple requests. Just starting and stopping the loop in
a singe request does nothing for reducing the timeout threshold of the
page. You can do this with scheduled tasks, web services, and|or the
<cflocation ...> tag. You will need to pass through relevant data about
what iterations to process, and use some kind of extra request storage
for the results, such as server|application|session scopes, databases or
files.

Hope that gives you some help.

Ian



Sounds like more than I know how to do. I cant change the timeout because its set that way on purpose. I think Ill leave this alone for now. Thanks.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 11, 2007 Jan 11, 2007
You've tried adding requesttimeout=xxx to the URL I suppose...

Btw, "delimiters" refer to individual characters, entering two may product double the results as your cfloop will loop on both chr 10 and 13. And another btw, the sequence on a PC is actually 13 then 10. Mac's use only 10, Some *nix's only use 13. This might not be an issue if you know your source.

I'm going to assume the 'Email' column in your database is configured to allow only unique values.

If you continue to have timeouts, you could always take the easy way out and split your file in half...

Edit: forgot the #'s around list="#importCSV#"
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 16, 2007 Jan 16, 2007
quote:

Originally posted by: mvierow
You've tried adding requesttimeout=xxx to the URL I suppose...

Btw, "delimiters" refer to individual characters, entering two may product double the results as your cfloop will loop on both chr 10 and 13. And another btw, the sequence on a PC is actually 13 then 10. Mac's use only 10, Some *nix's only use 13. This might not be an issue if you know your source.

I'm going to assume the 'Email' column in your database is configured to allow only unique values.

If you continue to have timeouts, you could always take the easy way out and split your file in half...

Edit: forgot the #'s around list="#importCSV#"



The database isnt set up to allow only unique values. I did try to add the request timeout, but the admin timeout values override it. I like your method but it seems to rely on the database having the "unique values only" setup.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources