Question
data scraping
Hi guys/gals,
It's been a while but I have a small coldfusion project where I am authenticating to a website using cfhttp and trying to download files that have a datestamp and then an extension. The way it work in the real world is I log into the site and get redirected to a page that has hyperlinks to files and I have to click each one to download it. There are about 9-10 files and I have to do this every day. So naturally I want to automate it.
I use
<cfhttp method="Get"
url="https://samplesite.com/cehttp/servlet/MailboxServlet?operation=LOGON&user=something&password=something&mailbox_server=something"
resolveurl="Yes">
I get: " Logon is successful."
Which is good
Then I loop through a build of the flies and it works like this
<cfset Possiblities = "#alterTime#.blah,
#alterTime#.bleh,
#alterTime#.tart,
#alterTime#.high,
#alterTime#.slip,
#alterTime#.cord,
#alterTime#.need,
#alterTime#.very">
<cfloop list="#Possiblities#" index="i">
<cfoutput>
<cfhttp method="get" url="https://samplesite.com/cehttp/servlet/MailboxServlet?operation=DOWNLOAD&mailbox_id=something&batch_num=something&data_format=A&batch_id=#i#" path="c:\temp" file="#urlString##i#">
#urlString##i#<br />
</cfoutput>
</cfloop>
So i loop through and attempt to download these file.
As part of the real logon process i get redirected to this:
https://samplesite.com/cehttp/servlet/MailboxServlet?operation=something&mailbox_id=&Submit=something
I would like to just zip through the available files and download them to a directory on my computer...
But when I test what is actually produced as a hyperlink and click on it it gives me:
"You are not logged on. Please logon first."
I am by passing the redirect portion but I think I should incorperate it some how.
Or maybe I don't know what I am doing with cfhttp...
Does anyone have tips or clues to my problem?
Thanks everyone!
It's been a while but I have a small coldfusion project where I am authenticating to a website using cfhttp and trying to download files that have a datestamp and then an extension. The way it work in the real world is I log into the site and get redirected to a page that has hyperlinks to files and I have to click each one to download it. There are about 9-10 files and I have to do this every day. So naturally I want to automate it.
I use
<cfhttp method="Get"
url="https://samplesite.com/cehttp/servlet/MailboxServlet?operation=LOGON&user=something&password=something&mailbox_server=something"
resolveurl="Yes">
I get: " Logon is successful."
Which is good
Then I loop through a build of the flies and it works like this
<cfset Possiblities = "#alterTime#.blah,
#alterTime#.bleh,
#alterTime#.tart,
#alterTime#.high,
#alterTime#.slip,
#alterTime#.cord,
#alterTime#.need,
#alterTime#.very">
<cfloop list="#Possiblities#" index="i">
<cfoutput>
<cfhttp method="get" url="https://samplesite.com/cehttp/servlet/MailboxServlet?operation=DOWNLOAD&mailbox_id=something&batch_num=something&data_format=A&batch_id=#i#" path="c:\temp" file="#urlString##i#">
#urlString##i#<br />
</cfoutput>
</cfloop>
So i loop through and attempt to download these file.
As part of the real logon process i get redirected to this:
https://samplesite.com/cehttp/servlet/MailboxServlet?operation=something&mailbox_id=&Submit=something
I would like to just zip through the available files and download them to a directory on my computer...
But when I test what is actually produced as a hyperlink and click on it it gives me:
"You are not logged on. Please logon first."
I am by passing the redirect portion but I think I should incorperate it some how.
Or maybe I don't know what I am doing with cfhttp...
Does anyone have tips or clues to my problem?
Thanks everyone!