• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Bots vs Browsers

Enthusiast ,
Oct 04, 2014 Oct 04, 2014

Copy link to clipboard

Copied

Hi,

Do bots send CFID/CFTOKEN in the request headers? Is that a reliable way to detect if a bot is visiting? User agent testing leads to hundreds of strings to test against, and is an ever-growing list. Is there a more reliable way of detecting bots in 2014 with CF?

Thanks,

Mark

Views

1.8K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Enthusiast , Oct 06, 2014 Oct 06, 2014

Which OS are you using?  How much traffic do you get?  I recently installed a third-party IIS Web Application Firewall for a client called Aqtronix WebKnight. It has lots of blocking rules/filters and provides protection before the request makes it to the ColdFusion layer.

https://www.aqtronix.com/?PageID=99

Session IDs are normally passed via FORM, URL or COOKIE parameters.  Many vulnerability scanning services will attempt to generate their own and randomize the session variables in an attempt t

...

Votes

Translate

Translate
Enthusiast ,
Oct 06, 2014 Oct 06, 2014

Copy link to clipboard

Copied

Sending CFID/CFToken in request headers?  Do you mean as a FORM or URL parameter?  If so, yes, some crawlers & bots will attempt to maintain sessions if required to perform additional requests.

Regarding question #2
You could always try the Browscap CFC.  This CFC will parse the browser string, perform a lookup and return a struct of browser features (including "Crawler".)

henrylearn2rock/BrowscapCFC · GitHub

Another method to detect bots would be to use the rules from the Bad Behavior PHP script and write your own ColdFusion filter:
http://bad-behavior.ioerror.us/

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Oct 06, 2014 Oct 06, 2014

Copy link to clipboard

Copied

Thanks Jamo. If I look in the headers of each web request coming in, say using FusionReactor, Google bot does not seem to send CFID/CFTOKEN, so I was wondering if other bots did the same thing? But if you are saying some bots maintain sessions, then clearly this is no good. The scanning of user agents seems to be the most common way of identifying a bot, or perhaps scanning the number of IP requests in a given time frame perhaps. Not keen on either of those ideas 2013 User Agent Blacklist | Perishable Press has a really good rewrite rule that can be converted to a REFind() btw - the PHP code less so.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Oct 06, 2014 Oct 06, 2014

Copy link to clipboard

Copied

Which OS are you using?  How much traffic do you get?  I recently installed a third-party IIS Web Application Firewall for a client called Aqtronix WebKnight. It has lots of blocking rules/filters and provides protection before the request makes it to the ColdFusion layer.

https://www.aqtronix.com/?PageID=99

Session IDs are normally passed via FORM, URL or COOKIE parameters.  Many vulnerability scanning services will attempt to generate their own and randomize the session variables in an attempt to cause the web application to give them an existing session or throw an error.  Some bots will retain a session that they initiated to access multiple pages, but they can opt not to send the tokens at any time (or send bad tokens.)  If you ever passed CFTokens in the URL, Google and other search engines would be inadvertently following them & indexing them.  (I've seen many people share links on Facebook that contain their personal session URL... if you click on it fast enough, you can usurp their session.)

I don't provide application sessions to bots... it's a waste of resources. I block many of the default user agents used by scripts. It's not 100% effective since they can be changed, but it keeps out many of the script kiddies.

Here's a technique I've documented regarding using ColdFusion to block fake Googlebots. This same method can be used to block fake BingBot & YahooSlurp user agents too.
http://gamesover2600.tumblr.com/post/93345023759/identify-block-fake-googlebots-using-coldfusion

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Oct 07, 2014 Oct 07, 2014

Copy link to clipboard

Copied

LATEST

Thanks for the great tips, that IIS filter looks interesting - we are on IIS 8 and will give it a try. Best to nip the problem in the bud before it gets to ColdFusion, that saves coding more and more anti-bot routines.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation