Question
spiders, robots.txt and /includes folder
I am unsure whether or not I should put my /includes folder
into my robots.txt file. When search engine spiders crawl through a
site, how do they deal with files that are in the /includes folder?
Do they only see those files when they are "cfincluded" by a
calling page or do the spiders also see them as independent pages?
I don't want to see pages of mine showing up on a search engine's rankings that are devoid of sibling content. (For example, I wouldn't want just content from "column 1" without the pages' header, footer, column 2 and sidebar also being displayed.) This could give users a poor (and obviously misleading) impression of my site and its content.
So, should I put my /includes folder into my robots.txt file (ex. "Disallow: /includes/") or not? And would this prevent a spider from following a <cfinclude>? I definitely don't want that.
But if spiders ONLY crawl files within the /includes folder when they are called from another file, then I wouldn't have to worry about page components showing up in rankings under the guise of complete pages.
Any information on this topic would be greatly appreciated.
PS. On a separate, but slightly related note, can search engine spiders crawl JavaScript and CSS files?
I don't want to see pages of mine showing up on a search engine's rankings that are devoid of sibling content. (For example, I wouldn't want just content from "column 1" without the pages' header, footer, column 2 and sidebar also being displayed.) This could give users a poor (and obviously misleading) impression of my site and its content.
So, should I put my /includes folder into my robots.txt file (ex. "Disallow: /includes/") or not? And would this prevent a spider from following a <cfinclude>? I definitely don't want that.
But if spiders ONLY crawl files within the /includes folder when they are called from another file, then I wouldn't have to worry about page components showing up in rankings under the guise of complete pages.
Any information on this topic would be greatly appreciated.
PS. On a separate, but slightly related note, can search engine spiders crawl JavaScript and CSS files?
