robots.txt question

Question

Scenario:You have a CMS which easily allows you to enable / disable pages from public view (ie: not able to be found or crawled)...What I would like to know is if you create a separate specifically named folder outside of the CMS for, say, a custom microsite... and use the DNS to point to it - but you want to include robots.txt so that microsite does not get crawled / picked up by Google... will the robots.txt work in that structure?example: mysite.com/special-microsite/index.html   (with robots.txt inside the folder called 'special-microsite')I am concerned this would then block the entire site from getting crawled - including all the content in the CMS? Or not? Or would the code below need to be something otherwise so only the microsite is not crawled?User-agent: *Disallow: /Any input would be helpful.

Nancy OShea · Accepted Answer

No. I have lots of sub-folders on my domain that are disallowed in the robots.txt file. It doesn't stop the rest of the site from being crawled.

User-Agent: *

Disallow: /cgi_bin/

Disallow: /js/

Disallow: /includes/

Disallow: /scripts/

Disallow: /styles/

Disallow: /less/

Disallow: /special_microsite/

Sitemap: http://domain_name.com/sitemap.xml

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded