Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

robots.txt question

Enthusiast ,
Jan 25, 2017 Jan 25, 2017

Scenario:

You have a CMS which easily allows you to enable / disable pages from public view (ie: not able to be found or crawled)...

What I would like to know is if you create a separate specifically named folder outside of the CMS for, say, a custom microsite... and use the DNS to point to it - but you want to include robots.txt so that microsite does not get crawled / picked up by Google... will the robots.txt work in that structure?

example: mysite.com/special-microsite/index.html   (with robots.txt inside the folder called 'special-microsite')

I am concerned this would then block the entire site from getting crawled - including all the content in the CMS? Or not? Or would the code below need to be something otherwise so only the microsite is not crawled?

User-agent: *

Disallow: /

Any input would be helpful.

658
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Jan 25, 2017 Jan 25, 2017

No.  I have lots of sub-folders on my domain that are disallowed in the robots.txt file.   It doesn't stop the rest of the site from being crawled.

User-Agent: *

Disallow: /cgi_bin/

Disallow: /js/

Disallow: /includes/

Disallow: /scripts/

Disallow: /styles/

Disallow: /less/

Disallow: /special_microsite/

Sitemap: http://domain_name.com/sitemap.xml

Translate
Community Expert ,
Jan 25, 2017 Jan 25, 2017

No.  I have lots of sub-folders on my domain that are disallowed in the robots.txt file.   It doesn't stop the rest of the site from being crawled.

User-Agent: *

Disallow: /cgi_bin/

Disallow: /js/

Disallow: /includes/

Disallow: /scripts/

Disallow: /styles/

Disallow: /less/

Disallow: /special_microsite/

Sitemap: http://domain_name.com/sitemap.xml

Nancy O'Shea— Product User, Community Expert & Moderator
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Jan 26, 2017 Jan 26, 2017

good to know. can i just use my original code - will that suffice? Or should I use the one you specifically listed above that covers different components?

thank you!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 26, 2017 Jan 26, 2017
LATEST

My code is an example from my site.  Your robots.txt file must be tailored to your server set-up and needs.  It's always a good idea to include a link to your sitemap for the robots to follow.

Nancy

Nancy O'Shea— Product User, Community Expert & Moderator
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines