Skip to main content
Known Participant
March 21, 2008
Question

Stopping search engine crawling in Connect

  • March 21, 2008
  • 3 replies
  • 834 views
I have a robots.txt located in this location:
$BREEZE_DIR\appserv\web\common

And it says this:
# thanks, but no thanks
User-agent: *
Disallow: /

From what I understand, that should prevent all search engines from crawling my site, yet today I found a recording meeting coming up in MSN Search results. Do I have my robots file in the wrong place?
    This topic has been closed for replies.

    3 replies

    Known Participant
    April 9, 2008
    The Google search "link:(my site)" shows no results. So that's good I guess. In fact the indexed meeting is the only result at all when I do a search for everything on my site. Maybe this meeting was recorded before I put my robots file in place. There was a short period there when I didn't have a robots file. Thanks for the help!
    Participating Frequently
    April 10, 2008
    unless you removed the robots.txt it should be in there by default with the standard installation. So yes, maybe when you didn't have it in there, it was indexed.
    Inspiring
    April 8, 2008
    Hi,

    Great to know. How is it on the hosted server? Does it contain a robot.txt?
    Participating Frequently
    April 9, 2008
    good question, i just checked it by appending /robots.txt to my account URL and it comes up with this
    User-agent: *
    Disallow: /

    So it looks like there is a robots.txt, but not sure where it is placed.
    Participating Frequently
    April 9, 2008
    yupp, the directory is correct, just checked it on a licensed install. so if the robots.txt is not working it might really be down to crawlers ignoring it. Your recorded meeting might come up in a search because it is linked from another website. There is a google search you can run to find out what websites are linking to your domain (at least to find out those in the google index). Might be worth a try.
    Participating Frequently
    March 21, 2008
    Hi, might be in the wrong place. However some crawlers ignore the * in your robots.txt and want to be addressed by their name and don't follow the convention.