Blogger Templates

Translate

Adding a robots.txt to SharePoint 2010

I manage a number of SharePoint farms some of which are web facing and configured to allow Anonymous access. However, business requirements dictate that one of the environments, should not be indexed by any of the search engines.

How do you prevent you web facing farm from being indexed? Well, if you want to block search engines from indexing your site you need to create a robots.txt file and place it in the root of your root site.

What is a Robots.txt

Robots.txt is a text (not html) file placed in the root of your site to tell search robots which pages should and should not be visited/indexed. It is not mandatory for search engines to adhere to the instructions found in the robots.txt but generally search engines obey what they are asked not to do.

It is important to note that a robots.txt does not completely prevent search engines from crawling your site (i.e. it is not a firewall) and the fact that you may have a robots.txt file on your site is something like putting a note "Please, do not enter"
on your unlocked front door. Put simply, it will not prevent thieves from coming in but the good guys will not open to door and enter.

It goes without saying therefore, if you have sensitive data, you cannot rely 100% on a robots.txt to protect it from being indexed and displayed in search results.

The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it. They do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://www.sitename.com/robots.txt) and if they don't find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don't put robots.txt in the right place, don't be surprised that search engines index your whole site.

Creating a Robots.txt
Launch Notepad
Put the following in your robots.txt file:

User-agent: *

Disallow: /
Save the file as: robots.txt
Adding a robots.txt file to the root of your public anonymous SharePoint site.
Open up your root site in SharePoint Designer.
Double Click the folder All Files
Drag and drop the newly created robots.txt to the All Files folder.
Exit SharePoint Designer.
Alternatively you can create the robots.txt from within SharePoint Designer itself.

To ensure the file is accessible to search engines go to your site URL and append "/robots.txt". Example: http://www.sitename.com/robots.txt

No comments:

Post a Comment