Search:

Home | Search Engines


Creating a Robots.txt file

By: Aquo SEO Consultants

The Robots Exclusion Protocol or robots.txt protocol is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is, otherwise, publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code.

A robots.txt file on a website will function as a request that specified robots ignore specified files or directories in their search. This might be, for example, out of a preference for privacy from search engine results, or the belief that the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, or out of a desire that an application only operate on certain data.

The protocol, however, is purely advisory. It relies on the cooperation of the web robot, so that marking an area of a site out of bounds with robots.txt does not guarantee privacy. Some web site administrators have tried to use the robots file to make private parts of a website invisible to the rest of the world, but the file is necessarily publicly available and its content is easily checked by anyone with a web browser.

Here are a few examples:

This example allows all robots to visit all files because the wildcard "*" specifies all robots and the Disallow is blank:

User-agent: *
Disallow:

This example keeps all robots out because the Disallow points to the root directory:

User-agent: *
Disallow: /

The next is an example that tells all crawlers not to enter into four directories of a website:

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /tmp/
Disallow: /private/

This example tells a specific crawler not to enter one specific directory:

User-agent: GoogleBot
Disallow: /private/

Example that tells all crawlers not to enter one specific file:

User-agent: *
Disallow: /directory/file.html

Note that all other files in the specified directory will be processed.

Example demonstrating how comments can be used - everything after the number sign (#) will be ignored by the robot:

# Comments appear after the "#" symbol at the start of a line, or after a directive
User-agent: * # match all bots
Disallow: / # keep them out

Article Source: http://www.seoarticleexchange.com

Contributed by the Aquo Marketing Team.

Comment on this Article

Rate this Article

 

Not yet Rated

Click the XML Icon Above to Receive Search Engines Articles Via RSS!


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

Powered by WebRing.

 

Article Dashboard Installation by Aquo

Powered by Article Dashboard