"Have more of your web pages spidered, indexed and increase your search engine positioning by using robots.txt files"
Having a robots.txt file in the root directory of your web site greatly increases it's spiderability, especially when the pages are two or three levels deep.
Search engines, specialty robots, spiders, and crawlers will look in your root directory for a special file named "robots.txt"
(http://www.mydomain.com/robots.txt). The file tells the robot (spider) which files it may spider (download). This text file must be placed in the root
of your web site.
The easiest way to create a properly formatted robots.txt file is to use a program. I create my robots.txt files with
"Robot-Manager's "Pro" edition. This way I have the ability to analyze my log files to see which robots visit and what they get on their "crawl".
You can also create your robots.txt files manually. When you upload the file to your web site, make sure to use the "ASCII" mode.
The two basic lines of the file are the "User-agent" and the "Disallow" statements. The "User-agent" line specifies the robot and the "Disallow" line(s)
specify files and/or directories. For example:
User-agent: googlebot
This specifies the User-agent for Google.
You may use the wildcard charcter "*" to specify all robots:
User-agent: *
To disallow access to a specific file (email.html) in the root directory:
Disallow: /email.html
You may also specify directories. To block spiders from your cgi-bin directory use:
Disallow: /cgi-bin/
To disallow access to the entire site use:
Disallow: /
Here are a few simple examples:
To allow all robots to visit all files of your web site:
User-agent: *
To disallow all bots everywhere:
User-agent: *
To bar all robots from the cgi-bin and images directories:
User-agent: *
Any line in the robots.txt that begins with # is considered to be a comment.
After you have created and uploaded your robots.txt file, you can validate it at:
About The Author:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(c) Copyright 2005. All rights reserved.
Publishing Guidelines: You may freely distribute or publish
this article provided you publish the whole article and
include the copyright notice and links in full. A courtesy
copy is requested upon publication.
This article provides usefull information on robots.txt
files.
Subscribe to Internet Marketing Newsletter:Your e-mail address will never be abused. I hate spam as much as you do, and you may unsubscribe at any time
|
|
Articles Regarding Marketing Strategy
|| Search Engine Marketing
2005 - Present © NetMarketingStrategies.com - All Rights Reserved Worldwide |