Friday, March 30, 2007

Do You Need A Robots.txt File?

Here is an article I have written, called "Do You Need A Robots.txt File?" Sometimes people, I included, when they are not sure how to do something, have a tendency to justify why they should not do it, use it or have it. But I guess, that is when you have to dig in and trudge through the learning process, see how far you can get and then look for help. Needless to say, I have done the trudging and the research to understand better what a robots.txt file is and if it would be of benefit to my website. And in my opinion a robots.txt can be very helpful.

A Robots.txt file can help you in many ways. Just to name a couple: getting more traffic thru giving independent bots that follow protocol an invite, and to also help you to restrict the bad bots as well. The following article is pretty straight forward, and hopefully informative so that you can create your own Robots.txt file if you do not have one already.

Do you need a Robots.txt file?

By Vickie J. Scanlon

Do you need a Robots.txt file? When you have a small site, you are probably under the false assumption that you really don't need a robots.txt file. In fact, you may be saying to yourself, "I don't need a robots.txt file because, my site is, small, it's simple for the search engines to find, and since I want all pages indexed anyway, why bother." That was my thoughts in the beginning, as well as, not being aware of what a robots.txt file is/was or what it could do for my site. Thus, I'll try to give you a little insight as to what a robots.txt is, how to use them, why you need them and some basic instructions on creating a robots.txt file.

Define Robot.txt File

To begin we need to know what a web robot is, and is not. Thus, a Web robot is sometimes called spiders or web crawlers. These should not be confused with your normal web browser, for a web browser is not a web robot because a human being manually maneuvers it.

The main use of a robots.txt file is to give robots instructions to what they can crawl and what they should not crawl. This gives you a little more control over the robots. And since this gives you a little more control over the robots, which means you can issue indexing instructions to specific search engines.


Do you really need a Robots.txt file?

Do you really need a robots.txt even if you're not excluding any robots? It's a good idea. Why? First and foremost, it’s an invite to the search engines. In addition, some of the good bots may step away from your website if you do not have a robots.txt created in the top level of your website.

Sometimes you may want to exclude some pages from the search engine's eye. What type of pages?

1. Pages that are still under construction

2. Directories that you would prefer not to have indexed

3. Or you may want to exclude those search engines whose sole purpose is to collect email addresses or who you do not what to have your website appear in.


What does a Robots.txt file look like?

The robots.txt file is a simple text file, which can be created in Notepad. It needs to be saved to the root directory of your site-that is the directory where your home page or index page is located.

To create a simple robots.txt file to allow all robots to spider your site you can create the following info:

User-agent: *

Disallow:

That's it. This will allow all robots to index all your pages.

If you don't want a specific robot to have access to any of your pages, you can do the following:

User-agent: specificbadbot

Disallow: /

Here you would have to name the robot or specific substring. And you will need the "/" because that means "all directories".

For example, let say you do not want the Googlebot to index a page called "donotenter: and your directory is "nogoprivate". In the disallow section you would put:

User-agent: Googlebot

Disallow: /nogoprivate/donotenter.html

Now if it's a complete directory you do not want indexed you would put:

User-agent: Googlebot

Disallow: /nogoprivate/

By putting the forward slashing at the beginning and at the end, you tell the search engine not to include any of the directories.

Getting Your Code Right

If your Robots.txt file is a more complex piece of code, than it's always wise to do a quick check on the syntax. There are some nice online Robots.txt checks that are free, that you can use to check your syntax. One such free checker is called Robots Text Tester which is free to use through Search Engine Promotion or go to ClockWatchers and they can help you create a robots.txt file, as well as, give you info how to create a file to eliminate bad bots.

To conclude, a Robots.txt file can help you to increase the number of search engines that spider your site, which means increased traffic and better indexing. In fact, this small file also helps you to control what is and is not indexed by search engines. and which search engines can spider your site. So, let me ask you now- is a robots.txt file an important asset to have for your website? I'm sure you have to admit, that yes it is important, even for the small website.

About the Author:

Vickie J Scanlon -- Visit her site at: My Affiliate Place for free tools, how to info of affiliate marketing/internet marketing, tech accessories, software and computers for the online business.



For a ready guide/additional info on Robots.txt and .htaccess go to Website Protection on myaffiliateplace.biz

No comments:

Post a Comment