crazyweblist.com crazyweblist.com
   Main About Us Privacy of Info Terms of Use Add Url Add Article
Search:   
 
 

10 Spellbinding Ways To Light Up Your Sales

Specialize your product or service if you have too much competition. If you're selling an advertisin ... - Rojo Sunsen
 

Do You Want Respect As A Publisher?

"Learn To Give A Little Before You Get Anything Back" - Donesia Muhammad
 

Smart Advertising in Affiliate Marketing

Smart advertising is one the key ingredient to the success of affiliate marketing. One method of sma ... - Peter Garant
 
 

How to Measure Your Website's Performance

Learn how to look beyond just the number of page hits, visitors and sales to get an accurate idea of ... - Charlie Cook
 

Copyright Infringement on the Internet

How do copyright thieves justify their dirty deeds to steal other author?s work? One accomplished co ... - Lance Winslow
 
 

Main » Internet & Computers » SEO
 

Search Engine Tips & Tricks: Create a Robots Text File for Your Web Site

 

Author: Sandra Waggett

Search engines index millions of web sites to generate the search results they return for key words. They do this using spiders.

Most search engines have their own spider that crawls around the web looking for web pages. Spiders are also known as robots because they are simply tiny little programs that run automatically, looking for web pages and recursively traveling through the embedded text links to index them. Most robots look for a robots.txt file in the top-level directory of your web site, also known as the root where your home page is located on the web server.

The robots.txt file is a simple text file created in a basic text editor, like Notepad. It allows you to control what the spider is allowed to access and what it is not allowed to access or index.

The format of the basic robots.txt file is pretty simple:
User-Agent: [Spider Name]
Disallow: [File Name]

For example, to allow ALL robots complete access to your web site, your robots.txt file will look like this:
User-agent: *
Disallow:
The asterisk is a wild card character that represents ALL robots. Leaving the Disallow line blank indicates to the robots, that nothing on the site is disallowed.

The next example bars all robots from the cgi-bin (where your scripts are typically located), images directories, and the portfolio directories:
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /portfolio/
Note: You should use a separate Disallow line for each directory or individual file.

In this example, you may wonder why you would want to disallow a robot from indexing your portfolio directory.

If you are a photographer and you have thumbnail images on a portfolio page that link to enlargement pages launched in a pop-up window, you may not want those pop-up pages indexed. These are called dead-end or orphaned pages because only the enlarged image appears on the page with no contact info or menu links back to the main site. If the visitor entered your site on one of these pages, they would have nowhere to go and no way to contact you.

For a live example, check out www.AnJPhotography.com and look at her wedding portfolio. When you click on an image, it opens in a new window. The page in the new window is a dead-end page. A robots.txt file can keep search engines from indexing these dead pages so you dont leave site visitors stranded.

This example keeps googlebot (the Google spider) from getting at the private.htm file:
User-agent: googlebot
Disallow: private.htm

When you create your robots.txt file it is extremely important that you use a basic text editor (like Notepad) and NOT a word processing application like Microsoft Word. Applications like Microsoft Word can insert hidden characters that may make your robots.txt file unreadable. After you post your robots.txt file to the web server, you can validate it to make sure it is properly formatted. There are several free validators on the web. Here is one: http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

There are several advantages and some disadvantages of having the robots.txt file in your root directory. Protocol requires that all search engine robots start indexing your web site with the robots.txt file. This is the default entry point for robots if the file is present. Major search engines will never violate the Standard for Robots Exclusion. This is the primary reason it should be there. Beyond that, it can help with your search engine rankings when used correctly, and it can keep dead pages on your web site from being indexed. The primary disadvantage is that the robots.txt file may be viewed by nefarious individuals on the web, so you never want to use the robots.txt file to try to hide sensitive pages or directories on your web site (like passwords or private information). For more information about the robots.txt file and complete list of robots, visit the following web site: http://www.robotstxt.org/wc/robots.html

Author Bio:

Sandra Waggett

Sandra Waggett is the founder and principal designer of MSW Interactive Designs LLC (MSW-ID) major products and websites. MSW-ID provides custom website design, hosting, ecommerce and online marketing solutions to nearly 400 small business clients nationwide. MSW-ID helps small business professionals achieve an effective Internet presence.

Prior to founding MSW Interactive Designs LLC, she spent nearly 5 years working as a Senior Engineer for BAE Systems on the Lockheed Martin Mission Systems Team in Colorado Springs, CO. While with BAE, she was the training lead for the proposal phase of the Integrated Space Command and Control (ISC2) program. In this role, she authored the 10 year training plan for the proposal and developed web-based training prototypes for presentation to to the Government decision makers. Sandy earned her Master of Arts of degree from the University of CO, Colorado Springs, in Curriculum and Instruction, Corporate Track. Her specialties include web design, interface design, instructional design, and computer-based training development.

Sandy grew up in Las Vegas, NV and now resides on Capitol Hill in Washington DC.

You can also reach this article by using: search engine optimization services, search engine optimization firm
 
 
 

Related Articles

 
Traditional Antivirus Programs Useless Against New Unidentified Viruses!
 
Attack of the Bloggers
 
Email Marketing Campaigns - Robots, Humans And Shoes Slashed Whilst Lions Roared
 
Ten Reasons Why Online Surveys Are The Future of Marketing
 
Bread, Mayo, Turkey, MAILING LIST!
 
You, Marketing-Minded Financial Planner, Can Be an Author
 
4 Time-Saving E-Mail Tips!
 
Backup and Save your business!
 
Secrets of Microsoft new file system revealed by Data Recovery Engineer
 
Secrets of Super Affiliate
 
 
 
 

Self Enhancement

 

Medicine & Treatment

 

Science & Research

 

Teens & Children

 

Fitness & Health

 

Tour & Travel

 

Companies & Business

 

Outdoor & Sports

 

Jobs & Employment

 

Automobile & Automotive

 

Property & Estate

 

Music & Entertainment

 

People & Communities

 

Culture & Art

 

Lifestyle & Fashion

 

Internet & Computers

 

Policies & Law

 

Events & News

 

Home & Garden

 

Games & Play

 

Education & Reference

 

Shopping Online

 

Food & Recipe

 

Finance & Investment

 
Main Privacy of Info Terms of Use  
© 2006 www.crazyweblist.com - All Rights Reserved