Search engines and robots.txt
Since the birth of internet search engines, the robots.txt file has been how webmasters could let search engines like Google know what content should get crawled and indexed. However, as part of Google Sitemaps, later named XML Sitemaps Protocol, the usage was expanded with Sitemaps Autodiscovery. It is now possible for webmaster to direct search engines to the website XML sitemap. The moment a search engine has found your website and the robots.txt file, it will also know where to find your XML sitemap.
Create a robots.txt file
The robots.txt file is used to instruct search engine robots about what pages on your website should be crawled and consequently indexed. Most websites have files and folders that are not relevant for search engines (like images or admin files) therefore creating a robots.txt file can actually improve your website indexation.

A robots.txt is a simple text file that can be created with Notepad. If you are using Wordpress a sample robots.txt file would be:

User-agent: *
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/

“User-agent: *”means that all the search bots (from Google, Yahoo, MSN and so on) should use those instructions to crawl your website. Unless your website is complex you will not need to set different instructions for different spiders.

“Disallow: /wp-” will make sure that the search engines will not crawl the Wordpress files. This line will exclude all files and foldes starting with “wp-”from the indexation, avoiding duplicated content and admin files.

If you are not using Wordpress just substitute the Disallow lines with files or folders on your website that should not be crawled, for instance:

User-agent:*
Disallow: /images/
Disallow: /cgi-bin/
Disallow: /any other folder to be excluded/

After you created the robots.txt file just upload it to your root directory and you are done!

Sitemap linking.txt file
One final command that you can use that relates to the next section of this page is the 'SITEMAP' command. This can be used to tell search engines or other robots where your sitemap is located. For example the complete robots.txt could look like this:

User-agent: * Disallow: SITEMAP: http://www.advancedhtml.co.uk/sitemap.txt

Just be sure to replace the generic domain name above with your own.

In order to get this right, the robots.txt file has a specific location on your server. When the spiders visit your site (including the AutoMapIt spiders), we look for a file at the following location..

http://www.example.com/robots.txt

... that helps our spiders to know how fast to crawl your site and which pages should be avoided. Full documentation on this can be found at http://www.robotstxt.org, but if the spiders can find it in the right place, and you have the 'Sitemap:' line from above in that file, the search engines will know where to find your sitemap! Be sure to use the correct name for the file and don't put it into a sub-folder of your site.

The best part about this method of submitting your sitemap is that it is universal among the search engines. They pretty much all support this method and I'm sure that more will be adopting this in the very near future.

Contact Information
By E-Mail:
Information: info@densityseo.com
Support: info@densityseo.com
By Phone:
0141-2336773 , 09875198251
Quick Map