PageTraffic SEO Blog

Subscribe To Page Traffic Blog
Subscribe via RSS
Subscribe via Email

Google Speaks On Robots Exclusion Protocol

February 23rd, 2007 | 1,631 Views RSS Feed



If you're new here, you may want to subscribe to our Full RSS feed to get a daily digest of news around search engine industry.

A post on official Google blog informs about Robots Exclusion Protocol. Sometime back we informed you about a previous post on Robots.txt file. It imparted important details to the web publishers about how they can control indexing and accessing of sites by search engines and Google itself. The important tool for the same purpose is the robots.txt file. Robots.txt file gives powerful control to site owners on how the site is searched.

The more recent post on robots exclusion protocol provides more details and examples of mechanisms to control access and indexing of your website by Google.

This post simplifies the procedure of preventing Googlebot from following a link. “Usually when the Googlebot finds a page, it reads all the links on that page and then fetches those pages and indexes them. This is the basic process by which Googlebot "crawls" the web. This is useful as it allows Google to include all the pages on your site, as long as they are linked together.” It further says that one can add the NOFOLLOW tag to a  page which tells the Googlebot not to follow any links it finds on that page.

Further on, the post intricately explains how to control caching and snippets. “Usually you want Google to display both the snippet and the cached link. However, there are some cases where you might want to disable one or both of these. For example, say you were a newspaper publisher, and you have a page whose content changes several times a day. It may take longer than a day for us to reindex a page, so users may have access to a cached copy of the page that is not the same as the one currently on your site. In this case, you probably don't want the cached link appearing in our results."

To know more on how robots exclusion protocol can assist read the complete post.

Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!


Comments

Leave a Reply

Back to Top

Copyright © 2006-2008 PageTraffic SEO Blog. All rights reserved.

RSS feeds. WordPress Theme by Candid Software.

Googlebot visited this page Friday, August 15, 2008