PageTraffic SEO Blog

Subscribe To Page Traffic Blog
Subscribe via RSS
Subscribe via Email

Want Your Pages to be Crawled by Google? Avoid the Use of .exe in URLs!

June 16th, 2008 | 770 Views RSS Feed



If you're new here, you may want to subscribe to our Full RSS feed to get a daily digest of news around search engine industry.

In his blog, Matt Cutts has addressed the issue of why Google wouldn't crawl URLs that end with .exe as file extensions. Google accepts to crawl all file extensions such as .php, .asp., .html, .htm. However, there are certain extensions that Google can't index such as .exe.

According to Matt Cutts, “But there are some file extensions that are mostly binary data, such as .exe, where the vast majority of the time the data would be meaningless blobs, so there are a few extensions to avoid. If your files are named example.dll or example.bin and you don’t see Google crawling pages with that file extension, I’d recommend changing your file extension to something else.â€

One of the ways, that can be used to determine whether Google will crawl pages with a certain file type extension is to do a query such as [filetype:exe] and if you don't see any URLs that end directly in .exe, then this could have happened because of the following reasons:

  1. There are no such files on the web, which we know isn’t true for .exe.
  2. Google chooses not to crawl such pages at this time — usually because pages with that file extension have been unusually useless in the past

As one of the examples by Seomoz, shows, how URLs ending with '0' are also not being crawled by Google. Recently, they shifted their URLs, that ended as “/web2.0″. As Google doesn't crawl URLs with a '0' as an extension, hence, the page did not get crawled at all.

However, now Google is doing a rethinking on its crawling policies and it is now finally willing to crawl pages that end in “0″.

Want Your Pages to be Crawled by Google? Avoid the Use of .exe in URLs

So, in the end, it all comes down to this:

  1. Why Google doesn’t crawl some filetype extensions (when we’ve seen good evidence that the extensions are mostly binary or otherwise not-very-indexable files).
  2. An easy was to use the filetype: operator, so that you can decide whether to avoid a particular filename extension yourself.
  3. Google is willing to revisit old decisions and test them again, which is what we’re doing with the “0″ filetype extension.

Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!


 


Comments

Leave a Reply

Hire Full Time SEO Consultant


Subscribe To Our SEO Blog


Enter your email address:

Delivered by FeedBurner



The Associates

SEO Blogs - Blog Catalog Blog Directory

Back to Top

Copyright © 2006-2009 PageTraffic SEO Blog. All rights reserved.

RSS feeds. WordPress Theme by Candid Software.

Feedback Form