PageTraffic SEO Blog

Subscribe To Page Traffic Blog
Subscribe via RSS
Subscribe via Email

Yahoo! Explains Ad Tracking And Dead URLs

February 27th, 2007 | 1,247 Views RSS Feed



If you're new here, you may want to subscribe to our Full RSS feed to get a daily digest of news around search engine industry.

Official Yahoo! blog has written a post informing how best to prevent ad tracking URLs and dead URLs from getting indexed. 

While describing Ad tracking URLS yahoo says: “Ad tracking URLs are used by Webmasters to help determine what traffic is coming in from advertisements (e.g., Yahoo! Sponsored Search and Yahoo! Publisher Network) but aren’t necessary to include in the Yahoo! Search index. Sometimes you might notice that these URLs still appear in the index. That’s because they’ve appeared on pages that are “crawlable” or may have been copied over to crawlable pages by users. If you don’t want Yahoo! Slurp, our Web crawler to index these URLs you can use wildcards in robots.txt. For example, if you are using the parameter 'ref' to track ad sources, you can use a rule like the one below to keep your tracking URLs from being Slurped:
User-Agent: Yahoo! Slurp
Disallow: /*ref=YahooPublisherNetwork”

According to Yahoo! the best way to get rid of dead URLs from Yahoo! Search index is to return an HTTP Error 404 when Yahoo! crawler requests the page. The post further asserts “If you want to act before the 404 discovery and URL removal process completes, you can use Site Explorer to quickly delete the URLs from the index. One advantage to using Site Explorer is that you can delete multiple URLs including an entire subpath so long as the URL prefix is the same. As Danny Sullivan points out in his deep-dive post on the delete function, if you delete http://domain.com/subarea1/, then all the pages that begin with “domain.com/subarea1” will get removed. E.g.:
http://domain.com/subarea1/page1.html
http://domain.com/subarea1/page45.html”

Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!


 


Comments

Leave a Reply

Back to Top

Copyright © 2006-2008 PageTraffic SEO Blog. All rights reserved.

RSS feeds. WordPress Theme by Candid Software.

Googlebot visited this page Friday, November 14, 2008