Clinton/Gore – Math error 714; division by zero.

Robot Control

November 7, 2003 - 11:13pm

Lazy Web, I invoke thee…

Google’s sending people to my pages’ RSS feeds rather than the real page. The robots.txt file has no control over file extensions (specifically, query arguments). There appear to be no HTTP headers to control crawling (caching is not appropriate).

Is there any way to prevent Google from going to any URL ending in ?rss on the site?

Answer: It was in the FAQ, of all places. grumble

==

12. How do I tell Googlebot not to crawl dynamically generated pages on my site?
The following robots.txt file will achieve this.
User-agent: Googlebot
Disallow: /*?

==

“The whole modern world has divided itself into Conservatives and Progressives. The business of Progressives is to go on making mistakes. The business of the Conservatives is to prevent the mistakes from being corrected.” — ILN, 4/19/24 – G. K. Chesterton

Syndicate content