Rietta.com Security logo
You are reading The Rietta Blog, a publication about the web since 2005.

Re: David Bartosik: Why Robots.txt by Matt Benya

I came across a blog article, David Bartosik: Why Robots.txt by Matt Benya, which happens to mention RoboGen, a program for editing robots.txt files that I wrote nearly six years ago! I do enjoy finding references to my previous work. Mr. Benya’s explanation on of the robots.txt file reminds me of a situation I came across a few weeks ago.

I had logged into one of the web servers and noticed the system was not responding as snappily as usual. I turns out the load average was at 15%, caused by a large number of instances of a customer CGI script. Fortunately, these scripts were being run by a particular user so I was able to find and inspect the tail end of the log file and determined that ZyBorg, from wisenutbot.com, was rapidly accessing the dynamically generated site by a CGI-interface Perl script. In order to get the server load under control, I created a robots.txt for the site and blocked the ZyBorg user-agent from indexing the Perl scripts for the site. Fortunately the robot did comply with the exclusion standard and the rapid-fire crawling stopped.

While this story has nothing to do with RoboGen, I used Vim in the SSH session, it does show one concrete example of the continued applicability of the robot exclusion standard.

About Frank Rietta

Frank Rietta's photo

Frank Rietta is a web application security architect, author, and speaker. He is a computer scientist with a Masters in Information Security from the College of Computing at the Georgia Institute of Technology. He speaks about security topics and was a contributor to the security chapter of the 7th edition of the "Fundamentals of Database Systems" textbook published by Addison-Wesley.

If there is a topic you would like us to cover,