The Web Robots Pages -------------------------------------------------------------------------------- The Web Robots Database The List of Active Robots has been changed to a new format, called The Web Robots Database. This format will allow more information to be stored, updates to happen faster, and the information to be more clearly presented. Note that now robot technology is being used in increasing numbers of end-user products, this list is becoming less useful and complete. For general information on robots see Web Robots Pages. The robot information is now stored into individual files, with several HTML tables providing different views of the data: View Names View Type Details using tables View Contact Details using tables Browsers without support for tables can consult the overview of text files. The combined raw data in machine readable format is available in a text file. To add a new robot, fill in this empty template, using this schema description, and email it to me. -------------------------------------------------------------------------------- Others There are robots out there that the database contains no details on. If/when I get those details they will be added, otherwise they'll remain on the list below, as unresponsive or unknown sites. Services with no information These services must use robots, but haven't replied to requests for an entry... Magellan User-agent field: Wobot/1.00 From: mckinley.mckinley.com (206.214.202.2) and galileo.mckinley.com. (206.214.202.45) Honors "robots.txt": yes Contact: cedeno@mckinley.mckinley.com (or possibly: spider@mckinley.mckinley.com) Purpose: Resource discovery for Magellan (http://www.mckinley.com/) User Agents These look like new robots, but have no contact info... BizBot04 kirk.overleaf.com HappyBot (gserver.kw.net) CaliforniaBrownSpider EI*Net/0.1 libwww/0.1 Ibot/1.0 libwww-perl/0.40 Merritt/1.0 StatFetcher/1.0 TeacherSoft/1.0 libwww/2.17 WWW Collector processor/0.0ALPHA libwww-perl/0.20 wobot/1.0 from 206.214.202.45 Libertech-Rover www.libertech.com? WhoWhere Robot ITI Spider w3index MyCNNSpider SummyCrawler OGspider linklooker CyberSpyder (amant@www.cyberspyder.com) SlowBot heraSpider Surfbot Bizbot003 WebWalker SandBot EnigmaBot spyder3.microsys.com www.freeloader.com. Hosts These have no known user-agent, but have requested /robots.txt repeatedly or exhibited crawling patterns. 205.252.60.71 194.20.32.131 198.5.209.201 acke.dc.luth.se dallas.mt.cs.cmu.edu darkwing.cadvision.com waldec.com www2000.ogsm.vanderbilt.edu unet.ca murph.cais.net (rapid fire... sigh) spyder3.microsys.com www.freeloader.com. -------------------------------------------------------------------------------- The Web Robots Pages