IP-addresses of search engines
Diese Seite ist auch auf Deutsch verfügbar.
I use my own website as a test platform for everything around the Internet. Also I protocol every access to my web pages and try to track the behaviour of search engines.
To differentiate real visits by humans and the visits by robots of search engines I am trying to sort out my protocols. To make this easier I maintain a list of all search engines I identified visiting my website. As this list is saved in my MySQL database it is no extra work to publish it here.
| Robot | Subnet / URL | Record last updated | SQL String |
|---|---|---|---|
| Adobe | sjfw1.adobe.com | October 13, 2004 | = "sjfw1.adobe.com" |
| Alexa | 209.237.224.0/19 | November 11, 2004 | LIKE "209.237.2%" |
| Alexa | public.alexa.com | October 1, 2004 | LIKE "%public.alexa.com" |
| Allesklar | scooter.allesklar.de | August 27, 2004 | = "scooter.allesklar.de" |
| Almaden | wfp2.almaden.ibm.com | October 1, 2004 | = "wfp2.almaden.ibm.com" |
| Ask Jeeves | %.ask.com | March 11, 2005 | LIKE "%.ask.com" |
| Convera | 63.241.61.8 | January 7, 2005 | = "63.241.61.8" |
| Convera | 63.241.61.8 | January 18, 2005 | = "63.241.61.8" |
| Cosmix | %.cosmixcorp.com | December 29, 2005 | LIKE "%.cosmixcorp.com" |
| Cuill | %.cuill.com | August 7, 2007 | LIKE "%.cuill.com" |
| Diariofotografico | diariofotografico.com | October 3, 2005 | = "ns1.diariofotografico.com" |
| Dir.com | crawl20.dir.com | March 14, 2005 | = "crawl20.dir.com" |
| Echo | x-echo.com | October 2, 2004 | LIKE "%.x-echo.com" |
| Exabot | exabot.com | October 2, 2004 | LIKE "%.exabot.com" |
| Gamekit | bot1.gamekit.de | October 23, 2004 | = "bot1.gamekit.de" |
| Gigablast | www.gigablast.com | March 14, 2006 | = "www.gigablast.com" |
| GlobalSpec | 66.194.55.242 | November 21, 2004 | = "66.194.55.242" |
| Goo | goo.ne.jp | February 18, 2006 | LIKE "%.goo.ne.jp" |
| 64.233.0.0/17 | October 1, 2004 | LIKE "64.233.%" | |
| 64.68.64.0/19 | October 1, 2004 | LIKE "64.68.64.%" | |
| 64.68.88.0/21 | October 1, 2004 | LIKE "64.68.88.%" | |
| googlebot.com | October 1, 2004 | LIKE "%googlebot.com" | |
| proxy.google.com | October 1, 2004 | LIKE "%proxy.google.com" | |
| Google (Mediapartners) | 66.249.0.0/19 | October 1, 2004 | LIKE "66.249.%" |
| Inacts | search.inacts.com | December 19, 2004 | = "search.inacts.com" |
| Inktomi | %inktomisearch.com | October 1, 2004 | LIKE "%inktomisearch.com" |
| IRL Crawler | irl-crawler%.cs.tamu.edu | July 20, 2006 | LIKE "irl-crawler%.cs.tamu.edu" |
| Jeteye | jeteye.com | October 3, 2004 | LIKE "%.jeteye.com" |
| KnowItAll | hail.cs.washington.edu | August 15, 2004 | = "hail.cs.washington.edu" |
| Looksmart | looksmart.com | January 15, 2005 | LIKE "%.looksmart.com" |
| Majestic-12 | %.idi.ntnu.no | August 7, 2007 | LIKE "%.idi.ntnu.no" |
| Majestic-12 | 205.209.182.240 | February 18, 2006 | = "205.209.182.240" |
| metager2.de | 193.164.8.43 | December 29, 2005 | = "193.164.8.43" |
| Microsoft | 207.46.0.0/16 | October 1, 2004 | LIKE "207.46.%" |
| Microsoft | msnbot.msn.com | December 19, 2004 | = "msnbot.msn.com" |
| Microsoft | search.msn.com | October 1, 2004 | LIKE "%search.msn.com" |
| Microsoft Live | %.search.live.com | January 7, 2007 | LIKE "%.search.live.com" |
| Neofonie | spider.neofonie.de | March 16, 2007 | = "spider.neofonie.de" |
| Netcraft | 195.92.95.61 | December 30, 2004 | = "195.92.95.61" |
| NoxtrumBot | tpiol.tpiol.com | August 7, 2007 | = "tpiol.tpiol.com" |
| Overture | nat-yrl.overture.com | February 18, 2005 | = "nat-yrl.overture.com" |
| Picsearch | picsearch.com | October 1, 2004 | LIKE "%.picsearch.com" |
| Seekbot | %.seekbot.net | December 29, 2005 | LIKE "%.seekbot.net" |
| seventwentyfour.com | 209.167.50.???/?? | October 1, 2004 | LIKE "209.167.50.%" |
| Spammer | 211.157.8.44 | October 3, 2004 | = "211.157.8.44" |
| Spammer | 217.107.222.75 | August 11, 2005 | = "217.107.222.75" |
| Spammer | 66.246.218.107 | December 29, 2005 | = "66.246.218.107" |
| Spammer | 81.169.180.237 | December 30, 2004 | = "81.169.180.237" |
| Suchen.de | %.suchen.de | August 7, 2007 | LIKE "%.suchen.de" |
| Teoma | %.teoma.com | October 1, 2004 | LIKE "%.teoma.com" |
| Thunderstone | copilot.thunderstone.com | October 2, 2004 | = "copilot.thunderstone.com" |
| Tricus | 213.221.109.???/?? | October 1, 2004 | LIKE "213.221.109.%" |
| Turnitin | turnitin.com | October 2, 2004 | LIKE "%.turnitin.com" |
| W3C Validator | w3.org | October 1, 2004 | LIKE "%.w3.org" |
| Yahoo | yahoo.com | December 13, 2004 | LIKE "%.yahoo.com" |
To make it easier for you to use this list I also included a special version as a SQL delete string that can easily be run by MySQL.
This list is not exhaustive as I can only include all robots that visited my website. Unfortunately most search engines only update their records quite sporadic and visit only the home page and at most two or three sub-pages. Therefore it is easy to overlook some of the smaller search bots. Google and Microsoft on the other side often account for thousands of page requests per day.
Furthermore the shown SQL strings are often inaccurate as they do not map the exact subnets of the search engines. I decided to use this more simple approach to ease editing.
Further information on search engines can be found on these websites:
Haftungshinweis: Trotz sorgfältiger inhaltlicher Kontrolle übernehme
ich keine Haftung für die Inhalte externer Links. Für den
Inhalt der verlinkten Seiten sind ausschließlich deren Betreiber
verantwortlich.
Besuchen Sie auch meine Bildergalerie unter
gallery.plogmann.net.
© Stefan Plogmann, 1996-2008
