{"id":106,"date":"2005-02-04T11:56:15","date_gmt":"2005-02-04T16:56:15","guid":{"rendered":"http:\/\/mattclare.ca\/wordpress\/?p=106"},"modified":"2005-02-08T09:17:00","modified_gmt":"2005-02-08T14:17:00","slug":"msn-search-bot","status":"publish","type":"post","link":"https:\/\/mattclare.ca\/blog\/2005\/02\/04\/msn-search-bot\/","title":{"rendered":"MSN Search bot"},"content":{"rendered":"<p>My web site used to get along pretty well with Google.  They searched my web site a lot and I gaind some pretty good rankings.  I couls just mention some one I know in my blog&#8230;. say&#8230;. Chris Court A.K.A. Monkey, and pretty soon I was the number one Google result for that person.<\/p>\n<p>Well now that MSN has revamped their search engine it&#8217;s been getting a little greedy.  As per this thread over on <a href=\"http:\/\/slashdot.org\/comments.pl?sid=138325&#038;threshold=0&#038;commentsort=0&#038;tid=109&#038;tid=217&#038;mode=thread&#038;pid=11571924#11572030\"Slashdot<\/a>, the bot which Microsoft uses to index the world&#8217;s web pages appears to needlesly index all kinds of content.<\/p>\n<p>Here&#8217;s what it&#8217;s been doing to this site in the last four days, versus all the other search engine bots:<\/p>\n<table class=\"aws_border\" border=\"0\" cellpadding=\"2\" cellspacing=\"0\" width=\"100%\">\n<tr>\n<td class=\"aws_title\" width=\"70%\">Robots\/Spiders visitors (Top 25) <\/td>\n<td class=\"aws_blank\">&nbsp;<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\">\n<table class=\"aws_data\" border=\"1\" bordercolor=\"#ECECEC\" cellpadding=\"2\" cellspacing=\"0\" width=\"100%\">\n<tr bgcolor=\"#ECECEC\">\n<th>9 different robots*<\/th>\n<th bgcolor=\"#66F0FF\" width=\"80\">Hits<\/th>\n<th bgcolor=\"#339944\" width=\"80\">Bandwidth<\/th>\n<th width=\"120\">Last visit<\/th>\n<\/tr>\n<tr>\n<td class=\"aws\">MSNBot<\/td>\n<td>305+44<\/td>\n<td>3.11 MB<\/td>\n<td>04 Feb 2005 &#8211; 09:49<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Googlebot<\/td>\n<td>138+14<\/td>\n<td>1.96 MB<\/td>\n<td>04 Feb 2005 &#8211; 09:31<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">AskJeeves<\/td>\n<td>35+7<\/td>\n<td>214.61 KB<\/td>\n<td>04 Feb 2005 &#8211; 00:41<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Inktomi Slurp<\/td>\n<td>19+22<\/td>\n<td>117.74 KB<\/td>\n<td>04 Feb 2005 &#8211; 03:41<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Unknown robot (identified by &#8216;crawl&#8217;)<\/td>\n<td>19+9<\/td>\n<td>140.79 KB<\/td>\n<td>04 Feb 2005 &#8211; 03:30<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Unknown robot (identified by &#8216;spider&#8217;)<\/td>\n<td>6+1<\/td>\n<td>76.38 KB<\/td>\n<td>04 Feb 2005 &#8211; 08:01<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Unknown robot (identified by hit on &#8216;robots.txt&#8217;)<\/td>\n<td>0+7<\/td>\n<td>610 Bytes<\/td>\n<td>03 Feb 2005 &#8211; 06:42<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Netcraft<\/td>\n<td>4<\/td>\n<td>0<\/td>\n<td>03 Feb 2005 &#8211; 16:36<\/td>\n<\/tr>\n<tr>\n<td class=\"aws\">Alexa (IA Archiver)<\/td>\n<td>1+2<\/td>\n<td>5.98 KB<\/td>\n<td>04 Feb 2005 &#8211; 01:48<\/td>\n<\/tr>\n<\/table>\n<\/td>\n<\/tr>\n<\/table>\n<p><span style=\"font: 11px verdana, arial, helvetica;\">* Robots shown here gave hits or traffic &#8220;not viewed&#8221; by visitors, so they are not included in other charts. Numbers after + are successful hits on &#8220;robots.txt&#8221; files<\/span><\/p>\n<p><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>My web site used to get along pretty well with Google. They searched my web site a lot and I gaind some pretty good rankings. I couls just mention some one I know in my blog&#8230;. say&#8230;. Chris Court A.K.A. Monkey, and pretty soon I was the number one Google result for that person. Well&hellip; <a class=\"continue\" href=\"https:\/\/mattclare.ca\/blog\/2005\/02\/04\/msn-search-bot\/\">Continue Reading<span> MSN Search bot<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1,3],"tags":[],"_links":{"self":[{"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/posts\/106"}],"collection":[{"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/comments?post=106"}],"version-history":[{"count":0,"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/posts\/106\/revisions"}],"wp:attachment":[{"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/media?parent=106"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/categories?post=106"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mattclare.ca\/blog\/wp-json\/wp\/v2\/tags?post=106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}