This morning I got an email telling me that I was getting close to exceeding my bandwidth for the month. Interesting, that’s never happened before. So I checked my stats and sure enough I’ve served up 8.6 Gigs out of my 10 alloted.
Things were running about normal until the 23rd of the month and then usage quadrupled. Normally I was using between 150-200MB a day when all of a sudden it jumped to over 900MB. Visits and hits stayed pretty much the same, but pages went way up. The biggest page served was “/archives/ miatatude/” which is automatically generated when requested.
Further delving into the stats, a lot of external links had web addresses with names like: – – – – – – –
Next I looked in the raw access logs and found a bunch of entries that looked like this: – – [26/Aug/2005:00:00:08 -0500] “GET /archives/miatatude/ HTTP/1.0” 200 26131 “” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7” and – – [26/Aug/2005:00:02:00 -0500] “GET /archives/miatatude/ HTTP/1.0” 200 1723287 “” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7”
I did some reading up on deciphering that mumbo-jumbo and what is really strange is that both those requests are for the same web page, but for one a lot more info is returned, but the big question is what is going on here? I found like the top ten IP addresses doing this requesting and denied them access so they will get a 403 instead content. What really worries me is this looks a lot like comment spam roaches, you squash one and several more crawl out from the base boards. Am I going to have to check my logs daily and ban IPs until I close every one?
Have you tried taking a look in your comment table in your DB? Doing a search ordered by date could show you whether they’re seeding your past pages with comments pointing at their domains. Also, are you re-writing URLs placed in comments with the rel=’nofollow’ syntax? This kills the benefit in terms of pagerank and google, which is typically why you get this spam to begin with. Somehow I think you already know this stuff though… =)
This may be what they are trying to do, but it ain’t happening. I run some sort of MT plug in that closes comments once the entry falls off the front page (10 days?)
Go forth and acquire MT Blacklist.
It’s a slick tool that will allow you to ban based on content, not IP. It will also help eliminate those pernicious trackpack pings spams.
Used to run Blacklist, but stopped once I installed MT Close Comments. You think that is what those are, track back pings? That feature isn’t enabled on my blog…