Because I run my own mail server, I’m able to watch for trends related to incoming email and crunch numbers on those that seem interesting. Today, listening to the voice which has been telling me for the past few weeks that spam feels to be on a major uptrend, I looked at the numbers of spam messages that have hit my own inbox. (Well, make that “tried to hit my own inbox,” since I’m also able to run a general spam filter whcih catches most of the unsolicited crap.) And that voice appears to be correct; over the past two months, I’ve received way more than double the number of spam emails than in any of the months in the first half of 2007. For example, I’ve received over 28,000 spams through today in November, compared to just under 12,000 in January.

spam trend, 2007

As always, stats can lie as much as they can reveal truth; I don’t know what my 2006 chart would have looked like, whether there’s always an uptick towards the later months of a calendar year, or any other such comparison information. Nonetheless, I figured this was interesting enough to share.

This morning, I took a look at my mail server logs to see if yesterday’s changes had caused any unexpected issues, and I’m happy to say that all appears well. I also took a few minutes to analyze the logs a little bit, and here’s what the past 20 hours has brought:

don't send email to these accounts; they don't exist!
  • In 1,202 minutes, 16,605 messages were attempted to be delivered to nonexistent accounts on my server, for a rate of one message every four and a half seconds.
  • Those 16,605 messages were addressed to 915 unique (and still nonexistent!) email addresses.
  • By far, the queso.com address bore the brunt of this, with 759 of the addresses living there; no other domain had more than 60 or 70 false attempts.
  • The most popular fake email address is one that’s never existed, and doesn’t make much sense at all; it received 461 attempts. (The top 10 list is in the graphic to the right.)
  • As you’d expect, generic “webmaster” email addresses are popular, accounting for 225 of the attempts across all the domains I host; “postmaster” and “mail” are a lot less popular than you’d think.

All in all, I’m glad to have made the configuration change, and my mail server seems to be operating under quite a bit less load as a result.

There’s really no debate that despite all efforts to combat it, spam email continues to grow and thrive on the internet. Since I host my own email server (providing accounts for myself, my family, and a few friends), I’ve watched as gargantuan volumes of unsolicited email stream in over the wire, and I’ve had to keep up to speed on the latest and greatest spamfighting techniques in order to keep our mailboxes reasonably free of the nuisance. That being said, the whole system has always felt like a fragile beast, and when my spam system fails for even a few minutes, my inbox can get buried. (For example, a component of my filters got overloaded this morning for just over six minutes, and over 50 spam emails slipped through in that period.) So, for the past year, I’ve been hunting for ways to optimize my mail setup in order to lessen the load on the spam filters, and one specific way has eluded me until this morning. Being that I’ve actively searched for this very solution for over a year and not had success until today, I figured I’d describe what I did in case anyone else is looking for the same fix.

(Really, I shouldn’t have to tell you this, but what follows is an extremely detailed, low-level description of my mail setup and the innards of a spam filtering system. It’s dorky, and you probably won’t want to read the rest unless you’ve imbibed a good deal of caffeine and know your way around sendmail.)

There was quite a bit of teeth gnashing across the web throughout the evening yesterday as TypePad, LiveJournal, and all the other hosted Six Apart websites went dark; we learned late in the night that the cause was a “sophisticated distributed denial of service attack” against the sites. Digging a little deeper, though, it doesn’t look like this is a particularly accurate description of what happened — but instead of this being a case of the folks at Six Apart trying to cover up some internal issue, it instead looks like they’re being far too gracious in not revealing more about another company, Blue Security, which appears to have been responsible for the whole disaster. An explanation of this requires a slight bit of background.

Blue Security is a company which has recently garnered a little bit of notoriety on the ‘net due to its unorthodox method of attempting to control the problem of spam email. Last summer, PC World publshed a reasonably good summary of Blue Security’s antispam efforts; a charitable way of describing the method would be to say it attempts to bury spammers in unsubscription requests, but a more accurate description would be that the service performs outright denial-of-service attacks on spammers, and does so by convincing people to install an application (Blue Frog) on their computers which launches and participates in the attacks. Without a doubt, Blue Security’s system has generated controversy from the perspective of both unsolicited emailers and regular ‘net citizens alike, so it’s not all that surprising that the spammers recently began fighting back. One of the methods used against Blue Security has been a more traditional denial-of-service attack against the company’s main web server, www.bluesecurity.com, an attack which was effective enough to knock that web server offline for most of yesterday.

OK, so why is any of this information — about a company completely unrelated to Six Apart — important background? Because according to a post on the North American Network Operators Group mailing list, at some point yesterday the people at Blue Security decided that the best way to deal with the attack was to point the hostname www.bluesecurity.com to their TypePad-hosted weblog, bluesecurity.blogs.com. This effectively meant that the target of the attack shifted off of Blue Security’s own network and onto that of Six Apart, and did so as the direct result of a decision made by the folks at Blue Security. (The best analogy I can think of is that it’d be like you dealing with a water main break in your basement by hooking a big hose up to the leaking joint and redirecting the water into your neighbor’s basement instead.) Soon thereafter, the Six Apart network (understandably) buckled under that weight and fell off the ‘net, and over four hours passed before packets began to flow again. (And given that the www.bluesecurity.com hostname was still pointed at TypePad for most of today, I’d imagine that the only way those packets began to flow was as the result of some creative filtering at the edge of its network.) Judging from the outage, it’s unlikely that Blue Security gave them any warning — although who knows whether a warning would’ve prevented the basement from filling up with water all the same.

So, returning to my original point: saying that Six Apart’s services were taken down as the result of a “sophisticated distributed denial of service attack” is an incredibly gracious statement that only addresses about 10% of the whole story. The other 90% of that story is that Blue Security, a company with already-shady practices, decided to solve its problems by dumping them onto Six Apart’s doorstep, something I’m pretty damn sure isn’t part of the TypePad service agreement. I know that ultimately, the denial-of-service attack came from the spammers themselves, but it was specifically redirected to the Six Apart network by Blue Security, and I hope that they get taken to the cleaners for this one.

(I’ve just begun experimenting with the social bookmarking/commenting site Digg; as I’m clearly in favor of more people understanding how the outage came to occur, feel free to Digg this post.)

Update: Computer Business Review Online has picked up the story, and has some other details. Netcraft also has a post on the DDoS, and News.com picked up the bit from them, but there’s not much more in either bit.

There’s been the tiniest bit of preview press given to Sphere, which bills itself as a weblog search engine and has been in soft-launch mode for a little while now. Today, the service actually went live, so I figured a little exploration might be in order. Alas, after spending a little time with it, I concluded that the folks in charge of Sphere might want to change its billing to reflect that it’s more a splog search engine — the sheer number of spam weblogs in the search returns is pretty amazing. That, combined with Sphere’s apparent indexing of quite a few non-weblogs, makes its usefulness dwindle quite a bit.

Here are a few example searches, looking at the first page of ten hits that Sphere returns:

  • razr v3c”: returns five spam weblogs, two questionable spam weblogs, one overt non-weblog, and two legitimate sites.
  • honda accord”: three spam weblogs, one non-weblog, six legitimate sites.
  • bluetooth headset”: four spam weblogs, three legitimate sites.
  • dual core intel”: three spam weblogs, one questionable spam weblog, six legitimate sites.

I don’t claim for these results to be rigorously scientific, only representative of the experience that’s led me to relegate Sphere to the bin of sites that seem to have gone live without addressing all the issues inherent in their areas of focus, and as such, aren’t really all that useful.

Jonathan Corbet, the Grumpy Editor over at LWN.net, has a reasonably good review of the current offerings in the world of Bayesian spam email filters. His tests hinted that SpamAssassin remains difficult to beat in terms of accuracy, but that it’s still the slowest and most computing-intensive of all the solutions out there (in part because SpamAssassin does a lot more than just act as a Bayesian filter). It’s the system I run all my incoming mail through, but I definitely feel the processor crunch at times — and it’s definitely the kind of service I’d love to offload if there were a reasonable and inexpensive way to do so.

(For those of you who aren’t hip to the lingo of internet system administration, Bayesian spam filters are “trainable” applications that scan incoming email and make predictions about whether any given message is spam, predictions that are based in part on the content of prior legitimate and illegitimate messages to the same users. Back in 2002, Paul Graham wrote an article which posited that applying Bayesian probability theories to email might help alleviate the growing spam problem, and since then the notion has established itself as one of the cornerstones of email administration.)

In an effort to cut down on spam email, a few years ago I put together a clean little framework for a contact form, and put it up in all the relevant places so that people could still send me the occasional note through the various websites that I host. Lately, I’ve been getting a bit of spam submitted through the forms, more annoying than voluminous, and then tonight I learned from Matt Haughey that he’s even seeing a steady stream of spam submitted through the “suggest a post” function of his website PVRblog. It’s amazing to me, only inasmuch as it’s clear that the content spammers are now literally shoving their bits into any and all <textarea>s they can find on the web, much like that dog in heat that won’t stop humping your lamppost.

There’s been a lot of (virtual) ink lately about weblog comment spam, and a similar amount of activity on the part of those who write weblog software to make the practice more difficult, less inviting, and easier to manage. Today, while trawling around in the forums for MT-Blacklist (a plugin for Movable Type that helps deal with the problem), I ran across this genius idea for further fighting comment spam — a proposition for a fake installation of MT that would serve as a honeypot for spammers, drawing their comments in and then feeding them directly into the anti-spam database. A neat idea, indeed, awaiting someone with the skills to implement it!

It’s disappointing to see an information security organization as good as SANS get an issue about information security so painfully wrong. In its weekly NewsBites newsletter (issue 48, not available in the online archive at the time of writing), the following entry appears as a link to an eWeek article:

—Spammers Exploit Anti-Spam Technology - DomainKeys
(29 November 2004)
Spammers have begun using DomainKeys to make their fake messages appear legitimate. DomainKeys was one of the more promising technologies designed to eliminate forging, but spammers appear to have co-opted it.
http://www.eweek.com/print_article2/0,2533,a=139951,00.asp

What’s the problem with this? That in this case, DomainKeys are actually doing their job, not somehow being controverted. Much like Sender Policy Framework, Yahoo’s DomainKeys technology is not an antispam solution, but an antiforgery solution. As it’s described on that Yahoo! page (and by Ars Technica in a review), DomainKeys provides a way for email recipients to see whether or not a piece of email comes from the sender it claims to have come from. In other words, DomainKeys only helps assess whether or not an email really did come from billg@microsoft.com; it specifically makes no claims about helping users figure out whether or not his product will actually make your penis grow five inches overnight.

So when SANS says that “spammers appear to have coopted” DomainKeys, everyone should all be ecstatic — that means that email users and administrators gain the ability to know for certain when email comes from certain mail servers and domains, and thus be able to block those servers and domains with absolute confidence that it’s the right thing to do. Shame on SANS (and Dennis Fisher at eWeek) for not knowing the difference.

I just made a change to my mail system that (hopefully) will put a clamp on the last bit of spam that’s been making it through. For years — ever since I registered my domain back in 1993 — I’ve been receiving all of the mail that’s sent to any random name at queso.com that doesn’t match one of the few users that I’ve configured. This was great, for a while; for the most part, it meant when websites and companies demanded an email address from me, I could create unique ones that would both let any email they sent reach me and allow me to pinpoint the companies that sold their email address lists to spammers.

Alas, with the good came the bad, namely that over the years, many, many web users have given websites fake email addresses at my domain. (I guess queso is a pretty common word, and from the makeup of most of the addresses, it looks like most of the abusers have been Spanish-speaking.) In the past six months, unsolicited email to those fake addresses have comprised between 50 and 75 percent of all unsolicited email that hit my server, and as the number kept increasing, I realized that I needed to do something about it.

So I’ve put together a new system for registering at websites, using another (less common) domain name. I’ve also gone to most every site and mailing list that I care about, making sure that I change my registration to the new domain. And after watching my inbox for the past month to make sure I haven’t missed anything (I’m sure I have, but I’m definitely at the point of diminishing returns), it was time to flip the switch.

We’ll see if this works!

I don’t know about you guys, but I think it’s hilarious that this article on the Forbes website about the felony convictions of two spammers has, among others, a paid advertisement linked into the article (“penny stock”) for an email marketing company.