I've been thinking a lot about the spam in blog comments that John Dowdell and Brajeshwar have been mentioning. This could become a huge problem if it gets automated in a big way. To really filter spam, or filter anything, you have to first determine what exactly makes it different. Some success has been had with Bayesian filtering, but really, at the heart, it isn't the text that makes spam spam. An example, if I copied a piece of spam and sent it to a person collecting spam samples for a filter, that would be legitimate mail right?

What really makes spam spam, and what makes it hateable, are two things:

1) there is no human at the other end
2) it is sent by the truckload

The best filter for point one would be a Turing test - eg: does the thing on the other end seem capable of human intelligence. Just a simple one of course - you wouldn't want to miss any important mail from world leaders. It could be as simple as a random picture, then the person has to type in 'tree' or 'book' etc. The problem with this would be language, so maybe best to stick with numbers (and I've seen this type of thing before on the net - can't remember where, anyone?). You generate a gif (or use swf) and it displays a human-but-not-machine readable image that represents a 4 digit number, different each time. Then when commenting, the user has to type in the numbers they see as a validating key. No match, no post. I think that would go a long way to heading off the problem - I'll try to make one for this .Text software, maybe this weekend, and see how it works.

But really, why stop at blog comments? The problem with e-mail filtering today, is there is always a chance you'll miss a message, yet still a good chance your kids will wake up to "xnawlprqdt olwinw big wide open pussies", along with a descriptive photograph (that is if you still dare give your kids an e-mail account). The idea would be that your server (or a mail program extension) has a list of 'allowed' and 'disallowed' e-mail addresses. You could seed these with your current address book, scans of mail you've answered, addresses you haven't deleted (which would include newsgroups), and so on. Of course you could edit it manually too, allowing all mail from *yourCompany.com for example. People you send to would automatically be added to the allowed list too.

Then, if your server gets a message from a new address, not in the 'allowed' list, it writes back automatically (using your address), with a picture of numbers (a Turing test) and a question (written in all the languages you speak, and sure, the numbers in all the localized glyphs if needed). Answering the question correctly will allow the sender to be added to your 'allowed' list and allow the message to arrive. This only has to happen once per person. If the person had the same filter installed, the Turing message you sent would still reach them - they would have just sent you mail (and thus you would now be in their allow list). This would prevent infinite ping pong, though you could still have safeguards for that. So they simply reply, type in the numbers they see, and then you can speak with each other as normal from then on.

The reason I think this would work, it there would be a cost to becoming an 'allowed' sender, or more like a 'friction'. The cost would be five seconds for a human, and thousands of hours for a bulk mailer. This filters the second unique thing about spam, it is sent out by the millions. Even for companies, like your bank, it is nice there is a cost. If they don't care enough about your message getting through that it isn't worth 5 seconds of their time, once, then you don't need their message right? Also, if they do care that much, they can still get through to you - something that isn't always allowed in present systems. Of course you can still move them to the disallowed list, or being there is a human at the other end, just write them and ask them to stop.

If the throngs of soon to be unemployed telemarketers are put to work answering Turing tests, you could always put a legal bit of text in the test - saying you do not accept unsolicited ads. Their response would be somewhat traceable here (they have to get the message back), and this type of thing would probably be enforceable. Probably some of this system could be circumvented, but the trick to stopping the 'million transactions a minute' aspect of spam is to add a tiny bit of friction - and perhaps risk - to each one.

The system would also be self propagating. Each verify message could contain information that explains and links to this type of software. ISPs could install it as an option for their customers. It is also simple enough that there couldn't be some dick that would have the bright idea of making it ad supported - or at least they wouldn't survive because in an afternoon you could make your own : ). I will try to get one of these going, maybe on a bogus address at first.

The best part here, is if this ever would work out, I could finally go back to being the only one on the block with a 12 incher - woot! If you have any thoughts on the matter (or the spam filter), please let me know : ).
posted on Friday, October 31, 2003 5:58 AM
Feedback
  • # re: A Cure for Spam?
    aSH
    Posted @ 11/1/2003 8:32 PM
    Honestly, I think that the only REAL solution is the law. I mean, why we should tolerate this kind of attacks to our privacy? The european Union, as I wrote today ( http://www.actionscripthero.com/blog/archives/000101.php ) has banned the spam:
    "With a limited exception - covering existing customer relationship - e-mail marketing is only allowed with prior consent. Disguised identities and invalid return addresses, often used by “spammers”, are also outlawed."
    Yeah, I know that is very hard to fight the spam, but a Nigerian spammer was arrested; italians spammers are facing jail, the UK bans spam messages and presses US, and "Last week a court in California issued a huge fine against one spamming company, and the United States Senate has approved a bill banning spam in the US. " BBC ( http://news.bbc.co.uk/2/hi/europe/3231861.stm ) Now the EU bans spamm.
    Tools are not going to solve the problem, but they can be helpful. Only the Law can protect us. just my 2 cents.

  • # re: A Cure for Spam?
    aSH
    Posted @ 11/1/2003 8:33 PM
    btw nice to see your rss feed!

  • # re: A Cure for Spam?
    Robin Debreuil
    Posted @ 11/2/2003 12:39 AM
    Yeah, I noticed that on your blog, great thing, way to go EU! : ). I totally agree that there needs to be laws, that could go a long way to solving this. Although I read not that long ago that 90% of spam is fraud, so it isn't like there isn't enough to nail these people as it is. Spam has beened banned in some states too as you mentioned, though so far that hasn't slowed it much (ok, at all, but these cases may have a chilling effect). The 'do not call' list is a hopeful sign too - that was really rammed through in spite of a very strong lobby against it. They should also go after the 'legit' comapines benefiting from others doing their dirty work, like insurance companies, banks, etc. In the end it might still be hard to insure your kids don't end up with porn in their inbox without a filter though, but here's to hoping : ).



  • # re: A Cure for Spam?
    aSH
    Posted @ 11/2/2003 4:56 AM
    Jesse Warden got 280 spam comments in a couple of hours... I'm for more resources and laws to fight the spam, but I welcome any initiative to protect privacy. If you have some time to set up a filter it would be very nice. Much appreciated. If you need beta testers, I'll be glad to help :)

  • # re: A Cure for Spam?
    Robin Debreuil
    Posted @ 11/2/2003 9:06 AM
    Wow, 280 - that is out of bounds! I think with the google power of blogs this is only going to become more common, unfortunately...

    I did try that image idea tonight rather than put it off. It is pretty easy to do. I know it won't stop individual spammers, but at least it (should) stop the bulk spammers. Thanks for all the feedback : )

    Cheers,
    Robin

Blog Stats

  • Posts - 121
  • Stories - 1
  • Comments - 1441
  • Trackbacks - 47

.Net Blogs

01101 Blogs

Flash Blogs

Graphics

People