Distributed checksum SPAM filtering

Discussion for developers using MailEnable.
Post Reply
Posts: 30
Joined: Mon Jul 07, 2003 7:05 pm
Location: Budapest, Hungary

Distributed checksum SPAM filtering

Post by gkovacs » Tue Apr 06, 2004 11:43 pm

In the past months, I have evaluated a couple of solutions against spam. Most of them didn't impress me, to say the least. Also, I would like to implement something efficient at the server level, but rule/keyword based filtering is tedious and has a high false positive rate, and Bayesian filtering has it's own problems as well (it can be fooled, it's very resource intensive and it's hard to train properly, etc.)

I have just found out about a neat concept, distributed checksum networks. They work like this: every mail that a "network user" classifies as spam gets into their distributed database as a special checksum number. If more and more people classify the same as spam the network learns it, and other network users will be automatically protected from it. (Yes, the mail filter must communicate when filtering each message, but network traffic is very low until about 100.000 messages / day. Then it's advised that you set up an own server.)

One implementation is called DCC and the other is Razor. They have hundreds of thousands of users. Both of them are implemented in Perl and require Unix to function properly. (SpamAssassin supports them but not on Windows.) They seem to be pretty effective, I'm presently trying the commercial implementation of Razor called Spamnet by Cloudmark as an Outlook plugin.

Razor is http://razor.sourceforge.net
DCC is http://www.rhyolite.com/anti-spam/dcc/
Spamnet is http://www.cloudmark.com

These graphs are very interesting regarding the spam-catching performance of DCC:

I would really like to see a filter that implements this for Mailenable. I think most of the people here would even pay a small one-time fee to justify the development efforts. (I have notified the developer of MEFilter about this.)

Anyone interested in coding this for MailEnable?

Greg Kovacs

Posts: 111
Joined: Tue Oct 14, 2003 2:56 pm

SPAM filtering

Post by Kogo » Fri Apr 09, 2004 2:41 pm

I'm interested in applying SOME sort of spam filtering to ME... If there's nothing already available I could probably be convinced to code something.

Thell me: there are some messages that may be considered spam by some, but not by others. How do the systems you suggest handle that type of message?

Posts: 438
Joined: Wed Sep 04, 2002 3:04 pm

Post by sunpost » Fri Apr 09, 2004 6:57 pm

a system can be designed to allow users to bypass the filter.

the system above could benefit by having people catagorize the type of spam they report. this way, for example, a user can get all the pr0n spam, but filter the mortgage offers spam.


spam trap

Post by CarlK » Thu Apr 22, 2004 1:22 pm

I have an account that I stopped using over 2 years ago. I am confidant that the 300 messages /day are not anything I asked for, so spam. I log all the subjects to a db, and one of the filters my current account looks for "spam subjects"

Posts: 57
Joined: Sat Feb 28, 2004 9:56 pm

Re: SPAM filtering

Post by DavidPayer » Fri Apr 23, 2004 5:06 pm

Kogo wrote:I'm interested in applying SOME sort of spam filtering to ME... If there's nothing already available I could probably be convinced to code something.
I would like to encourage all ME users to take a look at EWall:

http://www.sssolutions.net. They offer what they call an X version of their product for low cost mail servers. They JUST included ME as one of the X products. ($99)

You can do several types of filtering through it and it allows rules based analysis. You can also integrate Spam Assassin as well and run your virus scanners on it.

It runs as either an SMTP proxy as a stand alone product or can be integrated onto the same machine you are using for mail.

Their newsgroup is at news://news.sssolutions.net and the group is friendly.

David Payer
OMNI Internet
When you get to the fork in the road . . . take it!

Posts: 30
Joined: Mon Jul 07, 2003 7:05 pm
Location: Budapest, Hungary

Different solutions

Post by gkovacs » Wed Apr 28, 2004 5:26 pm

As far as my experience goes, distributed checksum spam filtering is much better than any other filtering method. The reasons:

- no blacklisted words or phrases, no false positives from simple text variations
- very fast processing compared to bayesian filtering
- no per user / per domain / per server maintenance required, since all spam classifications originate on the (dcc or razor) network
- updated continously by hundreds of thousands of people

Basically you install it and it works from there. And to reply to the comment about different users classifying different types of messages spam: I don't care. No porn, no viagra, no creditcards, no mortgages. If any of my users are interested in this type of crap, they should easily find it on the web. Mail should be free from this, and hundreds of thousands of people can't be wrong.

Anyone interested in porting this? Most of us would even pay a small amount for a spam filter that just works.

Also, please don't advertise products on this forum, or at least open a new thread to it. Don't pollute our discussion with irrelevant products.

Greg Kovacs

Post Reply