Spam filter not learning

Discussion forum for Enterprise Edition.
richelesro
Posts: 35
Joined: Tue Feb 04, 2003 5:39 pm
Location: Irvine, CA

Spam filter not learning

Post by richelesro »

I've been using ME Standard for years. We recently wanted to have active spam filtering at the server, so I installed a trail of Enterprise. I setup a honeypot account and a ham account, then sent hundreds of emails to the ham account. I setup MTA on auto-learning with a global filter set at 90%. Less than 1% of the emails into our system were marked with the filter, even though 75% of the emails we get are spam.

So I created a pulic folder, dumped 4,000 messages into the ham folder along with the 6,000 spams. I turned auto-learning off and did it manually. No change. Even if I turn the percentage down to 30%, less than 5% get caught as spam.

What am I doing wrong? I look at the dictionary through the system overview and it says 1,881 ham and 6,932 spam, which is the same as when I installed it. I need a spam filtering system that works ASAP.
James Bailey

moegal
Posts: 118
Joined: Mon Feb 09, 2004 10:30 pm

Post by moegal »

We never got the auto training to work either. It works well if you train it yourself. We mostly followed the instructions on:

http://www.mefilter.com/forum/topic.asp?TOPIC_ID=1283

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben »

setup a honeypot account and a ham account
I would set up more than one each accounts for the whole server in fact you can use the whole domain using wildcards.
then sent hundreds of emails to the ham account
How did you do this?
I setup MTA on auto-learning with a global filter set at 90%. Less than 1% of the emails into our system were marked with the filter, even though 75% of the emails we get are spam.
It may not be the only reason for your filtering getting a bad pick up percentage but your dictionary tokens are too much out of sync these must be closer. In fact when you are training the dictionary in auto mode then with
1,881 ham and 6,932 spam
you will not collect any new spam only ham messages until the ham catches up. And if you only have one l mailbox then this is going to be slow.
So I created a pulic folder, dumped 4,000 messages into the ham folder along with the 6,000 spams
If you did this manually and your message count is only
1,881 ham and 6,932 spam
then something has gone wrong.

I have sent through a document that better covers the configuration of Bayesian check this and more things about it setup will make sense.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben »

richelesro the mailbox you have associated with your Forum account does not resolve, so the document I sent you failed.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

boxfan
Posts: 6
Joined: Thu Apr 19, 2007 8:36 pm

Post by boxfan »

I would be interested in the document as well.

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben »

Just sent Boxfan.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

boxfan
Posts: 6
Joined: Thu Apr 19, 2007 8:36 pm

Post by boxfan »

Got it, thanks.

smilinmike
Posts: 3
Joined: Wed May 09, 2007 6:58 pm

Post by smilinmike »

MailEnable-Ben wrote:I have sent through a document that better covers the configuration of Bayesian check this and more things about it setup will make sense.
I would like a copy of this as well.

Thanks

peter
Posts: 6
Joined: Sat Nov 23, 2002 10:26 am

Post by peter »

Please, one copy of document for me as well.

dhaag
Posts: 2
Joined: Fri Apr 20, 2007 4:37 pm

Post by dhaag »

May I get a copy of this document also?

MailEnable-Ian
Site Admin
Posts: 9738
Joined: Mon Mar 22, 2004 4:44 am
Location: Melbourne, Victoria, Australia

Post by MailEnable-Ian »

Hi,

PM me with an email address to send the whitepaper to.

regards,

MailEnable Support.

trusnock
Posts: 132
Joined: Tue Jan 31, 2006 8:42 pm

Post by trusnock »

We set up Bayesian filtering about a week ago, and we're getting good results so far. However, even though our message count is quite balanced (10711 Ham, 10705 Spam), both the auto-training and manual training methods seem to have stopped working.

We have auto-training enabled, with about a dozen wildcard domains configured for Ham and about a dozen honeypot addresses configured for Ham.

The Ham/Spam count does not increase in the System Overview screen, even after restarting the MTA. I thought I noticed it increasing by at least a handful of messages per hour after we first configured it.

When we run the manual training script (copied from the ME manual), it reports:

Dictionary Size: 27044 tokens
Dictionary Loaded.
Ham added=0, Spam added=0

Even though there were messages in the Spam and NoSpam dictionaries we passed it. Again, I'm sure this worked earlier in the week because we successfully trained on over 20,000 messages.

We purged the dictionary because we were close to 200,000 tokens, but it still won't update.

Did I make a dumb mistake somewhere and turn something off that I'm forgetting? Is there a log for the auto-training that might help me figure out why nothing is changing?

amwmedia
Posts: 5
Joined: Wed Dec 05, 2007 6:14 pm

Spam added=0

Post by amwmedia »

trusnock wrote: Ham added=0, Spam added=0
I'm having this same problem! any ideas?

Thanks,
Andrew

trusnock
Posts: 132
Joined: Tue Jan 31, 2006 8:42 pm

Post by trusnock »

amwmedia,
A few days after I posted that message in January, our training just started working again! I suppose it was after restarting something, or rebooting the server, or purging the extraneous tokens again, but there was no "Eureka" moment... we just noticed it was working and it has been fine ever since.

-Tom R.

amwmedia
Posts: 5
Joined: Wed Dec 05, 2007 6:14 pm

Post by amwmedia »

Hi Tom,

Thanks for the feedback, I just restarted the server and it's still not working. Has ANYONE else out there experienced this problem?

I really need to get this working. This is a new install of ME and I've not seems this working yet at all. So for now I'm stuck with the default TAB file which doesn't catch much!

Thanks,
Andrew

Post Reply