Bayesian filtering

Discussions on webmail and the Professional version.
abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Post by abbynormal1 » Wed Jun 28, 2006 6:04 pm

keithc wrote:How in the world do you have so much ham?
So my question really was : how are you keeping your ham balance up with the spam one?
I have a filter set up to catch any email coming from *@domain1.com, *@domain2.com and *@domain3.com. Any users with addresses on these three domains who send mail have their mail added to ham. These domains are trustworthy to only send legitimate emails they all send out a fair volume. In fact, my ham is exceeding my spam at the moment.
LumTech wrote:How are you getting your statistics of blocked messages to know what percent you are at?
I have a batch scheduled to run every night at 5am. When that batch finishes adding all the spam messeges in the spam IMAP inbox to the dictionary, it deletes them. So each night I look on my pc where the IMAP address is set up and I can see the number of spam emails that have come since the last time they were deleted. I then look at the number of "Bayesian filter" notifications I have received during the day (I receive a notification each time a spam has been filtered). So I simply divide the number of spam messages I've received into the sum of spams+bayesian filter notifications to get the percentage of spam I am still receiving, or vice versa to get the percentage of spam that's being blocked.

Hope that makes sense.

Also, I've temporarily disabled ham collection to allow spam collection to catch up.

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

messages I cant filter out

Post by abbynormal1 » Tue Jul 04, 2006 8:11 am

Lately, we've been getting a lot of spam that is seemingly meant to avoid Bayesian type filters. These are emails who's messages are an image of the text they are trying to get to me. So, there's nothing in the email except a big image of the spam message.

Anyone know if Bayesian can filter this out, or some other method I can use in addition to Bayesian?

rockinthesixstring
Posts: 844
Joined: Mon Dec 05, 2005 7:51 am
Location: Canada

Post by rockinthesixstring » Tue Jul 04, 2006 4:39 pm

for emails like that i use the SPF filter and the GAP filter found in MEFilter. you can find out more at http://www.mefilter.com/forum - you will have to register.
Chase
Server 2008 Standard (x64)
ME Ent 6.51 (SQL Server 2008 Config)
ASSP 1.9

support@aspportal.net
Posts: 4
Joined: Mon Jul 10, 2006 8:21 am
Location: Austria

Good combination

Post by support@aspportal.net » Mon Jul 10, 2006 8:23 am

I have jsut upgraded to ME Pro 2.x and it seems nice.

I think the SPAM filter could have been worked on more before the release but...

I found by using the new Bayesian filter with MEFilter almost all spam can be stopped, this is while my Bayesian filter is still in training.

I can really suggest MEFilter.
Please note it is not upgrade friendly and has alot of issues but if you get it to work it really helps with spam.
Gerhard Kruger

rockinthesixstring
Posts: 844
Joined: Mon Dec 05, 2005 7:51 am
Location: Canada

Post by rockinthesixstring » Mon Jul 10, 2006 3:28 pm

we are working on the upgradability of MEFilter. there will soon be a database backup utility in the WebUI. then upgrading and restoring the database will be a brease.
Chase
Server 2008 Standard (x64)
ME Ent 6.51 (SQL Server 2008 Config)
ASSP 1.9

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Looked at MEFilter, decided against

Post by abbynormal1 » Mon Jul 10, 2006 11:56 pm

Well, I registered for MEFilter, installed it (partially) and then deleted it. Installation is extremely unfriendly with very poor documentation. I hope ME develops something of their own to help with spam beyond what Bayesian can do.

MEFilter only support MS Access databases - WTF? What kind of programmer writes a program for mail servers based on MS Access rather than a real db engine like SQL or MySQL? This is what finally made me delete MEFilter. After trying to figure out what the guy means in the configuration instructions for an hour I finally learned that to get MEFilter to work with a real DB engine I would have to go through a giant mess of scripts and updates and other things I have no idea about. The thought of spending 24 hours setting up and configuring something that may not even work anyway was too much.

rockinthesixstring
Posts: 844
Joined: Mon Dec 05, 2005 7:51 am
Location: Canada

Post by rockinthesixstring » Tue Jul 11, 2006 12:05 am

i am sorry for your experience. i currently run MEF on SQL Server, and it works great. installing MEF is becoming easier and easier, but you are right it is still more difficult to install over a point and click installation. it needs to run in a subdirectory of ME, and that is what is preventing the point and click method.

there is now someone in charge of the SQL script, and it seems to work really well, MySQL is still a little ways off but is being worked on. there is also a Web Interface for simple data entry of Phrases and lists.

the most receint release of MEF has some really great features, one of which is SURBL checking.

if you are still looking for an inexpensive filter, i would hope you would come back to MEF (you have already paid for it). i would personally do my best to help you with any installation issues.
Chase
Server 2008 Standard (x64)
ME Ent 6.51 (SQL Server 2008 Config)
ASSP 1.9

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Post by abbynormal1 » Tue Jul 11, 2006 12:23 am

Thanks Chase. I'm not sure what you mean by the "point and click installation". Aren't all installations point and click? Or is there a command line installation that will make the task easier? I don't care wether it's SQL or MySQL, but how many servers really run Access as their DB engine? My question is why make that the default?

Installation and documentation just isn't right for me. I didn't know what the hell ODBC data sources was and had to Google it. A "Start Menu>Programs>Administrative Tools>Data Sources" would have done the trick.

Basically, everything in the install and config according to the documentation and forums posts seemed so confusing that I decided to quit attempting it.

Thanks for offering your free utility to the world, I'm just sorry I can't use it.

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben » Tue Jul 11, 2006 1:00 am

I have a document on the Bayesian spam filter that will help if anyone wants please PM me and I will send it through. It is an unsupported document almost a white paper but is pretty extensive. And the apparently part is correct you will need to collect your ham and spam on your server using other dictionary files is no where near as successful. But more is covered in the doc.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

collecting ham even though disabled

Post by abbynormal1 » Thu Jul 20, 2006 5:58 am

I've disabled the Ham collection Filter in Messaging Manager>Filters but for some reason it is still adding Ham to the dictionary.

I'm now at 191861 spam tokens and 300373 ham tokens. I desperately need to pause ham collection for a while to allow spam collection to catch up.

Help appreciated!

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Post by abbynormal1 » Thu Jul 20, 2006 6:02 am

I'd also like to add that my dictionaries folder is 419 megs. Can I delete the contents of the Spam and NoSpam directories without losing my work?

The batch that I have running each night is:

move C:\Progra~2\MailEn~1\Postoffices\domain.com\MAILROOT\spammailbox\Inbox\*.mai C:\Progra~2\MailEn~1\Dictionaries\Spam\
move C:\Progra~2\MailEn~1\Postoffices\domain.com\MAILROOT\ham\inbox\*.mai C:\Progra~2\MailEn~1\Dictionaries\NoSpam\
net stop memtas
mespamcmd -m "C:\Progra~2\MailEn~1\Dictionaries\MailEn~1.TAB" "C:\Program Files (x86)\Mail Enable\Dictionaries\Spam" "C:\Program Files (x86)\Mail Enable\Dictionaries\NoSpam"
net start memtas

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben » Thu Jul 20, 2006 6:21 am

I dont see where you are removing the files from your Spam and NoSpam folders, if you are not removing them then you will continually be adding the same messages i.e

C:\Progra~2\MailEn~1\Dictionaries\NoSpam\
C:\Progra~2\MailEn~1\Dictionaries\Spam\

After the migrate you should run a del /Q to remove the messages.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Post by abbynormal1 » Thu Jul 20, 2006 6:46 am

So, I just add this to the end of the batch?

del *.* C:\Progra~2\MailEn~1\Dictionaries\NoSpam\
del *.* C:\Progra~2\MailEn~1\Dictionaries\Spam\

MailEnable-Ben
Posts: 5858
Joined: Fri Jan 16, 2004 6:49 am
Location: Melbourne

Post by MailEnable-Ben » Thu Jul 20, 2006 7:15 am

Yes, you may like to look at your dictionary and start a new one as the variation is probably too great.
Regards,

Product Services
MailEnable Pty Ltd

To keep track of all ME company updates and version releases you should subscribe to the MailEnable list at http://www.mailenable.com or the RSS feed http://www.mailenable.com/rss.

abbynormal1
Posts: 33
Joined: Mon May 08, 2006 7:54 am

Post by abbynormal1 » Thu Jul 20, 2006 4:34 pm

Added those lines and ran the batch to recreate dictionary. Did fine until it got to the part where the delete command came up. It asked me to confirm y/n if I wanted to delete the files. Anyway to bypass that?

Post Reply