MailEnable Professional Guide
Message Filtering / Bayesian filtering / Manual training
In This Topic
    Manual training
    In This Topic

    Manual training of the Bayesian filter involves using scripts and the Spam Training Utility to update the dictionary file with spam and ham. Manual training can occur alongside auto-training and is a good way of adding extra emails that had avoided detection to the dictionary so they can be caught in future.

     Similar to auto-training, both spam and ham need to be collected, but the process for doing so varies, as detailed below.

    Collecting spam for manual training

    Two ways to collect spam for manual training purposes are:

    1. Creating a catchall address.  Set up a mailbox address (e.g. as a catchall address.  This address will collect all emails for a domain that do not have a mapping to a mailbox.  The majority of mail in this mailbox will be spam, as spammers will often send to unknown addresses for a domain.  Do not use the same address as one that is being used for auto-training.
    2. Using public folders.  Set up public folders for post offices for the purpose of collecting spam. IMAP users can drag and drop spam messages from their inbox into the public folder for collection. A script can then be scheduled to copy the content of these folders to a single spam repository folder for addition to the dictionary. For an example script, see the Manual Training section.

    Collecting ham for manual training

    One way of collecting ham for manual training is to configure a filter that collects mail from senders who have authenticated.  To do this, follow this procedure:

    The inbox of this mailbox can then be used as a source for ham messages to be used for manual training.

    Compiling the dictionary using a script

    In order to add emails to a dictionary, the Spam Training Utility is used. This will take spams and hams from two specified folders, process them and add them to the dictionary. Since the emails to add could be located in various public folders and catchall mailboxes, a scheduled DOS script would normally be used to copy the emails from these locations and put into two folders for the Spam Training Utility.  

    An example script for this is below. This script will also stop and start the MTA service in order to allow it to be used along with auto-training. Since the Spam Training Utility only works on the dictionary on the hard drive, the MTA service needs to be stopped to write out any auto-training additions that have been made.

    The script is just an example and would need to be modified to match the MailEnable configuration.

    Example Script
    Copy Code
    REM Copy mail stored by either a catchall account mailbox or filter into two folders,
    REM Spam and NoSpam which will be used by the training utility to add to the
    REM dictionary
    copy "C:\Program Files\Mail Enable\Postoffices\\MAILROOT\spam\Inbox\*.mai" "C:\Program Files\Mail Enable\Dictionaries\Custom\Spam\*.*"
    del /Q "C:\Program Files\Mail Enable\Postoffices\\MAILROOT\spam\Inbox\*.mai"
    copy "C:\Program Files\Mail Enable\Postoffices\\MAILROOT\ham\Inbox\*.mai" "C:\Program Files\Mail Enable\Dictionaries\Custom\NoSpam\*.*"
    del /Q "C:\Program Files\Mail Enable\Postoffices\\MAILROOT\ham\Inbox\*.mai"
    REM Now the email from Public folders is copied. Normally only junk emails will be
    REM used when using Public Folders for dictionary training
    copy "C:\Program Files\Mail Enable\Postoffices\\PUBROOT\SPAM\*.mai" " C:\Program Files\Mail Enable\Dictionaries\Custom\Spam\*.*"
    REM Remove the index file and messages from the folder
    del /Q "C:\Program Files\Mail Enable\Postoffices\\PUBROOT\SPAM\*.mai"
    del /Q "C:\Program Files\Mail Enable\Postoffices\\PUBROOT\SPAM\*.xml"
    REM Stop the MTA service to write out any auto-training dictionary
    net stop MEMTAS
    REM Process the messages in the dictionary files and convert them to the dictionary token file.
    mespamcmd -m "c:\Program Files\Mail Enable\Dictionaries\default\" “c:\Program Files\Mail Enable\Dictionaries\Custom\Spam” "c:\ Program Files\Mail Enable\Dictionaries\Custom\NoSpam"
    REM Clean up the dictionary spam and ham folders
    del /Q "C:\Program Files\Mail Enable\Dictionaries\Custom\Spam\*.MAI"
    del /Q "C:\Program Files\Mail Enable\Dictionaries\Custom\NoSpam\*.MAI"
    REM Start the MTA service
    net start MEMTAS