Step 1: Set up auto-training for the filter

In This Topic

The Bayesian filter can be auto-trained using ‘good’ emails (ham) and ‘bad’ emails (spam). The auto-training feature can be enabled under Servers > Localhost > Filters > MailEnable Bayesian Filter > Properties > Auto-training tab.

Setting	Description
Enable auto-training	Check this box to enable auto-training. While the Bayesian Filter is in auto training mode, the functions to manually update the dictionary using the “mespamcmd.exe” command utility (as mentioned in the Spam Training Utility section) do not function. This is because when the auto-training is running, new additions to the dictionary are stored in memory, and not written to the hard drive until the MTA service is stopped. A global filter with the 'Bayesian filter spam probability' criteria must be configured for auto-training to work. This is described in Step 4. If a filter is not configured with a Bayesian criteria, then no auto-training will occur.
Options (Process HTML content in Messages)	If this option is selected and the message contains HTML, then the HTML is parsed as well as the message plain/text boundary. Tokens will therefore also include data from the HTML messages. It makes the filter more likely to detect HTML as spam because the tokens/patterns of the HTML of bad messages can be used to calculate the probability of spam.
Spam Honeypot Email Addresses (Edit address list)	Define email addresses that do not receive valid mail for sampling. This is described in Step 2.
Ham Addresses (Edit address list)	Define "ham' or legitimate email addresses for sampling. This is described in Step 3.

Auto-training will only update the dictionary with additional spam messages when the corresponding total number of ‘good’ ham messages is the same or greater as the total number of ‘bad’ spam messages (and vice versa).