How To Learn Spam Messages Collected
in an MS Exchange Mail Folder

You are using Microsoft Exchange™? You are moving the false positives or false negatives to separate mail folders? You now want to use these mail folders to train the SpamAssassin™ Bayes database? So now you are in trouble, because Microsoft uses proprietary file formats for its mail folders, incompatible with the mbox standard used by sa-learn.

There are two ways around this problem. One involves IMAP2mbox (available in the contribution area), which uses IMAP access to retrieve a mail folder for learning.

The other involves using one of the many mail clients that use the mbox format natively for just the same. Retrieve the mail from Exchange using POP3 or IMAP, and save the mails as mbox file compatible with sa-learn. Examples of suitable mail clients are Eudora, or Mozilla Thunderbird, the latter is available for free here.

The following step-by-step instructions assume you have an Exchange server, Outlook clients, and use IMAP2mbox as mbox conversion tool.


  1. Create a public folder for spam mail messages.
  2. Create a public folder for ham mail messages.

Instruct your users to drag-and-drop undetected spams to the 'Spam' public folder, and false positives to the 'Ham' public folder. They cannot forward them as this will lose the message headers, which are very important to the Bayes filter.

Using IMAP2mbox to convert mail folders to mbox format

The IMAP2mbox archive contains the IMAP2mbox manual, and sample batch files that retrieve a mailbox and learn the messages either as spam or as ham.

Edit the files to supply the necessary parameters (account names and passwords).

Ongoing Usage

Daily, or weekly the administrator should

  1. Look the messages you intend to learn over. Users cannot be trusted to always do the right thing, and once you have learned a ham message as spam, it is hard to find this error later and undo it.
  2. Run the batch files you have created earlier.

It might be a good idea to archive the messages for later reference (e.g. if you have to unlearn a message with sa-learn --forget)

Closing Remarks

Your feedback is welcome! Please submit hints and suggestions to .