SpamAssassin site wide spam learning

August 13th, 2007
No Gravatar

SpamAssassin is great. I wouldn’t run a mail server without it. Obviously it isn’t 100% from day one, thats where Bayes learning comes in. Yes, it auto-learns, but some times we want to convince it a little more.


Take my ISP setup for example. I encourace IMAP/WebMail users to slam spam that has slipped through into a Junk folder.
My Thunderbird has it’s own Bayes engine which catches 89% of the 10% that slips through, that puts the messages there automatically.
* OK, they are random stats, but who cares.

Back to the point, take this sa_wrapper.sh that runs nightly from cron.
#!/bin/bash
echo "Forcing Expire..."
/usr/bin/sa-learn --force-expire
echo "Learning from Junk folders..."
/usr/bin/sa-learn --spam /home/vpopmail/domains/*/*/Maildir/.Junk/cur/*
echo "Cleaning Junk Folders..."
/bin/rm /home/vpopmail/domains/*/*/Maildir/.Junk/cur/* -rfv
echo "Current Bayes Info..."
/usr/bin/sa-learn --dump magic

This script learns the spam from all user’s Junk folders and mops up after itself.

See the sa-learn man page for a detailed description of what is going on.

* I’m using the qmail/vpopmail combo which stores mail in MailDir format. If you don’t use the same, I’m sure it can be adapted to suit..

Bookmark it del.icio.us | Reddit | Slashdot | Digg | Facebook | Technorati | Google | StumbleUpon | Window Live | Tailrank | Furl | Propeller | Yahoo


Was this post useful to you? Let me know, buy me a beer!
Alternatively, if you're feeling impecunious, you may like to subscribe to my RSS feed, or see other articles in the Linux category.

One Response to “SpamAssassin site wide spam learning”

  1. bash - Argument List Too Long | kieranbarnes Says:

    [...] for example, my site wide spamassassin Bayes learning script that runs nightly, it has started failing sometimes when a lot of spam has been caught by [...]

Leave a Reply