SpamAssassin site wide spam learning
August 13th, 2007SpamAssassin is great. I wouldn’t run a mail server without it. Obviously it isn’t 100% from day one, thats where Bayes learning comes in. Yes, it auto-learns, but some times we want to convince it a little more.
Take my ISP setup for example. I encourace IMAP/WebMail users to slam spam that has slipped through into a Junk folder.
My Thunderbird has it’s own Bayes engine which catches 89% of the 10% that slips through, that puts the messages there automatically.
* OK, they are random stats, but who cares.
Back to the point, take this sa_wrapper.sh that runs nightly from cron.
#!/bin/bash
echo "Forcing Expire..."
/usr/bin/sa-learn --force-expire
echo "Learning from Junk folders..."
/usr/bin/sa-learn --spam /home/vpopmail/domains/*/*/Maildir/.Junk/cur/*
echo "Cleaning Junk Folders..."
/bin/rm /home/vpopmail/domains/*/*/Maildir/.Junk/cur/* -rfv
echo "Current Bayes Info..."
/usr/bin/sa-learn --dump magic
This script learns the spam from all user’s Junk folders and mops up after itself.
See the sa-learn man page for a detailed description of what is going on.
* I’m using the qmail/vpopmail combo which stores mail in MailDir format. If you don’t use the same, I’m sure it can be adapted to suit..
Was this post useful to you? Let me know, buy me a beer!
Alternatively, if you're feeling impecunious, you may like to subscribe to my RSS feed, or see other articles in the Linux category.
August 13th, 2007 at 17:07
[...] for example, my site wide spamassassin Bayes learning script that runs nightly, it has started failing sometimes when a lot of spam has been caught by [...]