SpamAssassin site wide spam learning
August 13th, 2007SpamAssassin is great. I wouldn’t run a mail server without it. Obviously it isn’t 100% from day one, thats where Bayes learning comes in. Yes, it auto-learns, but some times we want to convince it a little more.
Take my ISP setup for example. I encourace IMAP/WebMail users to slam spam that has slipped through into a Junk folder.
My Thunderbird has it’s own Bayes engine which catches 89% of the 10% that slips through, that puts the messages there automatically.
* OK, they are random stats, but who cares.
Back to the point, take this sa_wrapper.sh that runs nightly from cron.
#!/bin/bash
echo "Forcing Expire..."
/usr/bin/sa-learn --force-expire
echo "Learning from Junk folders..."
/usr/bin/sa-learn --spam /home/vpopmail/domains/*/*/Maildir/.Junk/cur/*
echo "Cleaning Junk Folders..."
/bin/rm /home/vpopmail/domains/*/*/Maildir/.Junk/cur/* -rfv
echo "Current Bayes Info..."
/usr/bin/sa-learn --dump magic
This script learns the spam from all user’s Junk folders and mops up after itself.
See the sa-learn man page for a detailed description of what is going on.
* I’m using the qmail/vpopmail combo which stores mail in MailDir format. If you don’t use the same, I’m sure it can be adapted to suit..
| Bookmark it del.icio.us | Reddit | Slashdot | Digg | Facebook | Technorati | Google | StumbleUpon | Window Live | Tailrank | Furl | Propeller | Yahoo |
Was this post useful to you? Let me know, buy me a beer!
Alternatively, if you're feeling impecunious, you may like to subscribe to my RSS feed, or see other articles in the Linux category.
August 13th, 2007 at 17:07
[...] for example, my site wide spamassassin Bayes learning script that runs nightly, it has started failing sometimes when a lot of spam has been caught by [...]