kieranbarnes Independent PHP, WordPress & CubeCart Programmer

SpamAssassin site wide spam learning

Posted on August 13, 2007

SpamAssassin is great. I wouldn't run a mail server without it. Obviously it isn't 100% from day one, thats where Bayes learning comes in. Yes, it auto-learns, but some times we want to convince it a little more.


Take my ISP setup for example. I encourace IMAP/WebMail users to slam spam that has slipped through into a Junk folder.
My Thunderbird has it's own Bayes engine which catches 89% of the 10% that slips through, that puts the messages there automatically.
* OK, they are random stats, but who cares.

Back to the point, take this sa_wrapper.sh that runs nightly from cron.
#!/bin/bash
echo "Forcing Expire..."
/usr/bin/sa-learn --force-expire
echo "Learning from Junk folders..."
/usr/bin/sa-learn --spam /home/vpopmail/domains/*/*/Maildir/.Junk/cur/*
echo "Cleaning Junk Folders..."
/bin/rm /home/vpopmail/domains/*/*/Maildir/.Junk/cur/* -rfv
echo "Current Bayes Info..."
/usr/bin/sa-learn --dump magic

This script learns the spam from all user's Junk folders and mops up after itself.

See the sa-learn man page for a detailed description of what is going on.

* I'm using the qmail/vpopmail combo which stores mail in MailDir format. If you don't use the same, I'm sure it can be adapted to suit..


Related posts

  1. SpamAssassin: How to protect against current spam attacks
    Christopher J. Buckley has posted a good article on protecting against current spam attacks. Go...
  2. Preventing MSN Messenger Spam in Pidgin
    I use Pidgin as my instant messenger application. Mainly because it allows me to chat...
  3. bash – Argument List Too Long
    I've got several scripts that have been failing recently due to the high number of...
  4. FuzzyOCR inspired PDF scanning for SpamAssassin
    I've just stumbled over a PDF scanning engine for SpamAssassin. In light of the recent...
  5. FuzzyOCR for SpamAssassin on Ubuntu
    FuzzyOCR is a plugin for SpamAssassin that analyzes the content and properties of images to...