This guide is meant only to show the steps needed to setup Spamassassin, which can be used to identify and eliminate unwanted spam email within the School of Computing. This is not a general guide to the use of Spamassassin. For more information, please consult the Spamassassin web site
for more information.
Setting up Spamassassin to Identify Spam
The first step to dealing with the blight of spam is to identify it. This is exactly what spamassassin does. It doesn't remove or refile spam, merely tags incoming email messages with a rating of how confident it is the email is spam (we'll show you how to remove/refile it later).
All mail is automatically run through SpamAssassin via Amavisd-new on the mail server. If you would like to make changes beyond what Amavisd-new provides, just pop open your favorite text editor and add the following lines to the top of your ~/.procmailrc file:
# Filter all incoming email through spamassassin to add in headers
After being processed, every email that comes in for you will get tagged with special headers that describe just how much spamassassin thinks this particular email is spam. Here is an example, with all the headers:
Date: Tue, 11 Mar 2003 12:30:45 -0500
From: Mama Maria <
Subject: *****SPAM***** 6 Piece Pasta Bonus, Seen on TV.
X-Spam-Status: Yes, hits=14.8 required=5.0
X-Spam-Checker-Version: SpamAssassin 2.43 (184.108.40.206-2002-10-15-exp)
SPAM: -------------------- Start SpamAssassin results ----------------------
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM: Content analysis details: (14.80 hits, 5 required)
SPAM: OPT_IN (1.5 points) BODY: Talks about opting in
SPAM: AS_SEEN_ON (1.4 points) BODY: As seen on national TV!
SPAM: SUBJ_REMOVE (0.7 points) BODY: List removal information
SPAM: EXCUSE_3 (0.4 points) BODY: Claims you can be removed from the list
SPAM: CLICK_BELOW (0.3 points) BODY: Asks you to click below
SPAM: EXCUSE_1 (0.1 points) BODY: Gives an excuse about why you were sent this spam
SPAM: EXCUSE_16 (0.1 points) BODY: I wonder how many emails they sent in error...
SPAM: SPAM_PHRASE_13_21 (1.3 points) BODY: Spam phrases score is 13 to 21 (high)
SPAM: [score: 18]
SPAM: HTML_EMBEDS (0.4 points) BODY: HTML with embedded plugin object
SPAM: HTML_FONT_INVISIBLE (0.3 points) BODY: HTML font color is same as background
SPAM: BIG_FONT (0.3 points) BODY: FONT Size +2 and up or 3 and up
SPAM: HTML_FONT_COLOR_GRAY (0.3 points) BODY: HTML font color is gray
SPAM: LINES_OF_YELLING (0.2 points) BODY: A WHOLE LINE OF YELLING DETECTED
SPAM: WEB_BUGS (0.2 points) BODY: Image tag with an ID code to identify you
SPAM: CLICK_HERE_LINK (0.3 points) BODY: Tells you to click on a URL
SPAM: MAILTO_LINK (0.2 points) BODY: Includes a URL link to send an email
SPAM: NORMAL_HTTP_TO_IP (1.3 points) URI: Uses a dotted-decimal IP address in URL
SPAM: MAILTO_WITH_SUBJ_REMOVE (0.6 points) URI: Includes a URL link to send an email with the subject 'remove'
SPAM: MAILTO_WITH_SUBJ (0.4 points) URI: Includes a link to send a mail with a subject
SPAM: MAILTO_TO_REMOVE (0.2 points) URI: Includes a 'remove' email address
SPAM: RCVD_IN_OSIRUSOFT_COM (0.4 points) RBL: Received via a relay in relays.osirusoft.com
SPAM: [RBL check: found 220.127.116.11.relays.osirusoft.com., type: 127.0.0.6]
SPAM: RCVD_IN_SBL (3.2 points) RBL: Received via SBLed relay, see http://www.spamhaus.org/sbl/
SPAM: [RBL check: found 18.104.22.168.sbl.spamhaus.org.]
SPAM: X_OSIRU_SPAMWARE_SITE (0.3 points) RBL: DNSBL: sender is a Spamware site or vendor
SPAM: CTYPE_JUST_HTML (0.4 points) HTML-only mail, with no text version
SPAM: -------------------- End of SpamAssassin results ---------------------
<title>Better Pasta Pot</title>
<meta http-equiv="Content-Type" content="text/html;">
<img xsrc="http://www.asseenontvnetwork.com/order/track.php?clid=6&gid=PP01&CID=LAEM0001TBPP01-1" height="1" width="1"
....the rest of the voluminous HTML email has been left out
Filtering Email Messages Marked as Spam
Whew! That is a lot of information. You can configure spamassassin to be more terse, and we'll get into that later. Right now we are just interested in the following header:
X-Spam-Status: Yes, hits=14.8 required=5.0
This is the magic line that we will filter on using procmail. One common practice is to refile all email marked as spam to another folder. That way, if something gets mistakenly tagged as spam, you can easily retrieve it by just copying it from one folder to another. Here are the procmail instructions to do refile mail that has been tagged as spam by spamassassin into a Maildir folder called "Spam":
# Refile everything marked as spam to the "Spam" folder
:0 : spam.lock
* ^X-Spam-Status: Yes
Boom. That's it. Everything that Spamassassin thinks is spam will now not show up in your inbox, but instead will be refiled to a different folder called Spam. This way, if anything mistakenly gets identified as spam, you can just go to this folder and move it out into another folder.
Customizing of Spamassassin
There are tons of ways that you can customize spamassassin. Individual configurations go in the file ~/.spamassassin/user_prefs, and are listed one per line. Please see the Spamassassin Conf documentation for all the configuration details. Another page that is useful to read over is thelist of tests spamassassin performs.
There are a few SoC-specific customizations that we have done to help spamassassin do its job. Specifically, all mail that comes from utah.eduhas 1.5 subtracted from the score (decreasing the likelyhood it is spam), and another 1.0 is substracted if the email comes from cs.utah.edu, so don't feel like you need to do that on your own.