This guide is meant only to show the steps needed to setup Spamassassin, which can be used to identify and eliminate unwanted spam email within the School of Computing. This is not a general guide to the use of Spamassassin. For more information, please consult the Spamassassin web site for more information.
Setting up Spamassassin to Identify Spam
The first step to dealing with the blight of spam is to identify it. This is exactly what spamassassin does. It doesn't remove or refile spam, merely tags incoming email messages with a rating of how confident it is the email is spam (we'll show you how to remove/refile it later).
All mail is automatically run through SpamAssassin via Amavisd-new on the mail server. If you would like to make changes beyond what Amavisd-new provides, just pop open your favorite text editor and add the following lines to the top of your ~/.procmailrc file:
# # Filter all incoming email through spamassassin to add in headers # :0fw | /usr/bin/spamc
After being processed, every email that comes in for you will get tagged with special headers that describe just how much spamassassin thinks this particular email is spam. Here is an example, with all the headers:
From
Tue Mar 11 14:28:49 2003 Return-Path: <
> To:
Date: Tue, 11 Mar 2003 12:30:45 -0500 From: Mama Maria <
> Subject: *****SPAM***** 6 Piece Pasta Bonus, Seen on TV. X-Spam-Status: Yes, hits=14.8 required=5.0 tests=AS_SEEN_ON,BIG_FONT,CLICK_BELOW,CLICK_HERE_LINK, CTYPE_JUST_HTML,EXCUSE_1,EXCUSE_16,EXCUSE_3,HTML_EMBEDS, HTML_FONT_COLOR_GRAY,HTML_FONT_INVISIBLE,LINES_OF_YELLING, MAILTO_LINK,MAILTO_TO_REMOVE,MAILTO_WITH_SUBJ, MAILTO_WITH_SUBJ_REMOVE,NORMAL_HTTP_TO_IP,OPT_IN, RCVD_IN_OSIRUSOFT_COM,RCVD_IN_SBL,SPAM_PHRASE_13_21, SUBJ_REMOVE,WEB_BUGS,X_OSIRU_SPAMWARE_SITE version=2.43 X-Spam-Flag: YES X-Spam-Level: ************** X-Spam-Checker-Version: SpamAssassin 2.43 (1.115.2.20-2002-10-15-exp) X-Spam-Prev-Content-Type: text/html
SPAM: -------------------- Start SpamAssassin results ---------------------- SPAM: This mail is probably spam. The original message has been altered SPAM: so you can recognise or block similar unwanted mail in future. SPAM: See http://spamassassin.org/tag/ for more details. SPAM: SPAM: Content analysis details: (14.80 hits, 5 required) SPAM: OPT_IN (1.5 points) BODY: Talks about opting in SPAM: AS_SEEN_ON (1.4 points) BODY: As seen on national TV! SPAM: SUBJ_REMOVE (0.7 points) BODY: List removal information SPAM: EXCUSE_3 (0.4 points) BODY: Claims you can be removed from the list SPAM: CLICK_BELOW (0.3 points) BODY: Asks you to click below SPAM: EXCUSE_1 (0.1 points) BODY: Gives an excuse about why you were sent this spam SPAM: EXCUSE_16 (0.1 points) BODY: I wonder how many emails they sent in error... SPAM: SPAM_PHRASE_13_21 (1.3 points) BODY: Spam phrases score is 13 to 21 (high) SPAM: [score: 18] SPAM: HTML_EMBEDS (0.4 points) BODY: HTML with embedded plugin object SPAM: HTML_FONT_INVISIBLE (0.3 points) BODY: HTML font color is same as background SPAM: BIG_FONT (0.3 points) BODY: FONT Size +2 and up or 3 and up SPAM: HTML_FONT_COLOR_GRAY (0.3 points) BODY: HTML font color is gray SPAM: LINES_OF_YELLING (0.2 points) BODY: A WHOLE LINE OF YELLING DETECTED SPAM: WEB_BUGS (0.2 points) BODY: Image tag with an ID code to identify you SPAM: CLICK_HERE_LINK (0.3 points) BODY: Tells you to click on a URL SPAM: MAILTO_LINK (0.2 points) BODY: Includes a URL link to send an email SPAM: NORMAL_HTTP_TO_IP (1.3 points) URI: Uses a dotted-decimal IP address in URL SPAM: MAILTO_WITH_SUBJ_REMOVE (0.6 points) URI: Includes a URL link to send an email with the subject 'remove' SPAM: MAILTO_WITH_SUBJ (0.4 points) URI: Includes a link to send a mail with a subject SPAM: MAILTO_TO_REMOVE (0.2 points) URI: Includes a 'remove' email address SPAM: RCVD_IN_OSIRUSOFT_COM (0.4 points) RBL: Received via a relay in relays.osirusoft.com SPAM: [RBL check: found 84.8.236.209.relays.osirusoft.com., type: 127.0.0.6] SPAM: RCVD_IN_SBL (3.2 points) RBL: Received via SBLed relay, see http://www.spamhaus.org/sbl/ SPAM: [RBL check: found 84.8.236.209.sbl.spamhaus.org.] SPAM: X_OSIRU_SPAMWARE_SITE (0.3 points) RBL: DNSBL: sender is a Spamware site or vendor SPAM: CTYPE_JUST_HTML (0.4 points) HTML-only mail, with no text version SPAM: SPAM: -------------------- End of SpamAssassin results ---------------------
<html> <head> <title>Better Pasta Pot</title> <meta http-equiv="Content-Type" content="text/html;"> </head> <body bgcolor="#ffffff"> <img xsrc="http://www.asseenontvnetwork.com/order/track.php?clid=6&gid=PP01&CID=LAEM0001TBPP01-1" height="1" width="1" border="0">
....the rest of the voluminous HTML email has been left out
Filtering Email Messages Marked as Spam
Whew! That is a lot of information. You can configure spamassassin to be more terse, and we'll get into that later. Right now we are just interested in the following header:
X-Spam-Status: Yes, hits=14.8 required=5.0
This is the magic line that we will filter on using procmail. One common practice is to refile all email marked as spam to another folder. That way, if something gets mistakenly tagged as spam, you can easily retrieve it by just copying it from one folder to another. Here are the procmail instructions to do refile mail that has been tagged as spam by spamassassin into a Maildir folder called "Spam":
# # Refile everything marked as spam to the "Spam" folder # :0 : spam.lock * ^X-Spam-Status: Yes $HOME/Maildir/.Spam/
Boom. That's it. Everything that Spamassassin thinks is spam will now not show up in your inbox, but instead will be refiled to a different folder called Spam. This way, if anything mistakenly gets identified as spam, you can just go to this folder and move it out into another folder.
Customizing of Spamassassin
There are tons of ways that you can customize spamassassin. Individual configurations go in the file ~/.spamassassin/user_prefs, and are listed one per line. Please see the Spamassassin Conf documentation for all the configuration details. Another page that is useful to read over is thelist of tests spamassassin performs.
There are a few SoC-specific customizations that we have done to help spamassassin do its job. Specifically, all mail that comes from utah.eduhas 1.5 subtracted from the score (decreasing the likelyhood it is spam), and another 1.0 is substracted if the email comes from cs.utah.edu, so don't feel like you need to do that on your own.
|