SingleUserUnixInstall

Single-user SpamAssassin installation

This page will take you through a complete single-user SpamAssassin installation on a typical Unix account.

Many systems already have SpamAssassin and the support packages installed, in which case this whole page is unnecessary for you (you can check with spamassassin -V). However, if you want to run the newest version of SpamAssassin and all the related packages, you can always guarantee the setup by installing it yourself in your own directory. This also requires, of course, for you to update it yourself as new versions are released.

Note that, unfortunately, almost every Unix installation is slightly different, and that if you just blindly follow the commands here, this may not work, and you could even lose some mail. In other words, if you really don't know Unix, you may want to get someone who knows it better to help you with this install.

Overview

We're going to install SpamAssassin 3.0.2, and add in SPF, Razor, Pyzor, and DCC. We're going to set up a mistake-based Bayesian learner (including an IMAP LearnAsSpam folder and support for forwarding the mail to another account), as described at ProcmailToForwardMail. We're also going to create a web front-end to ease whitelist administration. This assumes that your system already has procmail installed, and has a new enough Perl and Python to work with this software, plus a number of the standard Perl modules, such as Net::DNS and DB_File. You need these two installed to use DNSBLs and Bayes, both of which are important for good performance.

Setting up your path

If your system now or in the future may have other copies of spamassassin or the other packages installed on it, we want to make sure that it uses the version we're installing. We do this by having the shell look first in our local bin directory. Even more important, we need to tell Perl where to find the packages we're installing locally, and we try to work around a language bug with some versions of Perl 5.8.

We can do this with bash by entering the following lines at the top of the .bashrc (pico .bashrc):

export PATH=$HOME/bin:$HOME/perl5/bin:$PATH
export MANPATH=$HOME/man:$HOME/perl5/man:$MANPATH
export PERL5LIB=$HOME/lib/perl5/:$HOME/lib/perl5/site_perl/5.8.3/:$PERL5LIB
export LANG=en_US

The 5.8.3 should be replaced with the version you get when entering (perl -v).

After saving and exiting (Choose Ctrl-X, then y, then enter), we reload the .bashrc with the command (cd ~;. .bash_profile). The same commands would work if your shell is sh, ksh, or zsh, by editing the corresponding rc file.

In csh .cshrc and tcsh .tcshrc, you would add the following lines:

setenv PATH $HOME/bin:$HOME/perl5/bin:$PATH
setenv MANPATH $HOME/man:$HOME/perl5/man:$MANPATH
setenv PERL5LIB $HOME/lib/perl5/:$HOME/lib/perl5/site_perl/5.8.3/:$PERL5LIB
setenv LANG en_US

Installing SpamAssassin

We're going to download SpamAssassin and the other packages into $HOME/src:

cd $HOME
mkdir src
cd src
wget http://www.apache.org/dist/spamassassin/Mail-SpamAssassin-3.0.2.tar.gz
tar xvzf Mail-SpamAssassin-3.0.2.tar.gzcd Mail-SpamAssassin-3.0.2
perl Makefile.PL PREFIX=$HOME && make && make install

Push enter 4 times (i.e., the defaults are all fine).

Testing installation

Make sure we're working with the version we just installed by entering which spamassassin and we should see something like (/home/myusername/bin/spamassassin).

Enter (spamassassin < $HOME/src/Mail-SpamAssassin-3.0.2/sample-spam.txt). You should see a message that spamassassin is creating user preferences file and then see the output of the message with the SpamAssassin markup.

If that doesn't work, look at the debug output with spamassassin -D < $HOME/src/Mail-SpamAssassin-3.0.2/sample-spam.txt and then perhaps take a look at FixingErrors.

SPF support

SpamAssassin 3.0 supports SPF to detect and penalize header forgery. This requires Mail::SPF::Query, a relatively new package that's not yet installed on most machines. You can confirm whether you have it by entering (perl -e 'require Mail::SPF::Query'). If you get the error "Can't locate Mail/SPF/Query.pm in @INC..." you need it, if you get no feedback you can skip to the next section.

To install SPF, do the following:

cd $HOME/src
wget http://spf.pobox.com/Mail-SPF-Query-1.997.tar.gz
tar xvzf Mail-SPF-Query-1.997.tar.gz
cd Mail-SPF-Query-1.997
perl Makefile.PL PREFIX=$HOME && make && make install

You can test this installation (and that PER5LIB is set correctly) with (perl -e 'require Mail::SPF::Query').

Razor support

To install the packages that Razor requires, do the following:

cd $HOME/src
wget http://unc.dl.sourceforge.net/sourceforge/razor/razor-agents-sdk-2.03.tar.gz
tar xvzf razor-agents-sdk-2.03.tar.gz
cd razor-agents-sdk-2.03
perl Makefile.PL PREFIX=$HOME && make && make install

To install Razor:

cd $HOME/src
wget http://unc.dl.sourceforge.net/sourceforge/razor/razor-agents-2.67.tar.gz
tar xvzf razor-agents-2.67.tar.gz
cd razor-agents-2.67
perl Makefile.PL PREFIX=$HOME && make && make install
razor-client
razor-admin -create
razor-admin -discover
razor-admin -register

It should then say "Register successful...". (Note that you may need to enter the last command a couple times to reach the registration server; if it says "Error 202", try "razor-admin -register" again.)

Pyzor support

To install Pyzor:

cd $HOME/src
wget http://unc.dl.sourceforge.net/sourceforge/pyzor/pyzor-0.4.0.tar.bz2
tar xvfj pyzor-0.4.0.tar.bz2
cd pyzor-0.4.0
python setup.py build
python setup.py install --home=$HOME
pyzor discover

If you get the following error message, define PYTHONPATH to point at ($HOME/lib/python):

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: No module named pyzor.client

DCC support

To install DCC:

cd $HOME/src
wget http://www.dcc-servers.net/dcc/source/dcc-dccproc.tar.Z
tar xfvz dcc-dccproc.tar.Z
cd dcc-dccproc-*
./configure --disable-sys-inst  --disable-server --disable-dccm \
--disable-dccifd  --homedir=$HOME/dir  --bindir=$HOME/bin
make && make install

Test spamassassin installation

First, create your Bayes databases by entering (sa-learn --sync).

You should now have all the packages you need installed. You can test this by entering

spamassassin -D < 
$HOME/src/Mail-SpamAssassin-3.0.2/sample-nonspam.txt

and carefully reviewing the output.Specifically, look for the following lines:

debug: bayes: found bayes db version 3
debug: is DNS available? 1
debug: registering glue method for check_for_spf_helo_pass (Mail::SpamAssassin::Plugin::SPF=HASH(0x8d21990))
debug: Razor2 is available
debug: Pyzor is available: /home/username/bin/pyzor
debug: DCC is available: /home/username/bin/dccproc

These lines confirm, in order, that DB_File, Net::DNS, Mail::SPF::Query, Razor, Pyzor, and DCC are all correctly installed and configured.

Configure procmail

Copy the sample .procmailrc from ProcmailToForwardMail. The easiest way to do this is:

cd $HOME
wget http://wiki.apache.org/spamassassin-data/attachments/ProcmailToForwardMail/attachments/procmailrc.forward.txt
mv procmailrc.forward.txt .procmailrc

It's essential that you edit that file with your correct public and private addresses. Do this with (pico .procmailrc).

If you don't want your mail forwarded to another account, you can instead use the example procmail file by entering (cp $HOME/src/Mail-SpamAssassin-3.0.2/procmailrc.example $HOME/.procmailrc).

Configure .forward

Follow the steps in the first section of UsedViaProcmail to enable procmail.

Specifically, if your system supports .forward files (as opposed to .qmail) and is not already processing mail through procmail, then edit your .forward. Replace user with your username (which you can discover by entering whoami) and entering the correct procmail path (which you can discover with which procmail):

cd $HOME
pico .forward
"|IFS=' ' && exec /usr/bin/procmail -f- || exit 75 #user"

Choose Ctrl-X, then y, then enter to save.

Test mail installation

Now, you should be ready to send some test emails and ensure everything works as expected. First, send yourself a test email that doesn't contain anything suspicious. You should receive it normally, but there will be a header containing "X-Spam-Status: No".

Now, send yourself a copy of the GTUBE test string to check to be sure it is marked as spam. That string is:

XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X

This email will be recognized as spam and put in the almost-certainly-spam folder. You should be able to see it by entering (less $HOME/mail/almost-certainly-spam).

If your test non-spam email doesn't get through to you, immediately rename your .forward file until you figure out cause of the the problem, so you don't lose incoming email. (mv .procmailrc .procmailrc.broken).

Note: one possible cause for problems is the use of smrsh on the MTA system; see ProcmailVsSmrsh for details.

End-user mail filtering

You now want to set up filtering in your mail client to automatically move likely spam to a Junk mail filter. (Note that the .procmailrc we're using leaves very high likelihood spam on the server (or drops it on the floor), so we never see it.) The directions for this are different for every mail client, but they all involve filtering on the header X-Spam-Flag: YES and moving the resulting mail to a junk folder.

If you have a false positive (a real mail that winds up in your junk folder, you can add a whitelist_from *@example.com line to your ($HOME/.spamassassin/user_prefs). For false positives (spam that gets through), redirect it to [MAILTO] spam@yourservername.com or just move it to the LearnAsSpam folder.

Enable IMAP LearnAsSpam folder

If your final delivery is to an IMAP accessible MTA, you can set up an even easier way to do mistake-based Bayesian learning. Namely, you can create a LearnAsSpam folder. Rather than resending spam for learning, you can just move any false negatives (spam that got delivered to your inbox) to this folder. Then, every hour, those mails are pulled down (and deleted) from your IMAP server and learned as spam. Specifically, many installations of Exchange server support access via IMAP, so this solution is one of the easiest ways to enable end-user Bayesian training by Exchange users.

To do this, we need fetchmail, which we can confirm is installed with (which fetchmail). First, we create a .fetchmailrc pico .fetchmailrc with our IMAP account information. This should look like the following, filling in your own information for the server, username, and password:

poll mail.example.com protocol IMAP:
user myusername with password mypassword

Now make it only readable to you with:

chmod 600 .fetchmailrc

In your mail client, create a top level IMAP folder called LearnAsSpam. Now, to test if the setup works, move some spam into this folder. It's essential that this be real spam or else you'll mistrain your Bayesian learner.

The path to fetchmail /usr/local/bin/fetchmail in the following command should be set to the results of (which fetchmail). From the command line, enter:

/usr/local/bin/fetchmail -a -v -n --folder LearnAsSpam -m '$HOME/bin/sa-learn -D --spam'

You should see debug information of fetchmail accessing your IMAP account and downloading one message at a time from the LearnAsSpam folder, and then debug info from sa-learn as it learns the message as spam. sa-learn is smart enough to automatically strip away the SpamAssassin markup, if any. The messages should have disappeared from the your LearnAsSpam folder. Once that's working well, you're ready to create a cron job to automatically do this every hour.

Enter the following commands:

echo "0 * * * * /usr/local/bin/fetchmail -a -s -n --folder \
LearnAsSpam -m '$HOME/bin/sa-learn --spam' > /dev/null" > cronfile
crontab cronfile
crontab -l

You should see the line starting with "0 * * * *" displayed. This means that you've set up a cron job to automatically run fetchmail every hour. In case you're curious, -a means all mail in the folder, -s is silent, -v verbose, -n means not to modify any headers, and -D turns on debugging in sa-learn. We redirect the output to /dev/null to avoid having cron email us the output from sa-learn about messages having been learned.

Enabling user configuration through webuserprefs

The LearnAsSpam folder is a great way to do mistake-based training of the Bayesian filter based on false negatives. However, when SpamAssassin (very occasionally) misclassifies a real mail (ham) as spam, I like to whitelist the sender to avoid it occurring again. The advantage of this approach is that it guarantees that any future mail with that From address will get through. The disadvantage is that spammers could forge that address to get spam through to me. However, since the addresses I'm entering are fairly random, I haven't had any problem with any forgery.

The easiest way to enable end-users to configure their whitelists is via a web user interface. I prefer [WWW] webuserprefs, which is flexible but fairly easy to install. These directions assume you have Apache web hosting and PHP support on the same account where you've installed spamassassin mail processing. Specifically, it assumes that your SpamAssassin user prefs are at $HOME/.spamassassin/user_prefs and that $HOME/public_html is the home directory of your website (or an alias to it). We're going to create a password-protected directory where you can edit your SpamAssassin preferences. (myusername needs to be the same as your username on this server, but mypassword can and probably should be different):

cd $HOME/src
wget http://voxel.dl.sourceforge.net/sourceforge/webuserprefs/webuserprefs-0.6.tar.gz
tar xvzf webuserprefs-0.6.tar.gz
mv webuserprefs-0.6 $HOME/public_html/webuserprefs
cd $HOME/public_html/webuserprefs
htpasswd -bc .passwd myusername mypassword

Enter pico .htaccess and create the following file (correcting the path, which you can find with pwd):

## password begin ##
AuthUserFile /usr/www/users/myusername/webuserprefs/.passwd
AuthName     "Protected"
AuthType     Basic
<Limit GET POST PUT>
require valid-user
</Limit>
<Files .passwd>
deny from all
</Files>
## password end ##

And we need to set permissions on the necessary files:

chmod 666 $HOME/.spamassassin/user_prefs
chmod 705 .htaccess

Now, we (pico config.php), removing the "// " in the line:

// require("auth/server.php");

We also need to set the correct path to your home directory. Enter (cd $HOME/.spamassassin; pwd). If the path is (/home/myusername), no change is necessary. If it is, (/usr/home/myusername), make the following change (or adjust the path accordingly):

$user_prefs     = "/home/$auth_user/.spamassassin/user_prefs";

To:

$user_prefs     = "/usr/home/myusername/.spamassassin/user_prefs";

Finally, find where $group_sort is set to no and change to:

$group_sort     = "yes";

You should now be able to access your preferences from [WWW] http://www.example.com/webuserprefs/, which should also require a username and password.

webuserprefs lets you configure a lot of things. In fact, if you (cp contrib/panels/* panels/) and reload the webpage, you'll see some extra panels that allow you to control even more. This is fine for power users, but the end users I'm working with don't want to be confused by all of these options, since the defaults I've set up for them are fine. They just want a simple way to edit their whitelist. So, (mv panels/* contrib/panels/) will get rid of all the panels and just leave the editing of whitelists and blacklists.

Now, if you find a sender whose mail is being incorrectly put in the Junk folder (a false positive), you can just go the webpage, enter their email address with Accept Mail From and click Add Rule. Also, occasionally, I use the Reject Mail From (blacklist) for senders that won't honor an unsubscribe. However, the Bayesian learning can work just as well as a blacklist. As the webpage describes, whitelists can also support wildcards, of the form:

*@unitedoffers.com

Follow-up

You'll want to subscribe to the spamassassin-announce [WWW] list to be alerted when new updates come out. Follow the same steps as your original install (with the new filename, of course), and the make install will automatically overwrite old versions.

If you want to install custom rules, such as those at CustomRulesets, just

cd 
$HOME/etc/mail/spamassassin

and wget the ones you want. Note that many of these rules havealready been incorporated into SpamAssassin 3.0.2 so you may have an unduly high risk of FalsePositives if you download more.

If you're using SpamAssassin for non-commercial use, you may also want to turn on the MAPS rules, which are useful DNSBLs. Edit the user_prefs by entering pico $HOME/.spamassassin/user_prefs and add the following 4 lines:

score RCVD_IN_MAPS_RBL 2.0
score RCVD_IN_MAPS_DUL 1.0
score RCVD_IN_MAPS_RSS 2.0
score RCVD_IN_MAPS_NML 2.0

Contributors


CategoryInstall

last edited 2005-04-16 02:58:54 by DanKohn