|
I had been aware of the spambayes project for a while, but never got around to checking it out, much less installing it. I was assuming that you would need a certain email client to make it work (Outlook, or default Unix mail clients), or that I would have to write my own email client (in Python) and then integrate the package.
Fortunately, I turned out to be wrong. I now have spambayes installed on my Windows 2000 box, happily cooperating with Pegasus Mail. I didn't need a special plugin. All you need is a mail client that has some decent filtering features. It does not matter whether you use Outlook ( ), or Eudora, or Pegasus, or Joe's Own Email App.
So, the prerequisites are:
- email client with decent filtering functions (more precisely, that lets you filter on email headers)
- Python 2.2.2 or newer
- spambayes (duh)
The following text describes how spambayes works (although there are better sources; see below) and how it's installed. The assumed level is somewhere between expert (who doesn't need this text) and clueless newbie. :-)
How it works
spambayes maintains a collection of emails, which are classified as either spam or ham (where 'ham' obviously means, non-spam). You train it by feeding it more mails and telling it how it's classified.
Normally, when you check your email, your email client connect to a server, downloading new messages (if any). With spambayes, you run a proxy server that sits between your email program and the server. You tell your email program to connect to the proxy, and the proxy connects to the server. The proxy uses spambayes to inspect incoming messages, and tags them as spam or ham.
All messages are still downloaded as normal by your mail program, but they now have a new header, that looks like this:
X-Spambayes-Classification: spam
Instead of spam, it can also say ham, or unsure. This is where the filtering capabilities of your mail program come into play. Add a filter that looks for the header above, and if found, moves the message to a spam folder. (Or, you can delete it immediately, but this is not recommended until spambayes is well trained and has a very low level of mistakes.) In my case, I have two folders "spam" and "unsure"; Pegasus moves incoming spam/unsure to these, so I can check later what to do with them.
You can train spambayes using your mail client, as well. Simply send a mail to spambayes_spam@localhost (or spambayes_ham@localhost) to tell it whether that mail is spam or ham.
How to install
1. Download the latest version of spambayes from Sourceforge. Unpack and do a setup.py install. In my case, this installed the spambayes package in c:\python22\lib\site-packages\spambayes, and some scripts in c:\python22\scripts. Make sure you know where it installs it on your box.
2. Create a file bayescustomize.ini in the scripts directory. (I'm not sure if that is the preferred location, but it seems to work.) This is where you add data for your POP and SMTP servers. It will look roughly like this:
[pop3proxy]
listen_ports:9001
remote_servers:pop3.wanadoo.nl
[smtpproxy]
listen_ports=9999
remote_servers=mail.alltel.net
Or, if you use multiple POP servers (like me):
[pop3proxy]
listen_ports:9001, 9002
remote_servers:pop3.wanadoo.nl, earthlink.net
[smtpproxy]
listen_ports=9999
remote_servers=mail.alltel.net
The POP and SMTP servers can just be copied from your mail client settings. You will have to make up the ports yourself; I use 9001 for the first POP server, 9002 for the next, etc, and 9999 for the SMTP server.
3. Start pop3proxy.py in the scripts directory. This will function both as a POP server and an SMTP server. If it fails with an error message, chances are your configuration file is missing or has incorrect values.
4. In your mail program, change the POP and SMTP servers to localhost and the appropriate port. For example, where I first connected to pop3.wanadoo.nl on port 110 (default), I now want to connect to localhost on port 9001. Change these settings carefully for all your POP and SMTP servers, or things won't work.
5. To test if everything works correctly, check your mail. Messages coming in after the proxy was started should have the X-Spambayes-Classification header. (In Pegasus 4, you have to look at the "Raw view" tab for this. Your email program should also have a way to check a message's headers.) If it's missing, something is wrong, and you should check your settings, and/or if the proxy is (still) running.
6. Also test if you can send mail to spambayes_spam@localhost. Mail that you send there should not bounce.
7. To configure spambayes further, there's a web interface, that is accessible once the proxy runs. Point your trusty browser to http://localhost:8880. (If you're running spambayes now, that link should work.) Here, you can add new POP servers, train the system, etc. The interface pretty much speaks for itself.
8. Set appropriate filters in your mail client, moving spam and unsure to designated folders, so they won't clutter your inbox any longer.
9. This is not strictly necessary, but useful: I copied pop3proxy.py to pop3proxy.pyw, created a shortcut to it, and put that in the Startup menu. (Simply drag pop3proxy.pyw to the Startup folder in the start menu.) This way, the proxy will start automagically and invisibly when your computer starts. (Note: I haven't tested this yet, what does it do when you're not connected to the Internet?) You can do it with the .py file instead, if you like to watch the messages the proxy spits out.
Links
Further reading:
--
|