My relationship with dspam, a bayesian classifier based spam solution, has been unwavering since I first installed it in 2005. I previously used SpamAssassin. While SpamAssassin offers a wide variety of tests involving potentially dozens of sources including rulesets, DNS blacklists, URI blacklists, checksum databases, and bayesian classifying, I found its false negative rate unacceptable and resource overhead excessive.
Recently, the time for a fresh installation of the latest stable release of dspam had come. My prior install was using version 3.0, which has become quite old, though still effective. My previous install was simply a bruteforce effort involving procmail and some other nastiness. Instead of continuing that tradition, I have configured dspam to work with Exim4 and PostgreSQL on Ubuntu 6.06.
The configuration that follows allows Exim4 to pass incoming, non-local messages without a dspam header to dspam. The message is checked, then reinjected into Exim for local delivery if the message is deemed innocent. If the message is believed to be spam, the mail will be quarantined on the system. All tokens and user preferences are stored within a PostgreSQL database. The Web interface allows you to report false positives and negatives. Should that sound appealing, read on for details.
To install dspam, you will need the following entry uncommented in your sources.list.
deb http://us.archive.ubuntu.com/ubuntu/ dapper universe
And the usual installation of packages.
# apt-get install dspam \ libdspam7-drv-pgsql dspam-webfrontend
For dspam to run at start up, you must edit /etc/default/dspam accordingly.
START=yes
For dspam to speak to PostgreSQL, you will want to to modify /etc/dspam/dspam.d/pgsql.conf accordingly.
PgSQLServer 127.0.0.1 PgSQLUser dspam PgSQLPass PgSQLDb libdspam7drvpgsql
It’s necessary to create a database and populate it with the current schema.
# su - postgres -c 'createuser -R -D -S -e dspam' # su - postgres -c 'createdb libdspam7drvpgsql -O dspam' # su - postgres -c \ 'cat /usr/share/doc/libdspam7-drv-pgsql/pgsql_objects.sql | psql -U postgres libdspam7drvpgsql'
Finally, PostgreSQL must allow trust for the dspam user to the dspam database on localhost. While having dspam use a password is supposedly possible, I was never successful in having it connect to localhost via TCP with a password required. Add the following to the /etc/postgresql/8.1/main/pg_hba.conf file and restart PostgreSQL.
# Most specific first. # Must come before more general host rules. host libdspam7drvpgsql dspam 127.0.0.1/32 trust
Configuring dspam itself is necessary and no easy task. Given the wide ranging number of possible configurations dspam supports, there are many, many options. The configuration that follows supports using dspam over a UNIX socket in a daemon setup. Many options are described in the dspam.conf file in some detail while the purpose of others is entirely mystifiying.
StorageDriver /usr/lib/dspam/libpgsql_drv.so TrustedDeliveryAgent "/usr/sbin/exim4 -oi -oMr despammed" OnFail error Trust root Trust dspam Trust mail Trust mailnull Trust smmsp Trust daemon Trust Debian-exim TrainingMode teft TestConditionalTraining on Feature chained Feature whitelist Algorithm graham burton PValue graham Preference "spamAction=quarantine" Preference "signatureLocation=headers" Preference "showFactors=on" AllowOverride trainingMode AllowOverride spamAction spamSubject AllowOverride statisticalSedation AllowOverride enableBNR AllowOverride enableWhitelist AllowOverride signatureLocation AllowOverride showFactors AllowOverride optIn optOut AllowOverride whitelistThreshold HashRecMax 98317 HashAutoExtend on HashMaxExtents 0 HashExtentSize 49157 HashMaxSeek 100 HashConnectionCache 10 Notifications off PurgeSignatures 14 # Stale signatures PurgeNeutral 90 # Tokens with neutralish probabilities PurgeUnused 90 # Unused tokens PurgeHapaxes 30 # Tokens with less than 5 hits (hapaxes) PurgeHits1S 15 # Tokens with only 1 spam hit PurgeHits1I 15 # Tokens with only 1 innocent hit LocalMX 127.0.0.1 SystemLog on UserLog on Opt in ServerMode dspam ServerPass.heh "huh" ServerDomainSocketPath "/var/spool/dspam/dspam.sock" ClientHost /var/spool/dspam/dspam.sock ClientIdent "huh@heh" ProcessorBias on Include /etc/dspam/dspam.d/
In my configuration above, the usage of dspam is entirely optIn based. Using the database configuration, you absolutely must set any preferences, such as the optIn flag, using the dspam_admin tool. It will properly set the option in the database. The per user preference files on disk are completely ignored under the database drivers.
# dspam_admin list preferences jasonb # dspam_admin add preference jasonb optIn on
Also, add the dspam user to the Debian-exim group.
# usermod -G Debian-exim -a dspam
The Exim4 portion of my configuration is based almost entirely upon one by Simon McVittie.
First, in /etc/exim4/conf.d/router/550_exim4-local-dspam
dspam_router:
no_verify
check_local_user
condition = "${if and { \
{!def:h_X-My-Dspam:} \
{!eq {$received_protocol}{local}} \
{!eq {$received_protocol}{despammed}} \
{ <= {$message_size}{3M}} \
}\
{1}{0}}"
headers_add = "X-My-Dspam: scanned by $primary_hostname, $tod_full"
driver = accept
transport = dspam_transport
dspam_error_spam_router:
driver = accept
domains = example.com
local_part_suffix = -spam
transport = dspam_error_spam_transport
dspam_error_ham_router:
driver = accept
domains = example.com
local_part_suffix = -fp
transport = dspam_error_ham_transport
Next, in /etc/exim4/conf.d/transport/40_exim4-config_local_dspam the following will actually summon dspam. Notice only innocent mail is delivered, allowing for spam to be quarantined. If you deliver both innocent and spam, the former will not be quarantined even if quaranting is enabled in the dspam.conf configuration.
dspam_transport:
driver = pipe
command = "/usr/bin/dspam --client --deliver=innocent --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
user = dspam
group = dspam
log_output = true
return_fail_output = true
return_path_add = false
message_prefix =
message_suffix =
dspam_error_spam_transport:
driver = pipe
command = "/usr/bin/dspam --client --source=error --class=spam --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
user = dspam
group = dspam
log_output = true
return_fail_output = true
return_path_add = false
message_prefix =
message_suffix =
dspam_error_ham_transport:
driver = pipe
command = "/usr/bin/dspam --client --source=error --class=innocent --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
user = dspam
group = dspam
log_output = true
return_fail_output = true
return_path_add = false
message_prefix =
message_suffix =
To access the dspam Web interface, you must ensure that the dspam CGI will execute as the dspam group so it can access files in /var/spool/dspam and write to them. For Apache 1.3 on Ubuntu 6.06, you have to mv the suexec wrapper around.
# mv /usr/lib/apache/suexec.disabled /usr/lib/apache/suexec # invoke-rc.d apache restart
dspam will use the valid-user from AuthType Basic against whatever authentication framework you configure. For a small number of users I am simply using AuthUserFile.
AuthType Basic AuthUserFile /etc/apache/authz AllowOverride None AuthName "DSPAM Control Center" Require valid-user
If dspam does segfault, which sadly has happened, your mail will bounce.
Client exited with error -5 R=dspam_router T=dspam_transport: Child process of dspam_transport transport returned 251 (could mean shell command ended by signal 123 (Unknown signal 123)) from command: /usr/bin/dspam
Therefore, though beyond the scope of this discussion, should you configure monit to monitor your services, it can monitor dspam.
check process dspamd with pidfile /var/run/dspam.pid group dspam start program = "/etc/init.d/dspam start" stop program = "/etc/init.d/dspam stop" if failed unixsocket /var/spool/dspam/dspam.sock then restart if 3 restarts within 5 cycles then timeout