Advanced Layouts

Clapf supports larger setups when you have more antispam nodes. Feel free to experiment with the following examples, or create your own. Use the following layouts as recipies from a cookbook. Have your own layout? Be sure to share it with us!

#1: External SQL server, no quarantine

This is the simplest way to use multiple clapf nodes if you don’t need to maintain a quarantine: just point them to use an external sql server, and make sure you assign each clapf nodes a unique server_id (0..255).

#2: Master - slave SQL servers, no quarantine

In this scenario each clapf nodes have their own slave sql servers. The important change from the previous setup is that you can’t just write the clapf database since a slave sql server should be used as read-only. Thus we disable updating the token timestamp values (update_tokens=0). Note that the counters should be put to a memcached / redis server, and then to the master, instead of directly writing to the local slave. The advantage of this setup is that the local slave sql servers can be highly optimised to support higher read performance.

Also note that you need different spam@, and ham@ addresses for training using an extra clapf node which updates the master sql server with training requests.

And note that we have to find a workaround to write the history table on the master sql server, so we set history=1, thus the clapf nodes will write the history information to short files, and we need a cron entry for user clapf to process them, and write them to the master sql server:

*/5 * * * * /usr/bin/php /usr/local/libexec/clapf/history-to-sql.php --host 192.168.100.100 --database clapf --username clapf --password verystrongsecret

#3: External SQL server with quarantine

This setup is slightly different than the #1 scenario. Now we need a quarantine, so we instruct clapf to discard the recognised spam emails, and rather to deliver them to users, keep them in a quarantine:

spam_overall_limit=0.92
spaminess_oblivion_limit=0.92

The only problem is that we have more clapf nodes, so we have to find a way to move these spam emails (ie. files) to the host running the GUI. I recommend to use rsync:

*/5 * * * * /usr/bin/rsync -azr --remove-source-files /var/clapf/queue/ gui-node:/var/clapf/queue

#4: Master - slave SQL servers with quarantine

In this scenario we combine the quarantine feature with the master - slave SQL server layout. We need the following settings and cron jobs:

spam_overall_limit=0.92
spaminess_oblivion_limit=0.92
*/5 * * * * /usr/bin/rsync -azr --remove-source-files /var/clapf/queue/ gui-node:/var/clapf/queue
*/5 * * * * /usr/bin/php /usr/local/libexec/clapf/history-to-sql.php --host 192.168.100.100 --database clapf --username clapf --password verystrongsecret