Increase Bayesian Token DB max size?

Increase Bayesian Token DB max size?

Postby EKjellquist » Wed Sep 24, 2014 2:13 pm

Back in 2.60, the max size of the Bayesian DB was increased to 1500k entries, which was a good step, but I'm wondering if it would be possible to increase that again? I've been at the max token limit for some time, and as a result, newer spam has been able to bleed through the walls more easily. I wasn't sure if the db limit was more of an internal DB memory limitation, or perhaps a x86 limitation? If I could point it at a SQL Server express db to store tokens, at least on my current hardware, the db could be far bigger without performance issues (or at least I'd like to test that theory).

Any chance of that token limit being able to increase?
EKjellquist
 
Posts: 89
Joined: Tue Sep 09, 2014 10:40 pm

Re: Increase Bayesian Token DB max size?

Postby Code Crafters » Fri Sep 26, 2014 10:36 am

We can increase the max limit but as you said this is to directly limit RAM usage. This was also set to avoid too much processing slow down with too many tokens when scanning emails but obviously modern computer RAM and processing power have increased which should allow a larger token limit now. Another option is to retrain the bayesian with only newer emails if you've saved them all which we recommend you do. However, tokens not used for 60 days by default are also deleted to try and free up the token database memory. Please see below for the help manual comment related to this setting for our recommendations.

Max Tokens - This sets the maximum number of tokens allowed in the database. It is recommended that you use a value between 250,000 and 500,000 to ensure optimum performance. Please note that for every 100,000 tokens, 10MB of system memory will be required.


This means that at the limit you have set of 1.5 million tokens the RAM usage for this should be roughly 150MB; obviously modern servers can have 8GB, 16GB or even more so this limitation may not be a problem now. I'll try to raise the limit in the next update.
Code Crafters
 
Posts: 933
Joined: Mon Sep 10, 2007 2:35 pm

Re: Increase Bayesian Token DB max size?

Postby EKjellquist » Wed Oct 08, 2014 4:37 pm

Chris,

Having looked at this over long periods of time, the amount of spam we get has more or less followed trends aligning to the activity of larger botnets and spammers (when they're active and when they're caught and disbanded), but we get a pretty wide variety of stuff. I use every antispam tool the server offers (save greylisting), and it's effective at blocking a lot at the gate; though our Bayesian filter's been full for awhile now, so that last line of defense has allowed more into inboxes. Not a big issue in the near-term, but I figured this was something that could be updated...

Of course, we've also been running AMS since 2006 and certainly our server h/w has changed up enough, as you point out, where memory isn't really a limitation anymore. Used to be an old spare HP desktop from 2002, now one of several Server 2012 R2 VMs on a PowerEdge R620.

Thinking of that, are there any plans to offer AMS / AFS as native 64-bit apps in the future? We haven't run into any performance issues relating to application resource limits of x86, but was curious...
EKjellquist
 
Posts: 89
Joined: Tue Sep 09, 2014 10:40 pm

Re: Increase Bayesian Token DB max size?

Postby Code Crafters » Sun Oct 12, 2014 7:20 am

We'll increase the max token limit. No reason to need a 64bit version of the app yet still but we may do eventually.
Code Crafters
 
Posts: 933
Joined: Mon Sep 10, 2007 2:35 pm


Return to Suggestions

Who is online

Users browsing this forum: No registered users and 1 guest

cron