Bayesian/Tokens

Bayesian/Tokens

Postby tiggi » Fri May 28, 2010 3:21 pm

Morning Rob/Chris,

I have some questions regarding the Boyesian filter.

Right now settings for Bayesian are:
Score Required: 60%
Max Tokens: 1,500,000
Expire Unused Tokens after 30 Day

Current Status: 1,322,751 tokens - 180,677 SPAM and 5,896 non-SPAM

Filter has been working great until last week or two so I am trying to examine what's wrong and optimize it.
What happens to the auto-learn when tokens reach it's max limit, does auto-learn just stops?
I see 180k spam tokens and 5896 non-spam, what are the other tokens?
Does the 1,322,751 tokens count toward the Max Tokens limit or just spam+non-spam?

What would be the optimal settings for the Boyesian filter?

Thanks for your help.
tiggi
 
Posts: 11
Joined: Mon Mar 17, 2008 10:37 pm

Re: Bayesian/Tokens

Postby Code Crafters » Sat May 29, 2010 6:40 pm

You can just raise this default limit to many millions to avoid limiting the number of tokens the Bayesian database can hold. It's the 1,322,751 tokens number that is approaching this limit. The SPAM / non-SPAM numbers are the number of SPAM / non-SPAM mails that have been learned from. Chances are if SPAM mails are getting through now with so many learned mails that there is simply a new strain template of SPAM mail coming in and after a week or 2 auto-learning from users mails these should be automatically learned and filtered properly. We have an extermely well trained Bayesian filter now also but new strains just take a few mails to train past them and stop them coming through.
Code Crafters
 
Posts: 933
Joined: Mon Sep 10, 2007 2:35 pm

Re: Bayesian/Tokens

Postby tiggi » Wed Jun 30, 2010 5:43 pm

Chris,

chris wrote:You can just raise this default limit to many millions to avoid limiting the number of tokens the Bayesian database can hold. It's the 1,322,751 tokens number that is approaching this limit.


I've tried raising token Database Limit Max Tokens to over 1,500,000 but I am getting an popup message saying "The bayesian max token limit must be between 5,000 and 1,500,00".
How can I raise it to many millions as you suggested? Also what happens when max tokens are reached, does bayesian stops learning?
tiggi
 
Posts: 11
Joined: Mon Mar 17, 2008 10:37 pm

Re: Bayesian/Tokens

Postby Code Crafters » Thu Jul 01, 2010 8:15 am

In the current version 1,500,000 tokens is the limit because these have to be loaded into memory and some systems won't cope with more. We will probably raise this limit in a future update. For now, you should simply shorten the lifetime of your tokens to expire after maybe 30 days instead of the default 60 days. Older tokens become less relevent with time and if you have so many tokens then using more recent ones will improve the quality of your tokens.
Code Crafters
 
Posts: 933
Joined: Mon Sep 10, 2007 2:35 pm


Return to General

Who is online

Users browsing this forum: No registered users and 9 guests

cron