Question about learning the bayesian filter

by **waterman34** » Wed Oct 24, 2007 5:22 pm

Hi guys

Once an email has passed through our RBL filters and found to be spam, our spam identifier content filter edits the subject to start with <SPAM> and gleefully sends it on its merry way to its destination for the client to delete or filter or whatever they see fit to do.

My question is this, if all these spam emails are filtered into a junk mail folder with this <spam> in the subject, can they still be used for learning the bayesian filter? or will the bayesian filter start learning that to be spam, the emails must have <SPAM> in the subject?

Sorry if this sounds a bit dull but I'm unsure how the bayesian would learn from these?

by **rob** » Fri Oct 26, 2007 10:27 am

The <SPAM> will affect the bayesian as it learns from every part of the email. However, it takes into account every single word within the mail and so the affect probably wouldn't be noticable. We do exactly the same on our system and it has had no negative effect.

Question about learning the bayesian filter

Question about learning the bayesian filter

Re: Question about learning the bayesian filter

Who is online