Question about learning the bayesian filter

Question about learning the bayesian filter

Postby waterman34 » Wed Oct 24, 2007 5:22 pm

Hi guys

Once an email has passed through our RBL filters and found to be spam, our spam identifier content filter edits the subject to start with <SPAM> and gleefully sends it on its merry way to its destination for the client to delete or filter or whatever they see fit to do.

My question is this, if all these spam emails are filtered into a junk mail folder with this <spam> in the subject, can they still be used for learning the bayesian filter? or will the bayesian filter start learning that to be spam, the emails must have <SPAM> in the subject?

Sorry if this sounds a bit dull but I'm unsure how the bayesian would learn from these?
waterman34
 
Posts: 57
Joined: Thu Sep 27, 2007 11:33 am

Re: Question about learning the bayesian filter

Postby rob » Fri Oct 26, 2007 10:27 am

The <SPAM> will affect the bayesian as it learns from every part of the email. However, it takes into account every single word within the mail and so the affect probably wouldn't be noticable. We do exactly the same on our system and it has had no negative effect.
rob
 
Posts: 415
Joined: Mon Sep 10, 2007 2:34 pm


Return to General

Who is online

Users browsing this forum: Google [Bot] and 23 guests

cron