6 Sep 2007 15:23
Idea for improving the learning stage
Andrew <aremo <at> ngi.it>
2007-09-06 13:23:37 GMT
2007-09-06 13:23:37 GMT
Hello, I would like to submit an idea which I think would improve the accuracy and the learning stage of any statistical spam filter. The concept: learn where the "giveaway" is by watching user behaviour. It basically comes down to having the filter take note of this: did the user need to open the email before flagging it as spam? If the answer is "no", then concentrate your stats on the subject line and ignore the body (which might be full of random words used by the spammer to pollute the filter's database). If the answer is "yes", the reverse applies: ignore the subject, which must have looked "legitimate" to the user, and concentrate on the body, which is what clued the user in about the email being spam. By analyzing only the subject OR the body, you analyze only what actually looks like spam, thus ignoring the parts of the email that are there to deceive. What do you think? Regards, Andrew _______________________________________________ Bogofilter-dev mailing list Bogofilter-dev <at> bogofilter.org http://www.bogofilter.org/mailman/listinfo/bogofilter-dev(Continue reading)
RSS Feed