[sanesecurity] Re: Bayesian approach to Clamav signatures

  • From: Steve Basford <steveb_clamav@xxxxxxxxxxxxxxxx>
  • To: sanesecurity@xxxxxxxxxxxxx
  • Date: Thu, 25 Nov 2010 22:07:36 +0000



PiK wrote:
Some time ago I realized the possibility Clamav logical signatures allow to build heuristic filters. Something close to Bayesian method. They are not exact Bayesian but based on the idea to create phrase lists: one list of phrases found mainly in spam and the other (stoplist) with phrases often found in ham. However, Clamav logical signatures do not allow to assign a real weight for each phrase. Fortunately, instead, they allow to specify minimal or maximal number of appearances of phrase in text.
Nice work there! I did a quick scan of a few messages I have and it certainly picked up a few more 419s.
I'll hopefully have a deeper look over the next few days.

I attached two files: human readable input for generator and sig (.ldb) file. I am not sure if they pass: if freelist.org cut attachments then I supply links for the files.

Came though here ok.

Cheers,

Steve
Sanesecurity

Other related posts: