[antispam-f] Re: Recent Id: and Ref: Spams

From: Frank de Bruijn <antispam@xxxxxxxxxx>
To: antispam@xxxxxxxxxxxxx
Date: Wed, 24 Jan 2007 21:56:04 +0100

In article <4ea9e74bc9steve.pampling@xxxxxxxxxxxxx>,
   Steven Pampling <steve.pampling@xxxxxxxxxxxxx> wrote:

 [snip]

> I have a list of headered items collected over the last couple of months
> (400+) mixed with some genuine mail for reference - the spam content all
> seems to be matched by RBLcheck[1] looking at zen.spamhaus.org.[2] 

> The (very) rough and ready BASIC test code I have to simply send the same
> query set also produces spam identification responses from the RBL server
> zone. The blacklist response times seem to be about 0.02 seconds (faster
> than a blink?) on a broadband line also being used to surf with a Windross
> machine.

> One of the features of spammers is that although they can fiddle with
> various elements of the mail message and To: / From: headers the Received:
> headers provide a trace back and it's the IP in there I'm ripping out...

> I'm e-mailing Frank with the early code to see if he can use any portion in
> 1.60+

Oh dear. I've already had the necessary code [1] since version 1.57 (see
the StrongHelp page 'New in 1.57' for the reason why it hasn't been put
to use yet). I'm currently running extensive tests on a module I've
written to handle block list queries.

> I don't think it would be much effort to extend the code to (optionally)
> use multiple query zones from various blacklist providers.

Yes, it does that as well. And figuring out which of those providers to
use has been taking up far too much of my time lately... :-( [2]

Using more than one does cause noticeable 'hiccups', by the way, but (in
my opinion) not too badly.

> [1] Surprisingly easy to port to RISC OS.[3]

I looked at John Tytgat's port in 2005 but decided I needed something
that would cache responses and could be interfaced with AntiSpam more
easily. So I started writing a module when I discovered the RISC OS
resolver didn't want to play. Then I got sidetracked several times over
and didn't get back to working on it until last December. It's almost
ready for use now.

> [2] Spamhaus maintain multiple query zones but zen is a combination of all
> of them.

And needs to be 'handled with care' if you don't want to get false
positives when using 'deep parsing', which is what AntiSpam would do by
default. There are several ways to handle this (ignoring 127.0.0.10 and
127.0.0.11 codes is one, not checking the earliest trace fields another)
but I haven't decided which one to use yet.

Regards,
Frank

[1] Most of it not present in the release versions and a bit outdated
    because of later developments, so your code is still very welcome.

[2] I haven't made up my mind about it yet either. Spamhaus will
    definitely be in the list and probably SpamCom as well. I'm also
    considering AHBL and maybe DSBL and WPBL.
    There's about 150 of them, some of which are 'out' for various
    reasons (they must be free and testable for starters and not too
    slow to query). Serious suggestions (i.e. done after serious
    investigation of the provider's specs) are most welcome.

Follow-Ups:
- [antispam-f] Re: Recent Id: and Ref: Spams
  - From: Steven Pampling

References:
- [antispam-f] Recent Id: and Ref: Spams
  - From: Martin
- [antispam-f] Re: Recent Id: and Ref: Spams
  - From: Steven Pampling

[antispam-f] Re: Recent Id: and Ref: Spams

Other related posts: