[antispam-f] Re: Recent Id: and Ref: Spams

From: Steven Pampling <steve.pampling@xxxxxxxxxxxxx>
To: antispam@xxxxxxxxxxxxx
Date: Thu, 25 Jan 2007 08:04:02 +0000 (GMT)

On 24 Jan, Frank de Bruijn <antispam@xxxxxxxxxx> wrote:
> In article <4ea9e74bc9steve.pampling@xxxxxxxxxxxxx>, Steven Pampling
>    <steve.pampling@xxxxxxxxxxxxx> wrote:

>  [snip]

> > I have a list of headered items collected over the last couple of
> > months (400+) mixed with some genuine mail for reference - the spam
> > content all seems to be matched by RBLcheck[1] looking at
> > zen.spamhaus.org.[2] 

> > The (very) rough and ready BASIC test code I have to simply send the
> > same query set also produces spam identification responses from the
> > RBL server zone. The blacklist response times seem to be about 0.02
> > seconds (faster than a blink?) on a broadband line also being used to
> > surf with a Windross machine.

> > One of the features of spammers is that although they can fiddle with
> > various elements of the mail message and To: / From: headers the
> > Received: headers provide a trace back and it's the IP in there I'm
> > ripping out...

> > I'm e-mailing Frank with the early code to see if he can use any
> > portion in 1.60+

> Oh dear. 

I noted the REM bit about a module, but did this stuff anyway - mental
exercise

> I've already had the necessary code [1] since version 1.57 (see
> the StrongHelp page 'New in 1.57' for the reason why it hasn't been put
> to use yet). I'm currently running extensive tests on a module I've
> written to handle block list queries.

I had noticed the code, I hadn't noticed the info in the 1.57 help page (I
forget what was going on when 1.57 came about but I do recall being late
taking it up and you were several incarnations along when I did.

BTW. The problem with the Resolver module is that it just throws away the
zen.spamhuas.org portion of a query like 80.123.246.12.zen.spamhaus.org and
sends a query of 80.123.246.12.in-addr.arpa instead.
Someone in the developer team did a bit of naff coding.

> > I don't think it would be much effort to extend the code to
> > (optionally) use multiple query zones from various blacklist providers.

> Yes, it does that as well. And figuring out which of those providers to
> use has been taking up far too much of my time lately... :-( [2]

> Using more than one does cause noticeable 'hiccups', by the way, but (in
> my opinion) not too badly.

Normal response times from single source queries tend to be around 0.02
seconds, but with multiple source queries the typical response on the
*windows* rblcheck is the same 0.2 with gaps between responses of up to 2
seconds so a "hiccup" on your module wouldn't be surprising.

> > [1] Surprisingly easy to port to RISC OS.[3]

> I looked at John Tytgat's port in 2005 but decided I needed something
> that would cache responses and could be interfaced with AntiSpam more
> easily. So I started writing a module when I discovered the RISC OS
> resolver didn't want to play. Then I got sidetracked several times over
> and didn't get back to working on it until last December. It's almost
> ready for use now.

Ah, yes, modules. Must learn how to produce those. I have this grammar
parser as a command line reliant on UnixLib...

> > [2] Spamhaus maintain multiple query zones but zen is a combination of
> > all of them.

> And needs to be 'handled with care' if you don't want to get false
> positives when using 'deep parsing', which is what AntiSpam would do by
> default. There are several ways to handle this (ignoring 127.0.0.10 and
> 127.0.0.11 codes is one, not checking the earliest trace fields another)
> but I haven't decided which one to use yet.

My test routine (not part of antispam) simply took the complete text of a
set of e-mails exported from Pluto and extracted the IP from *all*
received: lines, sent it and moved on to finding another line with an IP.

43 queries sent, 86 DNS packets in the exchange all within 0.5 seconds.
No false positives (like from my own ISP markers) and a few spam items not
recognised.
I haven't seen any evidence of false positives - but then the only mail I
was looking was either spam that had gone through spammy addresses or
relatively local transfer. Might need some genuine correspondence from
someone in China to check :-)

-- 

Steve Pampling

Follow-Ups:
- [antispam-f] Re: Recent Id: and Ref: Spams
  - From: Frank de Bruijn

References:
- [antispam-f] Recent Id: and Ref: Spams
  - From: Martin
- [antispam-f] Re: Recent Id: and Ref: Spams
  - From: Steven Pampling
- [antispam-f] Re: Recent Id: and Ref: Spams
  - From: Frank de Bruijn

[antispam-f] Re: Recent Id: and Ref: Spams

Other related posts: