[hashcash] Re: OmniMix with recipient related Hashcash

  • From: "Eric S. Johansson" <esj@xxxxxxxxxx>
  • To: hashcash@xxxxxxxxxxxxx
  • Date: Tue, 05 Dec 2006 10:37:25 -0500

Christian Danner wrote:


I only had to add a few lines of code, nothing spectacular to be
impressed about.

actually, yes there is. You did something constructive with this technique. That's worth a lot. The next stage will be PR. When I finish up twopenny blue, I'm planning on writing up and sending out a press release via one of the free channels or one of the cheap channels. I'm going to also include references to all the other folks that are compatible to make it look like a huge movement. :-)

Above all OmniMix is a privacy tool. So if you only run an IMAP client
and don't rule over the server your privacy is irretrievably gone.
OTOH if you run both it may be possible to employ OmniMix at the
outbound side and control it with header directives, one of which now
is a Hashcash activation switch ('O-HashcashRecipient-Active: Yes)'.
But this would imply all the problems that come with server side
minting.

I assume this isn't a problem with pop three? It seems to me that if you leave the traffic encrypted or what ever and only decrypt on read/display, IMAP shouldn't be a problem even if the server is not under your control. Interesting. I need to look at your tool more closely.

It wasn't my intention to bore you or revitalize ancient discussions.
I'm aware of the strategies you developed since then to optimize spam
filtering and appreciate the work you've done.

I didn't mean to imply that you are boring us. Apologies if I'd did come across that way. As a friend pointed out to me, I accidentally solved a big chunk of the mailing list problem with the hybrid system I had developed. The first message to a new user gets a stamp. Then the logos back to normal. Jonathan pointed out that if we implemented the opportunistic signature technique than mailing list could sign their messages which would make it easier for users to not get spoofed with a by address white list.

Yes. I know. This is not part of the simple hash cache implementation. :-)


The main problem I see with this 'hybrid' approach, which adds the
element of some 'entry key information', is, that it requires a large
amount of advance efforts:

- for every addressee there has to be a depository on the net to
provide the sender with the required information. The necessary stamp
size isn't static but has to be adjusted in reaction to the behavior
of the spammers.

quite true. The feedback mechanism is not simple. you need to use the source IP address of the message for the postage test and the query for postage uses the "from" address. If the two match up, then the postage is successful. If they don't match up (i.e. spammer spoofing) then the postage will fail and that adds negative score.

In reality, all were talking about is a simple DBM indexed by IP address with a postage field. This creates a blacklist with more nuances than a simple binary test.

- the client software has to be designed to perform network lookups
and create Hashcash before it sends messages.

Yes. No doubt about it. But with a library encapsulating whatever complexity, this shouldn't be a problem. I don't see as having more than one method which would be "who I am, who I'm looking for" query yielding "this many bits, every time"

- The spam filter has to be extended to look up on the net or in a
built-in cache to get the required recipient specific parameters
before it evaluates the Hashcash token and decides on filtering.
Moreover it has to keep records of senders with valid Hashcash in past
mails to be able to permit (PDA) messages without a token.

Again, the DBM described above works for the first query and the second query is my friends white list maybe with a slight extension. Again, a simple DBM. Now we could make things more complex and stick the data into something like sqllite but I like DBM files. This simple, they're easy, and the only problem is I can't do sharing of a single instance in memory between multiple processes. The lock, open, read/write, close cycle gets really old.

To me it seems like trying to sell a high-tech Ferrari a hundred years
ago, where the infrastructure mainly consists of bumpy country lanes.
Some time later this may work out well, but in the beginning it's
suitable at best for some test tracks.

It's actually more like selling an ATV in the days of horse and wagon. Both can run over bumpy roads but horse owners get really upset if you try to tell them about new ways of doing things.

Seriously, there has been a huge psychological barrier to hash cash over the years. The biggest objection has been the stamp on every message model which is what drove me to the level of complexity I am currently implementing. I'm trying to use code to get past psychological barriers. Sometimes I think a baseball bat to the knees or ankles might have been more effective. It's truly frightening that people involved with technology get so emotionally bound up and resistant to new ideas (yes, I know I should talk)


Please correct me if I'm wrong, but I see the (pure) principle of
Hashcash in spending a certain amount of computation time in order to
supply evidence for the seriousness of my message. If my device (PDA
etc.) doesn't allow me to support such a strategy, then my mail has to
be better in other areas to pass spam filters. Hashcash isn't a
cure-all for spam, however it provides a valuable additional
characteristic to base a filtering decision on.

Or... my argument for low-power devices is "let the server do it". It's totally appropriate and is a good example of how one can assign a value to a proof of work stamp. How much is it worth to you to generate a stamp that will guarantee your message will be delivered to a destination mailbox? Again this is another reason for the one stamp friends list. Once the PDA has sent a message and you've paid for that stamp via some other resource, now the PDA is free to send as many messages to that address as you can create.

I agree it's not a cure-all but has the wonderful property that if we can fool spammers into generating lots of stamps, the volume of spam on the net will go down. And if we can't fool them, we now have a signal for making our own messages much more recognizable.

It wins either way.

So why not start at a lower level to bring the developers of MUAs and
filtering systems as well as mailserver operators on board instead of
confusing and by that demotivating them with complicated data flows
right from the beginning. AFAIK widespread SpamAssassin - after some
configuring - is already capable of handling Hashcash the way OmniMix
now provides, and it modifies the score value of a mail based on the
stamp size, which allows the sender to invest as many ressources as
expendable. So a sender may already experience an advantage in
equipping her/his mails with Hashcash tokens even the naïve way, which
however isn't general knowledge yet.

With regards to Spam assassin, I think it's a wonderful that it's been included. I am sad however that it does not default on. But as I've said before, I have tried for years to get others to turn on stamp systems and the basic argument circles around the drain this way:

Why should I filter for stamps if nobody's generating them.

Why should I generate stamps if nobody's filtering for them.

Why should I ask my users to sit there and wait for minutes to generate stamps when it's useless because nobody's filtering for them.

generating stamps is going to break my server if I generate them for every message.

and then there's the usual angry retort about to hell with you for assuming that I'm going to burn cycles for your stupid stamp.

Seriously, there is significant psychological resistance to creating stamps. About half of the resistance goes away if there is some sort of throttle on stamp generation. The white list technique is one of the better throttling techniques but I accept that is too complex for simple systems. That's why I think the DNS technique is a good first step. The model I've worked out for DNS doesn't even require a server telling you how much per user. All it is a static record describing something about the recipient site and the stamp size required.

For example, a static record could tell you the number of bits, the type of interaction (every time, first time, every time till reply), version number, and receive only/bidirectional.

Like I said before, the interpretation could be in a library which would make interpretation uniform across multiple filters.


No, of course not. Please don't take my words too serious, as on this
sector I'm a newbie without any experience. I merly considered using
Hashcash a good strategy to qualify individually sent anonymous
messages, which among other things don't come with a valid 'From'
header, over mass-mailed spam, the originators of which would need
thousandfold my computation power to reach a comparable value of their
tokens.

Okay. We are basically on the same page. If I have the neurons, maybe I'll take a stab at the modeling and see if it yields up something simple. And yes, the assumption about hash cash and the impact on spammers is quite right. But the beauty is it have the same impact on any mass e-mail such as commercial advertising. I think that's wonderful.

---eric

Other related posts: