[hashcash] Thanks for the replies

From: Mate Soos <msoos@xxxxxxxxxxx>
To: hashcash@xxxxxxxxxxxxx
Date: Sat, 27 May 2006 01:34:02 +0200
Hi!

Thanks for the replies to my previous email. By the way, I was taught
about hashcash in my university as a scientifically sound way of
blocking spam - so congratulations, academia is already listening :)
Back to the replies:

>> [...] hash the address, and use that(or some first bytes of that)
>> instead of the email string. We are done - hmm, well, almost. Except
>> that if the person thinks that the email might have been sent to
>> someone he knows the email address of, he could check.

>I think you identified the problem with the hash of email address for
>Bcc -- a guess can be confirmed.  To expand on that also note that a
>"guess" can mean hashing a database of 1 billion email addresses to
>see which it is -- hashcash shows us this will take just an hour or
>so.

I actually thought about this possiblity but I thought that guessing
would be hard if we have no clue about the possible the address of the
other person. I still think it is, though guessing easy addresses (like
bob@ and tom@ with easy domains like @hotmail.com, @aol.com, +domains we
frequently email) can be trivial. I will think hard about this :D

>The other thing is it leaks the number of Bccs.  Perhaps one could
>defend against it by adding a few extra bogus email addresses
>if there aren't any bcc's so you cant tell.

Well, this, I did not think about, and it is a valid point. Your
solution seems nice, but it only works for short number of BCC's: if
someone put a 100 BCC's in the message, then the usual e.g. 4-5 bogus
BCC's won't help - the reciever will know that there were other
recievers (unless we put ~100 bogus BCC's into every email, which sounds
bad). It would work for about 99.99% of emails, though. And having more
than 5 sounds like a bad social habit anyways (or a lack of an email-list).

>>[...] Is the length of the 0-bits that must be calculated
>> by the sender fixed ? [...]

>We need some extension to communicate updates to it scalably, without
>any central bandwidth overhead. 

Agreed.

>Something like each sender sends
>their highest received hashcash bits requirement. We can have some
>transparent process for deciding the current (some formula on current
>hardware say), and update it say once a year.

Then different machines with different recieved emails and different
partners will set different limits (this is like agreeing on time in a
distributed environment - everybody will think it's something
different). From a practical point of view: with distributed things,
cheating is always too easy, no matter how good the idea. And then I
might be able to convince the other that my computation limit is low. Or
put a lot of nodes in which I control, and set the computation limit
low, forcing everybody to put it down.

> And have a requirement
>to create one more bit than required so that people who haven't got
>the update yet dont get their mail misfiled.  Then basically updates
>would distribute via software update, and via received mails.

Well, to be honest, as a (moderately bad) programmer, I say: this sounds
like an unworkable idea. Email seems such a simple thing, and it is an
awful mess on the internet, it's a wonder it works. I can only imagine
the horror a distributed program would cause on the net :)

>Another option is DNS some text record on hashcash.org eg
>bits.hashcash.org returns a text record lets say.  However people will
>misimplement their DNS clients or caches and it'll get hammered :(

This is the solution. It needs investment :( but it would work great.
DNS is a good way of distributing such simple data in a way that scales.
RBLs work this way (as far as I know, at least).

>> Generating
>> truly random numbers is not as easy as people(programmers) think.

>Yeah its a valid point, but taken care of in the C implementation: it
>uses /dev/urandom on linux and CryptGenRand from CAPI on windows.

Great :) I implemented the Yarrow-160 cryptographically sound random
number generator on a symbian mobile once, this is the reason I was
investigating.


BTW I hope hashcash will start to get used almost everywhere. I am
astounded why it hasn't already started to take off real good. All the
other methods have serious flaws. RBLs don't block spams from zombies
(+a lot of places+they block good traffic too), content analysers will
always have a non-negligable false positive rate, and central
bulk-analysing methods (hash email, send to central server, if server
sees the hash too many times, decides it is spam) are very costly both
computationally and in real money, plus are slow to respond and don't
block spam that changes a bit at every several hundred mails. Hashcash
would solve the real problem of spam: the burden of the spam is not on
the sender but on the reciever.

Bye,

Máté
Follow-Ups:
- [hashcash] Re: Thanks for the replies
  - From: Adam Back
[hashcash] Thanks for the replies

Other related posts: