Hi! Thanks for the replies to my previous email. By the way, I was taught about hashcash in my university as a scientifically sound way of blocking spam - so congratulations, academia is already listening :) Back to the replies: >> [...] hash the address, and use that(or some first bytes of that) >> instead of the email string. We are done - hmm, well, almost. Except >> that if the person thinks that the email might have been sent to >> someone he knows the email address of, he could check. >I think you identified the problem with the hash of email address for >Bcc -- a guess can be confirmed. To expand on that also note that a >"guess" can mean hashing a database of 1 billion email addresses to >see which it is -- hashcash shows us this will take just an hour or >so. I actually thought about this possiblity but I thought that guessing would be hard if we have no clue about the possible the address of the other person. I still think it is, though guessing easy addresses (like bob@ and tom@ with easy domains like @hotmail.com, @aol.com, +domains we frequently email) can be trivial. I will think hard about this :D >The other thing is it leaks the number of Bccs. Perhaps one could >defend against it by adding a few extra bogus email addresses >if there aren't any bcc's so you cant tell. Well, this, I did not think about, and it is a valid point. Your solution seems nice, but it only works for short number of BCC's: if someone put a 100 BCC's in the message, then the usual e.g. 4-5 bogus BCC's won't help - the reciever will know that there were other recievers (unless we put ~100 bogus BCC's into every email, which sounds bad). It would work for about 99.99% of emails, though. And having more than 5 sounds like a bad social habit anyways (or a lack of an email-list). >>[...] Is the length of the 0-bits that must be calculated >> by the sender fixed ? [...] >We need some extension to communicate updates to it scalably, without >any central bandwidth overhead. Agreed. >Something like each sender sends >their highest received hashcash bits requirement. We can have some >transparent process for deciding the current (some formula on current >hardware say), and update it say once a year. Then different machines with different recieved emails and different partners will set different limits (this is like agreeing on time in a distributed environment - everybody will think it's something different). From a practical point of view: with distributed things, cheating is always too easy, no matter how good the idea. And then I might be able to convince the other that my computation limit is low. Or put a lot of nodes in which I control, and set the computation limit low, forcing everybody to put it down. > And have a requirement >to create one more bit than required so that people who haven't got >the update yet dont get their mail misfiled. Then basically updates >would distribute via software update, and via received mails. Well, to be honest, as a (moderately bad) programmer, I say: this sounds like an unworkable idea. Email seems such a simple thing, and it is an awful mess on the internet, it's a wonder it works. I can only imagine the horror a distributed program would cause on the net :) >Another option is DNS some text record on hashcash.org eg >bits.hashcash.org returns a text record lets say. However people will >misimplement their DNS clients or caches and it'll get hammered :( This is the solution. It needs investment :( but it would work great. DNS is a good way of distributing such simple data in a way that scales. RBLs work this way (as far as I know, at least). >> Generating >> truly random numbers is not as easy as people(programmers) think. >Yeah its a valid point, but taken care of in the C implementation: it >uses /dev/urandom on linux and CryptGenRand from CAPI on windows. Great :) I implemented the Yarrow-160 cryptographically sound random number generator on a symbian mobile once, this is the reason I was investigating. BTW I hope hashcash will start to get used almost everywhere. I am astounded why it hasn't already started to take off real good. All the other methods have serious flaws. RBLs don't block spams from zombies (+a lot of places+they block good traffic too), content analysers will always have a non-negligable false positive rate, and central bulk-analysing methods (hash email, send to central server, if server sees the hash too many times, decides it is spam) are very costly both computationally and in real money, plus are slow to respond and don't block spam that changes a bit at every several hundred mails. Hashcash would solve the real problem of spam: the burden of the spam is not on the sender but on the reciever. Bye, Máté