[hashcash] Hashcash performance improvements
- From: Jonathan Morton <chromi@xxxxxxxxxxxxxxxxxxxxx>
- To: hashcash@xxxxxxxxxxxxx
- Date: Wed, 26 May 2004 04:31:22 +0100
I recently sent Adam an updated version of libsha1.c, which is about
20% faster on x86 machines. After some more thought in this vein, I'd
like to help write a set of "fast minting" routines which maximise
sustained minting performance on a given class of machines.
I think this is in our best interests, as determined spammers will do
this kind of optimisation as soon as it becomes profitable for them,
and we want to stay abreast of them at worst. Optimising the software
available to the "good guys" limits the advantage spammers can gain
over us, at least until they decide to use dedicated hardware.
I've already determined that the vanilla x86 instruction set is
incapable of computing SHA-1 hashes without spilling working variables
to memory - the revised routine reduces this problem but doesn't
eliminate it. There are 5 explicit working variables, 2 temporaries,
and one frequently-accessed constant needed, but the x86 only has 8
GPRs, some of which are needed for system purposes.
It should, however, be possible to compute SHA-1 more efficiently using
MMX, which has a set of 8 registers separate from the GPRs - just
enough to store all the needed values. Additionally, MMX would be able
to compute two hashes simultaneously, potentially doubling the
collision rate. I estimate that an efficient MMX routine could be as
much as 4 times as fast as the present implementation - that's
potentially 4 million collision tests per second from a recent Athlon.
One disadvantage, though, is that MMX has no rotate instruction - it
takes 4 instructions (move, shift left, shift right, or) to emulate
that operation.
Even more efficient routines could possibly be written using SSE2, but
I don't have hardware capable of running such code. Someone else might
want to implement an SSE2 extension once the MMX and Altivec routines
are mature, so I'll try to ensure such extensions are easy to add.
PowerPC machines (meaning Macs) have a rich set of registers and are
not limited by spilling. I still think a faster implementation is
possible, by running multiple "pipes" alongside each other and thus
taking advantage of pipeline bubbles. Other register-rich
architectures, including perhaps ARM, could benefit from the same code.
On PowerPC machines supporting Altivec (G4 and G5 at present), there
are 32 128-bit vector registers available, so it is theoretically
possible to run a 16-pipe hash routine, consisting of 4 pipes each
operating on 4-element vectors. This would probably achieve a rate of
1.6 million collision tests per second, or better, on a 400MHz G4.
To allow multi-pipe implementations and reduce overhead, it's necessary
to move away from the standard libsha1-style implementation, towards a
dedicated minting routine. The standard libsha1 routines can be used
to verify the token after minting and upon receipt. I suggest
decoupling the token formatting from the hash minting, and simply
passing a string and a bit-count request to the minter. The formatting
only needs to be done once per token, so doesn't need to be specially
optimised.
The minter routine would then append the random salt string, which
would be at least 64 (or 96) bits worth but extended (if necessary) to
allow minting to progress on a single SHA-1 block - which is much more
efficient than calculating SHA-1 afresh for the entire message.
Hexadecimal and base64 strings are both feasible, though the tail of
the salt string may be more efficiently coded in some subset of base64,
eg. the first 16 letters of the alphabet.
I'd like this to work within the official hashcash distribution, but if
that is not possible, I hope there is no objection to my implementing a
standalone minter.
--------------------------------------------------------------
from: Jonathan "Chromatix" Morton
mail: chromi@xxxxxxxxxxxxxxxxxxxxx
website: http://www.chromatix.uklinux.net/
tagline: The key to knowledge is not to rely on people to teach you it.
- Follow-Ups:
- [hashcash] Re: Hashcash performance improvements
- From: Adam Back
Other related posts:
- » [hashcash] Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- » [hashcash] Re: Hashcash performance improvements
- [hashcash] Re: Hashcash performance improvements
- From: Adam Back