On Wed, 26 May 2004, Jonathan Morton wrote: > I recently sent Adam an updated version of libsha1.c, which is about > 20% faster on x86 machines. After some more thought in this vein, I'd > like to help write a set of "fast minting" routines which maximise > sustained minting performance on a given class of machines. In that case, I'd like to help out if I may. Optimising assembly language is a hobby of mine. Although I mostly have experience of PPC so it seems we overlap a bit there. > The minter routine would then append the random salt string, which > would be at least 64 (or 96) bits worth but extended (if necessary) to > allow minting to progress on a single SHA-1 block - which is much more > efficient than calculating SHA-1 afresh for the entire message. A couple of further suggestions: first, some of the hash computation can be calculated just once. For example, in the stamp 1:20:040428:foo::0298b26853c8ebb3:beca0 there are 34 bytes up to and including the last ':'. This means that the first eight words of the block are constant, so the first eight steps of the 80 can be calculated once and then re-used. This could mean a 10% improvement, roughly. Second, the random salt string could even be extended to take up more of the block. Suppose the stamp above is extended to have 48 bytes up to the last ':', then 12 words remain constant and hence 12 steps out of 80 don't need to be repeated, a possible 15% speedup. This leaves 7 bytes for the counter string before the stamp overflows one SHA-1 block, which should be enough (even if only using a hex counter, that's 2^28 chances at finding a 20-bit collision). It needn't be the random salt which is extended in this way; the counter could be initialised to take up the most space possible and then incremented at the tail end. That is, the minter would try 1:20:040428:foo::0298b26853c8ebb3:000000000000000000000 1:20:040428:foo::0298b26853c8ebb3:000000000000000000001 1:20:040428:foo::0298b26853c8ebb3:000000000000000000002 This is the longest stamp which fits in one SHA-1 block, and permits the minter to precalculate 13 out of 80 steps, with one (tiny) additional adjustment every time the counter carries between four-byte words: 1:20:040428:foo::0298b26853c8ebb3:000000000000000000FFF 1:20:040428:foo::0298b26853c8ebb3:000000000000000001000 ^^^^ Obviously this leaves us looking at a trade-off between stamp size and minting time. > Hexadecimal and base64 strings are both feasible, though the tail of > the salt string may be more efficiently coded in some subset of base64, > eg. the first 16 letters of the alphabet. I agree, this would make advancing through the counter values cleaner and faster. This could also depend on the particular mint function, for example, an optimised PPC G4 function might use one set of vector registers and two sets of integer registers to implement a six pipeline minter. Then it might be convenient to use the first 24 letters of the alphabet, and perhaps a simple matter to use both upper and lower case for a total of 48 different values per byte. (The G4 can carry out one vector and two integer ops per clock cycle, so this is a likely scenario). Is there any concept of "order" on the counter, or a requirement that the value there is the "lowest"? And if so, who's going to check it? In other words, is it fair to say that a minting function can produce any string for the counter provided it's in the base64 alphabet? I ask because I vaguely remember a statement that the counter should be "the first" value that works, although I can't find it and am possibly mistaken. The assumption may have simply been that the minter will naturally return the first value it finds that works. One last thing, Kyle Hasselbacher said in an old comment: "I'm going to rewrite my generator to start the counter at 2^(bits+2). It'll be the unluckiest generator on the net. 8-)" Referring back to my second suggestion, this could also make it the fastest generator on the net :-) Regards, Malcolm Howell.