I think my code is slower (lots) than Jonathan's, but maybe there are some ideas in it that can be picked up. Mine is optimized for P1, since that's all I have at home to develop on :(. I tested it at work on the P4, but I'm not comfortable doing development on this at work, which precludes using SIMD. On my 200Mhz P1, I get 170,000 collision tests per second, on my work 2.4Ghz P4, its 1,600,000. The two key ideas are: 1) changes to W at position 0x0c do not affect 11 of the remaining 64 words in W, and 2) if you only change positions 0x0c and 0x0d of a sha-1 block, then you can calculate A,B,C,D, and E after the first 12 iterations of the sha-1 loop once, use those values instead of H0,H1,H2,H3, and H4 as initializers and skip the first 12 (out of 80) iterations of the sha-1 loop. I've attached the assembly, driver, and makefile for nasm/vc. To make it work on Linux nasm/gcc, you need to change the name of _hashcash_mint to hashcash_mint to meet ELF naming requirements. My code isn't drop-in compatible with hashcash, which is too bad. It seems worthless to do that, though, since the SIMD code is so much faster. The tar file also has the Excel spreadsheet I used to find the least change-sensitive word in the W array. To use it, put an "X" in those of the first 16 word(s) that you intend to change. Don Briklin and Bob Frankston will automatically fill in X's in the words that will change in the calculated part of the W array. Michael