Has someone planned testing if bswapq inline is faster on AMD64 than the current C implementation of 64-bit byteswap... Also... I would like to know what is the speed difference between passing variables to swap functions in registers or stack... I know only gcc implements it (VC++ doesn't). I know VC++ has its own speed hack (naked functions). ;)