[hashcash] a note about hyperthreading (was: parallel hashcash)
- From: Jonathan Morton <chromi@xxxxxxxxxxxxxxxxxxxxx>
- To: hashcash@xxxxxxxxxxxxx
- Date: Thu, 7 Apr 2005 22:17:38 +0100
Hal> I did a little research then and it seems that most hyperthreading
Hal> benchmarks show similar results of only a few percent increase at
Hal> best. I have to say that this CPU technology is more hype than
Hal> hyper.
I try to avoid the hype. ;-)
I think it all depends on the type of application you're running. If
both processes are numeric-intensive (or are doing the exact same
thing), I guess hyperthreading won't help, since you're still only
using
one core. If your processes are doing vastly different things, like in
a typical office app or game, hyperthreading is probably more of a
winner. I don't think hashcash is, or ever will be, an application
that
can take advantage of hyperthreading, since it just does a whole bunch
of integer calculations.
Hyperthreading is supposed to help the "front end" part of the
processor find more instructions per clock to send to the execution
units. Thus it only really helps raw performance if *two* conditions
hold:
- Each process, if run on it's own, would leave execution units unused
during a significant percentage of clock cycles. This generally
happens when there is a lot of random branching and/or memory access
going on, but it also often happens with floating-point code on the P4.
- The other process(es) make use of the unused unit-cycles. (A
particular hyperthreading implementation, eg. Niagara, may use more
than two threads per core.) If the first process is doing a lot of
branching, or uses FP code that's not already optimised for the P4,
that's quite easy to do. If the problem is memory access, then if the
second process is also running heavy memory access, there may be a
conflict of resources that have nothing to do with execution units, and
total performance could actually decrease. Where difficulties really
begin, however, is where both processes are already using the same
execution units quite efficiently.
On the P4, running two hashcash threads together doesn't satisfy these
conditions. As Hubert pointed out, hashcash uses integer instructions
(specifically, bitwise logic and addition) pretty much exclusively.
The P4 allegedly has a pair of double-pumped ALUs that *should* be
equivalent to four ALUs at normal clock speed, but the evidence
suggests that half the clock cycles on each one can only be used for
address generation, not for real work. Thus, the P4 tends to be
limited by the execution units for this particular algorithm, not by
the front end.
What hyperthreading *is* useful for, in a hashcash context, is allowing
ordinary user applications to remain responsive while hashcash is
churning away in the background. Most "business" applications do a lot
of memory access and branching, or else they do floating-point
calculations, both of which can be neatly slipped into the execution
backend of the P4 without interfering much with the hashcash thread.
Interestingly, this last is roughly what Intel marketing actually
describes.
Demos performed by a variety of hardware-review sites back this up,
although they usually do this on Windows where hyperthreading gets a
large advantage from an unexpected source. This source is actually
Windows itself... the SMP and UP kernels have different schedulers,
and the SMP one is considerably more intelligent. This is unlike
Linux, where the SMP and UP kernel variants use the same scheduler, but
the UP kernel is able to take some shortcuts because it can assume only
one CPU is being used (it still runs on SMP boxes, but "parks" the
unused CPUs so that they cannot interfere). Microsoft persist in using
their braindead UP kernel on non-SMP and non-hyperthreading PCs, which
means that a hyperthreading P4 gets an almost unfair boost in
performance. The Linux UP kernel gets a modest performance gain over
the SMP kernel on UP hardware, simply because of the shortcuts.
Hyperthreading may provide a direct performance benefit on other
processor architectures, however. In particular, the IBM POWER5 uses
hyperthreading in conjunction with an unusually wide backend, in an
attempt to obtain more-than-dualcore performance from an equivalent
transistor count to implementing dual cores. Multithreading at the
application level will also, obviously, speed up true dualcore and SMP
machines. It should therefore be left up to the user to decide whether
to turn this on.
Finally, I've found an application which might have been designed for
hyperthreading, if I didn't know for certain that it's been around for
over a decade. It's about as different from hashcash as possible.
While it is not presently multithreaded, it could easily be split into
a producer thread (doing heavy FPU work to iterate over an IFS-type
fractal) and a consumer thread (doing lots of random memory access to
render the fractal data points into an image). For SMP boxes, multiple
producer and consumer threads would also be beneficial.
--------------------------------------------------------------
from: Jonathan "Chromatix" Morton
mail: chromi@xxxxxxxxxxxxxxxxxxxxx
website: http://www.chromatix.uklinux.net/
tagline: The key to knowledge is not to rely on people to teach you it.
Other related posts:
- » [hashcash] a note about hyperthreading (was: parallel hashcash)
Hal> I did a little research then and it seems that most hyperthreading Hal> benchmarks show similar results of only a few percent increase at Hal> best. I have to say that this CPU technology is more hype than Hal> hyper.
I try to avoid the hype. ;-)
I think it all depends on the type of application you're running. If
both processes are numeric-intensive (or are doing the exact same
thing), I guess hyperthreading won't help, since you're still only using
one core. If your processes are doing vastly different things, like in
a typical office app or game, hyperthreading is probably more of a
winner. I don't think hashcash is, or ever will be, an application that
can take advantage of hyperthreading, since it just does a whole bunch
of integer calculations.