[openbeos] Re: app_server: MMX/SSE help wanted
- From: Christian Packmann <Christian.Packmann@xxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Mon, 09 Aug 2004 22:47:44 +0200
On 2004-08-08 14:19:19 [+0000], Adi Oanca wrote:
> What about SSE, SSE2, SSE3? what can you tell us?
> Knowing they use 128bit registers, do they deliver a 4x performance
> gain over the CPU?
In some cases, probably. I wouldn't count on that being the general case,
though. But even if it's only 2-3x, that's still a serious speedup which
can be had for 'free'.
> These have support for floating point instructions isn't it?
All SSE instruction sets and 3DNow! offer this. Probably very useful for
doing OP_ALPHA stuff, and all other cases where you want to mix MMX and FP
operations. I can't judge how useful that will turn out, that really
depends on the particular tasks to be performed.
>> I'm not really a SIMD pro, but I'll gladly help with whatever I know.
>> And I already have a few suggestions about data alignment of bitmaps,
>> which would help SIMD coders a lot in writing efficient code.
> Good, let's hear them. Before that: do you want to write some code
> for Haiku project?
Absolutely. To do SIMD coding for a good purpose would make me very happy.
About data alignment issues (short version):
Most CPUs prefer if they can perform reads/writes on natural alignmnent
borders, i.e. a boundary of 2 bytes for a word (int16), 4 bytes for double
word (int32), 8 bytes for quad word (abstract MMX datatype), etc.
When doing unaligned accesses, it'll take the CPU some extra cycles to
perform the read/write operations. As these delays happen very often when
reading lots of data, this will incur a significant slowdown.
AFAIK the worst case is a P4 doing an access across a 16- or 64-byte
boundary; Intels docs state a penalty of up to the pipeline depth. The P4s
pipelines have a length of 20-30 stages (depending on P4 model), and if you
run into a 20 or 30 cycle delay... that's very bad.
But even if you're 'only' loosing a few cycles during each mem access that
can hurt performance quite badly.
The problem with current BeOS is that it doesn't provide any kind of
control over alignment when allocating bitmaps, they'll only be aligned to
4-byte boundaries. In order to get optimum memory throughput, I have to
write special code which reads 32bit values until I reach a well-aligned
address (8/16 bytes for MMX/SSE), then do full-width accesses until there's
only a 'remainder' of data left which has to be read in 32bit chunks again.
The resulting code is messy.
So from the SIMD coders perspective, it would be very good if Haiku would
offer some control over data alignment for bitmap allocations.
This includes not only the base address of the bitmap, but should ideally
extend to each bitmap row, in cases where each row has to be processed
separately (e.g. by blur routines:).
Of course there'd be some 'waste', but I don't think this would matter too
much on modern systems. Binary compatibility shouldn't be a problem either,
as the BeBook already says that BytesPerRow() are decisive on determing a
bitmaps actual size.
This might be implemented by an additional constructor without too much
fuss, I'd guess. It would be great if you could implement this, as this
would make SIMD coding much easier and less bug-prone.
Bye,
Chris
- Follow-Ups:
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Adi Oanca
- References:
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: DarkWyrm
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Christian Packmann
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Adi Oanca
Other related posts:
- » [openbeos] app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- » [openbeos] Re: app_server: MMX/SSE help wanted
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Adi Oanca
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: DarkWyrm
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Christian Packmann
- [openbeos] Re: app_server: MMX/SSE help wanted
- From: Adi Oanca