
|
[openbeos]
||
[Date Prev]
[08-2004 Date Index]
[Date Next]
||
[Thread Prev]
[08-2004 Thread Index]
[Thread Next]
[openbeos] Re: app_server: MMX/SSE help wanted
- From: Christian Packmann <Christian.Packmann@xxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Tue, 10 Aug 2004 21:54:08 +0200
On 2004-08-10 14:52:32 [+0200], Alexander G. M. Smith wrote:
> Christian Packmann wrote on Mon, 09 Aug 2004 23:03:49 +0200:
>> Even for non-cacheable data and simple operations, SIMD processing (and
>> use of data prefetch instructions) can give more than decisive
>> advantages.
> Looks like somewhere between 2 and 3 times speedup for large data.
On my system with its slow RAM; P4s with fast RAM are a different kind of
breed, the same will go for Athlon64s. So on modern systems a speedup of 4
times seems more likely.
> Sure are lots of shift instructions in the C code - that's what MMX does
> do all in one operation.
Not quite; MMX can access the bytes as single operands and perform the
addition on all 8 values in register at once, it has no need for shifting
any values - this is a huge advantage. And additionally it can do saturated
additions, i.e. all values >255 are automatically clipped to 255; in C you
need to do that in a separate step with masking (value & 0xff).
> I wonder if it would be faster or slower with
> byte pointers and math rather than shift operations to extract the bytes.
Good idea about the byte pointers, I just tested this and while it gives a
marginal +2% improvement for RAM data, it's +30% for cache.
You can't use byte arithmetic though, as x86 has no saturated integer
addition; any overflow would give garbage results. But ADDs are usually
heavily optimized nowadays and should execute in 1 cycle irregardless of
width.
> I'd also check the generated code to make sure *src was not being
> reloaded for every operation (copy it to a local variable first in that
> case) and compile with optimization.
The byte pointer version uses a local var, so this shouldn't be a problem.
And I had opt=full from the beginning.
I'll clean up the program a bit and upload a new version, hopefully by
tomorrow.
> Anyway, it's nice to see those actual numbers!
A pleasure! :)
Bye,
Chris
|

|