[openbeos] Re: app_server: MMX/SSE help wanted

Christian Packmann wrote on Sun, 08 Aug 2004 14:04:49 +0000:
> I've got a blur routine (3x3 matrix) for B_RGB32 bitmaps, which gives 
> following results on my Athlon XP 2100+ (1733MHz) with DDR266 memory:
> 
>               Bitmap 640x480, 1200 KB    Bitmap 100x100, 9.76KB
>       Code         MegaPixels/second          MegaPixels/second 
> C integer              33                           35
> MMX                    80                          125
> 3DNow!                110                          134

Talking generally, in the future it won't be as impressive.  CPU speeds go up 
faster than memory speeds.  So if all your data (or batches of it) can fit in 
the L1 cache on the CPU (a few kilobytes) and needs intensive processing then 
it's great to use MMX.  Otherwise you aren't saving quite as much time.  Then 
there's the fact that newer memory systems are good for sequential access, but 
horrible for reading data from random addresses.  Again that affects how you 
write your code.  Whole books and courses are available on odd optimization 
tricks to work around those bottlenecks and other quirks.  But one key thing is 
to measure your results with an accurate timer, otherwise it's just wishful 
thinking.

- Alex

Other related posts: