[openbeos] Re: app_server: MMX/SSE help wanted

  • From: Adi Oanca <e2joseph@xxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Tue, 10 Aug 2004 01:44:38 +0300

Christian Packmann wrote:

I've put the program with source up at <http://www.elenthara.de/BeOS/B_OP_ADD_Test.zip>, if anybody wants to look at it (it's just a quick hack, don't expect comments; if you have questions, contact me). I'd love benchmark results from a P4, as I'm very curios on how much it differs between SIMD and integer code. The program should auto-detect the supported SIMD sets, and run only appropriate routines; but the CPU ID routine has never been tested on PII/III/4s, so it might crash.

Here are your tests on a P4 2.6GHz HT: $ ./B_OP_ADD_Test 800 600 1 Benchmarking C integer 93.13 MPixels/second

Benchmarking C integer, loop unrolling x4
146.16 MPixels/second

Benchmarking plain MMX
33.75 MPixels/second

Benchmarking SSE, loop unrolling x4, PREFETCHT0
657.53 MPixels/second

$ ./B_OP_ADD_Test 800 600 2
Benchmarking C integer
99.21 MPixels/second

Benchmarking C integer, loop unrolling x4
153.87 MPixels/second

Benchmarking plain MMX
35.94 MPixels/second

Benchmarking SSE, loop unrolling x4, PREFETCHT0
587.52 MPixels/second

$ ./B_OP_ADD_Test 100 100 1
Benchmarking C integer
101.05 MPixels/second

Benchmarking C integer, loop unrolling x4
174.55 MPixels/second

Benchmarking plain MMX
40.51 MPixels/second

Benchmarking SSE, loop unrolling x4, PREFETCHT0
1600.00 MPixels/second

$ ./B_OP_ADD_Test 100 100 2
Benchmarking C integer
109.09 MPixels/second

Benchmarking C integer, loop unrolling x4
174.55 MPixels/second

Benchmarking plain MMX
40.42 MPixels/second

Benchmarking SSE, loop unrolling x4, PREFETCHT0
1476.92 MPixels/second
=======================

MMX performance a bit odd?


Adi.

Other related posts: