2009/6/14 Rene Gollent <anevilyak@xxxxxxxxx>: > Benchmark: Haiku app_server bilinear copy [ > > Minimum Average Maximum > # 1: 436412 453995 556881 - 'C, original' > # 2: 449611 451156 460086 - 'C, precise' > # 3: 457694 463536 501845 - 'C, precise DIV' > # 4: 239682 243406 246527 - 'MMX/SSE' > # 5: 236101 240076 252101 - 'MMX/SSE optim-test' > # 6: 301065 308596 352154 - 'SSE2' > Skipped 'SSSE3', insufficient SIMD support Havn't looked at the code but it would be interesting how well gcc4 would optimize the C version for different arch flags. Also IIRC AMD's documentation on SIMD suggested how to order instructions to avoid stalls. -- Fredrik Holmqvist