[haiku-development] Re: Optimizing Painter::_DrawBitmapBilinearCopy32

  • From: Fredrik Holmqvist <fredrik.holmqvist@xxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sun, 14 Jun 2009 18:04:02 +0200

2009/6/14 Rene Gollent <anevilyak@xxxxxxxxx>:
> Benchmark: Haiku app_server bilinear copy
[
>
>       Minimum    Average    Maximum
> # 1:    436412     453995     556881  - 'C, original'
> # 2:    449611     451156     460086  - 'C, precise'
> # 3:    457694     463536     501845  - 'C, precise DIV'
> # 4:    239682     243406     246527  - 'MMX/SSE'
> # 5:    236101     240076     252101  - 'MMX/SSE optim-test'
> # 6:    301065     308596     352154  - 'SSE2'
> Skipped 'SSSE3', insufficient SIMD support

Havn't looked at the code but it would be interesting how well gcc4
would optimize the C version for different arch flags.

Also IIRC AMD's documentation on SIMD suggested how to order
instructions to avoid stalls.

-- 
Fredrik Holmqvist

Other related posts: