[haiku-development] Re: Optimizing Painter::_DrawBitmapBilinearCopy32

  • From: Urias McCullough <umccullough@xxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sun, 14 Jun 2009 11:13:30 -0700

On Sun, Jun 14, 2009 at 10:44 AM, Urias McCullough<umccullough@xxxxxxxxx> wrote:
> PIII 450 results running on gcc4 Haiku r30993 (I apologize for the
> multitude of gcc4 results, but I use gcc4 Haiku far more than gcc2
> these days):
>
> ~> sysinfo
> Kernel name: kernel_x86 built on: Jun  7 2009 10:27:27 version 0x1
> 1 Intel Pentium III, revision 0673 running at 447MHz (ID: 0x00000000 
> 0x00000000)
>
> CPU #0: GenuineIntel
>        Type 0, family 6, model 7, stepping 3, features 0x0383f9ff
>                FPU VME DE PSE TSC MSR PAE MCE CX8 SEP MTRR PGE MCA
> CMOV PAT PSE36
>                MMX FXSTR SSE
>        Extended Intel: 0x00000000
>
>        Instruction TLB: 4k-byte pages, 4-way set associative, 32 entries
>        Instruction TLB: 4M-byte pages, fully associative, 2 entries
>        Data TLB: 4k-byte pages, 4-way set associative, 64 entries
>        L2 cache: 512 KB, 4-way set associative, 32 bytes/line
>        L1 inst cache: 16 KB, 4-way set associative, 32 bytes/line
>        Data TLB: 4M-byte pages, 4-way set associative, 8 entries
>        L1 data cache: 16 KB, 4-way set associative, 32 bytes/line
>
>  194973696 bytes free      (used/max   73453568 /  268427264)
>                           (cached     24084480)
>     31547 semaphores free (used/max       1221 /      32768)
>      3971 ports free      (used/max        125 /       4096)
>      3989 threads free    (used/max        107 /       4096)
>      2031 teams free      (used/max         17 /       2048)
> ~> runme_haiku
> Benchmark: Haiku app_server bilinear copy
> Compile date: Jun 14 2009 14:38:02
> GCC version: 2.95.3-haiku-081024
>
> CPU vendor ID: GenuineIntel
> CPU:
>  SIMD instructions: MMX SSE SSE-Integer
>
> Can't lock process to CPU on this platform.
> Estimated CPUID/RDTSC overhead: 109 clock cycles.
> 10 runs per benchmark.
>
>                    --  Results  --
>
>       Minimum    Average    Maximum
> # 1:    453962     492521     676056  - 'C, original'
> # 2:    502890     523050     652266  - 'C, precise'
> # 3:    495008     499859     516316  - 'C, precise DIV'
> # 4:    291554     298556     343949  - 'MMX/SSE'
> Skipped 'MMX/SSE optim-test', insufficient SIMD support
> Skipped 'SSE2', insufficient SIMD support
> Skipped 'SSSE3', insufficient SIMD support
>
> Since the CPUID results from the benchmark were incomplete, I threw in
> a sysinfo too.
>

And here are some results from another PIII box running at 866mhz.
This machine is running an ancient gcc2 Haiku r29420 (I might upgrade
it tonight):

~> sysinfo
Kernel name: kernel_x86 built on: Mar  6 2009 22:20:01 version 0x1
1 Intel Pentium III, revision 0686 running at 864MHz (ID: 0x00000000 0x00000000)

CPU #0: GenuineIntel
        Type 0, family 6, model 8, stepping 6, features 0x0383f9ff
                FPU VME DE PSE TSC MSR PAE MCE CX8 SEP MTRR PGE MCA
CMOV PAT PSE36
                MMX FXSTR SSE
        Extended Intel: 0x00000000

        Instruction TLB: 4k-byte pages, 4-way set associative, 32 entries
        Instruction TLB: 4M-byte pages, fully associative, 2 entries
        Data TLB: 4k-byte pages, 4-way set associative, 64 entries
        L2 cache: 256 KB, 8-way set associative, 32 bytes/line
        L1 inst cache: 16 KB, 4-way set associative, 32 bytes/line
        Data TLB: 4M-byte pages, 4-way set associative, 8 entries
        L1 data cache: 16 KB, 4-way set associative, 32 bytes/line

 314851328 bytes free      (used/max   87465984 /  402317312)
                           (cached     46501888)
     64263 semaphores free (used/max       1273 /      65536)
      3976 ports free      (used/max        120 /       4096)
      3990 threads free    (used/max        106 /       4096)
      2032 teams free      (used/max         16 /       2048)
~> runme_haiku
Benchmark: Haiku app_server bilinear copy
Compile date: Jun 14 2009 14:38:02
GCC version: 2.95.3-haiku-081024

CPU vendor ID: GenuineIntel
CPU:
  SIMD instructions: MMX SSE SSE-Integer

Can't lock process to CPU on this platform.
Estimated CPUID/RDTSC overhead: 122 clock cycles.
10 runs per benchmark.

                    --  Results  --

       Minimum    Average    Maximum
# 1:    432823     519293     931042  - 'C, original'
# 2:    486212     525834     661193  - 'C, precise'
# 3:    475781     522175     723021  - 'C, precise DIV'
# 4:    273213     297925     473892  - 'MMX/SSE'
Skipped 'MMX/SSE optim-test', insufficient SIMD support
Skipped 'SSE2', insufficient SIMD support
Skipped 'SSSE3', insufficient SIMD support
~> runme_haiku
Benchmark: Haiku app_server bilinear copy
Compile date: Jun 14 2009 14:38:02
GCC version: 2.95.3-haiku-081024

CPU vendor ID: GenuineIntel
CPU:
  SIMD instructions: MMX SSE SSE-Integer

Can't lock process to CPU on this platform.
Estimated CPUID/RDTSC overhead: 122 clock cycles.
10 runs per benchmark.

                    --  Results  --

       Minimum    Average    Maximum
# 1:    432769     506279     848629  - 'C, original'
# 2:    486538     533291     663126  - 'C, precise'
# 3:    475901     514613     648138  - 'C, precise DIV'
# 4:    273223     310876     475751  - 'MMX/SSE'
Skipped 'MMX/SSE optim-test', insufficient SIMD support
Skipped 'SSE2', insufficient SIMD support
Skipped 'SSSE3', insufficient SIMD support

Other related posts: