[haiku-appserver] Re: Quite some room for improvement

  • From: "Rudolf" <drivers.be-hold@xxxxxxxxxxxx>
  • To: haiku-appserver@xxxxxxxxxxxxx
  • Date: Mon, 12 Dec 2005 21:22:04 +0100 CET

Hi Stephan :-)

I really hope you're not _that_ surprized!
Unless I am very much mistaken, I already warned (in general) for 8/32/
64 bit accesses..

Also, I informed some(?) people about a _very_ nice benchmarking app 
outthere, that you really need to have now I guess (not Stephan's 
benchmarker for writes only on bebits, but a not-yet released low-
level, both directions one).

Here are some results from it on some of my systems: Note that I tested 
with and without MTRR support(!).

A few more things:
- If you want the app, I'll mail it. The author considers himself 
'lost' to our community, and I still lack clearance to publish it. :-/
- PLEASE optimize for 64 bit reads AND writes. No MTRR avaible means 
that writes WILL be influenced in speed.

Thanks :-)

Rudolf.

BTW: attachement in HTML included from some systems (reports from 
author about mainly my reported results to him)
BTW2: and don't forget about PCI-FW via the AGP busmanager! Have a look 
at those results...


=========
<snipping from mails to author>

laptop: Packard Bell EasyNote 4012C+.
CPU celeron 400Mhz, Neomagic Magicgraph PCI video chipset NM2160,
mainboard chipset Intel 82440MX

$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   63.84 MB/s (210.00 MB in 3.29 s)
Write 32-bit:   63.66 MB/s (210.00 MB in 3.30 s)
Write  8-bit:   63.66 MB/s (210.00 MB in 3.30 s)
Read  64-bit:    5.24 MB/s (22.50 MB in 4.29 s)
Read  32-bit:    5.05 MB/s (22.50 MB in 4.46 s)
Read   8-bit:    1.19 MB/s (7.50 MB in 6.32 s)
$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   63.87 MB/s (210.00 MB in 3.29 s)
Write 32-bit:   63.75 MB/s (210.00 MB in 3.29 s)
Write  8-bit:   63.67 MB/s (210.00 MB in 3.30 s)
Read  64-bit:    5.24 MB/s (22.50 MB in 4.29 s)
Read  32-bit:    5.06 MB/s (22.50 MB in 4.45 s)
Read   8-bit:    1.18 MB/s (7.50 MB in 6.33 s)
$

---------
disabled MTRR-WC:
---------
$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   57.36 MB/s (180.00 MB in 3.14 s)
Write 32-bit:   29.44 MB/s (90.00 MB in 3.06 s)
Write  8-bit:    7.30 MB/s (30.00 MB in 4.11 s)
Read  64-bit:    5.04 MB/s (22.50 MB in 4.46 s)
Read  32-bit:    4.46 MB/s (15.00 MB in 3.36 s)
Read   8-bit:    1.11 MB/s (7.50 MB in 6.73 s)
$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   57.82 MB/s (180.00 MB in 3.11 s)
Write 32-bit:   29.61 MB/s (90.00 MB in 3.04 s)
Write  8-bit:    7.37 MB/s (30.00 MB in 4.07 s)
Read  64-bit:    5.06 MB/s (22.50 MB in 4.44 s)
Read  32-bit:    4.48 MB/s (15.00 MB in 3.35 s)
Read   8-bit:    1.11 MB/s (7.50 MB in 6.73 s)
$
-----------------------------
+mtrr-wc, plus driver speedup fix (testable thanks to you!!)
(I'll sleep on including this in CVS as it seems like this speedup goes 
at the cost of CPU time..)
------------------------------
$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   79.80 MB/s (240.00 MB in 3.01 s)
Write 32-bit:   79.78 MB/s (240.00 MB in 3.01 s)
Write  8-bit:   79.75 MB/s (240.00 MB in 3.01 s)
Read  64-bit:    5.33 MB/s (22.50 MB in 4.22 s)
Read  32-bit:    4.83 MB/s (15.00 MB in 3.11 s)
Read   8-bit:    1.15 MB/s (7.50 MB in 6.50 s)
$ AGPBandwidth -d1 -r2 --allsizes --delay
AGPBandwidth 0.6
Screenmode: 1024x768x16, framebuffer address: 0x20c00000

Delaying 4 seconds...
Write 64-bit:   79.89 MB/s (240.00 MB in 3.00 s)
Write 32-bit:   79.81 MB/s (240.00 MB in 3.01 s)
Write  8-bit:   79.70 MB/s (240.00 MB in 3.01 s)
Read  64-bit:    5.33 MB/s (22.50 MB in 4.22 s)
Read  32-bit:    4.82 MB/s (15.00 MB in 3.11 s)
Read   8-bit:    1.15 MB/s (7.50 MB in 6.51 s)

==================

> Hi all,
> 
> The good news is that I've found a way to accelerate the alpha 
> blending 
> inside the drawing modes that Painter uses by a factor of 4.6. For 
> writing 
> to graphics memory, the access pattern doesn't seem to matter much. 
> There 
> is virtually no difference if you write 8 bits, 32 bits or 64 bits at 
> once. 
> But when you need to read from the frame buffer, the difference is 
> quite 
> noticable: It is 3.6 times faster to read 32 bits into a temporary 
> variable, alpha blend into that, and write it back. 4.6 times faster 
> to do 
> them same, but with 64 bits. I always thought that this stuff 
> mattered for 
> just writing to graphics mem as well, but it seems that this is not 
> the 
> case.
> 
> I've also noticed an awesome possibility for speed improvement in my 
> bitmap 
> rendring code for clipped bitmaps. This should speed up WonderBrush 
> on 
> Haiku quite a bit. Maybe this applies to more stuff as well. If I 
> understand AGG correctly, it will clip stuff you draw at a very late 
> stage, 
> mostly at the time it tries to write a generated scanline to the 
> frame 
> buffer. When you manually apply a bit of clipping before that, you 
> can 
> possibly skip the generation of much of the scanline as well. For 
> bitmaps, 
> this is especially easy to accomplish. And even more effective, since 
> AGG 
> would generate a color scanline, since the "fill" is not a solid 
> color.
> 
> Best regards,
> -Stephan
> 
> 
> 
> 


<<< application/x-be_attribute; name="BeOS Attributes": Unrecognized >>>

<<< application/x-be_attribute; name="BeOS Attributes": Unrecognized >>>

Other related posts: