[haiku-development] Re: A tale of two accelerant API's

From: looncraz <looncraz@xxxxxxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Tue, 12 Feb 2013 13:39:05 -0800

On 2/12/2013 06:33, Axel Dörfler wrote:

That reminds me: the app_server is already completely agnostic towardshow the frame buffer looks like.
It doesn't care whether a workspace is in a single frame buffer or not.
Every Screen object can come with its own framebuffer, and that mightbe powered by the same graphics driver or not. In any case, thegraphics driver is the only one that decides whether or not amulti-head situation shares the same framebuffer.
From the POV of the app_server, every screen has its own framebuffer.

Bye,
   Axel.

What about the case of differing resolutions for two monitors on adual-head card? I don't see how that can be handled transparently tosomething which will be handling the rendering into it... withoutscaling, or simply over-provisioning (virtually (by hidden clippinglogic I haven't seen in the code anywhere...) or otherwise) so that therendering still occurs as if the frame buffer was some neat and tidyrectangle.

Currently the CompositeEngine isn't setup to handle multiple screenobjects or Desktops - just one, but there is currently no multiheadsupport in Radeon HD to test it, so I am testing the code outside ofapp_server in a fake environment (that and it is a lot easier to testthe interaction of just my changes this way...) - I'm thinking eachscreen may get its own CompositeEngine... but share all otherresources. This is five threads on a quad-core system by default, butthis will be configurable so as to limit the total number of threadsspawned...

The overall setup is quite simple, actually: but it is designed withthe idea that everything is 2d and we don't have the luxury of wonderfulhardware acceleration... because we don't.. and won't for years. It isflexible, though. WindowBuffers can easily represent vram / 3d texturesand the rest is adjustable easily enough...

Other than the obvious redirection of client paints into a buffer, it isnecessary to properly determine when a client draw needs to change whatis on screen - when it is visible. I've added this logic into thewindow itself - so just before an UpdateSession is completed aRenderTask object (or two) is pulled from the RenderTaskPool and allwindows which were affected by the draw are added, along with theirrespective affected regions, sorted by z-order. Each RenderTaskrepresents a single client draw - but compositing is currently designedwith the idea of splitting the rendering down two paths: one foralpha-mode rendering, another for fast copy-mode rendering - sometimestwo RenderTasks are "paired" together, one "owning" the other becausethey represent parts of the same client draw. As such, windows willneed to declare an alpha region in order to gain transparency.Decorators are the opposite: they will need to declare a copy-moderegion to improve performance, otherwise they are considered 100%alpha-mode.

When a RenderTask is Finalize()'d it calls into the CompositeEnginewhere it is sorted by various properties (window priority, pixel area,drawing mode, foreground/background window, last update(min frame rate),etc...) - this is where prioritization comes to play. Now the clientdraw is complete and the window thread can go on its merry way. Thisprocess is much faster than it sounds - it occurs in the window threadsso we can create multiple RenderTasks at once as well. Exactprioritization isn't being considered as vital. All new RenderTasks areinserted after any skipped RenderTasks from the last frame unless theyare deemed to be more important by a certain factor (such as would bethe case for MediaPlayer or games that don't use BDirectWindow (thereare some changes I want to make here - basically just to let theBDirectWindow clients give the app_server an opportunity to overlay itsdraws so software cursors don't get obscured and overlays don't getmessed up... at least not as badly as they do today... but, yes, thatrequires client cooperation - and lower performance in that window)).

The next time the CompositeEngine frame control thread (needs a nicename... like Rembrandt?) wants to render a frame (which it tries to doevery 1/60th of a second or so). it tests to see if there are anypending RenderTasks and then switches contexts so new RenderTasks can beadded while the current frame is being rendered. A group of threads,one per core by default, use specialized DrawingEngines attached to theframe buffer for the screen in question - this is where hardwareacceleration comes in. The only action these threads perform is callinginto DrawingEngine::DrawWindowBuffer(WindowBuffer* buffer, IntRectsource, IntRect destination). A buffer to buffer rendering - either araw copy, or with alpha calculations in place. The drawing mode switchis so fast I didn't worry about avoiding them, so the DrawingEngine isset to render in alpha mode or copy mode depending on the needs of theRenderTask its owning thread is servicing. The threads recycle theRenderTask objects when they are serviced, release its read lock on theframe control MultiLocker, and then starts all over.

Next, the CompositeEngine frame control thread (Rembrandt?) comes 'backto life' some time before it is desired to update the actual image onthe screen and prevents the threads from working on any new RenderTasksby trying to get a WriteLock on the frame control MultiLocker. This isthe mouse or certain effects. The back buffer becomes the front buffer,the front buffer the back buffer, and the modified areas of the frontbuffer are copied to the back buffer, any post-processing can occur now,and the time to the next frame is calculated. And it all begins again.

I currently have no code for the single-buffer case. In fact, I'mworking on the premise that every single window is double buffered alongwith the frame buffer. I did a test run with the previously releasedcode and the app_server memory usage went to 200MB with this setup.112MB was used for single-buffering. And the performance differencewould be noticeable on lower end systems - where RAM is cheaper than CPU.

If the buffers can exist in VRAM, though, then we see a rather lowmemory requirement... and it is entirely feasible to hold one buffer invideo RAM and the other in system RAM. (mmap, FTW!).

BTW, DrawingEngine::DrawWindowBuffer calls into the acceleratedDrawBitmap_NoScale32, IIRC... so acceleration should come "for free."


--The loon

PS: there is plenty I left out... such as how system-wide effects are tobe handled (no provisions as of yet - LOW priority)

Follow-Ups:
- [haiku-development] Re: A tale of two accelerant API's
  - From: Axel Dörfler

References:
- [haiku-development] Re: A tale of two accelerant API's
  - From: Axel Dörfler
- [haiku-development] Re: A tale of two accelerant API's
  - From: Alexander von Gluck IV
- [haiku-development] Re: A tale of two accelerant API's
  - From: Stephan Aßmus
- [haiku-development] Re: A tale of two accelerant API's
  - From: Alexander von Gluck IV
- [haiku-development] Re: A tale of two accelerant API's
  - From: Axel Dörfler
- [haiku-development] Re: A tale of two accelerant API's
  - From: Alexander von Gluck IV
- [haiku-development] Re: A tale of two accelerant API's
  - From: Axel Dörfler
- [haiku-development] Re: A tale of two accelerant API's
  - From: John Scipione
- [haiku-development] Re: A tale of two accelerant API's
  - From: pulkomandy
- [haiku-development] Re: A tale of two accelerant API's
  - From: looncraz
- [haiku-development] Re: A tale of two accelerant API's
  - From: Axel Dörfler

[haiku-development] Re: A tale of two accelerant API's

Other related posts: