[haiku-development] Re: Compositing window management

  • From: Stephan Aßmus <superstippi@xxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Wed, 15 Jun 2011 11:19:50 +0200

Hi,

On 15.06.2011 06:37, Mark Watts wrote:
Hi, I'm not exactly new to this list (lots of lurking), but I haven't
done any hacking on the Haiku sources.
I want to implement compositing in Haiku (along with other interface
things). I've gotten Haiku compiled and running in a VM, so from here
I guess I should start with the interface kit? Can anyone who's done
related work suggest a starting point in the code or potential
problems?

Great to hear, I hope you are indeed serious about getting your feet wet! :-) I am going to give you an overview, but it is intended to help you along when reading the code and piecing things together. It will just give you a direction that is guaranteed to work OK and with the minimal amount of changes, but you need to figure out the actual implementation yourself. Here it goes:

Compositing can initially be implemented without doing lots of changes to the drawing code or how updates work. In the Interface Kit we have BWindow and BView. The BWindow has a messaging port to the app_server. Inside app_server, a ServerWindow (once instance per BWindow) receives any drawing commands which BViews send via their owning BWindow. BViews also have a server counterpart, which is called View. There is another class called Window. Window is representing the on-screen object (holds clipping, decorator, ...) while ServerWindow is rather managing the communication and owns one Window object. View objects are owned by Window and mirror the client side BView hierarchy, as long as those BViews are attached. Window also owns a DrawingEngine, which abstracts the entire drawing interface. The default implementation for all these calls is in the Painter class. Painter itself is using AGG for general purpose drawing algorithm implementation, as well has a whole bunch of optimized implementations when we can take a shortcut. Each Window instance owns a DrawingEngine and Painter instance.

Now comes the interesting part: Painter has the notion of being attached to a memory address which represents the frame buffer of the screen. All the Painter objects owned by each Window are attached to the same memory address at the moment. They each expect to be setup with a correct clipping region before any drawing calls are invoked on them (this happens in ServerWindow right before invoking drawing calls). The clipping is updated for all on-screen windows in an "atomic" operation, by the Desktop object. The Desktop object runs in its own thread and whenever it calls into any ServerWindow/Window/View/... code it is expected to hold the global window lock in write-mode. This means all ServerWindow threads are then blocking. Whenever ServerWindow threads run, they hold the global lock in read-only mode, which means all other ServerWindow threads are allowed to run concurrently, and only the Desktop thread is blocking (since it always wants a write-lock). This means when the global clipping is supposed to be changed, for example by moving a window on screen, this operation blocks until all other ServerWindow threads have released the read-lock. Note that you can also invoke Desktop functions from a ServerWindow thread. In that case you have to first give up the read-lock, acquire the write-lock, call into the Desktop function, release the write-lock, and re-acquire the read-lock. Just so you understand these bits of code when you come across them. That's basically the locking inside app_server, but there are additional utility classes which perform their own (inner) locking. It is relatively easy, unfortunately, to mess things up and cause a dead-lock. Just so you keep this in mind. To recap: A ServerWindow thread only runs after it successfully acquired the read-lock (or the write-lock, but the write-lock then blocks all other threads including the Desktop thread). The Desktop thread only runs when it successfully acquired the write-lock, it never read-locks.

Anyway, the important bit is that all Painters are attached to the frame buffer, pointing to the same top-left pixel address, and they are already allowed to paint concurrently. All threads are blocked when the clipping is updated, then they can continue drawing again. The frame buffer which these Painters are attached to is called the "back-buffer". It resides in system RAM and is always 32 bit per pixel.

Since all Painter objects point to the top-left pixel of the front-buffer, all the coordinates of client drawing commands are converted to screen coordinate space! (Important since this needs to change later on)

Access to the frame buffer is managed by instances of a class called HWInterface. There is only a single instance of this class per Screen. In a regular desktop situation, this instance would be of the type AccelerantHWInterface. It is connected to the graphics card via the accelerant interface by which the graphics card provides a frame buffer of its own. It may have a different color space. HWInterface provides utility methods for copying parts of the back-buffer (RAM) to the front-buffer (graphics card memory).

Inside the DrawingEngine, you will come across code which performs this back-to-front copying. Here is another important bit of information: To avoid flickering, a Window object has the notion of an "update session". Basically it means it collects all "dirty regions". Eventually it will have a chance to tell the client BWindow object (via a back channel which is just regular BMessages being sent to the client BWindow) that it needs to paint these dirty regions. This is all asynchronous of course. Eventually sometime later, the BWindow has become aware that the Window companion in the server wants it to paint. It sends a comman "begin update session". The first thing that happens is that Window locks the session's dirty region and tells the BWindow which views are affected. Updates which arrive while a session is ongoing are batched into the next (future) update session. Also, an update session puts the Window drawing code into a special mode: It does not perform back-to-front copying of painted regions. Once the BWindow has finished calling all the Draw() hooks of the affected BViews, it sends a command "end update session". This in turn will then finally trigger the back to front copy. This means there is no flickering when BViews draw in Haiku. However, BViews can draw at any time. An application may just invoke someView->Draw(), or you can even invoke any drawing methods directly. When this happens, the server Window is not inside an update session and the back-to-front copy happens immediately. With flickering and the obvious performance hit. That's why the proper way to redraw a BView is by calling Invalidate() instead of Draw().

So how do you implement compositing now? Basically, you would want to give each Window object currently on screen it's own private frame buffer. The Painter objects are then each attached to a frame buffer of their own. Nothing would /need/ to change except that the coordinate conversion is done for the window coordinate system rather than the screen system. The buffer allocation management can be simple at first.

As your second step, you need something which actually composes all window buffers into the back-buffer at the right moment and transfers the result into the front-buffer inside the graphics memory. Notice that you absolutely *cannot* get rid of the back-buffer in system RAM. You want to access the graphics memory only in one direction -- writing -- you never want to read from it. Reading the graphics memory is insanely slow. For compositing, that is what you need to do, however.

The third step (rather simple, but here for completeness) is to change that BDirectWindow now points to the window buffer instead of the frame buffer.

When you trigger the compositing at the right time, you should now have a system which works exactly as before, only with a lot of memory overhead and no other benefits.

Obviously your next steps would then be to realize some benefits from your changes:

* You would change the way how dirty regions are triggered when parts of windows are exposed (all this code is in the Desktop class). Obviously exposing parts of a window does not actually need anything to be redrawn anymore, since the exposed part is already fully valid in the private Window frame buffer. So expose events do not need to invalidate client BWindows anymore, but they simply need to trigger an update at the compositing step. This change will be one of the biggest benefits, since it greatly reduces the CPU consumption when windows are moved on screen. You can consider it performance optimization by "caching".

 * The only events that trigger actual redraw would be:
   - When a Window is resized. Note you can copy the valid
     region of the old buffer and limit the dirty region, but
     Views with B_FULL_UPDATE_ON_RESIZE, or views which follow
     the right/bottom edge of their parents still invalidate parts
     of the new buffer that you could copy from the old... the code
     already does this, so nothing needs to change.
   - When the client requests a redraw via BView::Invalidate().
   - When the View hierarchy changes.
   - When other properties change, like the decorator look.
   (All this already happens, no need to change anything except to
    avoid triggering the wrong kind of update when windows move.)

 * Your next chance of introducing some benefit is by using an alpha
   channel for Windows. Giving them a drop shadow would be nice.
   Compositing these in software may be fast enough, you just freed
   up some time by caching window contents.

Once you have this system in place, you can start thinking about doing the compositing via the graphics card hardware. For this to work you would setup the hardware to pull textures from main memory and compose in the graphics buffer. This allows you to get rid of the back-buffer in system RAM, but only for the sitation when hardware acceleration is available. Obviously there is no driver for Haiku yet which has these capabilities. There isn't even an official accelerant API for this new functionality and requirements of the compositing.

There are some nice properties which can be coded into the compositor: It can be locked to the screen refresh rate, and you may cause window painting to happen in separate temporary buffers. The compositor would then compose the dirty parts of the back-buffer at a fixed rate, in its own thread. It requests buffers from each Window to do so, but the buffers would always be clean. When a Window draws, which is undisturbed by the compositor asking for the buffer, it draws into a temporary buffer. When it is done, it switches the new clean buffer for the old clean one by holding the compositor lock for a short time (also telling the compositor to recompose at the same time). This way you can never see dirty parts of any window anymore as a user, and since the compositor is locked to the screen refresh, you see no tearing either.

Hope this helps,
-Stephan

Other related posts: