[haiku-appserver] Re: investigating some bugs

  • From: "Stephan Assmus" <superstippi@xxxxxx>
  • To: haiku-appserver@xxxxxxxxxxxxx
  • Date: Tue, 29 Mar 2005 15:48:25 +0200 CEST

Hi again,

> > sorry if I made you feel like I'm putting pressure on you. It's 
> > just 
> > that I got so excited about the possiblity of showing a running 
> > Haiku 
> > with app_server at BeGeistert. Now, your explaination of the 
> > problem 
> > makes much sense. Maybe I can even find a short term fix until you 
> > have 
> > more time to fix it for real.
> 
>       :-)) I should borrow you my machine for BeGeistert, as those bugs
> do not reproduce. :-))))

I should try turning off one CPU maybe? I'm on an ancient 2 x 350MHz 
PII. If there are bugs in the locking, it's just the nature of those 
problem to only show up sometimes or only on other machines... The 
demonstration machine will most likely be a single CPU notebook though. 
;-) 

> > It happened (while doing exactly the stuff you describe above) when 
> > I 
> > used the UpdateQueue version of ViewHWInterface::Invalidate().
> 
>       A little more info please as I understand nothing...

To reproduce the bug, you were doing the same thing as me. Dragging 
arround and resizing windows. Now, about ViewHWInterface::Invalidate(), 
here is the structure of my drawing backend.

DisplayDriverPainter
        * uses Painter
        * uses HWInterface

Painter
        * is attached to a RenderingBuffer
        * provides all BView style drawing functions

HWInterface
        * provides two RenderingBuffers, BackBuffer and FrontBuffer

Now, HWInterface is supposed to be the abstraction of a graphics card. 
Therefor, it provides all stuff like SetMode(display_mode) and so on. 
The only important thing however is the BackBuffer, to which 
DisplayDriverPainter attaches its Painter instace. Then there is a 
HWInterface::Invalidate(BRect) method, which tells the HWInterface that 
a certain part of the BackBuffer contains new contents.

So in any DisplayDriverPainter drawing function, this is essentially 
what happens:

 if (Lock()) {
    BRect areaEnclosingAllPixelsHavingChanged =
                   fPainter->DoTheDrawingInBackBuffer();
    fHWInterface->Invalidate(areaEnclosingAllPixelsHavingChanged);
    Unlock();
  }

Or at least this is how it will eventually work, right now, the 
DisplayDriverPainter calculates the dirty area itself. Never mind.

HWInterface also has a method CopyBackToFront(BRect), which copies the 
provided area from the BackBuffer to the FrontBuffer. Right now, if you 
look in ViewHWInterface, Invalidate() is just implemented as calling 
CopyBackToFront(). So the transfer from back to front buffer is 
_inline_ with the drawing. What UpdateQueue is eventually supposed to 
do, is to culmulate the dirty regions and doing the back to front 
transfer in another thread, therefor _decoupled_ from the actual 
drawing. So, DisplayDriverPainter can keep drawing, UpdateQueue will 
keep track of the dirty region and eventually cause a back to front 
transfer. For example, in a more advanced version of UpdateQueue, it 
could do the transfer within the time of a vertical blank, so that 
front buffer updates are synced with the monitor refresh.

Ok, back to the original problem: UpdateQueue already works, and it 
makes the drawing faster. When I enable it in 
ViewHWInterface::Invalidate(), I'm more likely to see the busy loop bug 
in the RootLayer1 thread. I think UpdateQueue is build so simple, that 
I don't expect it to be the cause of problems in a completely unrelated 
part of app_server. There is no change at all to the semantics of the 
DisplayDriver interface. However, since the drawing becomes a little 
faster, I tend to explain the problem by the change of timing.

> > When I 
> > added more debug output, the less often the bug happened,
> 
>       Hmm... the update code might be the cause...

Whose update code? The one in RootLayer calculating the invalid regions 
or my update code that causes a back to front buffer transfer? I rather 
think that adding more debug output made the drawing slower again, so 
the bug didn't show up anymore. There is sometimes a very different 
situation on a dual CPU machine such as mine, these timing related 
locking problem can become much more appearant.

Hope this explains things.

If you find some time to work on the update code. Here is something you 
should consider. In the current design, I'm seeing that update requests 
pile up. Especially the very slow drawing of ViewDriver and the not 
much faster Painter bring these problems up. However, if nothing is 
changed in that design, the problems will be there in the Haiku version 
just as well. What I mean is this: Slow drawing operations are not a 
problem per se. Piling up drawing requests however is a huge problem, 
and that is the _true_ cause of the ViewDrivers slowness. So multiple 
update request should be combined "by some mechanism" and result in 
only one drawing request. There are multiple approaches to that 
mechanism. I don't have a good overview of what happens in the rest of 
the app_server yet, so I might be totally off here. I think that the 
server side thread that exists for each window should try to pull 
update information with a time out, while the thing in app_server that 
calculates the necessary updates should push the update region with 
hard locking. Ie, there is a lock that serializes access to the update 
region (I don't know what locks already exist, and if there should be 
one per server object or if there is one object holding the clipping 
information of all layers or what not...). Anyways, the part that sends 
update requests down to the client windows, which result in them 
calling Draw() of their child views - this part needs to try to access 
the update region with a timeout. Next time it tries to access it, it 
might already have been combined from the previous state and more 
recent going ons in the app_server. Am I talking sense?


Best regards,
-Stephan

Other related posts: