[haiku-development] Re: The new locking primitives and boosting -- I need some guidance here :)

  • From: Ingo Weinhold <ingo_weinhold@xxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Wed, 25 Jun 2008 05:37:04 +0200

On 2008-06-25 at 00:56:00 [+0200], André Braga <meianoite@xxxxxxxxx> wrote:

It surprises me every time anew, how you manage to write a short novel on a 
relatively compact topic. :-)

[...]
> However, tracing is not the only way to know what code paths a process
> follows. When a process (or a thread) blocks, it blocks on a specific
> semaphore, a specific mutex, a specific condvar. We could use this
> knowledge to boost processes that we *know* to be interactive, because
> they blocked on a locking primitive that was declared on a code path
> that has to do with interactivity. But instead of feeding the
> scheduler with this information directly, we could make the process
> wakeup code itself adjust the process priority right before it gets
> inserted into the scheduler queues. As soon as this "privileged"
> scheduling round is finished, the process either blocks again on a
> locking primitive and gets boosted accordingly the next time it wakes
> up, or gets demoted back to its actual priority level.
> 
> I believe this is an incredibly lightweight, robust and precise way to
> detect interactivity, since it's not based on any heuristics, but the
> actual code path a process follows. If said process sometimes behaves
> CPU-bound and sometimes behaves interactive, it won't be boosted at
> the wrong moments; this is much better than *trying* to predict the
> future by using some CPU usage history window. Not to mention that
> there's zero code overhead except for an extra private call that
> creates a branded semaphore, a single numeric variable that holds the
> desired boosting level, and a couple extra lines of code on the wakeup
> code that tweaks the priority prior to reinserting the thread on the
> scheduler queues. And the subsystems can be gradually modified to use
> branded locking primitives: nothing in the system will be disturbed at
> all, except that the more key locking primitives are converted, the
> more responsive the system gets.
> 
> I actually can't find a single disadvantage in this approach.

Well, it adds overhead. The main question is whether this kind of 
optimization is necessary at all. As François already mentioned threads 
that need very low latencies (and don't do much work) just get a high 
priority. Analogously worker threads should be given priorities lower than 
B_NORMAL_PRIORITY. IMHO when these simple rules are respected a relatively 
fair scheduler would automatically schedule threads in such a way that the 
interactive user experience is just fine.

Interactions with Haiku's GUI involve quite a few threads that mostly 
communicate via messaging. I wouldn't be much surprised if I'd be told that 
most of those threads most of the time don't use their full quantum before 
they have to wait for a lock or the next message. To me keeping scheduling 
paths short seems to be the best best way to ensure low latencies.

> What I don't quite know so far is if this is at all feasible for R1.
> I'm not sure how the message-passing synchronisation harmonises with
> lower-level locking primitives (i.e., which messages to interactive
> subsystems are blocking and which aren't).

Not sure what you expect. Messaging uses ports, ports use semaphores. If 
you wait for a message on an empty message port or want to write a message 
to a full one, you effectively wait on a semaphore.

> I don't know if the new
> locking primitives that Ingo recently introduced permit such branding
> (although I don't quite see any reason it couldn't; I'm just being
> cautious).

I don't see why this shouldn't be possible.

> Anyway, I'd like to receive feedback regarding this idea I had, and if
> possible I'd love to implement it for the R1 timeframe, as I'm on the
> verge of becoming free of any college duties that held me back for the
> past several months. If we find out it's not feasible for R1, I'd
> really like to have it considered for R2.

To be honest, I'm very sceptical whether this kind of optimization is 
really useful.

CU, Ingo

Other related posts: