[haiku-development] Re: scheduler (was GCC 2 x GCC 4)

From: "André Braga" <meianoite@xxxxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Thu, 28 Aug 2008 05:05:05 -0300

On Thu, Aug 28, 2008 at 03:02, Ari Haviv <arielbhaviv@xxxxxxxxx> wrote:
> Hopefully you'll get better soon.
> I meant, what is the structure now looking like as you have apparently
> rewritten it several times with some academic guidance. How does it
> compare to ULE and CFS?
> http://www.haiku-os.org/blog/meianoite goes up to some time last year.

Now it feels a lot like a calendar queue, however it looks like
anything but one. :)

I won't explain in detail here how I came up with the current model;
suffice to say that the one described in the last blog posts I wrote
gets FUBARed when threads enter and leave the runqueues.

The structure is still a tree of runqueues, each runqueue
corresponding to a priority level. The key used to sort the tree is a
number that is always increased by the ratio between the summation of
all the thread priorities in all runqueues and the summation of all
the thread priorities in a given priority level's runqueue (the ratio
is between global to local so that the ratio is bigger than 1 and
higher precision is attained by shifting left the divident by a few
bits, and so that the increment stays an integer; IOW, you could call
this ratio a... stride ;)).

The threads in the runqueues themselves could be sorted by % of
quantum used, and even promoted to a higher priority level if they
consistently use less than the quantum, but I haven't experimented
with that yet.

I expect the RBTree of runqueues to be less than 3 levels deep on the
common scenario, and the worst case scenario is 7 levels deep.
Actually, because the RBTree can have an imbalance of 2n, those
numbers could theoretically double, but only for very very brief
moments. And since there's one tree per CPU, they'll tend to be quite
small indeed on SMP machines.

Now, about ULE and CFS.

They both try to model Shortest Job First, ULE by attempting to
segregate interactive and batch threads and then get them sorted by
estimated CPU usage, but ULE uses a simple data structure (a calendar
queue); CFS does this sorting as well, but it uses a huge RBTree that
models a virtual timetable where threads deemed shorter-running are
given the first slots.

Forget about directly comparing it against ULE or CFS. In Unix,
priorities do not express roles, while in BeOS they do. Most of the
algorithms used in ULE and CFS are there to add expressiveness to the
otherwise mute defaut priority level Unix processes are spawned with.
Quite frankly, the Linux and BSD camps spent years banging their heads
trying to support 'nice' leves and fairness because they approached
these problems with a Unix processes mindset. This is not 1970
anymore... (Fortunately or unfortunately, since contemporary pop music
essentially sucks, and in this sense the 70's ruled :D)

I did draw some inspiration from the *data structures* they used in
recent times in both ULE and CFS (namely the calendar queue and the
RBTree), but the algorithms themselves didn't really help much.

Now, back to the scheduler-without-a-name (SWAN?):

The current C++ code is broken due to unfinished refactoring... Eh. If
you're not afraid of looking at uncommented prototype code put
together in a rush, months ago I cooked up a behavioral simulator in
Lua of what I wanted to achieve. For those unfamiliar with the
semantics of Lua, just *ignore* the code since I cheated by using Lua
tables instead of implementing the RBTree for absolutely no gain (it's
just a behavioral simulator, not a benchmark!) and play with the stuff
from line 115 on.

Hints: add an extra dash in line 119 so that it reads "---[[" instead
of "--[["; this will print the elements IDs as they're scheduled; but
if you do, remember to change the value in line 138 to a much smaller
number, preferrably a multiple of cumulativePriority (i.e., summation
of all the thread priorities in all runqueues), else it will print
over 9000 lines of text! In the example, cumulativePriority = 10 + 10
+ 20 + 20 + 30 = 90, so I suggest using a smaller multiple of 90.

You can find the code here: http://pastebin.com/m5330479a
And you can run it by pasting it here: http://www.lua.org/cgi-bin/demo

BTW, for those interested in the details and shortcomings (given the
context above) of both ULE and CFS, you may find the links below
interesting (they're from the author of ULE, btw):
http://jeffr-tech.livejournal.com/3729.html
http://jeffr-tech.livejournal.com/12933.html

Cheers,
A.

Follow-Ups:
- [haiku-development] Re: scheduler (was GCC 2 x GCC 4)
  - From: André Braga

References:
- [haiku-development] Re: scheduler (was GCC 2 x GCC 4)
  - From: Ari Haviv

[haiku-development] Re: scheduler (was GCC 2 x GCC 4)

Other related posts: