[openbeos] GSoC ticket #1069 (thread scheduler) application

From: "André Braga" <meianoite@xxxxxxxxx>
To: openbeos@xxxxxxxxxxxxx, open-beos-kernel-devel@xxxxxxxxxxxxxxxxxxxxx
Date: Sun, 8 Apr 2007 20:00:08 -0300
Hi,

As some of you may know, I've applied to the #1069 ticket, "Create an
O(1) thread scheduler with CPU affinity and soft real-time support
which targets desktop responsiveness". I've designed a nice O(1)
thread scheduler a few years ago, and I'll be using it as the basis
for the new Haiku scheduler, shall my application be accepted.

There are a few things I'd like to discuss before potentially wasting
time towards either implementing stuff in a way people won't agree
with, or trying several (flawed in some way) approaches before
settling on a good compromise. Here are my questions:

Should we follow the letter of the "Thread Priorities" entry on the
BeBook (Kernel Kit, Thread Concepts) and allow RT threads run
uninterrupted until some event leads to unscheduling? Wouldn't it be
better to set some threshold as to how long a RT thread would be
allowed to run? I remember Axel has pointed towards the latter.

Should we use separate structures, one to reference blocked threads
and other to reference ready threads? My initial impression is that
it's really not necessary and that the overhead to transfer threads
from one structure to the other is probably higher than skipping
candidate threads that can't be run until a suitable one is found. If
for example the number of blocked threads is 66%, on average it takes
3 comparisons to find a ready thread, and O(n) on the degenerate case.
If we manage to avoid the degenerate case by exploiting workload
characteristics (for instance, if we assume there are queues for every
priority level, and threads on a given priority level will be chosen
by a round-robin algorithm, recently unscheduled threads will go to
the end of the queue and the odds are that to-be-scheduled threads
will be in a ready state by the time they are chosen; also, if we
boost or reduce the thread priority based on how it was unscheduled,
and always insert it on the last position of the new priority level
queue, by the time the scheduler reaches it again it will probably be
ready to run) things might just compensate each other and the
degenerate case, which by Murphy's law WILL eventually happen, will be
so rare it won't really matter much.

OTOH, actual O(1) scheduling could be attained by separating those
structures, so it is definitely desirable. So, if no one objects, and
despite my own previous reasoning and gut feelings, this is probably
how I'll proceed. I believe the blocked threads structure should take
into account the reason why a thread was unscheduled. Should the
structure be actually organized according to unscheduling reason (the
categories would be, for example, "blocked by slow I/O", "blocked by
fast I/O", "slept for x (m,n,µ)s", "waiting on semaphore" etc) and
threads would be kept partially ordered according to wakeup expectance
inside each category, or shall we just use a big clock structure with
cues to every y (appropriate submultiple of 'second')s and order
threads according to said wakeup expectance, and then just walk this
whole structure every time the scheduler runs? I tend to prefer the
finer grain, but it might just not pay off.

Is O(1) a concern for scheduling only, or is it desirable to happen
for every thread operation, like find_thread? This is just classic
time-space tradeoff, and we either just search linearly for a thread,
or use separate, more efficient structure to list their locations,
like a hashtable.

Finally (for now; you'll surely hear lots of further questioning as I
move along): the previous question brought up another one I'd like to
generalize here: how big a concern is memory usage? I'm talking about
bit brushing here. I could, for example, use some pointer math with 8-
and 16-bit offsets instead of saving full pointers on a 32- or 64-bit
word, and then use the saved bits to store extra information; or it
might not be worth the trouble. I could choose not use extra pointers
and use single-linked lists and walk the list to perform removal
operations, or I could use double-linked lists and make removal
operations take constant time. I'm pretty much sold on the idea that a
custom memory allocator tailored to thread creation and access
patterns is VERY desirable, so thread structures will be allocated
close together according to their base priority classes and other
relevant information that would lead to better exploiting spatial and
temporal locality. I'm also not ruling out using memory prefetch
instructions straight in the scheduler code. w00t.


I'm always suggesting alternatives ranging from naïve to sophisticated
because some implementations may demand considerable kernel code
surgery to accommodate all the changes required to support them. It
will probably mean the code won't resemble it's NewOS heritage, making
syncing with the NewOS codebase difficult, if not impossible. I don't
know how desirable this is, and I thought it'd probably be best to
discuss those matters before tearing the kernel apart just to
reassemble it in a way people won't be comfortable with. I'm trying to
find the middle-ground between obviousness and cunning, elegance and
efficiency, and all the compromises we've come to know and love/hate
in Computer Science.

Any input is very appreciated.

Thanks,
André
Follow-Ups:
- [openbeos] Re: GSoC ticket #1069 (thread scheduler) application
  - From: Gustavo grieco
- [openbeos] Re: GSoC ticket #1069 (thread scheduler) application
  - From: Ingo Weinhold
[openbeos] GSoC ticket #1069 (thread scheduler) application

Other related posts: