[haiku-development] Thread migration strategy

  • From: André Braga <meianoite@xxxxxxxxx>
  • To: "haiku-development@xxxxxxxxxxxxx" <haiku-development@xxxxxxxxxxxxx>
  • Date: Sat, 21 Mar 2009 21:58:30 -0300

Splitting this into a different thread in order not to hijack René's.
Copying and pasting my own text liberally :)

I'm considering this approach to implement load balancing and migration:

First, supposing the kernel is equipped with some means of discovering
the topology of the cores on the system, it would build a graph of
migration penalties, such that cores connected by slower means would
receive a higher penalty (SMT* < shared cache < SMT** < shared socket
< shared... backplane? [and Haiku goes NUMA!]). Let's assume a tree
where cores are the leaves and interconnections are the nodes. The
kernel could use default values for different classes of
interconnects, or it could actually measure the actual latency of
transferring a thread from one core to others sitting on different
branch of the tree.

(*,**: SMT is tricky, as discussed on a previous thread. Affinity
flags could be used to hint the scheduler about the proper situation
to run a thread on a logical core, in order not to trash caches and so
on.)

The graph of penalties would be represented by a compact triangular
matrix, since the penalties would be symmetrical between a pair of
cores and its reciprocal, i.e., (Ca, Cb) and (Cb, Ca).

Let's assume that migration would either happen explicitly, where an
application would use the proper syscalls to distribute its threads to
different cores, or implicitly, for load-balancing. So only the latter
concerns the scheduler and its load balancer.

I'm under the strong impression that this is already implemented in
Haiku, but anyway, each core should have some book-keeping structure
to note things like load average and so on. Each core would also store
pointers for N "hog" threads, that is, the threads that consume the
most of their allocated timeslices. Those threads are the prime
candidates for being migrated to lesser-loaded cores. There would be
an array of pointers to each such structures.

When the load-balancer kicks in, it sorts this array using the load
average as the key, while computing the average load between cores, It
then proceeds to examine the most loaded cores for "hog" threads,
stopping when it reaches a core whose load is smaller than the
average. Let K be the quantity of cores above this threshold.

It then does the reverse, comparing each of the selected loaded cores
to the less loaded ones, adding the migration penalty to the load
average of the candidate destination core. The penalties should be
chosen so that they represent the time wasted on migration and
restoring the caches, but being reasonable enough so that the
scheduler wouldn't avoid migrating to an idling core just because it's
on a distant branch of the interconnects tree.

Then the load-balancer migrates this thread to the best destination
core among the candidates. Let W be the quantity of cores considered
for receiving migrated threads, so that K <= W <= 2K. This is so that
the algorithm scales reasonably well for a large number of cores.

In order to avoid the ping-pong effect, a just-migrated thread could
have a "cool-down" period where it would not be allowed to get to the
"hogs" list.


What do you people think?


Cheers,
A.

Other related posts: