[interfacekit] Proposal: Only one thread per application (not window) on app_server side without polling
- From: "Marcus Overhagen" <ml@xxxxxxxxxxxx>
- To: interfacekit@xxxxxxxxxxxxx
- Date: Sat, 26 Jul 2003 01:40:45 GMT
Hi, please read everything and comment on this.
Every BWindow currently uses the PortLink class to have a connection
with the app_server where one thread is reading from the port.
This is a problem because a large number of BWindows, or offscreen
bitmaps will create a large number of threads in the app_server,
which will run out of virtual address space because of this.
The obvious solution is to have only one thread (or at least only a small
number of them per application, regardless of the number of windows (or ports).
Limitations of the current BeOS and also the NewOS or OpenBeOS kernel
prevent the use of a Simple WaitForMultipleObjects() call to wait on
a number of ports at the same time. That's the reason why one thread
for each connection port is created on the app_server side.
The best solution would be to create a WaitForMultipleObjects() function,
however, this would involve very huge changes to the kernel, and will
not be done for R1.
Another solution would be a continuos polling of all window ports,
but that would have a huge performace impact, since each read_port()
would have to be used with a rather small timeout to keep the response
time of the GUI low.
My proposal is different:
The BWindow or offscreen bitmap number is not limited, but a granularity of
1024 is used. (because one 4096 byte memory page can hold 1024 of int32)
For a number of up to 1024 ports, one semaphore is used, and one 4096 bytes
(one page) memory block is shared between application and app_server.
The app_server assignes a int32 in the memory block, to be used as port event
counter by giving the BWindow or offscreen bitmap the offset into the shared
memory block. The app server keeps track of used and unused int32 objects
in the memory block, to be able to reuse them. (This can be done using
a hashmap, or a 128 byte bitmap). It also keeps (for efficiency reasons)
track of the highest offset that is assigned as a int32 event counter.
In addition to this, it needs to keep track of whatever data it needs
as state data per window (like the port used), and needs to know which
in32 event counter belongs to which state data (to know the port id, and more).
The needs to be stored like a hash map indexed by the event counter
(or by an array), since it is no longer thread local, instead one thread
serves multiple windows.
Communincation will work like this:
When the first BWindow or offscreen bitmap in a application is created,
the initial 4096 byte of shared memory, one semaphore (initialized to 0),
and the server side thread is created.
The BWindow is assigned the first 4 bytes of the shared memory, to be used
as a event counter (and initialized to 0), and also one port)
The server side thread is blocked by waiting on the semaphore.
Many more Bwindows are created, each one is assigned another place in the
4096 byte block as a event counter.
When BWindows are destroyed, the app_server will mark the event a no longer
used.
When a BWindow needs to transfer data, it does a
write_port(window_port, data);
atomic_add(event_counter_pointer, 1)
release_sem(event_sem);
The thread checks the event counters and Bwindow ports is like this:
for (;;) {
acquire_sem(event_sem);
int32 allcount = 0;
for (int i = 0; i < max_assigned_event_counter; i++) {
if (is_event_counter_in_use(i) == false)
continue;
if (count = atomic_read(event_counter[i * 4]) > 0) {
// find BWidow object assigned this (i) event counter
// read count times from the port assigned to this BWindow
// and process the data. The port read will never block.
// do drawing here
allcount += count;
atomic_add(event_counter[i * 4], - count);
}
}
allcount -= 1; // the event sem was already acquired once above
if (allcount > 0)
acquire_sem_etc(event_sem, allcount); // will never block, just need to
account for all processed events.
}
That's it! No polling. Just a few int32 variables will be checked.
Example: if only 5 window exist, only 5 int32 will be checked, but
that is not as expensive as a failed read_port(), which would be a
kernel call.
The seconds acquire_sem() call allows to process a larger number of
signaled events at once, without having a acquire_sem() and only
processing of a single port read event which would be the other option.
When an application creates more than 1024 windows, a second set of
4096 bytes meory block, event_sem, and app_server thread will be needed.
I think is a easy and low resource using way to allow a huge number
of BWindows, while keeping thread count in the app_server low,
additional resource usage low, and avoid polling.
Might be even faster and use less resource then having individual threads.
What do you think?
BTW, I also have a number of ideas how to optimize PortLink...
regards
Marcus
- Follow-Ups:
- [interfacekit] Re: Proposal: Only one thread per application (notwindow) on app_server side without polling
- From: Axel Dörfler
- [interfacekit] Re: Proposal: Only one thread per application (not window) o=n app_server side without polling
- From: Massimiliano Origgi
- [interfacekit] Re: Proposal: Only one thread per application (notwindow) on app_server side without polling
- From: Ingo Weinhold
Other related posts:
- » [interfacekit] Proposal: Only one thread per application (not window) on app_server side without polling
- » [interfacekit] Re: Proposal: Only one thread per application (not window) o=n app_server side without polling
- [interfacekit] Re: Proposal: Only one thread per application (notwindow) on app_server side without polling
- From: Axel Dörfler
- [interfacekit] Re: Proposal: Only one thread per application (not window) o=n app_server side without polling
- From: Massimiliano Origgi
- [interfacekit] Re: Proposal: Only one thread per application (notwindow) on app_server side without polling
- From: Ingo Weinhold