[openbeos] Re: Threading and File Descriptors

  • From: "Michael Phipps" <mphipps1@xxxxxxxxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Fri, 16 Aug 2002 21:42:08 -0400

>Speaking of threads (and I'm no low level programmer either).  What is the
>maximum number of threads per process planned for OBOS.  Other OS's seem to
>artificially limit the number of threads to something like 32 or 64.  Though
>I do realize that too many threads can cause thrashing, but is that not more
>of a hardware issue and not a software issue?  As hardware increases
>capability in the SMP space, I would hate to see the OS have some small
>limit on the number of threads per process.  If it is a fixed array, could
>we have it at least 128 or 256?

I can tell you that none of this has been given any real thought, at least by 
me.
*Personally*, it seems like 32 or 64 threads in a process is enough, for now.
The trade off is space/process. Or possibly time - look at it this way - 

        Pros:                                   Cons:
Approach 1: Array
Fast Access                                     Fixed amount, space wasted
Simple Code                             Poor insertion and deletion

Approach 2: Linked List
Fast insertion & deletion               Slower  access
No wasted slots                         Wasted space (pointers to next)
Any number possible             (little) more complex code      

Approach 3: Extensible Arrays
Fast Access                                     Complex Code
Moderate waste (you don't want to extend the array for every insert/delete)

I can't think of any others.

>Also, this same argument goes with file descriptors (and sockets).  What is
>the planned limit per process and can we have it dynamic pool of descriptors
>or some larger fixed array?

Many of the same issues arise here. Most OSs that I know of have arrays for 
these things.
The problem is that *most* apps won't need more than, say, 5 file descriptors 
at the same time.
Or some low number of sockets (1-10 or so).

These are some of the tradeoff issues that "force" us to decide what we want to 
focus on.
Server OS's dedicate a lot more resource, up front, to these things, because 
they know that it
is likely that they will need them. Memory is likely to be plentiful. Solaris, 
for example, has
256 or so file descriptors per process. 

I have many of the same issues in the VM work that I am doing. The *average* 
area will probably
be <= 256k. So should I optimize for something that size? If so, what are the 
implications of smaller
and larger areas? If lookups are O(1)  from 0-->512k areas, then O(N) after 
that, is that OK? Probably not.
OTOH, is it worth the tradeoff to dedicate more ram to every area to make the 
*occassional* large area's 
lookups faster? The largest area I see on my system is 16meg. OTOH, it is 
somewhat short sighted to think
that nothing larger will ever happen... Since I know, at the beginning [other 
than resizing areas] the size of the
area, I am thinking about a scaled index, where there are N index spots (say, 
64). Each index spot
represents a number of pages (P). Say you had an area that was 128 pages 
(512k). P would be 2. So you could
find any page with an index lookup and (half of the time, statistically) a 
linked list node hop. OTOH, for a 
640 page area (2.5 meg) finding an area would be, on average, 1 index lookup + 
5 (640/64==10; 10/2=5) hops.
Still not too bad. But for a 6400 page area (25 meg - say, some mpeg4 movie 
that someone has mmapped), 
you get 1 index lookup + 50 linked list indexes. Probably too many. So then you 
start to think that maybe
the number of index spots should vary based on the area size... It all gets 
complicated. 

One advantage that we do have is that we have the source code. If we choose to, 
we could rebuild at any time
with more sockets/file descriptors, etc. Of course,  that isn't very user 
friendly. A better solution is to solve all 
of the problems with more hard thought and less #define cop-outs. That means... 
back to work. ;-)


Other related posts: