[openbeos] Re: Threading and File Descriptors

  • From: "Michael Phipps" <mphipps1@xxxxxxxxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Sat, 17 Aug 2002 00:47:08 -0400

>>      Pros:                                   Cons:
>> Approach 1: Array
>> Fast Access                                  Fixed amount, space wasted
>> Simple Code                          Poor insertion and deletion
>> 
>> Approach 2: Linked List
>> Fast insertion & deletion            Slower  access
>> No wasted slots                              Wasted space (pointers to next)
>> Any number possible          (little) more complex code      
>> 
>> Approach 3: Extensible Arrays
>> Fast Access                                  Complex Code
>> Moderate waste (you don't want to extend the array for every insert/delete)
>> 
>> I can't think of any others.
>
>Sorted data structures - trees, skip lists, and other dynamically allocated 
>and automatically balancing data structures.  Not as fast as an array (perhaps 
>10 times slower than an array lookup, which may sound silly but may actually 
>be worth it to get rid of the artificial limits), and they have the advantages 
>of a linked list.  The key could even be a small integer, used as a handle to 
>the data, or a pointer if the nodes in the data structure don't move around.  
>The small integer (used much like you would an array index but never getting 
>reused for deallocated items; always incrementing) is better since invalid 
>key/handles are detected.

Trees and skiplists (both of which I considered) are *way* too big, memory and 
code wise, for kernel work.
Think about it - skip lists with 2-10 (or more) pointers per node? Where a node 
is a single socket/file handle?
Trees are possible, but they still have issues (O(log 2 N)) with performance, 
they have to be rebalanced, and they are painful to code. Not to mention the 2 
pointers per node. I think that the solution for these things *has* to be space 
and code efficient. I am thinking that maybe allocating a structure like this:
struct fewFileHandles{
        fileHandle small[4];
        struct manyFileHandles *more;
}

Where manyFileHandles is exactly the same, but with, maybe 25 fileHandles and a 
pointer to another of the same would be a good approach.
In fact, that sort of thing solves both of these issues. Leaping through the 
list, 25 at a time, is not too bad, for file handles and sockets.
Maybe 50 pages at a time for VM... Or even:
struct vmPages {
        page *pages[50];
        a100Pages *somePages;
        a1000Pages *lotsOfPages;
}

Where one or both of the pointers will be null...
Hmm - I will have to think about that...

>> I have many of the same issues in the VM work that I am doing. The *average* 
>> area will probably
>be <= 256k. So should I optimize for something that size?
>
>And then there are areas which are hundreds of megabytes big.  Possibly 
>gigabytes if running on a 64 bit processor.  Big images (600 dpi scans of 
>large things).  Big RAM disks (my pet project, 128MB areas aren't big 
>enough!).  Hours of sound or video in an editor.  Maybe even get rid of hard 
>drives and just use RAM (particularly if magnetic nonvolatile RAM becomes 
>affordable).

Gigabyte areas would require a whole different way of looking at things. What 
is optimal for 16k areas is not going to work for 2 gig areas.
I am not sure if I want to try to conquer all of that with a first release, but 
it is definately something coming up in the future (and not too far off).



Other related posts: