>> Pros: Cons: >> Approach 1: Array >> Fast Access Fixed amount, space wasted >> Simple Code Poor insertion and deletion >> >> Approach 2: Linked List >> Fast insertion & deletion Slower access >> No wasted slots Wasted space (pointers to next) >> Any number possible (little) more complex code >> >> Approach 3: Extensible Arrays >> Fast Access Complex Code >> Moderate waste (you don't want to extend the array for every insert/delete) >> >> I can't think of any others. > >Sorted data structures - trees, skip lists, and other dynamically allocated >and automatically balancing data structures. Not as fast as an array (perhaps >10 times slower than an array lookup, which may sound silly but may actually >be worth it to get rid of the artificial limits), and they have the advantages >of a linked list. The key could even be a small integer, used as a handle to >the data, or a pointer if the nodes in the data structure don't move around. >The small integer (used much like you would an array index but never getting >reused for deallocated items; always incrementing) is better since invalid >key/handles are detected. Trees and skiplists (both of which I considered) are *way* too big, memory and code wise, for kernel work. Think about it - skip lists with 2-10 (or more) pointers per node? Where a node is a single socket/file handle? Trees are possible, but they still have issues (O(log 2 N)) with performance, they have to be rebalanced, and they are painful to code. Not to mention the 2 pointers per node. I think that the solution for these things *has* to be space and code efficient. I am thinking that maybe allocating a structure like this: struct fewFileHandles{ fileHandle small[4]; struct manyFileHandles *more; } Where manyFileHandles is exactly the same, but with, maybe 25 fileHandles and a pointer to another of the same would be a good approach. In fact, that sort of thing solves both of these issues. Leaping through the list, 25 at a time, is not too bad, for file handles and sockets. Maybe 50 pages at a time for VM... Or even: struct vmPages { page *pages[50]; a100Pages *somePages; a1000Pages *lotsOfPages; } Where one or both of the pointers will be null... Hmm - I will have to think about that... >> I have many of the same issues in the VM work that I am doing. The *average* >> area will probably >be <= 256k. So should I optimize for something that size? > >And then there are areas which are hundreds of megabytes big. Possibly >gigabytes if running on a 64 bit processor. Big images (600 dpi scans of >large things). Big RAM disks (my pet project, 128MB areas aren't big >enough!). Hours of sound or video in an editor. Maybe even get rid of hard >drives and just use RAM (particularly if magnetic nonvolatile RAM becomes >affordable). Gigabyte areas would require a whole different way of looking at things. What is optimal for 16k areas is not going to work for 2 gig areas. I am not sure if I want to try to conquer all of that with a first release, but it is definately something coming up in the future (and not too far off).