[haiku-development] Re: Do optional packages need to include everything they do?

On 2009-09-01 at 17:14:52 [+0200], Michael Lotz <mmlr@xxxxxxxx> wrote:
> > "Michael Lotz" <mmlr@xxxxxxxx> wrote:
> > > I'm currently looking through the kernel to check a few problematic
> > > places
> > > (i.e. where large reads are split up into page wise reads).
> > 
> > What is definitely problematic is that the FS overlays used do not
> > support the asychronous I/O hooks - this should be the main reason
> > why
> > booting from CD is so slow without the delays stippi added.
> Yes, I am working on that. The problem is that the whole IOScheduler
> design isn't really taking such situations into account. I am currently
> experimenting with creating an IOScheduler for the write_overlay volume
> and then running the incomming io_requests through that one to get do_
> io()'s for the write_overlay that then can be translated to reads from
> memory where appropriate or further requests down to the original node.
> At least for the reading case where no changes on the write_overlay
> level are present, the write_overlay scheduler can then be completely
> bypassed, resulting in direct asynchronous IO on the CD as would be the
> case if there were no write_overlay at all.
> One problem I ran into is that the IOScheduler only has one cookie.
> This is fine if you have one single raw device that you cater for, but
> is a problem in the write_overlay case. There I need the info on which
> node the io_request shall take place as I want one IOScheduler per
> overlay volume and not one per each node. Currently I work around that
> by re-setting the callback/cookie before scheduling each io_request,
> which does seem very hacky.

You may have misunderstood how the I/O scheduling is supposed to work. An 
IOScheduler is only catering for a single hardware device. It serializes 
and reorders all I/O requests for that device to minimize the average seek 
time. Using one in file systems is really not intended and I don't see what 
good it would do.

For the write overlay, I would expect the io() hook to work like this:

* A write request should be handled completely and synchronously by the 
hook. All data is just copied.

* A read requests, if the file has not been written to yet, shall be passed 
straight to the underlying layer.

* A read request, if the file has already been written to, needs to be 
processed by the hook:
  - The request might not intersect with the modified data range, so it can
    be passed straight to the underlying layer.
  - The request might be fully covered by modified data, so it can be
    handled completely (synchronously), by copying out the respective data.
  - The request might be partially covered by modified data, in which case
    the covered ranges need to be served synchronously and sub-requests need
    to be created for the not-covered ranges and be forwarded to the
    underlying layer. An optimization is possible when the beginning or the
    end of the range is covered: then that part could be handled
    synchronously, the request be adjusted and passed on to the underlying

I don't know how the write overlay is implemented, but to keep things 
simple it could just fully cache a file when first written to, to avoid the 
somewhat unhandy partially covered reads. That's seriously suboptimal when 
only small parts of large files are changed, but I don't think that's a 
particularly common case for the installer CD boot (probably not even for a 
live CD).

Note, that when creating sub-requests for an IORequest you might need to 
associate an iteration and/or finished cookie and hook with the request. 
You'll have to replace the original cookie, but save it first and later 
call the original hooks, in such a case. For reference: 
do_iterative_fd_io() (in vfs_request_io.cpp) does exactly that.

> The attribute_overlay always just passes through the requests, as it is
> not concerned with file data btw, so it is not a problem there.

I'm not quite sure, if that's correct. The io() hook operates on nodes. 
Some file systems, I believe including BFS, expose attributes as nodes to 
the VFS/file cache. IIRC you designed the attribute overlay also to be used 
as write overlay for file systems that actually do support attributes (but 
are read-only), so the io() hook can be called for nodes that are in fact 
attributes. I believe this happens only when requested by a 
{read,write}_attr() file system hook, so this is maybe not a problem, but 
not knowing the implementations of the overlays I couldn't say for sure.

CU, Ingo

Other related posts: