[haiku-development] Re: [GSoC proposal] IMAP FS - A few queries

  • From: Donn Cave <donn@xxxxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Thu, 14 Apr 2011 10:23:29 -0700 (PDT)

Quoth Anshul Singhle <xashck@xxxxxxxxx>,
...
> Didn't get you? How can the FS process an I/O request for some data it
> doesn't know exists?

Not at all, I'm talking about something that's very simple.

Among the first data you get for a message, the first FETCH request,
you will want RFC822.SIZE.  This gives you the full extent of your
file, so among other things you can report this size via stat().

Along with stat, of course you also need to support seek and
read operations on the data contents, just as if it were an ordinary
block device backed filesystem.  As in fact it is, in view of the
cache, and once the entire file is cached you need no IMAP support
at all to read it.

But when an application posts a read request starting at a point
in the file that isn't cached, the filesystem needs to go out and
get the data to satisfy that request.  That might be at position 0
in the file, if I just do a "cat" on the file and the filesystem
hasn't yet downloaded the full header.

In a more interesting case, if I know that the message has a MIME
multipart structure, I may "seek" to a certain spot in the file
to read a certain part, and then the filesystem would download
that whole part.  You have to publish the data for that structure -
offset, size, MIME type of each part - in file attributes or
something (note that structure is recursive, by the way!), so
I would know where to seek.  (And we need to test the assumptions
vs. server implementations and make sure that data is reliable
and not compromized by differences in the way part sizes are
reported, separator sizes, etc.)

If it isn't feasible to support random access as described
above, it still makes sense to support incremental download
for sequential access.  You don't need to download a part until
I have asked to read it.  If I read "off the end" of the cached
data, but still inside the extent of the file as reported by
RFC822.SIZE, then it's time to fetch more parts.  I don't have
to know the MIME structure of the file to do this, and of course
I don't have to communicate it to the filesystem as such, I just
ask for N bytes of data at file offset T.

I think it will still be desirable to publish MIME structure,
though, since applications need to know this, and I think you
really have to support seek anyway, so maybe it boils down to
the same implementation either way.  If the difference is
whether parts could be incrementally cached out of order, maybe
that isn't such a good idea anyway, since the part boundaries
won't line up with the block structure of the cache device.
Just guessing on that.

Functions like stat(), seek(), read() are really familiar to
me because I've been using them for decades, and I'm just assuming
they're familiar to you too.  I don't know Haiku filesystem
internals, but I assume the basics are more or less universal.

        Donn

Other related posts: