[haiku] Re: Need Some GSoC Advice

  • From: Truls Becken <truls.becken@xxxxxxxxx>
  • To: haiku@xxxxxxxxxxxxx
  • Date: Mon, 23 Mar 2009 19:54:36 +0100

Ankur Sethi wrote:

> (1) Create an initial index of all the data on the disk. This takes a
> *very* long time and consumes a large amount of CPU. Somethimes,
> seemingly at random, Spotlight will decide to build the entire index
> from scratch, and then there is very little you can do about it.

This is one place where the BFS indexing shines. A userspace process
that tries to keep an index of data on the disk needs to reindex from
scratch every time it starts up if it wants to be absolutely sure its
database is up-to-date. The reason is that it cannot know how much
changed on disk while the process was not running. BFS does not have
this problem because, obviously, no attributes can ever be written
without BFS noticing.

It would be interesting if a plugin based extractor could be triggered
from BFS. It doesn't have to run in kernel space; just spawn a
userspace process, read a simple key-value format from its stdout, and
set the attributes.

There would still be a problem with attributes getting out-of-date
when extractor plugins change. The extractor could solve this by
making a query for existing files with the mime-types handled by the
changed plugins, and update the attributes on those.

Other related posts: