[haiku] Re: Need Some GSoC Advice

  • From: Truls Becken <truls.becken@xxxxxxxxx>
  • To: haiku@xxxxxxxxxxxxx
  • Date: Mon, 23 Mar 2009 09:21:15 +0100

Ryan Leavengood wrote:

> On Sun, Mar 22, 2009 at 3:00 PM, Ankur Sethi <get.me.ankur@xxxxxxxxx> wrote:
>> I have been thinking about those 3 ideas, and I'm gravitating towards
>> idea #3.
>
> I think that is a good choice. One thing I'd like to implore right now
> is please be sure the indexer is as efficient and UNOBTRUSIVE as
> possible. In my experience the Mac OS X Spotlight system is very
> unobtrusive, so you might want to research how that works.

Another source of inspiration is SkyOS Index Feeder:
http://skyos.org/?q=node/418

It uses plugins to extract attributes stored in SkyFS (fork of
OpenBFS) and index file content to a database.

Efficiently indexing file content is probably the hardest part. It
would be nice if that could be stored in a BFS attribute as well
because you then avoid having a separate API for this part of the
search. Attributes with multiple parts, where each part is indexed
independently, would be needed for this to work. Something like that
has been brought up before, e.g the words {"Haiku", "indexing",
"feeder", "plugin"} are stored in an attribute in such a way that
searching for any of them will give a match.

I don't have any experience implementing full text indexing, so I
don't know if this is sufficient, though. How do you handle a search
for "indexing feeder" for instance? I mean, with a decent query system
you can quote parts of sentences to match exact phrases.

-Truls

Other related posts: