Ankur Sethi wrote: > (1) Create an initial index of all the data on the disk. This takes a > *very* long time and consumes a large amount of CPU. Somethimes, > seemingly at random, Spotlight will decide to build the entire index > from scratch, and then there is very little you can do about it. This is one place where the BFS indexing shines. A userspace process that tries to keep an index of data on the disk needs to reindex from scratch every time it starts up if it wants to be absolutely sure its database is up-to-date. The reason is that it cannot know how much changed on disk while the process was not running. BFS does not have this problem because, obviously, no attributes can ever be written without BFS noticing. It would be interesting if a plugin based extractor could be triggered from BFS. It doesn't have to run in kernel space; just spawn a userspace process, read a simple key-value format from its stdout, and set the attributes. There would still be a problem with attributes getting out-of-date when extractor plugins change. The extractor could solve this by making a query for existing files with the mime-types handled by the changed plugins, and update the attributes on those.