> >> The indexing daemon will have a set of plugins that will convert > > > data > >> from different file formats (PDF, ODF, DOC etc.) to a format > >> compatible with the indexing library. Did you check the patch I posted some time ago about BaseTranslator ? Actually I noticed a missing break somewhere, beware. > > > > As I'm sure you are aware, this is just like Spotlight (and of > > course > > is a fine idea.) In the article I linked above it mentioned that > > Spotlight (and apparently Google too) only care about the first > > 100k > > bytes of each file. So that would probably be a good limit to > > impose > > on our system. No reason to completely process multi-megabyte PDFs, > > etc. > > > > I don't think thats a good idea since many PDFs are multi megabyte > PDFs today (because of embedded graphics and fonts and stuff) and I > still want to be able to find a word if it is in the last chapter of > such a pdf Well you'd only index the text from it anyway... François.