Michael Pfeiffer <michael.pfeiffer@xxxxxxxxx> wrote on Mon, 18 Feb 2002 17:15:24 +0100: > Now I have two feature requests: > * querying files under a certain folder (+ sub folders) That's a bit awkward, since the indices currently apply to the whole disk. ReiserFS seems to have a plan where they have indices in each directory (so your query is a path composed of multiple queries). We could do something similar. Or just filter out the files that aren't under a particular directory (would be slow, but good enough for a first attempt). > * searching in the file data > > This would make "grep" obsolete. Along those lines, I was thinking of a quick and simple trick for adding keyword indices as one of my file system experiments. A new keyword attribute type (different from the current plain string attribute type) would automatically break up the attribute data into separate words and individually add each word to the index. Of course, if the attribute changes, it removes all the words of the old attribute value from the index and then adds the words of the new value. Word processors and other tools would create this attribute, making a list of all the unique words used in the file (or non-unique at a slight performance and space cost to add the same word to the index extra times). To save on storing the text twice (once in the file, once in the attribute), a special redirection marker attribute could be used instead (but then every time you slightly change the file, it gets reindexed). So, doea anyone know which characters are spaces in all alphabets? What does Chinese/Arabic writing do to separate words, if they even have words? It's not the same as grep, but probably useful enough. For pure greppiness, a Tri (tree of letters forming words as you traverse it) or some other different kind of indexing system would be needed. Perhaps the Patricia trees used in the Oxford dictionary project at Waterloo? - Alex