> > If you have a look at BSD's "file", the text magic happens in > > ascmagic.c - it looks very reasonable to me, and could even > > identify > > the charset for StyledEdit (at least in a basic way that should be > > enough for the Western world). > > I intend to incorporate that code into the text sniffer add-on > directly, > stripping as much of the character set stuff as possible. But we can > certainly provide a library function (e.g. in libtextencoding) that > guesses > the character encoding/set of a given buffer. > Emacs does that ... but in lisp :) More seriously, instead of creating yet another kind of addons... I've been thinking for some time about reusing the translators as a mean to index meta data from file content, by adding a SniffAndIndex() method or so... that would fill in atribs from ID3 for example... (not exactly the best example but well... or get image w & h form the file). I guess we could use that for mime sniffing as well. I think translators are probably the best suited to know what is inside files the support. Also when converting files with them they would just have to call their SniffAndIndex method to put the meta data right away. That'd save ppl form adding it manually, something we should have done for long, other OS already do. There might be a perf issue though if we load every translator each time... François.