[openbeos] Re: Identifying Text Files
- From: "Axel Dörfler" <axeld@xxxxxxxxxxxxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Fri, 09 Jun 2006 13:13:35 +0200 CEST
Ingo Weinhold <bonefish@xxxxxxxxxxxxxxx> wrote:
> since BeOS seems to have built-in support for recognizing files as
> text
> files, we want to have the same. I'm about to implement that, missing
> is
> basically the algorithm deciding whether (or with what probability) a
> buffer of bytes actually contains text.
>
> A simple but maybe a bit ignorant approach would be to check whether
> the
> buffer contains valid UTF-8 characters only (or more than, say, 95%).
> But
> maybe someone has better ideas...
I would add special rule semantics for this, ie. a "text" rule and an
"ascii" rule where the former would accept UTF-8 and the latter plain
ASCII only, maybe even with a method to specify the minimal congruence.
If you have a look at BSD's "file", the text magic happens in
ascmagic.c - it looks very reasonable to me, and could even identify
the charset for StyledEdit (at least in a basic way that should be
enough for the Western world).
Bye,
Axel.
- Follow-Ups:
- [openbeos] Re: Identifying Text Files
- From: Ingo Weinhold
- References:
- [openbeos] Identifying Text Files
- From: Ingo Weinhold
Other related posts:
- » [openbeos] Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- » [openbeos] Re: Identifying Text Files
- [openbeos] Re: Identifying Text Files
- From: Ingo Weinhold
- [openbeos] Identifying Text Files
- From: Ingo Weinhold