[haiku-commits] Re: haiku: hrev48017 - src/data/mime_db/text

  • From: Adrien Destugues <pulkomandy@xxxxxxxxx>
  • To: haiku-commits@xxxxxxxxxxxxx
  • Date: Tue, 14 Oct 2014 11:22:42 +0200

On Tue, Oct 14, 2014 at 11:13:02AM +0200, Axel Dörfler wrote:
> Am 14.10.2014 11:07, schrieb pulkomandy@xxxxxxxxxxxxx:
> >+    "0.251 [0:15] (\"<?xml\" | \"<\\000x\\000m\\000l\")"
> That doesn't look right: '<' has no prefix (what is \\000 anyway?), and
> there is no '?'?

I forgot the ?, added it in hrev48019.

The rule is a bit tricky but it works:

\\000 is unescaped once by the rdef parser to \000. It is then escaped
a second time by the MIME sniffing rule parser to a null character.

The first character has no suffix, and the last one has no suffix. This
allows the rule to match UTF16 no matter which endianness is used (both
"<\0?\0x\0m\0l\0" (big endian) and "\0<\0?\0x\0m\0l" (little endian) are
recognized). It also uses the [0:15] to skip the byte order mark, if any.

Note the xhtml rule already uses a similar format, and works fine. I
don't think there currently is a more readable way to express this.


Other related posts: