[haiku-development] Re: BString and UTF-8

From: Michael Bridgers <mibrid@xxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Fri, 02 Dec 2011 20:44:09 -0500



On 12/2/2011 11:08 AM, Axel Dörfler wrote:


This all sounds like very welcomed changes, but this one cannot be done
in a backward compatible manner, unfortunately.
As long as all *Chars() methods deal with invalid UTF-8 correctly (as
they should), there is no security risk either. I would just add a
method IsValidUTF8() or something to that degree which you can use to
test it.

Bye,
Axel.


I'm certain that it CAN be done in a backwardly compatible manner.

And *Chars methods, by definition, can't deal correctly with invalidUTF-8 strings. What should be done with a "replace char" method when,for example, the code point has one lead byte and 9 trailing bytes?There is no way that anything meaningful could be done.

If you look at how ICU handles things, it doesn't allow invalid strings.If you try to create a UnicodeString with invalid input, it replaces theinvalid code points with the replacement character, 0xfffd. Othersystems that use Unicode do similar things.

I know that some security exploits have passed executable code as astring as a means to breach security of an OS. At the"Internationalization and Unicode Conference" in 2002, there was a paperpresented that talked about security considerations with UTF-8.(http://unicode.org/iuc/iuc22/a323.html)

If you will give my changes a chance, I think you will see thateverything I'm doing will have a positive effect on the BString classand Haiku.

Also, I'm not a committer. Someone will have to verify what I'm doingbefore it will be committed. I'm not going to submit something thatwill break things, because I know it will never make it into the sourcetree.


And yes, I already have a static IsValidUTF8() method on the BString.

Michael

References:
- [haiku-development] Nightly images are back
  - From: Matt Madia
- [haiku-development] Re: Nightly images are back
  - From: Ralf Schülke
- [haiku-development] BString and UTF-8
  - From: Michael Bridgers
- [haiku-development] Re: BString and UTF-8
  - From: Axel Dörfler

[haiku-development] Re: BString and UTF-8

Other related posts: