[haiku-development] Re: BString and UTF-8

  • From: Pete Goodeve <pete.goodeve@xxxxxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Fri, 2 Dec 2011 17:05:45 -0800

On Sat, Dec 03, 2011 at 12:16:28AM +0100, Axel D wrote:
> On 12/03/2011 12:09 AM, Michael Bridgers wrote:
> >Keep in mind that all valid C-strings are a subset of valid UTF-8
> >strings. All valid C-strings will continue to work as they currently do.
> 
> You mix up C strings with ANSI strings. C is just a language, and C 
> strings can hold any byte but a zero value which denotes the end of a C 
> string.
> Therefore, you cannot provide a backwards compatible way to implement this.
> 
Thank you.  You got there before me... (:-/)

This is the sort of cavalier change that is definitely liable to break
things!  a C-string is by definition an array of any bytes, ending in null
(My Stroustrup says that it can even contain a zero byte -- it's just that
utility functions won't work on it!)  You can't tell -- or dictate -- what
a string might be used for.

Please don't do this.  By all means add UTF-8 methods to BString,
that will be very useful if appropriate. Presumably they would report
an error if the string wasn't flagged as valid UTF-8.  But you must
allow other 8-0bit encodings -- ISO-8859 for instance.

Thanks,
        -- Pete --

Other related posts: