[haiku-development] Re: BString and UTF-8

  • From: Oliver Tappe <zooey@xxxxxxxxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sat, 03 Dec 2011 13:50:20 +0100

On 2011-12-03 at 11:04:35 [+0100], Stefano Ceccherini 
<stefano.ceccherini@xxxxxxxxx> wrote:
> Il giorno 03/dic/2011 10.46, "Siarzhuk Zharski" <zharik@xxxxxx> ha scritto:
> >
> > Am 03.12.2011 02:48, schrieb Michael Bridgers:
> >>
> >> On 12/2/2011 8:05 PM, Pete Goodeve wrote:
> >>>
> >>>
> > My meaning is: BString is one of basic types and applying any
> restrictions on it's content may have dramatic sequences for developers -
> so they have to use other charcters container classes for their purposes.
> That "UTF-8 only" restriction looks like unpractical from my point of view.
> 
> I agree and suggest to add an utf8string class instead of extending
> BString. That way we don't have any restriction on how it works nor in the
> api.

Considering the use of BString in our API, I'd argue the other way around: 
when BString is used in our API (e.g. {BFont,BView}::TruncateString()), it 
is expected to always contain UTF-8 characters (or pure ASC-II, but that's 
just a subset of UTF-8).

The BString functions that deal with character indices rely on the string to 
contain UTF-8 characters, too. So if one puts an ISO8859-* character stream 
into a BString, those functions would no longer work.

The only argument I can see for allowing other charsets into BString is 
binary compatibility: BString has not done any validity checks in BeOS R5, 
so we can't introduce it now, until we've reached R1. 

After that, we should really change that and make BString UTF8-only. If 
there's enough demand, we could provide something like BByteString with a 
similar (but smaller) API that doesn't care about characters at all.

cheers,
        Oliver

Other related posts: