[haiku-development] Re: BString and UTF-8

  • From: pete.goodeve@xxxxxxxxxxxx
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sat, 3 Dec 2011 11:22:11 -0800

On Sat, Dec 03, 2011 at 01:50:20PM +0100, Oliver Tappe wrote:
> 
> On 2011-12-03 at 11:04:35 [+0100], Stefano Ceccherini 
> <stefano.ceccherini@xxxxxxxxx> wrote:
> > That "UTF-8 only" restriction looks like unpractical from my point of view.
> > 
> > I agree and suggest to add an utf8string class instead of extending
> > BString. That way we don't have any restriction on how it works nor in the
> > api.
> 
> Considering the use of BString in our API, I'd argue the other way around: 
> when BString is used in our API (e.g. {BFont,BView}::TruncateString()), it 
> is expected to always contain UTF-8 characters (or pure ASC-II, but that's 
> just a subset of UTF-8).
> 
> The BString functions that deal with character indices rely on the string to 
> contain UTF-8 characters, too.
 
Hunh?  Where do you get that from?  The indexing methods (ByteAt() and
the [] operator) both explicitly return char.  I can see nothing in the
current BString API that has any concept of multibyte codes.

I think it should be possible to add "CodePoint(int32 index)" et al methods,
but it would be extremely unwise to try to revamp the current assumptions.
When a BString is intended for display (as in your TruncateString()), sure
it should be UTF-8, but it's intended as a general utility object, and
you can't anticipate what someone might want to use it for.

        -- Pete --


Other related posts: