[haiku-development] Re: BString and UTF-8

  • From: pete.goodeve@xxxxxxxxxxxx
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sat, 3 Dec 2011 18:12:58 -0800

On Sun, Dec 04, 2011 at 12:22:45AM +0100, Oliver Tappe wrote:
> 
> On 2011-12-03 at 20:22:11 [+0100], pete.goodeve@xxxxxxxxxxxx wrote:
> > On Sat, Dec 03, 2011 at 01:50:20PM +0100, Oliver Tappe wrote:
> [ ... ]
> > > 
> > > The BString functions that deal with character indices rely on the string 
> > > to
> > > contain UTF-8 characters, too.
> >  
> > Hunh?  Where do you get that from?  The indexing methods (ByteAt() and
> > the [] operator) both explicitly return char.  I can see nothing in the
> > current BString API that has any concept of multibyte codes.
> 
> Hm, let me think, where did I get this from? Ah, right, String.h. If you 
> would have taken care to look, you would have noticed that it declares 
> several ...Chars() methods that take character based lengths and indices as 
> parameters. Those support (and expect) UTF-8 encoding.
>  
Well, I have to admit I was looking at the docs [I'm a great believer in
docs.. (:-)] but it's quite likely that if I had looked at the header
I might have missed them...  Now if you had said "methods *added* to
the Haiku implementation", it would have been a bit more obvious.

Another problem of course is the ambiguous meaning of "Character".
I'm liable to be in automatic "C-mode" and equate: character = char = byte.
One probably should be strict and -- as David notes -- talk about
"Code Points".

But, as you say, it's all moot for now.

        -- Pete --


Other related posts: