[haiku-appserver] moreUTF8.h
- From: "Stephan Assmus" <superstippi@xxxxxx>
- To: haiku-appserver@xxxxxxxxxxxxx
- Date: Wed, 15 Jun 2005 19:39:55 +0200 CEST
Hi,
static inline bool
IsInsideGlyph(uchar ch)
{
return (ch & 0xC0) == 0x80;
}
This code returns true for the following pattern, right?
10?? ????
This code...
const char *ptr = text;
do {
ptr++;
} while (IsInsideGlyph(*ptr));
return ptr - text;
...increments the ptr once, then tests for IsInsideGlyph. Which will
return true in case only the first high bit is set. So how does this
work for three byte glyphs?
A three byte glyph looks like this (correct me if I'm wrong):
1110 ????
110? ????
10?? ????
So when IsInsideGlyph tests the second byte, it would return false, no?
Which means moreUTF8.h only works for 2 byte glyphs. Can someone
confirm? If my observation is correct, I'm going to fix the problem
with count_utf8_bytes() that I introduced in my last commit. If there
is a better way, speak up! :-)
Best regards,
-Stephan
- Follow-Ups:
- [haiku-appserver] Re: moreUTF8.h
- From: Axel Dörfler
Other related posts:
- » [haiku-appserver] moreUTF8.h
- » [haiku-appserver] Re: moreUTF8.h
- » [haiku-appserver] Re: moreUTF8.h
- [haiku-appserver] Re: moreUTF8.h
- From: Axel Dörfler