[openbeos] Re: [sort of OT,flame-ish] Re: Re: AW: Re: AW: Locale Kit

Pascal Goguey <pascal@xxxxxxxxxx> wrote:
> U < 128 : 1 byte
> 128 <= U < 2^12 : 2 bytes (e.g. accentuated letters)
> 2^12 <= U < 2^18 : 3 bytes (e.g. japanese characters)
> 2^18 <= U <= MAX_UNICODE : 4 bytes

That used to be correct, but is no longer, since the Unicode character 
set now spans 4 bytes; UCS-16 is "outdated" (or inappropriate for 
certain uses) as it can only represent a subset of the Unicode set - 
UTF-8 and UTF-16 can grow much longer these days (although it might not 
have been adopted everywhere).

Bye,
   Axel.


Other related posts: