[openbeos] Re: [sort of OT, flame-ish] Re: Re: AW: Re: AW: Locale Kit

  • From: Pascal Goguey <pascal@xxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Tue, 23 Dec 2003 22:55:25 +0900


That used to be correct, but is no longer, since the Unicode character
set now spans 4 bytes; UCS-16 is "outdated" (or inappropriate for
certain uses) as it can only represent a subset of the Unicode set -
UTF-8 and UTF-16 can grow much longer these days (although it might not
have been adopted everywhere).

Unicode spans actually (at least the last version I have seen a few months ago,
3,2??) between 0 and approximately 110000 (roughly 20+ bits).
The characters are under 2^20, and the extra 10000 are characters to be used
privately. They are unassigned, and any application can set up its own control
characters. That's basically the idea. There are other private areas under 2^20
and even under 2^16, this one I know: from E000 to F8FF.
For more info, there is a Unicode home page.
As for the 4 bytes, maybe you were talking about UTF32, not Unicode.



Bye, Axel.

Other related posts: