[haiku-development] B_UNICODE_CONVERSION vs UTF-8

From: "François Revol" <revol@xxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Tue, 17 Mar 2009 20:39:50 +0100 CET

B_UNICODE_ENCODING is actually UCS-2 (or maybe UTF-16, not even
sure...).

Vision has a special case to handle B_UNICODE_CONVERSION as UTF-8 (it
just skips calling convert_*_utf8(), however this lets through invalid
UTF-8 strings.

IMO we should support a B_UTF8_CONVERSION, rename B_UNICODE_CONVERSION
to B_UCS2_CONVERSION or whichever, to avoid misunderstanding, and
allowing the use of convert_ to also validate or eventually correct
broken strings by converting them from ISO latin1 as fallback (seems
ZETA's one does it when it finds broken UTF-8 as input).

Comments ?

François.

Follow-Ups:
- [haiku-development] Re: B_UNICODE_CONVERSION vs UTF-8
  - From: Ingo Weinhold

[haiku-development] B_UNICODE_CONVERSION vs UTF-8

Other related posts: