[gmpi] Re: GMPI req's draft 1 for review.

  • From: Steve Harris <S.W.Harris@xxxxxxxxxxxxxxx>
  • To: gmpi@xxxxxxxxxxxxx
  • Date: Wed, 8 Dec 2004 10:48:39 +0000

On Tue, Dec 07, 2004 at 03:28:43PM +0100, Sébastien Métrot wrote:
> Steve Harris wrote:
> 
> >On Tue, Dec 07, 2004 at 01:45:57 +1300, Jeff McClintock wrote:
> > 
> >
> >>HI,
> >>My point is, both Windows and Mac API use 16bit UNICODE.  If you pass my 
> >>plugin a 8bit string I've got to convert it, I've got to allocate memory 
> >>to hold the longer 16bit representation.  Then if I need to pass it back 
> >>I've got to convert it back to multibyte.
> >>   
> >>
> >
> >Actually windows uses UTF-16, and I'm pretty sure OSX uses UTF-8 nativly.
> >
> > 
> >
> No, Windows uses UCS-2. UTF-16 is a multibyte (variable length) 

Not in XP and newer:
http://msdn.microsoft.com/netframework/programming/bcl/faq/systemiofaq.aspx
It used to be UCS-2, but they ran into the normal fixed-width problems.
They seem to be recommending UTF-8 for i/o, and the MS tools generate
UTF-8 files unless told otherwise.

> The problem with Other unicode coding than UTF-8 is that they are not 
> ASCII compatible which means that you'll need conversion routines a lot 
> if you want to continue using 8 bit APIs (think about the number of 
> dependecies one can get in a plugin: libjpeg, freetype2, libpng, zlib, 
> libsndfile, etc...). UTF-8 is much more legacy code friendly, even if 
> you still have some work to do to adapt your code without making 
> mistakes (glyph separation is a good example, as is hyphenation).
> 
> I've done quite a lot of Japanese, Chinese, Arab, Hebrew, Korean, 
> Russian localization at some point in my career. I personally found that 
> UTF-8 often is a good solution for existing code. If you master all the 
> build chain then wchar_t (UCS-2 on Mac and Win32, UCS-4 on unices) may 
> be a good solution but then you'll need unicode fonts and all that jazz.

I've also done a fair bit of I18N work, and my experinces match yours;
UTF-8 is the least painful option.

The Unices I work with (Solris, Linux) use UTF-8 internally, as do most
databases and XML engines.

- Steve 

----------------------------------------------------------------------
Generalized Music Plugin Interface (GMPI) public discussion list
Participation in this list is contingent upon your abiding by the
following rules:  Please stay on topic.  You are responsible for your own
words.  Please respect your fellow subscribers.  Please do not
redistribute anyone else's words without their permission.

Archive: //www.freelists.org/archives/gmpi
Email gmpi-request@xxxxxxxxxxxxx w/ subject "unsubscribe" to unsubscribe

Other related posts: