[gmpi] Re: string encoding in teh API (UTFs)

  • From: thockin@xxxxxxxxxx
  • To: gmpi@xxxxxxxxxxxxx
  • Date: Wed, 14 Dec 2005 09:01:47 -0800

On Wed, Dec 14, 2005 at 09:38:12PM +1300, Jeff McClintock wrote:
> I understand as a Linux programmer you prefer UTF-8.
> 
> I'm just saying, for me, wide-chars are easier.  Fixed-size ones. I 
> guess that would be UCS-16 on Windows.

16 bit fixed characters *can't* represent all of Unicode.  If we're going
to support unicode, shouldn't we support Unicode?

>   Mostly it seems a waste for both plugin and host to use 16-bit, but 
> to double-convert each string to UTF-8 and back as it crosses the GMPI API.

Even if that's the case, how much string mangling do you really expect to
be doing?

> I already support Windows-98 ( ASCII ) and Windows-XP from the same 
> source code, by using the macro TCHAR to represent either "char" or 
> "wchar_t".

So you have two build and two environments?

>   I'm not sure why you're against this, considering on Linux you will 
> only ever use TCHAR=char and will therefore not be affected by it.

Hey, I would like to localize, too.  Linux can't just be char.  We're all
going to be affected by the decision.

I'm not necessarily against anything.  But you haven't addressed any of
the shortcomings of a 16-bit encoding.  Hey, I would like to localize,

- It's still variable length, and if you don't handle that your code is
  broken.

- It's not ASCII compatible, which FORCES all hosts to handle it
  explicitly.

- It's got byte-ordering problems if it ever goes cross-system.

- It causes a source-code portability mess (how do you declare a string
  literal?  "foo" or L"foo"?).

It seems to me that UTF-16 has all the problems of UTF-8 and more, with
none of the advantages.  The *only* thing it has going for it is Windows
and Mac.  That shouldn't be ignored, but neither should the accompanying
drawbacks.

To be intellectually honest:
        http://www.unicode.org/notes/tn12/
        http://www.linux.com/howtos/Unicode-HOWTO-1.shtml
        http://www.open-std.org/jtc1/sc22/wg20/docs/n830-utf-16-c.txt

I'm doing some reading now on how I would use UTF-16 in a classic C
environment.  How do I declare string literals, sprintf, etc.

Tim

----------------------------------------------------------------------
Generalized Music Plugin Interface (GMPI) public discussion list
Participation in this list is contingent upon your abiding by the
following rules:  Please stay on topic.  You are responsible for your own
words.  Please respect your fellow subscribers.  Please do not
redistribute anyone else's words without their permission.

Archive: //www.freelists.org/archives/gmpi
Email gmpi-request@xxxxxxxxxxxxx w/ subject "unsubscribe" to unsubscribe

Other related posts: