[gmpi] Re: string encoding in teh API (UTFs)

  • From: thockin@xxxxxxxxxx
  • To: gmpi@xxxxxxxxxxxxx
  • Date: Wed, 14 Dec 2005 11:43:35 -0800

On Thu, Dec 15, 2005 at 07:30:12AM +1300, Jeff McClintock wrote:
> > You're still ignoring all the other issues.
> 
> Well I decided to use wide-chars (UCS-2) based on this article...
> 
> "The Absolute Minimum Every Software Developer Absolutely, Positively 
> Must Know About Unicode and Character Sets (No Excuses!)"
> 
> http://www.joelonsoftware.com/articles/Unicode.html
> 
> It discusses most of the points you raise.

It doesn't discuss that wchar_t is not actually guaranteed to be any
specific width.

It does address byte-order marks, but are we really going to suggest that
is a useful thing to do on every string?

It doesn't address the fact that Windows' implementation (UCS-2) can't
represent all of unicode.

It doesn't discuss the lack of any standards-based UTF-16 support.

He equates UCS-2 and UTF-16 which is FLAT WRONG.

> I guess you may not agree that UCS-2 (16 bit wide-char) is the way to 
> go. But I hope it at least explains my reasoning (better than I can).

The best reasoning for it is "that's what Windows does".  But if we want
to support Unicode, UCS-2 doesn't cut it.  If you want to convert from
proper Unicode (whether that's UTF-8, UTF-16 or UTF-32) into UCS-2 in your
host, you should absolutely feel free.  But you do so at the peril of
getting some characters wrong.

I'm still leaning towards UTF-8 as the single standards-based option which
completely covers Unicode.  We have to pass strings between objects from
different compilers as well as possibly across networks.  We want to share
sourcecode as well as translation databases between platforms.  The only
answer I see that meets all of those is UTF-8.

I'm willing to be convinced otherwise, but th emore reading I do, the more
I think UTF-8 is right.

Tim

----------------------------------------------------------------------
Generalized Music Plugin Interface (GMPI) public discussion list
Participation in this list is contingent upon your abiding by the
following rules:  Please stay on topic.  You are responsible for your own
words.  Please respect your fellow subscribers.  Please do not
redistribute anyone else's words without their permission.

Archive: //www.freelists.org/archives/gmpi
Email gmpi-request@xxxxxxxxxxxxx w/ subject "unsubscribe" to unsubscribe

Other related posts: