[haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?

  • From: Rimas Kudelis <rq@xxxxxx>
  • To: haiku-i18n@xxxxxxxxxxxxx
  • Date: Sat, 17 Dec 2011 10:39:53 +0200

2011.12.17 10:11, Niels Sascha Reedijk rašė:
Hi,

On Fri, Dec 16, 2011 at 11:11 PM, Adrien Destugues
<pulkomandy@xxxxxxxxxxxxxxxxx>  wrote:
What would be more preferred. Use the dash? Or an underscore? Perhaps
we can use the dash in this case because there is no relation between
the zh-Hans and zh.

  And capitalization, should we use the ISO version? so zh-Hans? Or
shall we capitalize it completely, like pt_BR?


Don't mix them. pt_BR is portuguese, variant spoken in brazil.
zh-Hans and zh-Hant are, for localization purposes, entirely different
langages. (they read the same, but use different alphabets). Another example
of that would be the two writings of norwegian - nynorsk and bokmal (I hope
I get the spelling right...).

The result is that there is no 'zh' language at all. It's either zh-Hans or
zh-Hant. Go with ISO, and that should be fine as it's also what ICU uses ?
ICU seems to use the normal dash even for language-country locales,
like pt-BR [1]. So in this sense it might have to do with the demands
that POSIX makes, but this really goes into foreign territory. KDE
uses zh_CN for simplified Chinese. I understand the difference and why
zh-Hans is better, but i am wondering whether we should keep
compatibility in mind.

Unless of course, the LC_ALL and Haiku locale is unrelated.

Regards,

N>


[1] http://www.iana.org/assignments/language-subtag-registry

Currently, the user chooses his language in one tab in Locale prefs, and country in the other. I think LANG and all the LC_* variables should be composed (and I believe they are) by joining these two preferences, that is: * if I choose English language and China region, the locale would be en-CN (or en_CN) * if I choose Chinese Simplified langauge and China region, the locale would be zh-Hans-CN (or with underscores) * if I choose Chinese Simplified language and do not specify the region, the locale would be zh-Hans

So, CN (or TW) is quite likely to appear in those variables anyway, and since Haiku is an OS on its own, I'm not sure it makes sense to require that backwards compatibility. Linux may also move on from their current scheme one day too, you never know...

WRT dash vs. underscore... I'd think if BCP47 specifies and ICU uses dashes, and we don't have backwards compatibility to stick to, then why not use dashes? I suspect that for command-line applications ported from say Linux, simply renaming their .po files to our scheme would do (haven't tested though), and even if not, it should probably be considered a bug in GNU Gettext, not with us then...

Rimas

Other related posts: