[haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?

From: Rimas Kudelis <rq@xxxxxx>
To: haiku-i18n@xxxxxxxxxxxxx
Date: Sun, 18 Dec 2011 11:20:12 +0200

2011.12.17 23:20, Adrien Destugues rašė:

Le 17/12/2011 09:39, Rimas Kudelis a écrit :
2011.12.17 10:11, Niels Sascha Reedijk rašė:
Hi,

On Fri, Dec 16, 2011 at 11:11 PM, Adrien Destugues
<pulkomandy@xxxxxxxxxxxxxxxxx>  wrote:
What would be more preferred. Use the dash? Or an underscore? Perhaps
we can use the dash in this case because there is no relation between
the zh-Hans and zh.

  And capitalization, should we use the ISO version? so zh-Hans? Or
shall we capitalize it completely, like pt_BR?
Don't mix them. pt_BR is portuguese, variant spoken in brazil.
zh-Hans and zh-Hant are, for localization purposes, entirely different
langages. (they read the same, but use different alphabets).Another exampleof that would be the two writings of norwegian - nynorsk and bokmal(I hope
I get the spelling right...).
The result is that there is no 'zh' language at all. It's eitherzh-Hans orzh-Hant. Go with ISO, and that should be fine as it's also what ICUuses ?
ICU seems to use the normal dash even for language-country locales,
like pt-BR [1]. So in this sense it might have to do with the demands
that POSIX makes, but this really goes into foreign territory. KDE
uses zh_CN for simplified Chinese. I understand the difference and why
zh-Hans is better, but i am wondering whether we should keep
compatibility in mind.

Unless of course, the LC_ALL and Haiku locale is unrelated.

Regards,

N>


[1] http://www.iana.org/assignments/language-subtag-registry
Currently, the user chooses his language in one tab in Locale prefs,and country in the other. I think LANG and all the LC_* variablesshould be composed (and I believe they are) by joining these twopreferences, that is:* if I choose English language and China region, the locale would been-CN (or en_CN)* if I choose Chinese Simplified langauge and China region, thelocale would be zh-Hans-CN (or with underscores)* if I choose Chinese Simplified language and do not specify theregion, the locale would be zh-Hans
So, CN (or TW) is quite likely to appear in those variables anyway,and since Haiku is an OS on its own, I'm not sure it makes sense torequire that backwards compatibility. Linux may also move on fromtheir current scheme one day too, you never know...
WRT dash vs. underscore... I'd think if BCP47 specifies and ICU usesdashes, and we don't have backwards compatibility to stick to, thenwhy not use dashes? I suspect that for command-line applicationsported from say Linux, simply renaming their .po files to our schemewould do (haven't tested though), and even if not, it should probablybe considered a bug in GNU Gettext, not with us then...
Rimas
LC_ALLis unrelated. The code we're talking about here is the one forcatalogs, which does not use it.

Yeah, but I assume it's the Locale kit that sets those variables thatcan later be used by e.g. KDE apps. That's all I was saying.

You could add a country specific code to zh-Hans, giving somethinglike zh-Hans_CN for China. That is helpful if there is some othercountry using a variant of the language, same as pt_BR is differentfrom pt_PT and fr_CA is not exactly like fr_FR.
The implementation allows "pt_BR" to fallback to "pt" when there is nopt_BR string available.
Note that this language+country is chosen in the "language" tab, andis unrelated to the country code in the formatting tab. It is notpossible to make up arbitrary codes such as en_FR (as english is not alanguage usually spoken in France) for the language selection.


Ah, I didn't know that. Thanks.

The thing to remember is :
* _ is the marker for fallback. zh-Hans_CN can thus fallback tozh-Hans, but not to zh nor zh_CN or whatever else.
 * - is a normal character, used as a separator in some language codes.

Hm, you said if the Locale Kit doesn't do something that it should, thenit should be fixed. I would guess that treating dash as a fallbackmarker could be one of those things, if what we want is to be closer toBCP47 and ICU. Unless there's a really big reason not to.

And regarding Chinese in particular, I don't think there's anything tobe afraid of with the fallback mechanism – since no catalogs for zhwithout modifiers will exist, the fallback mechanism should simply fallback to the next language in the list, e.g. English (and it's a bug ifit does not). So I don't think there's much argumentation for treatingscript modifiers differently from language modifiers.

By the way, I just looked a bit closer at locale preferences, and I'm abit surprised that for each language that can be typed in multiplescripts, multiple entries exist (one for the language itself and one forthat language written in each script). For example, for Chinese, thereare Chinese, Chinese Simplified and Chinese Traditional entries. I don'tthink this makes sense, does it? I would suggest the following scheme:

* for each language with a Suppress-Script attribute in BCP47 (such asPunjabi), don't provide an option for "That language (That script)",leaving only "That language" instead, and listing countries under it* for each language without a Suppress-Script attribute (such asChinese), don't provide an option for "That language" without a scriptmodifier at all, leaving only the options with the script modifier set

This would quickly get rid of a few useless options in our Localepreflet, which can only be good, IMO.


Rimas

Follow-Ups:
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Adrien Destugues

References:
- [haiku-i18n] Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Niels Sascha Reedijk
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Adrien Destugues
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Niels Sascha Reedijk
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Adrien Destugues
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Niels Sascha Reedijk
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Rimas Kudelis
- [haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?
  - From: Adrien Destugues

[haiku-i18n] Re: Language code for zh_hans Fwd: Catkeys update from HTA wanted?

Other related posts: