[haiku-development] Re: Non-latin names in AboutSystem

  • From: Timothy Brown <stimut@xxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Mon, 2 Jan 2012 13:04:27 -0800

On 2 January 2012 02:59, Ingo Weinhold <ingo_weinhold@xxxxxx> wrote:
>
> On 2011-12-30 at 23:31:26 [+0100], Donn Cave <donn@xxxxxxxxxxx> wrote:
> > Quoth "Ingo Weinhold" <ingo_weinhold@xxxxxx>,
> > [... re automatic translation ...]
> >
> > > Does that actually work in general, i.e. also for scripts that aren't
> > > character-based (like Chinese)?
> >
> > Work at all, or work well?  I don't know about that specific software,
> > but for example Google translation comes with transliteration to pinyin,
> > so of course if you have good enough data and good enough software, it's
> > theoretically possible.
>
> Sure, automatic transcription to pinyin definitely works fine in principle
> (and the reverse for the input method). I was wondering whether ICU
> implements that and the mentioned class in particular.

I just wanted to clarify some things about Mandarin since I've been
studying it and know a little about it. Nothing technical in here, so
skip if you want.

Automatic conversion from characters to pinyin should work OK, but be
aware there is not a 1-1 mapping. That is, there are some characters
which can have more than one pinyin equivalent (or pronunciation)
depending on the characters meaning in that particular grammatical
context. In the other direction, most pinyin has many many different
characters (there are only ~1600 different sounds including the
different tones or inflections, but tens of thousands of characters).
I expect it would be very difficult to have an automated system that
goes from pinyin to characters, even with context (the pinyin input
method requires the user to select which set of characters they meant
from the pinyin they typed).

> > But I just wanted to point out that Chinese is also an example with more
> > than one possible transliteration.  Someone from Taiwan might use the
> > older Wade-Giles system, which is radically different, and even if you
> > supported both I don't know how you'd guess.
>
> Indeed, hanyu pinyin is not a transliteration but a phonetic transcription
> system. AFAIK it is used for Mandarin but not for Cantonese and others.

Correct. Pinyin is only used for Mandarin. But as Donn says, in
Taiwan, where they use Mandarin as well, they have several different
romanizations. Apparently as of 2008 though, the official romanization
system is pinyin, so I think it would be safe to just use pinyin and
not worry about the others.

As an example of different romanizations, the capital of China used to
be transcribed as "Peking", but is now transcribed to "Beijing" (using
pinyin). The pronunciation in Mandarin is still the same of course,
but now speakers of English will probably pronounce it in a way which
is closer to the actual Mandarin.

Regards,
Tim

Other related posts: