[openbeos] Re: AW: Re: AW: Locale Kit

  • From: Pascal Goguey <pascal@xxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Mon, 15 Dec 2003 18:08:10 +0900


Romanian language uses latin characters(it is a latin language)and has some
special characters but they are obitional. I mean you can sort it like you
said you can for french. the special characters are special forms of i,t,s
and a .

In case of french, if I understood Axel's method correctly, well, let's take an example: illettré, île, iliaque cannot be sorted by a plain sort function because î is outside of ASCII, and therefore greater than any of the other letters. The regular sort would put île after zythum.

So the proposed method (apparently)  consists in first stripping
these strings to temporary ascii strings, sorting, and then ordering the
original strings in the same order.

But there is a logical mistake here. Let's call:
Strip: a function that removes accents and alike.
A_Order : ascii order
F_Order : french dictionary order

A_Order ( strip (s1) , strip ( s2 ) ) can be deduced from F_Order (s1, s2)
F_Order(s1, s2) cannot be deduced from A_Order( strip(s1), strip(s2))

Here is an example:

These two words : cote and côte should happen in this sequence.
côte should be after cote.

If you perform an ASCII sort of the stripped strings, you end up
sorting cote and cote, and since the strings are equal, you cannot
decide which of the original strings comes first. No surprise here,
you loose information by stripping.
It's a good quick approximation, but not a fully working method.

In Japanese, it is even a bit more tricky since all the characters
that can be ordered come in at least 2 versions (hiragana and katakana).
The chineese characters used in Japanese cannot really be sorted.
Well, of course, they can, but the chineese character ordering is not
used in dictionaries for instance. At least the dictionaries I know.


regards, Valentin

-----Ursprüngliche Nachricht-----
Von: Axel Dörfler [mailto:axeld@xxxxxxxxxxxxxxxx]
Gesendet: Montag, 15. Dezember 2003 00:28
An: openbeos@xxxxxxxxxxxxx
Betreff: [openbeos] Re: AW: Locale Kit

"Chira, Valentin" <Chira@xxxxxxxxxxxxxxxx> wrote:
 I speak Romanian language so I could do the translations if you want
me to.

That would be very nice, although we still need informations like "how to sort cyrillic letters" - for example, I don't know if you can just use their unicode value for that (you can in English, but not in other European languages like French). So if you have informations like this for Romanian or other languages you know about, I'd be very happy to hear about it. For example you can sort French correctly when you remove all diacriticals from the letters ("à" -> "a", "ç" -> "c", etc.).

Thanks in advance!



Other related posts: