Romanian language uses latin characters(it is a latin language)and has some
special characters but they are obitional. I mean you can sort it like you
said you can for french. the special characters are special forms of i,t,s
and a .
In case of french, if I understood Axel's method correctly, well, let's take an example: illettré, île, iliaque cannot be sorted by a plain sort function because î is outside of ASCII, and therefore greater than any of the other letters. The regular sort would put île after zythum.
So the proposed method (apparently) consists in first stripping these strings to temporary ascii strings, sorting, and then ordering the original strings in the same order.
But there is a logical mistake here. Let's call: Strip: a function that removes accents and alike. A_Order : ascii order F_Order : french dictionary order
These two words : cote and côte should happen in this sequence. côte should be after cote.
If you perform an ASCII sort of the stripped strings, you end up sorting cote and cote, and since the strings are equal, you cannot decide which of the original strings comes first. No surprise here, you loose information by stripping. It's a good quick approximation, but not a fully working method.
In Japanese, it is even a bit more tricky since all the characters that can be ordered come in at least 2 versions (hiragana and katakana). The chineese characters used in Japanese cannot really be sorted. Well, of course, they can, but the chineese character ordering is not used in dictionaries for instance. At least the dictionaries I know.
-----Ursprüngliche Nachricht----- Von: Axel Dörfler [mailto:axeld@xxxxxxxxxxxxxxxx] Gesendet: Montag, 15. Dezember 2003 00:28 An: openbeos@xxxxxxxxxxxxx Betreff: [openbeos] Re: AW: Locale Kit
"Chira, Valentin" <Chira@xxxxxxxxxxxxxxxx> wrote:I speak Romanian language so I could do the translations if you want me to.
That would be very nice, although we still need informations like "how to sort cyrillic letters" - for example, I don't know if you can just use their unicode value for that (you can in English, but not in other European languages like French). So if you have informations like this for Romanian or other languages you know about, I'd be very happy to hear about it. For example you can sort French correctly when you remove all diacriticals from the letters ("à" -> "a", "ç" -> "c", etc.).
Thanks in advance!