
|
[openbeos]
||
[Date Prev]
[12-2003 Date Index]
[Date Next]
||
[Thread Prev]
[12-2003 Thread Index]
[Thread Next]
[openbeos] Re: AW: Re: AW: Locale Kit
- From: Pascal Goguey <pascal@xxxxxxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Mon, 15 Dec 2003 18:08:10 +0900
Hello,
Romanian language uses latin characters(it is a latin language)and
has some
special characters but they are obitional. I mean you can sort it like
you
said you can for french. the special characters are special forms of
i,t,s
and a .
In case of french, if I understood Axel's method correctly,
well, let's take an example: illettré, île, iliaque cannot be
sorted by a plain sort function because î is outside of ASCII,
and therefore greater than any of the other letters. The regular
sort would put île after zythum.
So the proposed method (apparently) consists in first stripping
these strings to temporary ascii strings, sorting, and then ordering the
original strings in the same order.
But there is a logical mistake here. Let's call:
Strip: a function that removes accents and alike.
A_Order : ascii order
F_Order : french dictionary order
A_Order ( strip (s1) , strip ( s2 ) ) can be deduced from F_Order (s1,
s2)
BUT:
F_Order(s1, s2) cannot be deduced from A_Order( strip(s1), strip(s2))
Here is an example:
These two words : cote and côte should happen in this sequence.
côte should be after cote.
If you perform an ASCII sort of the stripped strings, you end up
sorting cote and cote, and since the strings are equal, you cannot
decide which of the original strings comes first. No surprise here,
you loose information by stripping.
It's a good quick approximation, but not a fully working method.
In Japanese, it is even a bit more tricky since all the characters
that can be ordered come in at least 2 versions (hiragana and katakana).
The chineese characters used in Japanese cannot really be sorted.
Well, of course, they can, but the chineese character ordering is not
used in dictionaries for instance. At least the dictionaries I know.
Pascal
regards,
Valentin
-----Ursprüngliche Nachricht-----
Von: Axel Dörfler [mailto:axeld@xxxxxxxxxxxxxxxx]
Gesendet: Montag, 15. Dezember 2003 00:28
An: openbeos@xxxxxxxxxxxxx
Betreff: [openbeos] Re: AW: Locale Kit
"Chira, Valentin" <Chira@xxxxxxxxxxxxxxxx> wrote:
I speak Romanian language so I could do the translations if you want
me to.
That would be very nice, although we still need informations like "how
to sort cyrillic letters" - for example, I don't know if you can just
use their unicode value for that (you can in English, but not in other
European languages like French).
So if you have informations like this for Romanian or other languages
you know about, I'd be very happy to hear about it. For example you can
sort French correctly when you remove all diacriticals from the letters
("à" -> "a", "ç" -> "c", etc.).
Thanks in advance!
Bye,
Axel.
Pascal
|

|