Hi Joseph,I think there are several other languages which have compound characters as well, such as Tamil. Normally, a single character is only represented by one Unicode character. However, in Tamil, a compound character, even though it only looks like one character visually, is actually represented by multiple Unicode characters. I'm not sure if this is how Unicode handles all such languages.
As a sidenote, this actually causes a problem concerning speak typed characters for Tamil users:
http://www.nvda-project.org/ticket/1428Strangely, your first Korean example (ㄱㅏ, ga) is two Unicode characters, but the second (관, gwan) is only one. So, for the first, doing as you request isn't too difficult. For the second, we have to decompose the character. It looks like we can do this in Python fairly easily (unicodedata.normalize with form NFD). So, if I'm correct, 관 becomes 관.
The question is whether this is desirable for other languages. Also, this will affect European languages as well; e.g. Ç (C cedilla) decomposes to Ç (C followed by combining cedilla). At a guess, I'd think it's not desirable for some languages. The problem is that it'd be difficult for NVDA to know which characters to do it for and which not. I guess we could make it a config option, but that kinda sucks for new Korean users.
The other question is what to do when a user presses speak current word (numpad5) thrice, which spells the word with character descriptions. Do we split the compound characters there as well?
I would file a ticket for this, at least for Korean. We can then determine from this email thread whether other languages will benefit.
Jamie On 8/11/2012 4:54 PM, Joseph Lee wrote:
Hi folks, I’m copying both the translations and dev_asia group to get your feedback on the following: Are there languages besides Korean that requires multiple char components when constructing a single char? At least in Korean, there are character components that goes into creating a single character (not a word). In Korean, a single character consists of initial conscenant, one or two vowels and zero or more final conscenants. For example, the character “ga” (written as ㄱㅏin Korean) has an initial conscenant (G, pronounced “gi-yug) and a vowel (ah). Or, the character “gwan” (written as 관 in Korean, meaning a crown) has the initial conscenant of “G”, the vowel “wa” and the final conscenant of “n” (pronounced “ni-eun”). As of 2012.3, when invoking char description script (numpad2 twice quickly when the review cursor is focused on the character), the char itself is announced again (when the char in question is a Hangul character). The ideal behavior (requested by Korean users) is to announce the components of such a character when this script is executed. For example, supposing that the char is “ga”: User puts review cursor on the char “ga”. Then he or she does the following: Current behavior under 2012.3: ·Presses numpad2: NVDA says “ga”. ·Presses Numpad2 for the second time: NvDA says “ga”. Ideal behavior (investigating for and researching this with Korean users for 2013.1): ·Presses Numpad2: NvDA says “ga”. ·Presses Numpad2 for the second time: NvDA says “gi-yuk, ah”. A naïve solution would be to map all possible 10773 conscenant/vowel/conscenant set combinations in characterDescriptions.dic, which has a risk of slower performance. A fellow Korean translator says he found a Python script which could calculate components of a Korean char. I feel that if this is unique to Korean, then it’s something that we Korean users can work on it ourselves; however, if there are other languages that uses this kind of component system for constructing a char, that could give us some test scenarios for improving char description module in the future to take this case into account. If you want, I’ll create a ticket for this case later this month. Thanks. //JL _______________________________________________ Nvda-dev-asia mailing list Nvda-dev-asia@xxxxxxxxxxxxxxxxxx http://lists.nvaccess.org/listinfo/nvda-dev-asia
-- James Teh Director, NV Access Limited Email: jamie@xxxxxxxxxxxx Web site: http://www.nvaccess.org/ Phone: +61 7 5667 8372