[liblouis-liblouisxml] Re: Table writing questions for back translation, sign vs always

  • From: "Joseph Lee" <joseph.lee22590@xxxxxxxxx>
  • To: <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Tue, 3 Feb 2015 02:25:51 -0800

Hi,
Ah yes, those were carried over from BRLTTY.
Cheers,
Joseph

-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Harri Pasanen
Sent: Tuesday, February 3, 2015 2:20 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: Table writing questions for back 
translation, sign vs always

Hi,

Indeed, in my experiments it seems that the first definition wins.

Another confusing thing for me are the duplicate definitions, for example in 
ko.cti there are these two lines for dots 134-1235:

sign  134-1235
sign ᅱ 134-1235

My PC might lack the character displayed for the second row, but the question 
remains, why the two lines?

More generally, I wrote a small python program that pulls in the includes and 
reports the "conflicts", so if I the dot pattern is already 
defined, it will notify my about that.   As the first definitions seems 
to win, I fail to see how subsequent definitions would matter, at least for 
back translation.

These observations are generic, but for Korean in particular I see that ko.cti 
pulls in chardefs.cti, which includes latinLetterDef8Dots.uti and as result 
there is a big number overlapping rules.  chardefs.cti and 
latinLetterDef8Dots.uti are early, so they win and many rules in ko.cti seem to 
go unused.

Perhaps a set of tables specific to Korean back translation would help here?

Thanks,

Harri

On 03/02/2015 10:31, Joseph Lee wrote:
> Hi Harry,
> Usually, rules defined in a file that gets included in another file takes 
> precedence (Christian, john, Bert, am I right?). For example, suppose we have:
> File a.ctb:
> Always x 1346
> Then:
> b.ctb:
> include a.ctb
> always x 13456
> The rule from a.ctb will be looked up first.
> This is the reason why I divided Korean table set into those files: the 
> ko.cti is the base table, while ko-g1 and ko-g2 are rules specific to each 
> grade which includes ko.cti. Unlike other tables, ko-g2.ctb is not a superset 
> of ko-g1.ctb, otherwise some rules from grade 2 will not be applied. In 
> Korean braille (at least in forward translation), certain letters are 
> brailled like numbers, so we put a space between the letters and number (in 
> print, there is no space, but in braille, there is), and this is what ko-g1 
> is for. Ko-g2 does not include ko-g1 because of ambiguous rules regarding 
> letters followed immediately by numbers (no spaces) and this is due to the 
> nature of the contraction definitions - Korean grade 2 contractions are 
> shorthand for grade 1 braille letter definitions.
> In order to explain how Korean braille works, it is necessary to go over how 
> Korean letters are formed. Each Korean letter (letter as in compound 
> characters) are composed of a consonant, a vowel and zero or one consonant. 
> In braille, each consonant and vowel have their own braille dot combinations, 
> and for each consonant, there are two ways of representing them based on 
> where they appear. Although each consonant uses four dots, the initial form 
> uses dots 1256, while the final form uses dots 2356 (shifted down) in order 
> to distinguish consonant forms.
> In Grade 1, all letter forms are shown, but in grade 2, some letters omit 
> vowel dot combinations, and in case of two consonants, they have different 
> dot combinations (the example you've provided is one of them). However, in 
> both grades, certain consonants have same dot combinations as numbers (there 
> are seven of them), and since grades 1 and 2 differ in terms of handling 
> vowels and final consonant forms, the number+letter rules are defined in both 
> grades. This is complicated by the fact that certain final consonant forms 
> have same dot combinations as punctuation, thus making back translation 
> efforts a bit difficult (what HIMS did with Braille Sense was to use eight 
> dot combinations for denoting punctuations for back translation purposes).
> Lastly, in 2006, Ministry of Culture, Tourism and Sports in South Korea 
> promulgated a newer braille rules revision which attempts to use English dot 
> combinations for certain punctuations. This and the fact that many braille 
> readers in Korea prefer old table set forced me to redesign Korean braille 
> set (waiting for pull request/merge). In the new table set design, ko-g1 and 
> ko-g2 doesn't contain rules at all - these now act as interface files, 
> including just the needed files. Also, the basic design behind Korean braille 
> set remains the same: prepare for extensions by deciding which rules are 
> global (ko.cti) and which are grade-specific.
> Hope this helps.
> Cheers,
> Joseph
>
> -----Original Message-----
> From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
> [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Harri 
> Pasanen
> Sent: Tuesday, February 3, 2015 12:42 AM
> To: liblouis-liblouisxml@xxxxxxxxxxxxx
> Subject: [liblouis-liblouisxml] Table writing questions for back 
> translation, sign vs always
>
> Hi,
>
> I'm playing with back translation again and got some questions.
> This time looking at Korean grade 2.
>
> if I have
>
> sign $ 1246
>
> and later have
>
> always 가 1246
>
> it seems that the "sign" takes precedence over "always"
>
> Is there a way to "undefine" signs, or what is the recommended course of 
> action?
>
> More generally, is there a way to rewrite a rule later in file?
>
> These question arose because I was told that 1246 should back translate to 가 
> in Korean grade-2, but it back translates to "ed" because en-us-g2 is 
> included.  I thought commenting out the en-us-g2 would fix it, but then I got 
> back $, which is defined in the included chardefs.cti.
>
> Regards,
>
> Harri
>
>
>
> For a description of the software, to download it and links to project 
> pages go to http://www.abilitiessoft.com
>
> For a description of the software, to download it and links to project 
> pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to project pages go 
to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: