[liblouis-liblouisxml] Bug in Czech table (text_nabcc.dis obsolete)?

  • From: Boris Dusek <dusek@xxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Sun, 16 Jan 2011 21:33:06 +0100

Hello,

I wrote a program to dump a (unicode, dot) pairs and (dot, unicode) pairs for a
given table.  Then I tried that on Czech table.  It however returns wrong
result for letter \x00ED (i with acute).  It returns dots 1468, but correct is
dots 34.  I looked at the table and found these relevant declarations in
Cz-Cz-g1.utb (in the order they are encountered during processing):

1. Cz-Cz-g1.utb:4: include text_nabcc.dis
2. text_nabcc.dis:325: display \X00ED 1468
3. Cz-Cz-g1.utb:125: uplow \x00CD\x00ED 34

It seems that i with acute gets wrongly defined by the "display" opcode in
step 2. and then does not get corrected later in step 3. with the "uplow"
opcode.  I tried commenting out the include of text_nabcc.dis and it seems
that everything is working and the characters like i with acute are now
displayed correctly.

Now I could be doing something wrong since I don't understand liblouis
internals -- I am traversing the table in this way (you can ignore the "found"
variable, the line with ftor means that the `cd' is processed - e.g. output
to file):

 const TranslationTableOffset *tbl = (dir==char2dot) ? &m_table->charToDots[0] 
: &m_table->dotsToChar[0];
 for (int i = 0; (i < HASHNUM) && !found; ++i) {
     TranslationTableOffset bucket = tbl[i % HASHNUM];
     while (bucket && !found) {
         const CharOrDots & cd = reinterpret_cast<CharOrDots 
&>(m_table->ruleArea[bucket]);
         found = ftor(cd);
         bucket = cd.next;
     }
 }

So either I am doing something wrong, or there is a bug in Czech translation
table.  I also noticed that text_nabcc.dis has status
"# This file is obsolete. Do not use!".  

Could someone please enlighten me what is going on?

Thanks,
Boris
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: