[liblouis-liblouisxml] Re: Unified English Braille table set: current state of UEB tables in Liblouis

  • From: Ken Perry <kperry@xxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Thu, 26 Jun 2014 17:21:55 +0000

Where is the documentation on how the test tables works and examples of how to 
write them?

Ken

From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Joseph Lee
Sent: Thursday, June 26, 2014 10:05 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Unified English Braille table set: current 
state of UEB tables in Liblouis

Hi folks, mostly table and code maintainers and UEB readers:
I'd like to present the current state of the LibLouis table set for Unified 
English Braille (UEB) along with some concwerns and suggestions for solving 
these problems.
Currently, we have three sets of UEB tables: the old (current) UEBC table set 
written by Tom Johnston, who passed away; newer UEB table set included in 
master branch; and the rulebook-based rewrite of contracted UEB table, 
developed in Bitbucket repo for Liblouis. Of these, it was proposed that we 
switch to the newer table in master branch, with some sections coming from 
rulebook-based table with table tests added.
As of 2014, the master UEB set (based on United States English braille code) 
implements majority of the literary UEB standard, with some of the Unicode 
symbols included. The Bitbucket table set also implements majority of the 
rules, with the table content reorganized according to rules from the rulebook. 
Both table set includes rules which are missing from the other set, namely 
certain contraction rules are missing in the Bitbucket set and other rules are 
missing from the master table set (it'll take a while to list which rules are 
missing from which table).
However, what I'm more concerned about is the fact that the character 
definitions are out of date, which may explain back translation issues reported 
by Ken a few days ago. For example, in the master table set, certain symbols 
are defined as mathematical characters (such as less-than, plus, greater-than 
sign, etc.), which may pose an issue for back translating strings with those 
characters included. Without remedying this issue, we may see more back 
translation errors, which would defeat the intention of our UEB implementation.
Another concern is the persistent notion that UEB requires computer braille. 
This isn't the case (UEB does not require dedicated computer braille code). 
This is another reason to take a look at en-ueb-chardefs file to remove any 
references to computer braille symbols, which will take at least a month to do 
(especially with testing involved, like I and Ken do with our respective 
projects: Braille plus 18 for Ken, NVDA for me).
Thus based on these findings, added with the fact that UEB is mostly a literary 
standard, I'm beginning to worry that our implementation of UEB might not be 
stable, or at worse, incomplete (I might add that UEB can never completely be 
implemented by any computerized braille translation program because there are 
some rules which requires human intervention). Also talk about adoption in one 
of the largest markets - United States in 2016 and you'll see the magnitude of 
this problem.
But I believe we should not think about problems alone: there are possible 
solutions, both via table and code modifications that may allow substantial 
implementation of UEB that we could try. Here are some major issues to be 
solved and implemented:

*         Unified Liblouis braille table set for Unified English Braille code: 
by far, this is the critical hindrance to continuation of UEB implementation. 
Based on work we've done, for ease of future extension and for ease of 
debugging, I propose adopting the Bitbucket table set after examining which 
rules can be ported from the current master set (the master set contains some 
rules which are missing in the Bitbucket set, namely working with contracted 
lower braille dots which are part of a word such as "in").

*         Capital passage indicator: perhaps using the current method for 
determining emphasis passage might be useful.

*         Grade 1 braille embedded in grade 2: there are UEB-specific signs 
which allows embedding grade 1 braille within grade 2. This is used more by 
transcribers than automated tools, but just in case a document asks for such 
scenario (via XML or other markup), we should be prepared to handle such cases.

*         Exceptions to contraction rules: there are words which cannot be 
contracted due to various reasons ("dayan" is a good example). By far the 
master table set implements this well. In order to solve this, a dictionary of 
such words should be defined.

*         Rewriting major portions of chardefs: this is needed in order to 
prevent further back translation problems and to make sure UeB table set uses 
correct dots for punctuation, thereby freeing the tables from reliance on 
computer braile derived symbols once and for all.

*         Organization of the tables according to major rules or sections: this 
might be handy if we're debugging the table via checktable or for ease of 
future extensions (in case UEB changes).

*         Testing by users and organizations: What may allow UEB to be 
implemented well in this project would be collaboration with users and 
organizations willing to test our UEB implementation and give us feedback. 
There are at least four routes for testing: a firmware for Braille Plus 18, and 
third-party snapshots for Orca, NVDA and Braille Blaster. However, testing 
should not be limited to users and organizations: we need feedback from 
transcribers and people who are actually drafting UEB standard (that is, 
International Council on English Braille, or ICEB and its member organizations).

*         Test data: a few days ago, Mesar mentioned that some tables need more 
test data. UEB is no exception, and if people are willing to learn how the test 
file works, we'd be able to come up with common and not so common test cases to 
stress the UEB implementation to its limits (or beyond its limits) so we can 
prove that our UEB table sets are stable.
Thanks.
Cheers,
Joseph

Other related posts: