Where is the documentation on how the test tables works and examples of how to write them? Ken From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Joseph Lee Sent: Thursday, June 26, 2014 10:05 AM To: liblouis-liblouisxml@xxxxxxxxxxxxx Subject: [liblouis-liblouisxml] Unified English Braille table set: current state of UEB tables in Liblouis Hi folks, mostly table and code maintainers and UEB readers: I'd like to present the current state of the LibLouis table set for Unified English Braille (UEB) along with some concwerns and suggestions for solving these problems. Currently, we have three sets of UEB tables: the old (current) UEBC table set written by Tom Johnston, who passed away; newer UEB table set included in master branch; and the rulebook-based rewrite of contracted UEB table, developed in Bitbucket repo for Liblouis. Of these, it was proposed that we switch to the newer table in master branch, with some sections coming from rulebook-based table with table tests added. As of 2014, the master UEB set (based on United States English braille code) implements majority of the literary UEB standard, with some of the Unicode symbols included. The Bitbucket table set also implements majority of the rules, with the table content reorganized according to rules from the rulebook. Both table set includes rules which are missing from the other set, namely certain contraction rules are missing in the Bitbucket set and other rules are missing from the master table set (it'll take a while to list which rules are missing from which table). However, what I'm more concerned about is the fact that the character definitions are out of date, which may explain back translation issues reported by Ken a few days ago. For example, in the master table set, certain symbols are defined as mathematical characters (such as less-than, plus, greater-than sign, etc.), which may pose an issue for back translating strings with those characters included. Without remedying this issue, we may see more back translation errors, which would defeat the intention of our UEB implementation. Another concern is the persistent notion that UEB requires computer braille. This isn't the case (UEB does not require dedicated computer braille code). This is another reason to take a look at en-ueb-chardefs file to remove any references to computer braille symbols, which will take at least a month to do (especially with testing involved, like I and Ken do with our respective projects: Braille plus 18 for Ken, NVDA for me). Thus based on these findings, added with the fact that UEB is mostly a literary standard, I'm beginning to worry that our implementation of UEB might not be stable, or at worse, incomplete (I might add that UEB can never completely be implemented by any computerized braille translation program because there are some rules which requires human intervention). Also talk about adoption in one of the largest markets - United States in 2016 and you'll see the magnitude of this problem. But I believe we should not think about problems alone: there are possible solutions, both via table and code modifications that may allow substantial implementation of UEB that we could try. Here are some major issues to be solved and implemented: * Unified Liblouis braille table set for Unified English Braille code: by far, this is the critical hindrance to continuation of UEB implementation. Based on work we've done, for ease of future extension and for ease of debugging, I propose adopting the Bitbucket table set after examining which rules can be ported from the current master set (the master set contains some rules which are missing in the Bitbucket set, namely working with contracted lower braille dots which are part of a word such as "in"). * Capital passage indicator: perhaps using the current method for determining emphasis passage might be useful. * Grade 1 braille embedded in grade 2: there are UEB-specific signs which allows embedding grade 1 braille within grade 2. This is used more by transcribers than automated tools, but just in case a document asks for such scenario (via XML or other markup), we should be prepared to handle such cases. * Exceptions to contraction rules: there are words which cannot be contracted due to various reasons ("dayan" is a good example). By far the master table set implements this well. In order to solve this, a dictionary of such words should be defined. * Rewriting major portions of chardefs: this is needed in order to prevent further back translation problems and to make sure UeB table set uses correct dots for punctuation, thereby freeing the tables from reliance on computer braile derived symbols once and for all. * Organization of the tables according to major rules or sections: this might be handy if we're debugging the table via checktable or for ease of future extensions (in case UEB changes). * Testing by users and organizations: What may allow UEB to be implemented well in this project would be collaboration with users and organizations willing to test our UEB implementation and give us feedback. There are at least four routes for testing: a firmware for Braille Plus 18, and third-party snapshots for Orca, NVDA and Braille Blaster. However, testing should not be limited to users and organizations: we need feedback from transcribers and people who are actually drafting UEB standard (that is, International Council on English Braille, or ICEB and its member organizations). * Test data: a few days ago, Mesar mentioned that some tables need more test data. UEB is no exception, and if people are willing to learn how the test file works, we'd be able to come up with common and not so common test cases to stress the UEB implementation to its limits (or beyond its limits) so we can prove that our UEB table sets are stable. Thanks. Cheers, Joseph