[liblouis-liblouisxml] Re: ISO language code to table name mapping API

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 09 Nov 2009 23:06:44 +0000

Hello,
You're right about the last one (en-unified-g1-UEBC.utb) looking a bit odd. My first comment (possibly incorrect not knowing fully about UEBC) doesn't it duplicate information, doesn't the UEBC part say its unified Braille for English? It possibly raises the argument that file names are not enough. As I understand it the UEBC table would apply to more than one country (eg. Austraillia and Canada), so can apply to more than one locale. By putting the information in the table then you could represent more than one locale (this could either be used by creating multiple table_info structs or by having some way that the table_info struct can have more than one locale, first might be easier). Also if you create a table which may not be used directly but used as a common base to other tables which should be used, then checking filenames would not give any way to make the table not show (unless you add that to the naming convention) where as not putting the info in the table would be natural.

As for the number of dots issue why not just add that to the info? Also with some of my thoughts on automatic table selection or ordering tables in most likely wanted order, things like code revision/revision date would be useful.

Michael Whapples
On 09/11/09 21:03, James Teh wrote:
On 9/11/2009 10:41 PM, Christian Egli wrote:
First we will have to agree on the format of a header line in
tables containing the necessary information.
The question is what constitutes the necessary information. If we agree
on the struct then the format of the header line can be deduced.
While I have no problem with the header line, a question I have raised in the past is why we couldn't include all of the necessary information in the filename. This would certainly make scanning much faster. See below for an example. Note that we would still need to include the table's descriptive name in the header line; e.g. "Unified English Braille Code grade 1".

   struct table_info {
     char iso_language[2]; /* ISO 639-1 */
     char iso_region[3];   /* ISO 3166  */
     short grade;
I don't think grade is enough here. Computer braille is also a form of "grade" which doesn't fit into the normal idea of grades. Also, in English, computer braille is sometimes called grade 0, whereas this is not the case in some other languages; e.g. if I recall correctly, grade 0 in German is not computer braille. Having said that, we could just always give computer braille a grade value of -1. Another thing to consider is that I think the concept of computer, uncontracted and contracted braille is often more useful than grades.

Given this structure, we can have filenames as follows:
iso_language-iso_region-grade-[-extra].extension
where extension is: ttb for computer (text) braille, utb for uncontracted braille and ctb for contracted braille. extra is for differentiating two similar tables; e.g. braille codes from different years.
Examples:
* en-US-comp6.ttb
* en-US-comp8.ttb
* en-US-g1.utb
* en-US-g2.ctb
* en-unified-g1-UEBC.utb (Note the use of unified in place of the ISO region)
This last one looks a bit obscure, but it's a start...

The questions remain how we deal with 6 and 8dot tables, with computer
braille and with UEBC?
See examples above.

Jamie


For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: