[liblouis-liblouisxml] ISO language code to table name mapping API

  • From: Christian Egli <christian.egli@xxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Fri, 06 Nov 2009 14:52:52 +0100

Hi all
    
I would like to advance the integration of Liblouis in tools such as
OpenOffice and other XML tool chains such as the Daisy Pipeline. When
generating a Braille from an XML or from a OpenOffice document I would
assume that a user generally doesn't have a concept of a translation
table. They know that their text is in a certain language, they probably
know that they would like to have the Braille in a certain grade and
maybe a few other characteristics. Right now, however, the user has to
specify a specific table when invoking xml2brl or any of the liblouis
translation services.

So, what I'd like to do is to add some abstraction where the user
specifies a language (ISO639-1), optionally a country code (ISO3166-1),
a grade and maybe some additional information such as 6 or 8 dot
Braille. From that information Liblouis would automatically pick the
right table. 

A first step in that direction would be a mapping between iso language
code, grade and table name. This mapping could have a public API so that
NVDA could use it. The next step would probably be an extension of the
current API where you could invoke the translation functions given a iso
language code and a grade instead of a table name. And the third step
would be to add support in liblouisxml for the xml:lang attribute in the
sense that it would change to the right table based on the value of
xml:lang.

But let's take this step by step. For the  iso language code, grade and
table name mapping I envision a simple API as follows:

  typedef struct {
    char iso_language[2]; /* ISO 639-1 */
    char iso_region[3];   /* ISO 3166  */
    short grade;
    char * table_name;
  } table_info;

  char * lou_getTableName (char *iso_language, char *iso_region, short grade);

possibly a function to query all the available languages would be needed
as well. The implementation would simply be along the following:

static const table_info table_list[] = {
  {"en", "US", 1, "en-us-g1.ctb"},
  {"de", "", 0, "de-de-g0.utb"},
  ...
};

char * lou_getTableName (char *iso_language, char *iso_region, short grade) 
{
  /* find a table name for the given criteria if none found return null */
}

To begin we could simply hard code the list of tables. As a bonus (and
probably fairly simple) a Perl script could generate the table_list.
That would mean that the tables have to contain the relevant information
in an easy to parse form.

Before I go ahead and try to hack this stuff up, there are a few open
questions: How do we deal with 6 and 8 dot? What do we do with computer
braille? How do we handle the UEBC tables?

I'd like to know if this approach makes sense, if the API looks OK and
simply get feedback.

Thanks
Christian
-- 
Christian Egli
Swiss Library for the Blind and Visually Impaired
Grubenstrasse 12, CH-8045 Zürich, Switzerland

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: