[liblouis-liblouisxml] Liblouis table header

  • From: Bert Frees <bertfrees@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 20 Oct 2014 16:49:34 +0200

Hi all,

I want to bring this subject up again because we've been discussing it so many
times and I think it's about time we finally do something. To recap, we need to
develop a header format that can contain metadata about the table, and an API
for extracting metadata and querying tables.

Greg's proposal with the single-line comment on the first line was a good
start. But I'd like to have something a bit more flexible/extensible. I have
worked out something and I'd like to have you guy's opinions and suggestions.

For my DAISY Pipeline 2 work I have been nurturing the idea of selecting
translators based on some kind of "translator query". The use case is quite the
same as we have here for liblouis. The syntax I am proposing for DAISY Pipeline
is inspired by CSS media queries. A query is basically a list of key-value pairs
or keywords. I'm not proposing to use exactly the same syntax for liblouis, but
I believe we need something similar/mappable.

Let's take Greg's example:

    #afr#1#Afrikaans Uncontracted#za#Afrikaans ongekontrakteerde

It consists of 5 metadata fields. 3 of them can be used for automatic table
selection. The two other are for pretty printing in graphical user
interfaces. Combining the two locale fields (language and country) into a single
tag, the corresponding CSS query would look like this:

    (locale:af-ZA) (grade:1)

The same key-value pairs could be put in liblouis table headers. I like the idea
of having the metadata in special comments, in order to assure backwards
compatibility of new tables with old library versions, and so that
implementations of the liblouis table format can choose whether or not they
support metadata.

A possible syntax could be `#+<KEY>: <value>`

    #+LOCALE: af-ZA
    #+GRADE: 1
    #+PRETTY_NAME: Afrikaans ongekontrakteerde
    #+PRETTY_NAME_EN: Afrikaans Uncontracted

(This is org-mode's syntax by the way.) The keywords would be
case-insensitive. Some keywords will be standard, such as LOCALE, GRADE and
PRETTY_NAME, but I wouldn't restrict the allowed keywords in any way, in order
to keep the system flexible.

The CSS query syntax also allows single keywords, without a value. For example:

    (ueb) (grade:2)

This could be reflected in a `#+TAGS` field with a list of space-separated
"tags":
    
    #+LOCALE: en-US
    #+GRADE: 2
    #+TAGS: ueb

I haven't really thought about the API yet. It would be nice if one could
provide a list of key-value pairs and/or single keywords, together with a table
path, and get a table name back. The keys could possibly be sorted by
importance, and the API could possibly return some kind of "matching quotient"
that indicates whether the table is a good match for the query or not.


Thoughts?
Bert
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: