[liblouis-liblouisxml] Re: Liblouis table header

  • From: Bert Frees <bertfrees@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 20 Oct 2014 20:51:40 +0200

Hammer Attila writes:

> Bert, this is good ydea my openion.
> Now, for example in Orca Screen Reader some Liblouis table names marked 
> for translation in Orca side, but more tables not.
> If when Orca future requesting table list from the louis Python3 binding 
> and the table list function returning the localized table name, Joanie 
> have possibility to fill Contraction table combo box with translated 
> table name.
> If I understanding right your examples, of course, only presents the 
> localized table name when the equals system locale is used.
> For example, when I future using hungarian locale and your example the 
> afrikaans table pretty-name is "Afrikaans ongekontrakteerde", possible 
> translating future hungarian locale the afrikaans table name with 
> "afrikai" table name, or simple presents the english table name future 
> when I selecting a table from the Orca preferences dialog contraction 
> table combo box?
> Possible extending the pretty_name tag to pretty-name[locale] variant?
> So, the header have possibility to add following style translations:
> pretty-name[af]="Afrikaans ongekontrakteerde"
> pretty-name[hu]="afrikai (irodalmi)"
> This is examples only.

`pretty-name` could be treated as a special metadata field with a special API
call associated, named something like "get_localized_pretty_name(char* table,
char* locale)". For mapping locales to strings in the table header, I had
proposed to use keywords of the form "#+pretty-name-hu", but your idea
"#+pretty-name[hu]" would work equaly fine and looks a bit better.

I come to realize now that we're trying to cover two completely different use
cases here, namely "table discovery" vs. "table name localization". Although
they are both related to metadata, I wonder if it's such a good idea to mix the
two.

Michael Whapples writes:

> Hello,
> Much of that sounds quite good.
>
> I have some questions.
> 1. What if one is doing a partial search of a value (eg. If asking for 
> (locale:en) which I might take to mean return any English table 
> regardless of country). The reverse may also be desired, where a less 
> specific match would be acceptable (eg. (locale:fr_FR) but if that 
> specific one cannot be matched then (locale:fr) will also be checked). 
> Locales come to mind because this comes up in other things (eg. Java 
> applications choosing locale resource bundles), but other criteria may 
> need this partial matching on values.

Good point. I would say we treat `locale` as a special keyword and implement
some kind of fallback mechanism. E.g. when the query is "(locale:fr_FR)", tables
with locale "fr_FR" will get the most points, then "fr_FR_*" (a variant,
e.g. "fr_FR_1694acad"), then "fr", and then possibly "fr_*"
(e.g. "fr_CA"). Applications can still check the actual locale of a matching
table and decide to not consider it a match after all.

An alternative is for applications to make several query calls and implement the
fallback mechanism theirselves. (To illustrate, in CSS, several media queries
can be combined in a comma separated list. If one or more of the queries match,
the whole list matches, otherwise not.)

We could also allow applications to override the "matching function" for a
particular keyword, although that's pretty advanced usage already and I would
rather keep it as simple as possible.

Yet another approach is to allow multiple locales in a table header. For
example, a translation table for Spanish braille could possibly also be used for
Catalan. There's no way an automatic fallback mechanism would cover this case.

> 2. Where you mention doing matches with single keywords, why not just be 
> like XPath's attribute matching, just check for a key regardless of its 
> value instead of a separate tags field?

The single keywords were meant for things that can't really have a value (apart
maybe for values that evaluate to true or false). For this reason I wanted to
also treat them different from key-value pairs. I wanted to avoid things like
"(locale)" matching all tables that have a locale value. (In CSS media queries,
features, as they are called there, without a value actually kind of behave like
this. E.g. "(color)" means "(min-color: 1)". But this only really makes sense if
the required value type is an integer.)

But I also see where you're coming from. It might simplify things if we could
eliminate the special-purpose keyword "#+tags".

What if we make "(some-tag)" match tables with the field "#+some-tag" but not
tables with the field "#+some-tag: some-value"?

> 3. I imagine the API would have a query for tables function (IE. I give 
> it a set of key, value pairs and it gives me a table which matches). 
> There may be queries which could give multiple tables (eg. (grade:2)) so 
> the function may return multiple tables. I think this would be better 
> than just a single table (eg. the first matching table) as then the 
> application could present the options to the user.

OK, makes perfect sense.

> 4. Thinking back to question 1, I think the indicator value may be 
> useful, even with the query for tables function (IE. one could list the 
> tables in order of best match). Also a function to check a named table 
> against criteria may also be useful (IE. if the application wants to 
> take more control over table handling, eg. caching query results).

OK.

> On 20/10/2014 15:49, Bert Frees wrote:
>> Hi all,
>>
>> I want to bring this subject up again because we've been discussing it so 
>> many
>> times and I think it's about time we finally do something. To recap, we need 
>> to
>> develop a header format that can contain metadata about the table, and an API
>> for extracting metadata and querying tables.
>>
>> Greg's proposal with the single-line comment on the first line was a good
>> start. But I'd like to have something a bit more flexible/extensible. I have
>> worked out something and I'd like to have you guy's opinions and suggestions.
>>
>> For my DAISY Pipeline 2 work I have been nurturing the idea of selecting
>> translators based on some kind of "translator query". The use case is quite 
>> the
>> same as we have here for liblouis. The syntax I am proposing for DAISY 
>> Pipeline
>> is inspired by CSS media queries. A query is basically a list of key-value 
>> pairs
>> or keywords. I'm not proposing to use exactly the same syntax for liblouis, 
>> but
>> I believe we need something similar/mappable.
>>
>> Let's take Greg's example:
>>
>>      #afr#1#Afrikaans Uncontracted#za#Afrikaans ongekontrakteerde
>>
>> It consists of 5 metadata fields. 3 of them can be used for automatic table
>> selection. The two other are for pretty printing in graphical user
>> interfaces. Combining the two locale fields (language and country) into a 
>> single
>> tag, the corresponding CSS query would look like this:
>>
>>      (locale:af-ZA) (grade:1)
>>
>> The same key-value pairs could be put in liblouis table headers. I like the 
>> idea
>> of having the metadata in special comments, in order to assure backwards
>> compatibility of new tables with old library versions, and so that
>> implementations of the liblouis table format can choose whether or not they
>> support metadata.
>>
>> A possible syntax could be `#+<KEY>: <value>`
>>
>>      #+LOCALE: af-ZA
>>      #+GRADE: 1
>>      #+PRETTY_NAME: Afrikaans ongekontrakteerde
>>      #+PRETTY_NAME_EN: Afrikaans Uncontracted
>>
>> (This is org-mode's syntax by the way.) The keywords would be
>> case-insensitive. Some keywords will be standard, such as LOCALE, GRADE and
>> PRETTY_NAME, but I wouldn't restrict the allowed keywords in any way, in 
>> order
>> to keep the system flexible.
>>
>> The CSS query syntax also allows single keywords, without a value. For 
>> example:
>>
>>      (ueb) (grade:2)
>>
>> This could be reflected in a `#+TAGS` field with a list of space-separated
>> "tags":
>>      
>>      #+LOCALE: en-US
>>      #+GRADE: 2
>>      #+TAGS: ueb
>>
>> I haven't really thought about the API yet. It would be nice if one could
>> provide a list of key-value pairs and/or single keywords, together with a 
>> table
>> path, and get a table name back. The keys could possibly be sorted by
>> importance, and the API could possibly return some kind of "matching 
>> quotient"
>> that indicates whether the table is a good match for the query or not.
>>
>>
>> Thoughts?
>> Bert
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: