[liblouis-liblouisxml] Contractions skipped inside emphasis element

  • From: "John Brugge" <johnbrugge@xxxxxxxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Tue, 26 Jul 2011 08:11:02 -0700

I'm seeing some strangeness when translating XML documents in how some 
contractions are, or are not, happening inside of "em" (emphasis) elements. I 
noticed this when comparing the output of conversions using our current 
installation of liblouisxml, v2.1.0, against the latest version, 2.4.0. We are 
eventually headed to liblouisutdml, so I compared the output with that as well.

I've been able to duplicate the issue with a small test case. The input is some 
short text, like the title of a book: <em>For the Win</em>.

My semantic action file for this test has one line:
Italicx em

My configuration file is simple as well (attached).

The test is a command line conversion:
 echo "<em>For the Win</em>" | xml2brl -f test.cfg

Here is what I get with various versions of liblouisxml.
With v2.1.0, I get no output on stdout with this command, but in our full 
conversion process, I get this:
.,= .! .,W9

With v2.2.0 (we aren't looking to move to this, but I had it installed and so 
tried it):
.,FOR .THE .,W9

With v2.4.0:
.,= .THE .,W9

With liblouisutdml v1.9.0:
,=! ,W9

So I am wondering if there are configuration options that I haven't tried that 
would give more consistent results, or that might trigger the contraction of 
"the" in the v2.4.0 version.

I'm also curious if the lack of emphasis added with liblouisutdml is something 
that can be altered with a configuration or not.

I should also say that translating "For the Win" with lou_translate, in all 
versions, gives a consistent result, ",=! ,w9". Similarly, outside of the "em" 
elements, "the" does get contracted correctly in XML content.

Thanks for any hints,
John Brugge

