[liblouis-liblouisxml] Re: [liblouisutdml] push by bertfrees - Add test for John Brugge's issue... on 2014-02-25 22:35 GMT

  • From: "John Brugge" <johnbrugge@xxxxxxxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Thu, 27 Feb 2014 20:47:06 +0000

I guess my point is not about the source documents; the problem I see is
in cases where the source is perfectly legal and appropriate. Let me see
if I can clarify what the situation seems like.

I'll use again two simple fragments that I had before. The first is a
paragraph with emphasized text in it, and a span that has a period
immediately following it:

<p>
<em>The quick brown fox</em> <span>jumped</span>. <em>What a surprise.</em>
</p>

The second is a paragraph with a span that has a space after it, since it
is in the middle of a sentence.

<p>
<em>The quick brown fox</em> <span>jumped</span> <em>over the lazy
dog.</em>
</p>

Here are the options that I have tried for converting this:
1. Have a simple "no span" in a semantic action file. The result for the
first example is correct output, with no space between "jumped" and the
period. The result for the second example is incorrect output, where
"jumped" and "over" are run together. (This is the situation that appears
to me to be the bug).

2. Have a "no span" with a third column of "\*\s". The result for the
first example is incorrect this time, as there is now a space between
"jumped" and the period. The result for the second example is correct this
time, with a space between "jumped" and "over".

3. Same as above and add "compress.cti" to my list of liblouis tables.
There is no difference in the results.

Does that make any more sense, or am I still missing what the workable
solution is, so that both examples will convert appropriately and I don't
have to choose between the lesser of bad translations? Perhaps we have
some other odd interaction with semantic action rules in our config that
I'm not aware of, but I don't think that we're doing anything funky, at
least not on purpose. I know we haven't touched any of the liblouis table
definitions.

Thanks,
John

On 2/27/14 2:09 PM, "John J. Boyer" <john.boyer@xxxxxxxxxxxxxxxxx> wrote:

>Unfortunately, source documentsj do not always follow good practices.
>There should be spaces where spaces are needed. I have seen mny
>Bookshare documents where there are spaces before periods. The only way
>to fix this would be to edit the documents or perhaps preprocess them.
>The best thing to do is to chose the lesser of two evils. Having words
>run together is worse thatn an occasional space before a period, unless
>that space will occur a great many times.
>
>I would be glad to hear of a way to handle such inconsistencies in
>source documents automatically.
>
>John
> 
>On Thu, Feb 27, 2014 at 06:44:42PM +0000, John  Brugge wrote:
>> John,
>> 
>> So if this is not a bug, then how do I get the correct behavior when I
>>do
>> _not_ want a space after the span? The example I gave was one where
>>there
>> was punctuation immediately following the end of a span tag, such as a
>> period ending a sentence:
>> 
>> <p>
>> The quick brown fox <span>jumped</span>. What a surprise.
>> </p>
>> 
>> In essence, I want to preserve the original following content that was
>> there, be it whitespace or punctuation or characters. It seems to me
>>that
>> it is a bug because if I have "no span" in my semantic action, it will
>>go
>> further than "do nothing", and will remove the whitespace that follows,
>>if
>> there is some. I don't want that to happen, and the third column
>>solution
>> will force it to happen.
>> 
>> Am I reading this wrong, or is there some other way to get the desired
>> effect?
>> Thanks,
>> John
>> 
>> On 2/27/14 11:56 AM, "John J. Boyer" <john.boyer@xxxxxxxxxxxxxxxxx>
>>wrote:
>> 
>> >The issue with the span tag is not a bug. The solution is to use the
>> >third column of an entry in the semantic-action file and a liblouis
>>table
>> >that compresses multiple spaces into one. This was
>> >exlained in my replies to the original message. It has been in the code
>> >for years.
>> >
>> >John
>> >
>> >On Thu, Feb 27, 2014 at 05:58:55PM +0100, Bert Frees wrote:
>> >> Hi John,
>> >> 
>> >> I added this test in the hopes someone else would find it useful for
>> >> debugging. I don't think I will try to debug it myself (sorry :/)
>> >>because
>> >> it's not an issue for me in the Pipeline. My approach is to translate
>> >>all
>> >> the inline stuff beforehand and I only use liblouisutdml to format
>>the
>> >> (pre-translated) blocks. All the <span>s and <em>s are flattened out,
>> >>like
>> >> so:
>> >> 
>> >> <p>
>> >>     .,! .FOX JUMP$ OV] .'! LAZY DOG.-
>> >>  </p>
>> >> 
>> >> AFAIK, there was no consent on a bug tracking system so we kept on
>>using
>> >> the one on Google Code:
>> >>https://code.google.com/p/liblouisutdml/issues/list
>> >> 
>> >> Cheers,
>> >> Bert
>> >> 
>> >> 
>> >> 
>> >> 2014-02-27 17:01 GMT+01:00 John Brugge <johnbrugge@xxxxxxxxxxxx>:
>> >> 
>> >> > Thanks for creating a test to reproduce this, Bert.
>> >> >
>> >> > I haven't noticed whether there was any conclusion in the group on
>> >>whether
>> >> > a formal bug-tracking system would be put in place, so should I
>>just
>> >>watch
>> >> > commits to track when this might be resolved?
>> >> >
>> >> > Thanks much,
>> >> > John
>> >> >
>> >> >
>> >
>> >-- 
>> >John J. Boyer; President, Chief Software Developer
>> >Abilitiessoft, Inc.
>> >http://www.abilitiessoft.com
>> >Madison, Wisconsin USA
>> >Developing software for people with disabilities
>> >
>> >For a description of the software, to download it and links to
>> >project pages go to http://www.abilitiessoft.com
>> 
>> For a description of the software, to download it and links to
>> project pages go to http://www.abilitiessoft.com
>
>-- 
>John J. Boyer; President, Chief Software Developer
>Abilitiessoft, Inc.
>http://www.abilitiessoft.com
>Madison, Wisconsin USA
>Developing software for people with disabilities
>
>For a description of the software, to download it and links to
>project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: