[liblouis-liblouisxml] Re: [liblouisutdml] push by bertfrees - Add test for John Brugge's issue... on 2014-02-25 22:35 GMT

  • From: "John Gardner" <john.gardner@xxxxxxxxxxxx>
  • To: <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Fri, 28 Feb 2014 07:49:22 -0800

I am no XML expert, but I had thought that XML formatters ignored leading
and trailing white space and that there is an algorithm for inserting spaces
between sections of text separated by span, emphasis, etc tags.  If so, then
braille should do the same thing imho.

John G

-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of John J.
Boyer
Sent: Friday, February 28, 2014 7:34 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: [liblouisutdml] push by bertfrees - Add
test for John Brugge's issue... on 2014-02-25 22:35 GMT

Text nodes with only whitespace are ignored. This was a decision made to
avvoid having  to dddeal with whitespace  introduced by prettyp-rinting
which indents different levels in a  document by 
different amounts. It is the usual way of handling xml.

Please stop aarguing and come up with an algorithm that will handle these
examples and not produce sside effectws.

John

On Fri, Feb 28, 2014 at 11:12:30AM +0100, Bert Frees wrote:
> Come on John, let's be serious. There is absolutely nothing wrong with
> these examples if you ask me.
> 
> The space between the closing span tag and the opening em tag is
> significant, so liblouisutdml should clearly not drop it. The space
between
> "fox" and "jumped" isn't dropped, so it can't be impossible to handle the
> situation correctly between "jumped" and "over", right?
> 
> Also, it's a formatting thing IMO, not a translation thing, so one
> shouldn't rely on liblouis tables to trim whitespace.
> 
> If this would concern me at all I would have a look at it, but as I said
in
> the Pipeline I use liblouisutdml for formatting flattened blocks only.
> 
> 
> Bert
> 
> 
> 
> 2014-02-28 9:47 GMT+01:00 John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx>:
> 
> > In the second example there should be a space before over. The example
is
> > syntactically correct, but that does not guarantee that it is
semantically
> > correct. The situation is similar to a program which
> > compiles without errors but does not produce  the right results.
> >
> > liblouisutdml is not an editor. It has limited facilities for
compensating
> > for improper text in documents. We have been discussing similar
situations
> > in the work on BrailleBlaster. It does have editing
> > facilities for print and will have them for braille also.
> >
> > The way to deal with situations like this in Bookshare is to warn the
> > proofreadwrs about them. If you can come up with a way in which
> > liblouisutdml could handle these two examples I will be glad to hear
> > it.
> >
> > John
> >
> > On Thu, Feb 27, 2014 at 08:47:06PM +0000, John  Brugge wrote:
> > > I guess my point is not about the source documents; the problem I see
is
> > > in cases where the source is perfectly legal and appropriate. Let me
see
> > > if I can clarify what the situation seems like.
> > >
> > > I'll use again two simple fragments that I had before. The first is a
> > > paragraph with emphasized text in it, and a span that has a period
> > > immediately following it:
> > >
> > > <p>
> > > <em>The quick brown fox</em> <span>jumped</span>. <em>What a
> > surprise.</em>
> > > </p>
> > >
> > > The second is a paragraph with a span that has a space after it, since
it
> > > is in the middle of a sentence.
> > >
> > > <p>
> > > <em>The quick brown fox</em> <span>jumped</span> <em>over the lazy
> > > dog.</em>
> > > </p>
> > >
> > > Here are the options that I have tried for converting this:
> > > 1. Have a simple "no span" in a semantic action file. The result for
the
> > > first example is correct output, with no space between "jumped" and
the
> > > period. The result for the second example is incorrect output, where
> > > "jumped" and "over" are run together. (This is the situation that
appears
> > > to me to be the bug).
> > >
> > > 2. Have a "no span" with a third column of "\*\s". The result for the
> > > first example is incorrect this time, as there is now a space between
> > > "jumped" and the period. The result for the second example is correct
> > this
> > > time, with a space between "jumped" and "over".
> > >
> > > 3. Same as above and add "compress.cti" to my list of liblouis tables.
> > > There is no difference in the results.
> > >
> > > Does that make any more sense, or am I still missing what the workable
> > > solution is, so that both examples will convert appropriately and I
don't
> > > have to choose between the lesser of bad translations? Perhaps we have
> > > some other odd interaction with semantic action rules in our config
that
> > > I'm not aware of, but I don't think that we're doing anything funky,
at
> > > least not on purpose. I know we haven't touched any of the liblouis
table
> > > definitions.
> > >
> > > Thanks,
> > > John
> > >
> >

-- 
John J. Boyer; President, Chief Software Developer
Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: