I am no XML expert, but I had thought that XML formatters ignored leading and trailing white space and that there is an algorithm for inserting spaces between sections of text separated by span, emphasis, etc tags. If so, then braille should do the same thing imho. John G -----Original Message----- From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of John J. Boyer Sent: Friday, February 28, 2014 7:34 AM To: liblouis-liblouisxml@xxxxxxxxxxxxx Subject: [liblouis-liblouisxml] Re: [liblouisutdml] push by bertfrees - Add test for John Brugge's issue... on 2014-02-25 22:35 GMT Text nodes with only whitespace are ignored. This was a decision made to avvoid having to dddeal with whitespace introduced by prettyp-rinting which indents different levels in a document by different amounts. It is the usual way of handling xml. Please stop aarguing and come up with an algorithm that will handle these examples and not produce sside effectws. John On Fri, Feb 28, 2014 at 11:12:30AM +0100, Bert Frees wrote: > Come on John, let's be serious. There is absolutely nothing wrong with > these examples if you ask me. > > The space between the closing span tag and the opening em tag is > significant, so liblouisutdml should clearly not drop it. The space between > "fox" and "jumped" isn't dropped, so it can't be impossible to handle the > situation correctly between "jumped" and "over", right? > > Also, it's a formatting thing IMO, not a translation thing, so one > shouldn't rely on liblouis tables to trim whitespace. > > If this would concern me at all I would have a look at it, but as I said in > the Pipeline I use liblouisutdml for formatting flattened blocks only. > > > Bert > > > > 2014-02-28 9:47 GMT+01:00 John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx>: > > > In the second example there should be a space before over. The example is > > syntactically correct, but that does not guarantee that it is semantically > > correct. The situation is similar to a program which > > compiles without errors but does not produce the right results. > > > > liblouisutdml is not an editor. It has limited facilities for compensating > > for improper text in documents. We have been discussing similar situations > > in the work on BrailleBlaster. It does have editing > > facilities for print and will have them for braille also. > > > > The way to deal with situations like this in Bookshare is to warn the > > proofreadwrs about them. If you can come up with a way in which > > liblouisutdml could handle these two examples I will be glad to hear > > it. > > > > John > > > > On Thu, Feb 27, 2014 at 08:47:06PM +0000, John Brugge wrote: > > > I guess my point is not about the source documents; the problem I see is > > > in cases where the source is perfectly legal and appropriate. Let me see > > > if I can clarify what the situation seems like. > > > > > > I'll use again two simple fragments that I had before. The first is a > > > paragraph with emphasized text in it, and a span that has a period > > > immediately following it: > > > > > > <p> > > > <em>The quick brown fox</em> <span>jumped</span>. <em>What a > > surprise.</em> > > > </p> > > > > > > The second is a paragraph with a span that has a space after it, since it > > > is in the middle of a sentence. > > > > > > <p> > > > <em>The quick brown fox</em> <span>jumped</span> <em>over the lazy > > > dog.</em> > > > </p> > > > > > > Here are the options that I have tried for converting this: > > > 1. Have a simple "no span" in a semantic action file. The result for the > > > first example is correct output, with no space between "jumped" and the > > > period. The result for the second example is incorrect output, where > > > "jumped" and "over" are run together. (This is the situation that appears > > > to me to be the bug). > > > > > > 2. Have a "no span" with a third column of "\*\s". The result for the > > > first example is incorrect this time, as there is now a space between > > > "jumped" and the period. The result for the second example is correct > > this > > > time, with a space between "jumped" and "over". > > > > > > 3. Same as above and add "compress.cti" to my list of liblouis tables. > > > There is no difference in the results. > > > > > > Does that make any more sense, or am I still missing what the workable > > > solution is, so that both examples will convert appropriately and I don't > > > have to choose between the lesser of bad translations? Perhaps we have > > > some other odd interaction with semantic action rules in our config that > > > I'm not aware of, but I don't think that we're doing anything funky, at > > > least not on purpose. I know we haven't touched any of the liblouis table > > > definitions. > > > > > > Thanks, > > > John > > > > > -- John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc. http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com