Only text nodes with nothing but whitespace are ignored. For one thing, this avoids a lot of meaningless <brl> tags in UTDML. John B On Fri, Feb 28, 2014 at 07:49:22AM -0800, John Gardner wrote: > I am no XML expert, but I had thought that XML formatters ignored leading > and trailing white space and that there is an algorithm for inserting spaces > between sections of text separated by span, emphasis, etc tags. If so, then > braille should do the same thing imho. > > John G > > -----Original Message----- > From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx > [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of John J. > Boyer > Sent: Friday, February 28, 2014 7:34 AM > To: liblouis-liblouisxml@xxxxxxxxxxxxx > Subject: [liblouis-liblouisxml] Re: [liblouisutdml] push by bertfrees - Add > test for John Brugge's issue... on 2014-02-25 22:35 GMT > > Text nodes with only whitespace are ignored. This was a decision made to > avvoid having to dddeal with whitespace introduced by prettyp-rinting > which indents different levels in a document by > different amounts. It is the usual way of handling xml. > > Please stop aarguing and come up with an algorithm that will handle these > examples and not produce sside effectws. > > John > > On Fri, Feb 28, 2014 at 11:12:30AM +0100, Bert Frees wrote: > > Come on John, let's be serious. There is absolutely nothing wrong with > > these examples if you ask me. > > > > The space between the closing span tag and the opening em tag is > > significant, so liblouisutdml should clearly not drop it. The space > between > > "fox" and "jumped" isn't dropped, so it can't be impossible to handle the > > situation correctly between "jumped" and "over", right? > > > > Also, it's a formatting thing IMO, not a translation thing, so one > > shouldn't rely on liblouis tables to trim whitespace. > > > > If this would concern me at all I would have a look at it, but as I said > in > > the Pipeline I use liblouisutdml for formatting flattened blocks only. > > > > > > Bert > > > > > > > > 2014-02-28 9:47 GMT+01:00 John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx>: > > > > > In the second example there should be a space before over. The example > is > > > syntactically correct, but that does not guarantee that it is > semantically > > > correct. The situation is similar to a program which > > > compiles without errors but does not produce the right results. > > > > > > liblouisutdml is not an editor. It has limited facilities for > compensating > > > for improper text in documents. We have been discussing similar > situations > > > in the work on BrailleBlaster. It does have editing > > > facilities for print and will have them for braille also. > > > > > > The way to deal with situations like this in Bookshare is to warn the > > > proofreadwrs about them. If you can come up with a way in which > > > liblouisutdml could handle these two examples I will be glad to hear > > > it. > > > > > > John > > > > > > On Thu, Feb 27, 2014 at 08:47:06PM +0000, John Brugge wrote: > > > > I guess my point is not about the source documents; the problem I see > is > > > > in cases where the source is perfectly legal and appropriate. Let me > see > > > > if I can clarify what the situation seems like. > > > > > > > > I'll use again two simple fragments that I had before. The first is a > > > > paragraph with emphasized text in it, and a span that has a period > > > > immediately following it: > > > > > > > > <p> > > > > <em>The quick brown fox</em> <span>jumped</span>. <em>What a > > > surprise.</em> > > > > </p> > > > > > > > > The second is a paragraph with a span that has a space after it, since > it > > > > is in the middle of a sentence. > > > > > > > > <p> > > > > <em>The quick brown fox</em> <span>jumped</span> <em>over the lazy > > > > dog.</em> > > > > </p> > > > > > > > > Here are the options that I have tried for converting this: > > > > 1. Have a simple "no span" in a semantic action file. The result for > the > > > > first example is correct output, with no space between "jumped" and > the > > > > period. The result for the second example is incorrect output, where > > > > "jumped" and "over" are run together. (This is the situation that > appears > > > > to me to be the bug). > > > > > > > > 2. Have a "no span" with a third column of "\*\s". The result for the > > > > first example is incorrect this time, as there is now a space between > > > > "jumped" and the period. The result for the second example is correct > > > this > > > > time, with a space between "jumped" and "over". > > > > > > > > 3. Same as above and add "compress.cti" to my list of liblouis tables. > > > > There is no difference in the results. > > > > > > > > Does that make any more sense, or am I still missing what the workable > > > > solution is, so that both examples will convert appropriately and I > don't > > > > have to choose between the lesser of bad translations? Perhaps we have > > > > some other odd interaction with semantic action rules in our config > that > > > > I'm not aware of, but I don't think that we're doing anything funky, > at > > > > least not on purpose. I know we haven't touched any of the liblouis > table > > > > definitions. > > > > > > > > Thanks, > > > > John > > > > > > > > > -- > John J. Boyer; President, Chief Software Developer > Abilitiessoft, Inc. > http://www.abilitiessoft.com > Madison, Wisconsin USA > Developing software for people with disabilities > > For a description of the software, to download it and links to > project pages go to http://www.abilitiessoft.com > > For a description of the software, to download it and links to > project pages go to http://www.abilitiessoft.com -- John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc. http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com