[brailleblaster] Re: Unit test for index values

  • From: Keith Creasy <kcreasy@xxxxxxx>
  • To: "brailleblaster@xxxxxxxxxxxxx" <brailleblaster@xxxxxxxxxxxxx>
  • Date: Fri, 17 May 2013 13:29:28 +0000

Yes, I know it is supposed to but something is wrong. I note a lot of white 
space in certain documents that shouldn't be there to begin with. Here is an 
example:

<p>
        <strong>Page 8</strong>
        <strong>Sample Presentations</strong>
</p>


The problem here is that according to the DTBook DTD a paragraph can contain 
CData and so the extra white space, put there obviously for visual formatting 
in a text editor, becomes part of the document content.

It should be:

<p><strong>Page 8 Sample Presentations</strong></p>

You also can't arbitrarily throw out spaces. For example, here a space between 
elements in valid and relevant:

<p><strong>Page 8</strong> <strong>Sample Presentations</strong></p>

This would be typically done by a word processor like Word where it marked up 
the text correctly but the space, since it really can't have a "strong" style" 
was left alone between elements.


There are a lot of variations and in order to make the conversion to braille 
smooth we may want to pre-process documents to provide some consistency. The 
DAISY Pipeline does have a DTBook fixer. I may try it to see how much it helps.
 



Keith Creasy
Software Developer
American Printing House for the Blind
KCreasy@xxxxxxx
Phone: 502.895.2405
Skype: keith537


-----Original Message-----
From: brailleblaster-bounce@xxxxxxxxxxxxx 
[mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John J. Boyer
Sent: Friday, May 17, 2013 9:10 AM
To: brailleblaster@xxxxxxxxxxxxx
Subject: [brailleblaster] Re: Unit test for index values

Preferences.cfg uses the table list compress.cti,en-us-g2.ctb . The 
compress.cti gets rid of extraneous whitespace, but might affect index values.

John

On Fri, May 17, 2013 at 12:10:40PM +0000, Keith Creasy wrote:
> Hi John.
> 
> I wasn't suggesting you do it. I was hoping to get someone else to take an 
> interest.
> 
> I understand how it works but they are still coming out wrong. It seems to be 
> mostly related to white space. I've just about decided that we'll have to 
> pre-process the XML to get rid of extraneous white space before we translate 
> the text. This seems especially true of DTBook documents produced in Word or 
> the DAISY Pipeline rtf2dtbook conversion.
> 
> 
> Keith Creasy
> Software Developer
> American Printing House for the Blind
> KCreasy@xxxxxxx
> Phone: 502.895.2405
> Skype: keith537
> 
> -----Original Message-----
> From: brailleblaster-bounce@xxxxxxxxxxxxx 
> [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John J. 
> Boyer
> Sent: Friday, May 17, 2013 7:27 AM
> To: brailleblaster@xxxxxxxxxxxxx
> Subject: [brailleblaster] Re: Unit test for index values
> 
> Unit testing would be a good idea, but i'm not familiar with it, and I want 
> to concentrate on the code of liblouis and liblouisutdml.
> The index values are provided by liblouis. If you have a paragraph with only 
> one block of text they are unaltered by liblouisutdml .
> 
> John
> 
> On Thu, May 16, 2013 at 03:26:17PM +0000, Keith Creasy wrote:
> >    Everyone.
> > 
> > 
> > 
> >    It would be really great if someone could try and put together a unit 
> > test
> >    for the index attribute values. Basically just a test document like the
> >    one John B. sent out. The test could run the file through LibLouisUTDML
> >    and compare the output with known correct values, printing a report on 
> > any
> >    errors. This is still the most critical aspect of this whole project and
> >    we've made some great progress. I'd like to get as close as we can to
> >    tying up the remaining loose ends.
> > 
> > 
> > 
> >    I wonder if it might even be possible to test the values produced with
> >    LibLouisUTDML with values from LibLouis when the same text is processed
> >    without the extra XML markup.
> > 
> > 
> > 
> > 
> > 
> >    All of us who are involved in this are pretty covered up so if one or two
> >    others could jump in it would help a lot.
> > 
> > 
> > 
> >    Thanks!
> > 
> > 
> > 
> >    Keith
> > 
> > 
> > 
> > 
> > 
> >    Keith Creasy
> > 
> >    Software Developer
> > 
> >    American Printing House for the Blind
> > 
> >    KCreasy@xxxxxxx
> > 
> >    Phone: 502.895.2405
> > 
> >    Skype: keith537
> > 
> > 
> 
> --
> John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.
> http://www.abilitiessoft.com
> Madison, Wisconsin USA
> Developing software for people with disabilities
> 
> 

--
John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities



Other related posts: