[bksvol-discuss] Re: The Broker--strengths and weaknesses

  • From: Guido Corona <guidoc@xxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Mon, 17 Jan 2005 08:13:58 -0600

Dr Cross,  this is a very good point.  The problem is that the Bookshare 
automated header stripper is far too radical for its own good.
It removes both chapter headers and page headers completely,  without even 
leaving a virtual indicator of real page 1 in the book.
A kinder /gentler stripper can be implemented easily from a purely 
technical point of view,  as demonstrated by the 'carefully' option in 
Kurzweil 1000 9.0.  Even this stripper though is not perfect,  because it 
does not have an option to leave uncorrupted page numbers intact while 
removing redundant portions of page headers.

I have already suggested to the Bookshare staff that while page numbers 
should probably be removed from the raw body text,  they should be 
inserted into page header tags,  so that a Daisy reader can voice them or 
ignore them at will.  Furthermore,  there should be a mechanism for 
volunteers to tag manually page 1 in the book for best sequencing by the 
Bookshare system.  If page one were taggable,  the Bookshare system can 
then easily create page header tags without any need to preserve original 
page headers.
Needless to say,  the stripper should not remove chapter and section 
headers alltogether,  but smarts should be implemented to create the 
appropriate tags around them.

In the meantime,  one way to get around the problem is to split chapter 
headers on two consecutive lines to defeat the stripper.
For those books where the physical access to page numbers is important, 
omitting a blank line between the page number and the body text will also 
defeat the stripper.


Guido Dante Corona
IBM Accessibility Center,  Austin Tx.
Research Division,
Phone:  512. 838. 9735.
Email: guidoc@xxxxxxxxxxx
Web:  http://www.ibm.com/able

"Kenneth A. Cross" <crossk@xxxxxxxxxxxx> 
Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx
01/17/2005 03:31 AM
Please respond to


[bksvol-discuss] The Broker--strengths and weaknesses

Congratulations to the BookShare staff for the timely availability of THE 
BROKER, John Hrisham's newest book.  For the person who reads for 
pleasure, it is great and most enjoyable.  But the determination to 
provide the book without pagination is a real problem to some users. 
For example, since I was once an English teacher, I spend a fair amount of 
time either in book discussion groups or running such groups for young 
adults.  Since the page numbers have been carefully removed, I can't ask 
the members of my groups to consult specific pages.  And since there are 
no chapter headings, I can't use those either--I don't even know without 
great effort whether they occur. 
For the same reasons, I can't write or check out the writing of others if 
they have provided footnotes about the book.  I can't even check the print 
book quickly to check possible errors in the copy.  For example, I am 
pasting here a sentence which I think is in error. 
Everyone sw r allowed hard and waited for the words to escape through the 
heating vents.  Now if I had any idea what page of the print book that was 
from, I could check it almost instantly, but all I know is that was on 
page 17 of the BookShare copy.
What makes this disturbing to me is that, in the scanning process, the 
page numbers were there.  They had to be carefully eliminated.  And that 
careful elimination means that the book has some real limitations outside 
general reading. 
Paradoxically, the initial submissions of books are probably much more 
useful to someone doing work in teaching and research than are the books 
which have gone through an editing process.  Is there not some way we 
could preserve information and, simultaneously, not plague the general 

Other related posts: