[bksvol-discuss] Re: question for member bookshare readers re tables of contents

  • From: "Mayrie ReNae" <mayrierenae@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Fri, 15 Jan 2010 11:56:05 -0800

Hi Lynn,
Page numbers will be picked up whether they are at the bottom or top of
pages.  However, on pages where chapter headings occur, it has been
recommended by Carrie that we still put the page number above the chapter
heading.  For some reason, chapter headings were still being stripped even
with the new converter, unless the page number was above them and separated
from them by a blank line.
Chapter headings need to be larger than the font of the book by four points
in order for them to be navigated as a heading in the daisy files.  
Getting rid of extra blank lines is something that just makes reading by
either a braille reading, or sighted proofreader easier.  As Monica said,
bookshare's tools will remove them if the proofreader doesn't.
As for page numbers being separated from the text by a blank line, the
powers that be say that we do not need to do this.  However, because I have
encountered books with missing page numbers, or page numbers incorperated in
the text of the book, rather than being separated from it as they should be,
I have chosen, as have others, to continue separating them from the text by
a blank line, whether the page numbers appear at the top or bottom of pages.
Bold and italics are indeed preserved in bookshare's books.  And as you have
said, when chapter headings are sometimes bolded and sometimes not in a
scan, I opt to bold them.  
Does that answer most of your questions?


From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Lynn Zelvin
Sent: Friday, January 15, 2010 11:40 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: question for member bookshare readers re
tables of contents

OK. I am both relieved and confused by this answer. I guess it means I'll
stop removing blank lines. . I was trying to make sure I only removed those
which were just space between paragraphs and not those which might be
intentional breaks, but if they're all going away anyway, I can stop
thinking about it. Since the guidelines now really now say to read the whole
book, it's no big deal to get rid of blank lines, it's just the extra
attention to this. Line breaks within paragraphs, at least those we can be
sure about, even doing search-replace 27 times, takes longer.

What confuses me more is what's a change we absolutely have to make for the
books to convert properly. Like, this question of ellipses in tables of
contents. It sounded like bookshare volunteers were trying to set a
standard, rather than whether the ellipses were something the that was
needed for the book to be processed properly.  It sounded like volunteers
had decided on page numbering at the top with blank lines before and after
them rather than it being something needed  - will the automation work just
as well if page numbers are at the bottom of the page with or without blank
lines before them?  In the manual the checklist talks about standardizing
the font and using 16 point for chapter headings. So, I did this, using 14
point for minor headings within chapters. But it also seems to me that doing
this removes font changes the author may have intended. There wasn't
anything about this. Is there actually a font size that is used by the
automation process? Or is that another volunteer agreement? Is bold text in
daisy books just where bolding was used for emphasis in the original? is it
added or removed in headers? Some of the minor headings in this book I'm
working with now were in bold and a few were not and the difference didn't
seem to be in meaning, so I bolded them all, figuring that occasionally the
scanner just didn't pick up the difference. But really, it sounds like
nothing there matters because it will just be removed or added. I guess,
that if we're not putting in codes, I'd want to know which formatting
indicators the automation process actually uses, whether it's font size or
bolding or just shorter lines with blank lines before and after, or what,
and if nothing we do for a specific type of item matters, I'd like to know
that also and I'll not bother. I have a similar question about footnotes,
but will put that in a different message since it's sort of a different

I wanted to think I could follow instructions without asking a million
boring questions on this list, but now I feel really confused. I used to do
formatting for braille, both literary and textbook formatting, and it felt
much clearer and easier because the rules were detailed and specific. I
think the work is faster and easier when the rules are clear. And I'd
personally vote for the option to mark sidebars and captions and any other
feature that will actually be used, like, any indications that will make
final processing of tables work better, rather than having to  just leave
them unclear. Certainly, if some books had features that others lacked, an
end user would still benefit from having it when it's there. 


At 03:45 PM 1/14/2010, you wrote:

Hi Lynn. Bookshare does automate a lot of things like page numbering,
chapter indication, and removal of extra spaces and blank lines. Some
volunteers are choosing to do things like delete blank lines by choice, not
because they have to. It does make proofreading more comfortable for our
sighted and Braille reading volunteers. Since I don't know how my
proofreader will be working when I submit a book, I usually take a minute to
do this step. However, I only do it because I can do so in less than 30
seconds. If it took much longer than that, I wouldn't bother since the
Bookshare tool can do it during conversion. I definitely wouldn't do it if I
had to do it manually, deleting line by line.
Bookshare's processer tool will number pages without page numbers. However,
if a submitter submits a book with no page numbers, it can make proofreading
and identifying missing pages more difficult. With no page numbers, you
don't know for sure which pages to ask someone to scan for you if they're
missing from the book. You have to guess and try to piece the text together.
That's why Bookshare asks us to submit books with page numbers if they scan
well enough. So that's a clarity issue rather than an automation problem.
As for the formatting, Bookshare daisy files do have font size changes,
bolded text, as well as page and often chapter navigation. The catch is that
not all daisy players work in the same way. Some older players like the
Maestro don't handle chapter navigation at all. On the flip side, the now
free Freedom Scientific daisy reader for JAWS and Pac Mate uses chapter
navigation very well. Since there is such a wide range of functionality
among daisy players, Bookshare has chosen to write their code to validate
against the standards from the Daisy Consortium instead of writing for
specific devices. So the old cliché "results vary" applies here. (smile)
Interestingly, the daisy format can support a caption element. However, when
Bookshare staff asked if we'd be willing to use it to mark captions or
sidebars, about half of the volunteers said that it would be too burdensome.
It became a bit of a controversial issue. The idea was dropped at that
point. I was disappointed because I thought it would go a long way toward
making things like sidebars and text boxes more distinguishable when they
interrupt the flow of text in a book. I'd like to see the staff revisit the
issue, making it possible for those of us who want to label captions to do
so. If they made it optional, I think people would gradually begin trying
it, especially after seeing the improvements in books they read with better
navigation. Those that felt uncomfortable with the process could skip it
with no pressure or anything.
Monica Willyard
"The best way to predict the future is to create it." -- Peter Drucker


From: bksvol-discuss-bounce@xxxxxxxxxxxxx [
mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Lynn Zelvin
Sent: Thursday, January 14, 2010 1:07 PM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: question for member bookshare readers re
tables of contents

There was a point where just being able to read a book at all was considered
wonderful enough and better scanning quality was the best we could hope for.
I'm appreciating all the thought it seems is now being put into formatting
and other improvements.  That being said, we're really working with very
inadequate tools in trying to do better with this. We're asking the question
as to the best compromise solution considering that there are probably a
half dozen different formats in which people are reading these books -
braille with several different page widths, text-to-speech with some people
just sitting back and listening and others actively moving through the text
as they read, some with a screen reader on their computer and some with a
DAISY player on a computer or stand-alone device. Some use enlarged print
where some enlarge the text meaning again we have different page widths, and
others leave the original text and use software to magnify their screen
image. Some are using combinations both looking at the words and listening.
I've occasionally had things I did with both speech and braille, although I
don't know if anyone actually reads that way. There are probably more that
I'm not thinking about. 

It's impossible to find one best way for all of these.   The answer is in
being able to  use formatting and style codes, or at least in being able to
standardize. and then for the final formats to make use of those codes. So
if you code something as a page number, when converting to braille, it can
be, for example, placed in the top right corner regardless of how wide the
page is, could be spaced differently for different presentations of
enlargement,  and could contain a code that lets the daisy player actually
know it's a page number. a line of dots in tables of contents could be
present in visual and braille presentations, adjusted for page width, and
be active links to the actual page.  

I'd thought bookshare was doing some of that, but as I do tend to ignore
formatting when I read, I can't say I've noticed.   I don't use daisy
players or anything else fancy as I  don't like the speech engines they use.
I'm sure they must certainly be doing this with the NIMAC books that  we
aren't allowed to access. I was going to go poking around in some of the
books I already have, but it would be easier for someone who already knows
to give an answer. Even though they don't ask volunteers to add in codes,
I'd assumed they did some things by automation, like  coding as page numbers
sequential numbers that appear at the beginning and end of pages.  If
they're not, considering all the work volunteers are now putting into these
books, it seems we should ask them for a few codes we can use.  Validators
who chose could then properly code tables of contents, chapter headings,
page numbers, and footnotes, at the least. The volunteer manual I saw did
recommend  enough  standardization of such things that it does seem
bookshare could be making use of such efforts in the conversion process.
Maybe they're afraid not enough people would validate books if they were
expected to do this, but since some *are* doing it, maybe we could get some
guidance from them. Maybe if they are not making full use of our efforts, we
could prod them? 

Am I correct in my new reckoning that there is a gap between volunteers and
paid staff, that people making decisions about what to automate and how to
convert books are not interacting with people doing scanning and validating?
Is it that the hopes, which I share, are pinned on getting text from
publishers in the future and thus not needing to go through all this? Well,
even then we'll still need these tools to include older books in the
collection. In the past validators were asked to do simpler things like make
sure all the pages seem to be there. It's a lot more now and I think that's
good, but what a shame for us here making such compromise decisions when we
could do something that will really be used properly. Has this been
discussed already? 


Other related posts: