[bksvol-discuss] Re: New Volunter questions

  • From: "Monica Willyard" <rhyami@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Mon, 24 Nov 2008 02:17:37 -0500

Hi, Cindy. The stripper was developed because Bookshare needed a way to
normalize the text in its books. It was written almost seven years ago
though, and it needs an overhaul. I'll explain it as best as I know how. It
makes sure that each page is given a number, removes multiple blank lines,
tabs,  and spaces, and removes headers or footers from the pages. When you
scan a book, most books have a running header at the top of each page or a
footer along the bottom. While these are fine in print, they can drive us
bonkers because our speech or Braille reads them right along with the text
of the book. It's distracting at best and downright confusing or misleading
at its worst. The stripper removes those pesky headers and footers so we
don't hear the title or author's name each time we move to the next page.
The problem we run into is that the stripper isn't very smart. Sometimes it
thinks the name of a new chapter at the top of a page is a header and
removes it. That's not what we want. So we have settled on a work-around, a
truce  until the new site is launched. This work-around is optional since it
does add a little time to your scanning or validating. People are smarter
than the stripper and do a better job with handling headers. This method
produces the best quality, and it is something I almost always do. The
stripper looks for page numbers, and someone figured out that putting a page
number on the top of the page would help the stripper understand that a
chapter name is supposed to be there. We delete the header text, make sure
the page number is at the top of the page, then put a blank line, and then
the first line of text will start. Doing this makes sure that things like
chapter names are protected. As a side benefit, this helps you do a page
check where you'll quickly notice if a page is missing from your file. The
header removal is my first step when I validate a book.

 

This sounds more time-consuming than it actually is. Once you get a rhythm
going, it doesn't really take any brain power. It's like counting stitches
while knitting or snapping green beans after you've picked them. In fact, I
get on the phone with a friend and strip headers while we chat. If I can't
find a friend to talk to, I put on an audio book or some music. The time
goes by pretty quickly, and I reward myself with hazelnut coffee with lots
of cream when I'm finished. (grin)

 

 

Monica Willyard

"The best way to predict the future is to create it." -- Peter Drucker

Other related posts: