[bksvol-discuss] Becoming A Black Belt With openbook

  • From: "Monica Willyard" <rhyami@xxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Wed, 17 Dec 2008 15:08:34 -0500

Hi, everyone. I wrote an email about getting really clear scans and am
modifying it for one of our volunteers who is using Openbook. I'm
posting it publicly in case someone on this list might benefit
from it. It's a little on the long side. I hope something in it will
help you. If I've said anything confusing, please ask me about it. I
know we have volunteers at all levels of computer knowledge, so I'm
focusing on things that may not have occurred to you yet. I'll call
them my top ten Openbook scanning tips. (grin) They work from my
experience, and you may find that you need to experiment to find
something that works well for you. I have access to Openbook 6 and 7.
Most of this should apply for people using version 8 and even version
5 if you still use that. Wyn Wizard users may find some of this to be
useful, but some of it won't apply. I have Openbook 7 installed right
now and used it for several years. So I'll do my best to help you
translate these to other versions of Openbook if that's what you need.

I got a lot of these ideas from volunteers I've been fortunate enough
to work with over the past 2 years. Jim Baugh, Louise, Pratik, Jake,
Scott, Shelley,
and Gerald taught me so much about good scanning. Thanks guys. (smile) You rock!

1. Start with some solid settings in Openbook that will work most of
the time. You may  know your way around Openbook well. I don't know if
you've thought
to work on these settings though since they're not obvious. Under the
settings menu, in the scanner settings tab, make sure that your
despeckle setting is unchecked. In addition, uncheck the option to
scan white text on a black background. These options work well when
scanning newspapers and hardcover books that have a small decoration
around the text. For most books, these settings will actually degrade
performance. In Openbook 7 and later, turn off the language analyst
too. It can introduce OCR errors into your document. Once you have
settings you like, save them as default so you can start scanning
without worrying about them each time you start Openbook.

2. Prepare your book for scanning, and you'll get better results from
the start. Before you begin to scan a book, run your fingers lightly
through the
pages to remove any possible ink ,dust, or other particles that may be
on the pages. If the book is a library book, flip through the book in
sections of
about fifteen pages or so, gently pressing your fingers along the
inner spine to encourage the book to lie flat. If the book belongs to
you, especially
if its a paperback, flip through sections as with a library book, but
bend the book back so that it's outer covers almost touch. You're
giving your book
some flexibility stretches while not breaking its spine. This is
especially important for thick books or when you use two-page scanning
mode and will keep you from having to push down as hard on books while
you scan.

3. Optimize and verify settings for your book. Openbook doesn't have
an optimization feature like Kurzweil, but you can do this yourself
before scanning a whole book. Start with a base of good settings. Use
the resolution setting of 300 DPI for best results. Don't worry about
turning on color scanning unless you're doing a magazine or really
glossy book with lots of photos. Color scanning will just slow you
down if you don't need it. Before scanning a book, open to the center
and do several test scans, adjusting the contrast setting until you
like what you hear. Scanners do have personalities, and they tend to
have a certain contrast setting that works best most of the time. If
you have a high-quality scanner like an Epson, Opticbook, or HP, the
auto contrast feature may work really well for you. In you're using
something like a Canon or Visioneer, you will need to spend more time
adjusting the contrast setting. My old Canon seemed to do best with
the lighten page option. My old HP did best with the darken page
option for most books. Testing 4 or 5 pages in your book will help you
decide which contrast option to use. Once you have figured this out,
please save this as a settings file with the same name as your book.
If you skip this step, you'll have to start over with adjusting
settings when you start Openbook. If you save the settings and only
scan half of your book, you can start Openbook again and load your
settings. Giving them the same name as the book you're scanning will
help you locate the settings file quickly.

4. If someone suggests that you use greyscale, smile politely and
discard the idea. Openbook doesn't implement grey-scale correctly, so
automatic contrast is probably your best choice if a scan isn't coming
out well.

5. Catch bad scans as they happen. There is a friendly debate among
submitters about whether to scan in batches or to scan pages and
recognize them one at a time. There are pros and cons on both sides. I
think this is one area where Openbook makes a submitter's job harder
than it has to be. Since Openbook has no feature to tell you about the
scan quality as you're working, your best bet is to either scan and
proofread as you go or scan 10 to 20 pages at a time and then read
them to make sure your scan is still coming out ok. Nothing is more
frustrating than scanning a 300-page book and discovering that over
half of the pages are a mess. Rescanning is no fun at all!

6. Your scanner needs regular TLC too. Books can be dirty or dusty
sometimes. Mass market paperbacks can leave a residue of ink dust on
your scanner. Keep the scanner glass clean by using a dry, lint-free
cloth. Never use anything wet like an alcohol pad or baby wipe. That
will create little bubbles under the
scanner glass and will cause problems in future scans.

7. When scanning a book in batch mode, do a spot check every 15 or 20
pages. Look at the last page or two of the file to make sure the
settings are still producing accurate results.

8. After doing a scan, run your spellchecker. It will let you see your
spelling errors and will let you fix them more quickly than reading
through the document and fixing errors individually. If
you find some words that Openbook doesn't know, you may want to add
them to your word list so they won't be flagged in future scans. I
don't do this for
proper names unless its a name that will keep cropping up in future
books. I do add words that are valid but that Openbook doesn't have in
its internal
word list. You'll find that doing this over time helps Openbook do a
better job for you when you're cleaning up your scans.

9. Do all of your page rescanning, adding pages, spellchecking,
reading, or editing that you care to do in Openbook. Then save your
file as an rtf. Once you've saved it as an rtf, do not keep editing it
in Openbook because Openbook won't save it properly. So once it's an
rtf file, switch to Word or Wordpad to continue editing or whatever.
To save as an rtf file, press alt f for the file menu, and the letter
a to save as. Tab over to the file type list and choose rtf. Hitting
the letter r in the list should take you right to the rtf option. By
default, Openbook puts files in its library directory. You may want to
navigate to the my documents folder before saving your file. Then tab
over to the save button and press enter.

10. The issue of using auto-corrections when scanning is another issue
where there is debate. I believe it can be a good thing if used
carefully. I should
note that Gerald has pointed out that Openbook has some
auto-corrections that cause problems with books and should be fixed by
users of that program. Here are a few auto-corrections I have added to
my autocorrection list.

dirough for through
diough for though
diought for thought
diey for they
diere for there
dieir for their
cornpany for company
cornfortable for comfortable
tiiing for thing
rnany for many
anydiing for anything

If you use Openbook, you may want to remove a few of the corrections
in its default list. I regularly find these in books scanned in
Openbook and have
to fix them as I read.

modem for modern
torn for tom
glock for clock
morn for mom
bum for burn
corn for com

That last one causes problems for anyone scanning Star Trek books
because Kirk presses his corn badge to talk to the ship. (grin) If a
word like command
is hyphenated between two pages, you get corn-mand. Meanwhile, Batman
dials into the internet with his modern, tries to stop a crook named
torn from shooting him with a clock, and puts the dirty burn in cuffs
until mom-ing. See how auto-corrections can go wrong if you're not
careful?

Whew! We've made it to the end. (grin) I hope some of this makes your
scans easier to work with. It'll give you a foundation to start from
anyhow. Clean-up
tips will be another email and will take some thought. I'm better at
doing than explaining things. I do have a system I use though. I just
haven't really
written it down. Anyone got a cold Dr. Pepper to share?



-- 
Monica Willyard
Visit my blog at http://www.scannersguild.com
 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: