[bksvol-discuss] Re: pages that won't scan right.

  • From: "Gerald Hovas" <GeraldHovas@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Tue, 5 Jun 2007 10:33:04 -0500

Laura,

Having columns enabled when recognizing tables will cause information which
should appear on the same line to be placed further down the page.  For
example, when recognizing a table of contents, if columns are enabled, then
all of the page numbers will appear after all of the chapter titles instead
of each page number being placed on the same line as the chapter title.

BTW, FineReader is up to version 8.

As for reasons to use one OCR engine over another, each has its own
strengths and weaknesses.  For example, it was my experience that FineReader
7 did a better job at recognizing punctuation, and the previous version of
ScanSoft did a better job at recognizing larger text or text in headers.  My
personal preference then was to use FineReader since I preferred to look for
issues with text instead of punctuation.  I've only had the current versions
of the OCR packages for a couple of weeks, so I can't say what improvements
had been made to either package.

I think the best advice to give you would be to try both, and either settle
on one which tends to do the best job on the type of books you like to scan,
then switch to the other if you encounter a problem.  Or to give each a try
every time you start a new scanning project.  BTW, K-1000's Optimization
feature is meant to help you with the latter, if you happen to have K-1000.

HTH

Gerald

-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Laura Glowacki
Sent: Tuesday, June 05, 2007 8:58 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: pages that won't scan right.

Two questions:

Why do you disable columns?  Is it advisiable to do that allof the time? 
I'm a newbie scanner so I'm learning invaluable tricks through this thread.

Also, why fine reader 7?  Isee the different engines, but can anyone give me

an explanation as to the difference between them all and why one might be 
better than another in different situations?

Thanks,
Laura
----- Original Message ----- 
From: "Donna Smith" <donnafsmith@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Monday, June 04, 2007 6:45 PM
Subject: [bksvol-discuss] Re: pages that won't scan right.


> Monica has already given some very good advice and she let us know that
> you're using Kurzweil.
>
> First, I don't think you get the best scan when you set your program to 
> scan
> and recognize and read all at the same time.  I, too, like to read while I
> scan, but I use something other than my computer to read to me, such as a
> good old-fashioned book on tape or some other form of digitized book via a
> reader like BookPort, etc.  If all else fails, I just put on the radio or
> TV.  Not only do I think I get a better scan, but it's much quicker if you
> do each function separately.
>
> So here's what I do when I use Kurzweil.
>
> First I test scan a few pages in different parts of the book just to see
> what kind of scan I will get.  If it's a funky scan, I play around with 
> the
> optimizing settings or I change the recognition settings, but I'm getting
> ahead of myself.
>
> To scan I do the following:
>
> 1.  I go to settings and choose scanning.
> 2.  I arrow down once to set image scanning only.
> 3.  I tab once and set the page orientation such as right of scanner which
> is typical for me when scanning two pages at once.
> 4.  I tab over and set the delay between scans to 3 seconds, but this 
> should
> be set for whatever makes you comfortable.
> 5.  I tab to OK and press enter.
>
> Now I'm ready to scan.  I scan the whole book like this, or until I get
> tired of scanning which may happen halfway through or whenever.
>
> For the recognition phase, I do the following:
>
> 1.  I go to settings and arrow down to recognition and press enter.
> 2.  I disable column identification.
> 3.  Set it to recognize two pages.
> 4.  I tab over to text quality and leave it on normal if the test pages 
> were
> good, or set it for degraded if it was a poor scan.  I've never had to set
> one for draft quality, yet.
> 5.  I tab over to set partial columns to ignore.
> 6.  I set it to retain blank pages.
> 7.  (And this may be the most important part), I choose FineReader 7 as my
> selected recognition engine.
> 8.  I then tab to OK and press enter.
>
> To start the recognition, I go back to settings and scanning, arrow down 
> to
> recognize only, shift tab to OK and then start the recognition.
>
> Still, no matter how well you set things up, sooner or later you'll need 
> to
> rescan a few pages or add back in pages that got skipped.  I find these by
> checking through the scan using the page down key and checking lines
> periodically, discover the page numbers of the pages that need correction,
> and then make my best guess as to where those might be in the book.  I use
> the standard scan and recognize feature to spot scan pages until I find 
> the
> right pages, then I go back through all the settings above to get the best
> scan possible.  This usually does the trick.  If it doesn't, then it's
> usually because there is a table or diagram or something funky on the page
> other than straight text.  That takes some trial and error to fix 
> depending
> on what it is and sometimes it takes a good ole sighted person to help you
> figure it out!
>
> Hope this helps.
>
> Donna
>
> To unsubscribe from this list send a blank Email to
> bksvol-discuss-request@xxxxxxxxxxxxx
> put the word 'unsubscribe' by itself in the subject line.  To get a list 
> of available commands, put the word 'help' by itself in the subject line.
> 

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of
available commands, put the word 'help' by itself in the subject line.

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: