[bksvol-discuss] Re: Scans with Spaces in the Middle of Words

  • From: "Gerald Hovas" <GeraldHovas@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Sun, 2 Jul 2006 15:15:48 -0500

I've seen problems with spaces inside of words before.  From what I was
told, the submitter was using ScanSoft.  Unless someone knows of a book with
this problem that was not scanned using ScanSoft, then it's possible that
it's a bug in that OCR software, or a bug in an earlier version if you
happen to have a recent version which doesn't have the problem.

 

Whether it's related to a specific piece of OCR software or not, it's
possible that it's due to the word being split at the end of a line in the
printed book and the OCR software not recognizing the hyphen at the end of
the first half of the word, so the word doesn't appear to be a word split by
the end of a line.

 

HTH

 

Gerald

  _____  

From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Bud Schwab
Sent: Sunday, July 02, 2006 10:07 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: Scans with Spaces in the Middle of Words

 

I have the same trouble.  Hope somebody comes up with a sollution or at
least an explanation.  I'll be watching.

BudAt 05:11 AM 7/2/2006, you wrote:



I'm validating a book now that has a random space in the middle of words,
perhaps four or five times a page.  The spell checker will catch most of
these, but if the space appears in such a place that the characters before
and after the space both form words, I'll never know.
 
I suspect there's no way around this but reading the book through, which I
may not have time to do.
 
Any idea what causes the OCR to do something like that?
 
Just curious,
 
Lora
 


__________ NOD32 1.1616 (20060622) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com

                          
Bud Schwab               
W 6 Z Y P
Malibu, California
                 

Other related posts: