[bksvol-discuss] Re: Scans with Spaces in the Middle of Words

  • From: Stephen Baum <steve@xxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Thu, 06 Jul 2006 10:54:44 -0400

One version of Kurzweil 1000 had, for a while, a problem when using ScanSoft as the recognition engine that sounds like it would produce the kind of defects you are describing. When an end of line hyphen was removed, a space was left in its place. This was fixed quite a while ago in a patch.

Stephen

At 04:15 PM 7/2/2006, you wrote:
I've seen problems with spaces inside of words before. From what I was told, the submitter was using ScanSoft. Unless someone knows of a book with this problem that was not scanned using ScanSoft, then it's possible that it's a bug in that OCR software, or a bug in an earlier version if you happen to have a recent version which doesn't have the problem.

Whether it's related to a specific piece of OCR software or not, it's possible that it's due to the word being split at the end of a line in the printed book and the OCR software not recognizing the hyphen at the end of the first half of the word, so the word doesn't appear to be a word split by the end of a line.

HTH

Gerald

----------
From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Bud Schwab
Sent: Sunday, July 02, 2006 10:07 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: Scans with Spaces in the Middle of Words


I have the same trouble. Hope somebody comes up with a sollution or at least an explanation. I'll be watching.

BudAt 05:11 AM 7/2/2006, you wrote:

I'm validating a book now that has a random space in the middle of words, perhaps four or five times a page. The spell checker will catch most of these, but if the space appears in such a place that the characters before and after the space both form words, I'll never know.

I suspect there's no way around this but reading the book through, which I may not have time to do.

Any idea what causes the OCR to do something like that?

Just curious,

Lora



__________ NOD32 1.1616 (20060622) Information __________

This message was checked by NOD32 antivirus system.
<http://www.eset.com>http://www.eset.com

                                Bud Schwab              W 6 Z Y P
Malibu, California


To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.

Other related posts: