[bksvol-discuss] Re: Scans with Spaces in the Middle of Words

Cindy, the k1000 folks were just on the list saying this was a bug which can be fixed with a patch in Kurzweil.


At 01:55 PM 7/6/2006, you wrote:

One possibility ocurs to me--if the scanner just
eliminates the end-of-line hyphens but does not
replace them with nothing, then the spaces would
remain. I don't think one can do a global replace,
though, because then some words that should be
hyphenated would be together instead--of course that
can be corrected by the validator, if he/she reads the
book, but not if he/she just does the minimum
required. Still,having those words together without a
hyphen probably is preferable to having space between
words where they don't belong. smile

Cindy

--- Bud Schwab <budschwab@xxxxxxxxxxx> wrote:

> I get quite a few of those breaks in the middle of a
> word and it's
> not necessarilly at the end of a line.  The next
> time I find that I
> will do the same page over with each engine and see
> what happens.  If
> it's not the engine I don't know what it could be.
>
> Bud
>
> At 07:54 AM 7/6/2006, you wrote:
> >One version of Kurzweil 1000 had, for a while, a
> problem when using
> >ScanSoft as the recognition engine that sounds like
> it would produce
> >the kind of defects you are describing. When an end
> of line hyphen
> >was removed, a space was left in its place. This
> was fixed quite a
> >while ago in a patch.
> >
> >Stephen
> >
> >At 04:15 PM 7/2/2006, you wrote:
> >>I've seen problems with spaces inside of words
> before.  From what I
> >>was told, the submitter was using ScanSoft.
> Unless someone knows
> >>of a book with this problem that was not scanned
> using ScanSoft,
> >>then it's possible that it's a bug in that OCR
> software, or a bug
> >>in an earlier version if you happen to have a
> recent version which
> >>doesn't have the problem.
> >>
> >>Whether it's related to a specific piece of OCR
> software or not,
> >>it's possible that it's due to the word being
> split at the end of a
> >>line in the printed book and the OCR software not
> recognizing the
> >>hyphen at the end of the first half of the word,
> so the word
> >>doesn't appear to be a word split by the end of a
> line.
> >>
> >>HTH
> >>
> >>Gerald
> >>
> >>----------
> >>From: bksvol-discuss-bounce@xxxxxxxxxxxxx
> >>[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On
> Behalf Of Bud Schwab
> >>Sent: Sunday, July 02, 2006 10:07 AM
> >>To: bksvol-discuss@xxxxxxxxxxxxx
> >>Subject: [bksvol-discuss] Re: Scans with Spaces in
> the Middle of Words
> >>
> >>I have the same trouble.  Hope somebody comes up
> with a sollution
> >>or at least an explanation.  I'll be watching.
> >>
> >>BudAt 05:11 AM 7/2/2006, you wrote:
> >>
> >>I'm validating a book now that has a random space
> in the middle of
> >>words, perhaps four or five times a page.  The
> spell checker will
> >>catch most of these, but if the space appears in
> such a place that
> >>the characters before and after the space both
> form words, I'll never know.
> >>
> >>I suspect there's no way around this but reading
> the book through,
> >>which I may not have time to do.
> >>
> >>Any idea what causes the OCR to do something like
> that?
> >>
> >>Just curious,
> >>
> >>Lora
> >>
> >>
> >>
> >>__________ NOD32 1.1616 (20060622) Information
> __________
> >>
> >>This message was checked by NOD32 antivirus
> system.
> >><http://www.eset.com>http://www.eset.com
> >>
> >>                                 Bud Schwab
>       W 6 Z Y P
> >>Malibu, California
> >
> >To unsubscribe from this list send a blank Email to
> >bksvol-discuss-request@xxxxxxxxxxxxx
> >put the word 'unsubscribe' by itself in the subject
> line.  To get a
> >list of available commands, put the word 'help' by
> itself in the subject line.
> >
> >__________ NOD32 1.1637 (20060702) Information
> __________
> >
> >This message was checked by NOD32 antivirus system.
> >http://www.eset.com
> >
>
>
> Bud Schwab
> W 6 Z Y P
> Malibu, California
>
>  To unsubscribe from this list send a blank Email to
> bksvol-discuss-request@xxxxxxxxxxxxx
> put the word 'unsubscribe' by itself in the subject
> line.  To get a list of available commands, put the
> word 'help' by itself in the subject line.
>
>


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.

To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.

Other related posts: