[bksvol-discuss] Re: Quandries of an inexperienced proofreader

  • From: Debby Franson <the.bee@xxxxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Sun, 26 May 2013 14:39:29 -0500

Hi John!

Evan's suggestions were quite helpful.

His comment about italicized words reminded me of two things, one having to do with italics, and the other not.

Sometimes, instead of the italicized pronoun "I", the OCR will recognize it as a / (that's a slash). It's obvious that the slash should be "I".

The thing I thought of that doesn't have to do with italics is that very occasionally, probably due to the font used in the hard copy, the OCR will take out the space between the pronoun "I" and the next word, so you could have "Iwould" ""Icould", Iam" "Ihave" and "Idid", for example. They are obvious scannos, too. Scannos that are easily deciphered such as my examples in this post, I call "light scanos". The ones that I cannot figure out, such as your question about what was really supposed to be "hmmmm" I call heavy scanos. Just my terms, (smile).

Debby

At 01:37 PM 5/26/2013, Evan Reese wrote
Hi John,

The braille translator will put a dot 4 in front of the e in café to indicate that it's an e acute. You should leave those as they are in the book.

As for the compound words, there's quite a bit of variability in how those are written. My K1000 Ranked Spelling flags those quite a lot, at least until I add them to its vocabulary. Unfortunately, they should be left as written, even if they are not pronounced properly by some screen readers. My experience is that OCR usually doesn't remove spaces between words where they exist. The only exception to this is sometimes in italicized text, where letters slant to the right, and the space between words can be a bit smaller than usual. That will sometimes fool the OCR and cause the removal of a space between words, but with normal text, I find this doesn't happen. Don't know what OCR you're using there, but if it is set to remove end of line hyphens, it should work pretty well at doing that. However, with the FineReader OCR engine in my K1000, which is what I use most of the time, if the first word of a compound word is at the end of a line, then my OCR will generally leave the hyphen in. Most of those won't be at the end of a line though, so if you see very many of these without hyphens, chances are that that is how they were printed. Those should be left as written as well.
HTH


Evan

----- Original Message -----
From: <mailto:john.falter@xxxxxxxxxxx>john.falter
To: <mailto:bksvol-discuss@xxxxxxxxxxxxx>bksvol-discuss@xxxxxxxxxxxxx
Sent: Sunday, May 26, 2013 2:12 PM
Subject: [bksvol-discuss] Quandries of an inexperienced proofreader

the word café contains the é.
How is this handled in Braille?
Are we supposed to convert é to plain e?
I find words like love seat written as loveseat (alternate spelling).
I can't tell if the space is left out by the author/editor or the scanner/ocr.
When those two words lack the space, the pronunciation is badly affected.
I find many words that I think should be hyphanated but aren't.
Again I can't tell if the scanner/ocr has done this, and these words also aren't pronounced very well.
What do you experienced people think?

                                --
                mailto:<the.bee@xxxxxxxxxxxx>
--
The tongue of the wise uses knowledge rightly, But the mouth of fools pours forth foolishness.
Proverbs 15:2 NKJV

“Teach me, and I will hold my tongue
; Cause me to understand wherein I have erred.
Job 6:24 NKJV


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: