[bksvol-discuss] Re: Finding Soft Hyphens

  • From: Mike <mlsestak@xxxxxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Wed, 19 Mar 2008 18:43:15 -0700

Are you sure they are soft hyphens? They may be real hyphens. Soft hyphens should only be at the ends of lines. Back in the 1960s and before, many publishers split words with hyphens when the whole word didn't fit at the end of the line to make the right margin look straighter. When these books are scanned, these words split by hyphens can end up in the middle of the line since the lines are not necessarily the same width as in the book. Have you tried to search for regular hyphens?


Misha

Lora wrote:
I'm validating a book that has numerous words broken up by soft hyphens. I'd like to fix these, especially because they're not words at the end of lines or pages, just random words in the middle of various lines. I tried selecting the soft hyphen, and copying it into the Find box so I could locate other occurrences, but it won't copy and paste. Is there an easy way to find this soft hyphen? This book poses another problem. There are lots of the common scanos in it. For instance, 1 for I, rime for time, etc. I've fixed the ones I found as I skimmed the book, but many of these won't register in the spell checker. I suppose it means I should read them through. The real tricky one is "me" for "the." I suspect, if the OCR did that, it did some other funky stuff for the letters th. This is a long book. It's Buddhism in Action. Knowing I can be a slow validator, should I return it, or take the time to read through it. I'm interested in the book, and will gladly read it cover to cover, but at over 400 pages, this could take me a while. The scan is generally excellent, but there are lots of these little OCR quirks. Advice is welcome. Thanks, Lora

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: