[bksvol-discuss] Re: Fw: clearing out line breaks

  • From: talmage@xxxxxxxxxx
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Mon, 14 Mar 2005 13:53:25 -0500

Hi Gerald,

Regarding that com for corn entry, as Sarah pointed out, it really shouldn't make a difference, but since I removed it, I have myself convinced that I don't get such errors anywhere as often. I've gone back and checked scans, before and after removing it from the exception dictionary, and the frequency of occurrence seems to have gone from many to nearly none. Now of course the scans were all on separate titles, but some were in the same series by the same author and publisher, and I realize this is not empirical evidence of consistent behavior, but it makes me skeptical that there isn't a bug in Openbook. Perhaps someday I'll be motivated to scan the same book with the only difference in settings being the inclusion, and exclusion, of the com - corn entry in the exception dictionary, but I'm not motivated enough at this point and they aren't paying me the big bucks to test their software.
The tom for torn entry is certainly a brain dead entry however, when you consider they didn't check the box for case sensitive, and in my experience, I come across a lot more books where a character might be named Tom rather than torn erroneously being identified as tom.


Dave

At 01:09 PM 3/14/2005, you wrote:
Dave,

I've already got a couple of yours saved.  I've been meaning to tell you
that OpenBook's been working better since I turned off it's speech and
switch to using JAWS with it.  I saved the one about switching to Natural
Voices in OpenBook, too.

I didn't bother to save the message, but I've already gone in and deleted
COM to CORN in the OCR correction dictionary, too.  Now I know why I had so
much of a problem with that in the Honorverse short stories and Generation
Warriors.  I need to take some time and go through the dictionary to see if
there's other entries that are giving me problems.  I suspect that there's a
NEEDLER for NEEDIER in there since I kept having to change NEEDIER back to
NEEDLER in the three books I submitted this weekend.

Gerald

Gerald

-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of
talmage@xxxxxxxxxx
Sent: Monday, March 14, 2005 11:45 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: Fw: clearing out line breaks


Hi Gerald,

Thanks for the info.
I have to admit, my recommendations, and cautions, for doing global
replaces are generic in nature.  While I use MS Word, I do so only
reluctantly.  I haven't really found a word processor that I could say I
really like since DOS days and Word Perfect 5.1P.  So ok, I'm a dinosaur.
I will however, move your message to my saved mail mailbox in case I want
to refer to it in the future.

Dave

At 12:07 PM 3/14/2005, you wrote:
>OK, I tried to stay out of this because I knew if I got started, I'd have
>trouble stopping <smile>.
>
>There aren't always Manual Line Breaks at the end of lines.  Sometimes it's
>Paragraph Markers.  That's why you need to look to see what you're dealing
>with like Kelly said. You can do this with Ctrl-Shift-8 in Word.
>
>Paragraph Markers (^p) are what you get when you press Enter in Word.  You
>get Manual Line Breaks by pressing Shift-Enter in Word.
>
>Some other handy symbols to know are ^m for Manual Page Breaks
(Ctrl-Enter),
>^t for tab, and ^w for whitespace.  For those of you who aren't familiar
>with the term whitespace, it's the invisible characters between words like
>spaces and tabs.
>
>If you decide to remove line feeds, then you may want to remove the
>whitespace at the beginning and end of lines before removing ^p's or ^l's
>from the end of the lines to prevent you from getting multiple spaces
>between words.  Also, since you can't be guaranteed that there are spaces
at
>the end of lines without checking every individual line, then it's better
to
>remove the white space at the beginning and end of the line, then replace
>the ^l or ^p with a Space instead of nothing.  This will keep you from
>concatenating two words by accident and getting multiple spaces between
>words.  To remove whitespace only at the beginning or end of a line when
>Paragraph Markers are at the end of lines, replace ^p^w with ^p or ^w^p
with
>^p.  You could also replace ^w^p^w with ^p if you want to do both at the
>same time.
>
>If you do remove the whitespace at the beginning or end of lines, then you
>will not have whitespace on blank lines and can remove multiple blank lines
>by doing something like replacing ^p^p^p with ^p^p.  This doesn't work for
>blank lines at the top of a page.  For that you'll need to do something
like
>replacing ^m^p^p with ^m^p.  Any time you set out to remove multiple blank
>lines, you will want to continue doing the search and replace until Word
>tells you that no changes were made.
>
>The reason I used the Paragraph Markers in the example above is that if
>you're dealing with Paragraph Markers at the end of lines, and you remove
>multiple blank lines, then you can replace two consecutive Paragraph
Markers
>with Manual Line Breaks (^p^p with ^l^l), then remove the Paragraph Markers
>at the end of the lines, then change the Manual Line Breaks back to
>Paragraph Markers.  Of course, this would only work if your document
happens
>to have blank lines between paragraphs.
>
>In case you didn't realize it.  You can also add a blank line to the top
and
>bottom of a page by replacing ^m with ^p^m^p.  This may cause you to have
>one or more multiple blank lines at the top or bottom of a page though, but
>you should already be able to deal with that from an earlier example.  If
>you do decide to add or delete blank lines at the top or bottom of pages,
>then I'd suggest doing this before you strip headers so you can check every
>page as you go.
>
>If you do decide to make changes using a global search and replace, then
>take precautions like Kelly said and make a backup copy of the file before
>you start experimenting with it.  Also, like Sarah said, read through it,
>(or at the very least skim through it and spot check a large part of it)
>after making global search and replaces.  Also, spell check it again after
>doing the global search and replaces. like Tony said.  And, not to leave
out
>Dave, you can get yourself into trouble quickly if you're not careful.  As
>he said, there's no more accurate way than doing it manually.  The only
>problem with that is that it's tedious.  OK, did I forget to acknowledge
>anyone? <smile>
>
>Also, you don't have to hit Alt-E E to get to the Search and Replace dialog
>box in Word.  Ctrl-H will also take you there.
>
>See, I told you I'd have a problem stopping once I got started <smile>.
>
>OK,  Sue, I think you wanted to live dangerously and cause a lot of trouble
>this week.  This should help you get started. <grin>
>
>Gerald
>-----Original Message-----
>From: bksvol-discuss-bounce@xxxxxxxxxxxxx
>[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Tony Baechler
>Sent: Monday, March 14, 2005 2:13 AM
>To: bksvol-discuss@xxxxxxxxxxxxx
>Subject: [bksvol-discuss] Re: Fw: clearing out line breaks
>
>
>Hi list.  I'll just add that to do a similar thing with Word, or at least
>Word 2000, search for ^l to find the line breaks.  One really easy way to
>fix split words is to go into the find and replace dialogue with Alt, E,
>E.  Search for the following in the find edit box:
>
>-^l
>
>Replace with nothing.  Instantly your split words are gone.  This also
>takes out the line break, so you might have lines with only one or two
>words on them.  I don't have a good way to solve that.  Also make sure you
>do a spell check because some compound words that should be hyphenated will
>need to have the dash put back in, like "twentyone" instead of
"twenty-one."


Other related posts: