[bksvol-discuss] Re: Proofing a scanned book

  • From: Melissa Smith <mdsmith25@xxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Thu, 30 Jun 2011 09:54:19 -0500

Are there paragraph marks at the end of the short lines? If so, and it isn't the end of a paragraph, they need to be gotten rid of. I believe that there is a section in the manual that covers this. Well, I couldn't find the section in the manual, So, below is the process I use. This may not catch every single one, but does a pretty darn good job.


How to Remove any extra carriage returns inadvertently inserted by OCR:

This involves using the find and replace command in a special way that uses what

are called wildcards.

Before you begin, you want to ensure that you do not lose the blank lines at the

tops of pages. Simply go to the beginning of

the book, or to the beginning of the text if you are sure the

preliminary pages are in excellent condition and do the following:

In the find box, enter ^p^p

In the replace box enter two instances of a character that is not likely to appear

in the book. A good choice is $$, that is

dollar dollar.

Then execute a "replace all."

Next, in the find box, enter ^m^p

In the replace box enter two instances of a new unique character

that is not likely to appear in the book. A good choice is %%,

that is, percent percent. Just make sure it is not the same character that you just

chose in the previous step.

Then execute a "replace all."

Now, replace all the remaining paragraph marks in the entire book temporarily.

In the find box, enter ^p

In the replace box enter ~ , that is a tilde

Now you are going to look for paragraph marks that shouldn't be there. To do this,

you use a special kind of search, using the

"use wildcards" box in the find and replace dialogue. In the Find and Replace dialogue

box, click on the button that is

marked "More." This will expand the options that are available in the Find and Replace

box to include a new list of Search

Options. In the list of Search Options, check the box for "use wildcards" (you can

also do this while in the Search box by

typing alt U, which is alt "capital U) Next, in the find box, enter ~([a-z]) that

is tilde

left-parentheses left-square-bracket lowercase-a hyphen

lowercase-z right-square-bracket right-parentheses

In the replace box, enter\1 that is space backslash numeral one

Then execute a "replace all."

Now you will undo the search and replace you used to preserve the blank lines, and

restore them.

First, make sure you uncheck the use wildcards box in the find

and replace dialogue. You don't want it to remain checked because

it will affect other searches you will make later.

In the find box, enter: ~ , that is a tilde

In the replace box enter: ^p

Execute "replace all."

In the find box, enter $$ (or whatever other character you used to take the place

of your blank lines)

In the replace box, enter ^p^p

Execute "replace all."

Now you will undo the search and replace you used to preserve page breaks and the

blank space after them and restore them.

In the find box, enter %% (or whatever other character you used

to take the place of your blank lines)

In the replace box, enter ^m^p.

Execute "replace all."


Melissa Smith

On 6/30/2011 6:50 AM, Tim Syfert wrote:
Hi everyone,

Could Scott or anyone tell me if text should go across the full width of the 8 1/2 page? Some lines do and others are short, just as in the book itself.

Thanks.

Tim
To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: