[bksvol-discuss] Re: Rejecting a great scan for too many page breaks

  • From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Wed, 10 Jun 2009 15:09:32 -0400

I need to amend the below slightly, because after writing it, I did recall one occurrence of this issue when I was using OpenBook. In the summer of 2006, I did have the hard return at the end of each line issue occur in one book I scanned with OpenBook version 7.01, or 7.1, I forget where the 1 went, but it was just above version 7. I didn't notice it until I checked the book after it got into the collection. There was a tip on Jake's site on how to remove unwanted hard returns, but the volunteer tips are no longer available there. I just checked. I didn't understand the instructions anyhow, so I didn't know if I could apply them correctly or would just make matters worse..So I rescanned the book, doing something differently, and this time the hard returns were not there.


Unfortunately, I do not remember how the hard returns got in there in the first place, nor do I recall how I kept them from getting into the rescan of that book or the other books I scanned with OpenBook. However, from reading Lori's message, there are ways to at least reduce the problem, and to eliminate it entirely if you are willing to do the search and replace regimen she follows. Now that she mentioned it, I do seem to recall that turning column recognition on did help with this.

Evan

----- Original Message ----- From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 1:07 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


I used to use WordPerfect. I have to say here that I really liked that program. Anyway, the "hrt" you mention here is the hard return character, the ^p in MS Word, which is the paragraph symbol. I do not think you want to do a global search and replace to get rid of those, as your book would then become one very long paragraph. Somehow, you need to find out why your OCR software is putting those hard returns at the end of each line and get it to just put them at the ends of paragraphs in the book.

Is there a setting in OpenBook that has something to do with respecting line endings in the book? If so, then you need to turn it off. If there is no such setting, then I'm not sure what you can do. I used to use OpenBook 7 about two years ago and I didn't have this issue come up.

Evan

----- Original Message ----- From: "Robert Peters" <rpet@xxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 10:13 AM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


Jamie,
These are instances in books I have scanned and had rejected; according to Word Perfect 12.0, they are linebreaks [HRt] is the character it gives me, and there is one at the end of each line. I do not have a Braille display, and the rejection notes I have gotten on the four all said linebreaks. What is the method in Microsoft Word to get JAWS to tell me a linebreak character is there so I could try a global replace? I'm sorry to have so many questions; all my scanning in the past has just been for me, and using a BookPort for the last several years to read them. BookPort doesn't care about the linebreaks, but the Victor Streams (which I just got) appears to, as it makes the reading jerky.
Robert



Attention: The information contained in this message and/or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or nonpublic material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.


Jamie Yates <mirxtech@xxxxxxxxx> 6/9/2009 7:24 PM >>>
Also, Robert, you might not be dealing with the ^l line break. You might be
dealing with paragraph breaks at the end of each line, so that the text
looks like it does in the printed book.

When I have seen that, what I have tried to do is determine if there are 2
or more paragraph breaks ^p between the paragraphs. If so, I have changed
all instances of ^p^p^p to a qqq, or if there aren't usually 3 together then
just the ^p^p to a qqq. Then I change single ^p to a space. Then I change
all the qqq back to ^p^p^p and I make sure to change all of the ^m to a
^p^m^p because changing those gives you the blank lines before and after
hard page breaks.

This will ONLY work if paragraphs are separated by at least two paragraph
breaks. If they aren't, you're better off rejecting the book than you are
trying to change only the correct ^p to a space.

--
Jamie in Michigan
Currently Reading: False Prophet by Faye Kellerman
www.michrxtech.com/books.html

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: