[bksvol-discuss] Re: Rejecting a great scan for too many page breaks

  • From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 11 Jun 2009 16:58:11 -0400

That's good to hear. By the way, the soft returns are of no account in the rtf file. They have no effect on paragraphs in the Bookshare books.


Evan

----- Original Message ----- From: "Robert Peters" <rpet@xxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Thursday, June 11, 2009 11:35 AM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


Evan,
I think it is turned on; a book I had scanned and had rejected did not have any hyphenated words at the end of lines. I also noticed that were there might have been hyphenated words, a soft and not a hard carriage return goes in the file.
RKP



Attention: The information contained in this message and/or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or nonpublic material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.


"EVAN REESE" <mentat3@xxxxxxxxxxx> 6/10/2009 6:51 PM >>>
Replacing the hyphen ^p with nothing should join up the hyphenated word. But
OpenBook does have an end of line hyphen feature that I believe should
already be turned on. But if it isn't, turn it on and it may reduce--if not
eliminate--end of line hyphens.

Evan


----- Original Message ----- From: "Robert Peters" <rpet@xxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 7:16 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page
breaks


Thank you for this.  I'll try it.
I suppose that for hyphenated words, one would replace the hyphen ^p with
nothing?
Robert



Attention:  The information contained in this message and/or attachments
is intended only for the person or entity to which it is addressed and may
contain confidential and/or nonpublic material.  Any review,
retransmission, dissemination or other use of, or taking of any action in
reliance upon, this information by persons or entities other than the
intended recipient is prohibited.  If you received this in error, please
contact the sender and delete the material from any system and destroy any
copies.


"Lori Castner" <loralee.castner@xxxxxxxxxxxxx> 6/10/2009 1:08 PM >>>
Robert and Evan,

I suspected that the line ending character was the paragraph mark (caret+p
typed as pshift+6+p).

I scan with openbook 7.02 and sometimes have this character entered
through
a book and sometimes it does not show at all.  I find that the problem
occurs less often with recotgnize columns turned on.  I turn exact view
(not
exact wording) off and still have the problem.  I have used different
recognition engines with no effect.

I now follow Mayrie's recommendation when I have converted the file to
.rtf
and am editing in Word.

In the find box type caret+p followed by the lower case letter a and in
the
replace box space a. Be sure to tab to the more button and space then tab
to match case and check that.  I shift tab back to replace all and enter.

I follow this sequence with every lower case letter in the alphabet.
After
a few books, the sequences take about five minutes.

Also, Robert, in Openbook be sure to turn off despeckle, white on black
and
language analyst. These features are on by default; turning them off will
greatly improve your scans.

Cat Lover Lori

----- Original Message ----- From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 10:07 AM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page
breaks


I used to use WordPerfect. I have to say here that I really liked that
program. Anyway, the "hrt" you mention here is the hard return character,
the ^p in MS Word, which is the paragraph symbol. I do not think you want
to do a global search and replace to get rid of those, as your book would
then become one very long paragraph. Somehow, you need to find out why
your
OCR software is putting those hard returns at the end of each line and get
it to just put them at the ends of paragraphs in the book.

Is there a setting in OpenBook that has something to do with respecting
line endings in the book? If so, then you need to turn it off. If there
is
no such setting, then I'm not sure what you can do. I used to use
OpenBook
7 about two years ago and I didn't have this issue come up.

Evan

----- Original Message ----- From: "Robert Peters" <rpet@xxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 10:13 AM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page
breaks


Jamie,
These are instances in books I have scanned and had rejected; according
to Word Perfect 12.0, they are linebreaks [HRt] is the character it
gives
me, and there is one at the end of each line.  I do not have a Braille
display, and the rejection notes I have gotten on the four all said
linebreaks.
What is the method in Microsoft Word to get JAWS to tell me a linebreak
character is there so I could try a global replace?
I'm sorry to have so many questions; all my scanning in the past has
just
been for me, and using a BookPort for the last several years to read
them. BookPort doesn't care about the linebreaks, but the Victor Streams
(which I just got) appears to, as it makes the reading jerky.
Robert



Attention: The information contained in this message and/or attachments
is intended only for the person or entity to which it is addressed and
may contain confidential and/or nonpublic material.  Any review,
retransmission, dissemination or other use of, or taking of any action
in
reliance upon, this information by persons or entities other than the
intended recipient is prohibited. If you received this in error, please
contact the sender and delete the material from any system and destroy
any copies.


Jamie Yates <mirxtech@xxxxxxxxx> 6/9/2009 7:24 PM >>>
Also, Robert, you might not be dealing with the ^l line break. You might
be
dealing with paragraph breaks at the end of each line, so that the text
looks like it does in the printed book.

When I have seen that, what I have tried to do is determine if there are
2
or more paragraph breaks ^p between the paragraphs. If so, I have
changed
all instances of ^p^p^p to a qqq, or if there aren't usually 3 together
then
just the ^p^p to a qqq. Then I change single ^p to a space. Then I
change
all the qqq back to ^p^p^p and I make sure to change all of the ^m to a
^p^m^p because changing those gives you the blank lines before and after
hard page breaks.

This will ONLY work if paragraphs are separated by at least two
paragraph
breaks. If they aren't, you're better off rejecting the book than you
are
trying to change only the correct ^p to a space.

--
Jamie in Michigan
Currently Reading: False Prophet by Faye Kellerman
www.michrxtech.com/books.html

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list
of available commands, put the word 'help' by itself in the subject
line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list
of available commands, put the word 'help' by itself in the subject line.



To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list
of available commands, put the word 'help' by itself in the subject line.

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list
of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: