[bksvol-discuss] Re: Rejecting a great scan for too many page breaks

  • From: "Lori Castner" <loralee.castner@xxxxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Wed, 10 Jun 2009 17:31:59 -0700

Evan, that is interesting. I always get the extra lines between paragraphs when I convert to .rtf.


I have never used the launch feature, and I'll try it.

Lori

----- Original Message ----- From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 1:23 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


Lori, something else I recall which may or may not be relevant is that when I converted OpenBook scans from their native file formats into the rtf format directly in the Save As menu, I got blank lines between every paragraph, and multiple blank lines where there was one in the book. I found out that if I went to the Launch menu and opened MS Word, the book would be converted to rtf, or doc, I forget which, without all those blank lines.

I was just thinking that if OpenBook was inserting all those hard returns at the ends of paragraphs, it might be inserting them within paragraphs. I do not know that that was ever the case, but it's something worth looking at. (I know that Bookshare removes extra blank lines, but I save everything I scan and proofread, and I just didn't want all those extra blank lines cluttering up my files.) What is your experience with the process of converting from arc to rtf? I've also heard that scanning directly into rtf can do odd things to files, but I have never verified this personally when I was using OpenBook. Have you tried this? Is it possible that that is a cause of extra hard returns if people are doing that?

Evan

----- Original Message ----- From: "Lori Castner" <loralee.castner@xxxxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 3:30 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


Just as a point of interest, Evan, I did use Jake's method once; basically it solved the problem, but it was not a perfect solution. And it was daunting and a bit cumbersome for me.

I'm sorry to know that his tips are no longer accessible/available because there was some good stuff there.

I feel more confident using Mayrie's method.

Not to muddy the waters, but there is one more thing Openbook users should be aware of. Quotation marks in Openbook show as maybe what are called "smart quotes". So, after I convert a book to .rtf, I find one of the quotes, copy it, then paste it into the find field of the find and replace box. In the replace with field I type a regular quote, shift+apostrophe then enter on replace all. This method works for me to get the correct quotation marks into the document.

I hope I don't start a long thread on the difference between smart quotes and other quotes or whether I am using the right name for the openbook quotation mark, but the quotes do need to be changed.

Cat Lover Lori

----- Original Message ----- From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 12:09 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


I need to amend the below slightly, because after writing it, I did recall one occurrence of this issue when I was using OpenBook. In the summer of 2006, I did have the hard return at the end of each line issue occur in one book I scanned with OpenBook version 7.01, or 7.1, I forget where the 1 went, but it was just above version 7. I didn't notice it until I checked the book after it got into the collection. There was a tip on Jake's site on how to remove unwanted hard returns, but the volunteer tips are no longer available there. I just checked. I didn't understand the instructions anyhow, so I didn't know if I could apply them correctly or would just make matters worse..So I rescanned the book, doing something differently, and this time the hard returns were not there.

Unfortunately, I do not remember how the hard returns got in there in the first place, nor do I recall how I kept them from getting into the rescan of that book or the other books I scanned with OpenBook. However, from reading Lori's message, there are ways to at least reduce the problem, and to eliminate it entirely if you are willing to do the search and replace regimen she follows. Now that she mentioned it, I do seem to recall that turning column recognition on did help with this.

Evan

----- Original Message ----- From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 1:07 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


I used to use WordPerfect. I have to say here that I really liked that program. Anyway, the "hrt" you mention here is the hard return character, the ^p in MS Word, which is the paragraph symbol. I do not think you want to do a global search and replace to get rid of those, as your book would then become one very long paragraph. Somehow, you need to find out why your OCR software is putting those hard returns at the end of each line and get it to just put them at the ends of paragraphs in the book.

Is there a setting in OpenBook that has something to do with respecting line endings in the book? If so, then you need to turn it off. If there is no such setting, then I'm not sure what you can do. I used to use OpenBook 7 about two years ago and I didn't have this issue come up.

Evan

----- Original Message ----- From: "Robert Peters" <rpet@xxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 10:13 AM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page breaks


Jamie,
These are instances in books I have scanned and had rejected; according to Word Perfect 12.0, they are linebreaks [HRt] is the character it gives me, and there is one at the end of each line. I do not have a Braille display, and the rejection notes I have gotten on the four all said linebreaks. What is the method in Microsoft Word to get JAWS to tell me a linebreak character is there so I could try a global replace? I'm sorry to have so many questions; all my scanning in the past has just been for me, and using a BookPort for the last several years to read them. BookPort doesn't care about the linebreaks, but the Victor Streams (which I just got) appears to, as it makes the reading jerky.
Robert



Attention: The information contained in this message and/or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or nonpublic material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.


Jamie Yates <mirxtech@xxxxxxxxx> 6/9/2009 7:24 PM >>>
Also, Robert, you might not be dealing with the ^l line break. You might be dealing with paragraph breaks at the end of each line, so that the text
looks like it does in the printed book.

When I have seen that, what I have tried to do is determine if there are 2 or more paragraph breaks ^p between the paragraphs. If so, I have changed all instances of ^p^p^p to a qqq, or if there aren't usually 3 together then just the ^p^p to a qqq. Then I change single ^p to a space. Then I change all the qqq back to ^p^p^p and I make sure to change all of the ^m to a ^p^m^p because changing those gives you the blank lines before and after
hard page breaks.

This will ONLY work if paragraphs are separated by at least two paragraph breaks. If they aren't, you're better off rejecting the book than you are
trying to change only the correct ^p to a space.

--
Jamie in Michigan
Currently Reading: False Prophet by Faye Kellerman
www.michrxtech.com/books.html

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.



To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: