[bksvol-discuss] Re: Rejecting a great scan for too many page breaks

  • From: "Robert Peters" <rpet@xxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Wed, 10 Jun 2009 17:20:34 -0500

Folks,
        OpenBook does not have an "ignore end of line" option.  I am unfamiliar 
with the term "smart quotes", though I think that means a single quote instead 
of the more common double quotation mark.
        Have any of you heard of Omni Page Pro, and how it works?
                        Robert


 
Attention:  The information contained in this message and/or attachments is 
intended only for the person or entity to which it is addressed and may contain 
confidential and/or nonpublic material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in reliance upon, this 
information by persons or entities other than the intended recipient is 
prohibited.  If you received this in error, please contact the sender and 
delete the material from any system and destroy any copies.


>>> "EVAN REESE" <mentat3@xxxxxxxxxxx> 6/10/2009 3:23 PM >>>
Lori, something else I recall which may or may not be relevant is that when 
I converted OpenBook scans from their native file formats into the rtf 
format directly in the Save As menu, I got blank lines between every 
paragraph, and multiple blank lines where there was one in the book. I found 
out that if I went to the Launch menu and opened MS Word, the book would be 
converted to rtf, or doc, I forget which, without all those blank lines.

I was just thinking that if OpenBook was inserting all those hard returns at 
the ends of paragraphs, it might be inserting them within paragraphs. I do 
not know that that was ever the case, but it's something worth looking at. 
(I know that Bookshare removes extra blank lines, but I save everything I 
scan and proofread, and I just didn't want all those extra blank lines 
cluttering up my files.)
What is your experience with the process of converting from arc to rtf? I've 
also heard that scanning directly into rtf can do odd things to files, but I 
have never verified this personally when I was using OpenBook. Have you 
tried this? Is it possible that that is a cause of extra hard returns if 
people are doing that?

Evan

----- Original Message ----- 
From: "Lori Castner" <loralee.castner@xxxxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Wednesday, June 10, 2009 3:30 PM
Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page 
breaks


> Just as a point of interest, Evan, I did use Jake's method once; basically 
> it solved the problem, but it was not a perfect solution.  And it was 
> daunting and a bit cumbersome for me.
>
> I'm sorry to know that his tips are no longer accessible/available because 
> there was some good stuff there.
>
> I feel more confident using Mayrie's method.
>
> Not to muddy the waters, but there is one more thing Openbook users should 
> be aware of.  Quotation marks in Openbook show as maybe what are called 
> "smart quotes".  So, after I convert a book to .rtf, I find one of the 
> quotes, copy it, then paste it into the find field of the find and replace 
> box.  In the replace with field I type a regular quote, shift+apostrophe 
> then enter on replace all.  This method works for me to get the correct 
> quotation marks into the document.
>
> I hope I don't start a long thread on the difference between smart quotes 
> and other quotes or whether I am using the right name for the openbook 
> quotation mark, but the quotes do need to be changed.
>
> Cat Lover Lori
>
> ----- Original Message ----- 
> From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
> To: <bksvol-discuss@xxxxxxxxxxxxx>
> Sent: Wednesday, June 10, 2009 12:09 PM
> Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page 
> breaks
>
>
>>I need to amend the below slightly, because after writing it, I did recall 
>>one occurrence of this issue when I was using OpenBook. In the summer of 
>>2006, I did have the hard return at the end of each line issue occur in 
>>one book I scanned with OpenBook version 7.01, or 7.1, I forget where the 
>>1 went, but it was just above version 7. I didn't notice it until I 
>>checked the book after it got into the collection. There was a tip on 
>>Jake's site on how to remove unwanted hard returns, but the volunteer tips 
>>are no longer available there. I just checked. I didn't understand the 
>>instructions anyhow, so I didn't know if I could apply them correctly or 
>>would just make matters worse..So I rescanned the book, doing something 
>>differently, and this time the hard returns were not there.
>>
>> Unfortunately, I do not remember how the hard returns got in there in the 
>> first place, nor do I recall how I kept them from getting into the rescan 
>> of that book or the other books I scanned with OpenBook. However, from 
>> reading Lori's message, there are ways to at least reduce the problem, 
>> and to eliminate it entirely if you are willing to do the search and 
>> replace regimen she follows. Now that she mentioned it, I do seem to 
>> recall that turning column recognition on did help with this.
>>
>> Evan
>>
>> ----- Original Message ----- 
>> From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
>> To: <bksvol-discuss@xxxxxxxxxxxxx>
>> Sent: Wednesday, June 10, 2009 1:07 PM
>> Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page 
>> breaks
>>
>>
>>>I used to use WordPerfect. I have to say here that I really liked that 
>>>program. Anyway, the "hrt" you mention here is the hard return character, 
>>>the ^p in MS Word, which is the paragraph symbol. I do not think you want 
>>>to do a global search and replace to get rid of those, as your book would 
>>>then become one very long paragraph. Somehow, you need to find out why 
>>>your OCR software is putting those hard returns at the end of each line 
>>>and get it to just put them at the ends of paragraphs in the book.
>>>
>>> Is there a setting in OpenBook that has something to do with respecting 
>>> line endings in the book? If so, then you need to turn it off. If there 
>>> is no such setting, then I'm not sure what you can do. I used to use 
>>> OpenBook 7 about two years ago and I didn't have this issue come up.
>>>
>>> Evan
>>>
>>> ----- Original Message ----- 
>>> From: "Robert Peters" <rpet@xxxxxxx>
>>> To: <bksvol-discuss@xxxxxxxxxxxxx>
>>> Sent: Wednesday, June 10, 2009 10:13 AM
>>> Subject: [bksvol-discuss] Re: Rejecting a great scan for too many page 
>>> breaks
>>>
>>>
>>>> Jamie,
>>>> These are instances in books I have scanned and had rejected; according 
>>>> to Word Perfect 12.0, they are linebreaks [HRt] is the character it 
>>>> gives me, and there is one at the end of each line.  I do not have a 
>>>> Braille display, and the rejection notes I have gotten on the four all 
>>>> said linebreaks.
>>>> What is the method in Microsoft Word to get JAWS to tell me a linebreak 
>>>> character is there so I could try a global replace?
>>>> I'm sorry to have so many questions; all my scanning in the past has 
>>>> just been for me, and using a BookPort for the last several years to 
>>>> read them. BookPort doesn't care about the linebreaks, but the Victor 
>>>> Streams (which I just got) appears to, as it makes the reading jerky.
>>>> Robert
>>>>
>>>>
>>>>
>>>> Attention:  The information contained in this message and/or 
>>>> attachments is intended only for the person or entity to which it is 
>>>> addressed and may contain confidential and/or nonpublic material.  Any 
>>>> review, retransmission, dissemination or other use of, or taking of any 
>>>> action in reliance upon, this information by persons or entities other 
>>>> than the intended recipient is prohibited.  If you received this in 
>>>> error, please contact the sender and delete the material from any 
>>>> system and destroy any copies.
>>>>
>>>>
>>>>>>> Jamie Yates <mirxtech@xxxxxxxxx> 6/9/2009 7:24 PM >>>
>>>> Also, Robert, you might not be dealing with the ^l line break. You 
>>>> might be
>>>> dealing with paragraph breaks at the end of each line, so that the text
>>>> looks like it does in the printed book.
>>>>
>>>> When I have seen that, what I have tried to do is determine if there 
>>>> are 2
>>>> or more paragraph breaks ^p between the paragraphs. If so, I have 
>>>> changed
>>>> all instances of ^p^p^p to a qqq, or if there aren't usually 3 together 
>>>> then
>>>> just the ^p^p to a qqq. Then I change single ^p to a space. Then I 
>>>> change
>>>> all the qqq back to ^p^p^p and I make sure to change all of the ^m to a
>>>> ^p^m^p because changing those gives you the blank lines before and 
>>>> after
>>>> hard page breaks.
>>>>
>>>> This will ONLY work if paragraphs are separated by at least two 
>>>> paragraph
>>>> breaks. If they aren't, you're better off rejecting the book than you 
>>>> are
>>>> trying to change only the correct ^p to a space.
>>>>
>>>> -- 
>>>> Jamie in Michigan
>>>> Currently Reading: False Prophet by Faye Kellerman
>>>> www.michrxtech.com/books.html 
>>>>
>>>> To unsubscribe from this list send a blank Email to
>>>> bksvol-discuss-request@xxxxxxxxxxxxx 
>>>> put the word 'unsubscribe' by itself in the subject line.  To get a 
>>>> list of available commands, put the word 'help' by itself in the 
>>>> subject line.
>>>>
>>>
>>> To unsubscribe from this list send a blank Email to
>>> bksvol-discuss-request@xxxxxxxxxxxxx 
>>> put the word 'unsubscribe' by itself in the subject line.  To get a list 
>>> of available commands, put the word 'help' by itself in the subject 
>>> line.
>>>
>>
>> To unsubscribe from this list send a blank Email to
>> bksvol-discuss-request@xxxxxxxxxxxxx 
>> put the word 'unsubscribe' by itself in the subject line.  To get a list 
>> of available commands, put the word 'help' by itself in the subject line.
>>
>
> To unsubscribe from this list send a blank Email to
> bksvol-discuss-request@xxxxxxxxxxxxx 
> put the word 'unsubscribe' by itself in the subject line.  To get a list 
> of available commands, put the word 'help' by itself in the subject line.
> 

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx 
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: