[bksvol-discuss] Re: proofreading questions

  • From: "Judy s." <cherryjam@xxxxxxxxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Thu, 26 Jul 2012 20:03:51 -0500

Hi John,

I can answer the bit about books.google.com. It's unlikely that you would want to change your page breaks in the scanned version .rtf to match the Google copy. Your edition is probably a totally different edition than the one that is on google books. I run into this all the time. Books can have 20 or 30 different editions, or even more, and the pagination will be different on each edition.

The missing dash problem could have from a whole bunch of causes. I don't have a fix for you as a proofreader to handle that easily, unless there is a specific consistent pattern you can use to do a 'search and replace' for the missing em dashes. One idea from the pattern you are seeing is to just do a search for space space and then replace each of these that appear to be a missing em dash with a hyphen hypen, Someone else here may have a better idea, though.

I'm also a sighted volunteer with limitations that don't allow me to manipulate a print book. There are a few of us in that boat that are volunteers here. I use google book and amazon peek both when available, and when I can't get an answer I need there or from the person who scanned the book I ask on the list if someone can find the book and scan the page and send it to me.

Hope that helps,

Judy s.

On 7/26/2012 7:06 PM, John Simpson wrote:

I have several questions about the book that I am currently proofing. First off, words that are followed by an "'s" have the apostrsphe over the penultimate letter (e.g. Martin̓s). While this is not a showstopper, it does require a fair amount of corrections. I guess my question is what causes this kind of construction? Is it a function of the scan volunteer, the scanner hardware, or the OCR software?

Secondly, I have gone to books.google.com to take a look at this book. My question here is whether Google has a fair representation of the book. I know that all but one page are present, but within the first several chapters, the page breaks in the scanned version .rtf are not in the same place as they are in Google's copy. I certainly don't want to have to go through the entire book changing pagination based on Google. I do have a hold at my local library for the print copy that will help answer this question. Any other advice would be greatly appreciated.

The third question is that in the scanned version that I have from BookShare there are frequent instances of two spaces, rather than one. The sense of the book is that there should be a comma where the first space is. However, when looking at the Google version, this separator is an m dash surrounded by spaces. All of these dashes have been removed. Again, my question is whether this is a function of the scan volunteer the scanner hardware or the OCR software. Again, I do not wish to go through the entire print book looking for dashes that I need to replace, or even to do a find on two spaces and see if the meeting indicates a dash.

I am a sighted volunteer with physical limitations that do not allow me to manipulate a print book. While I don't mind getting occasional assistance to go to a specific page to verify my proofreading, I'm not able to scan a print book and compare my scan to the BookShare .rtf version. If the Google representation is accurate relative to the print book, I will be happy to use that as a resource wherever possible.

Thanks for any and all suggestions.

John Simpson



Other related posts: