Wow, Mayrie, I'm impressed. I only have a little bit to add to that.If you have any questions or need any pages rescanned, don't be afraid to e-mail the submitter. If the submitter didn't leave an e-mail address in the comments, I believe you can e-mail Pavi and ask her to forward a message for you. I do mostly scanning now but, when I vallidated, my favorite submitters were those who responded to my questions about their books. Most times, they keep the books until they're in the collection so any questions can be quickly answered.
Something else I do, and this is a smaller thing, is to remove the spaces between periods. I do this when there is a period, space, period, space, period. I know JAWS reads this strangely and it's confusing so I just replace that string with three periods in a row.
I will also reiterate how important it is to, not only read the book, but to fix any mistakes that appear. Yes, it takes more time and effort but it also means that Bookshare gets a better quality book in the collection.
Synopses are also important. Someone thought the book was good enough to scan and someone else was interested enough to validate it. Make the rest of us want to read it too by adding a synopsis if there isn't one. Even if it's only a couple of sentenses, tell us what the book is about.
Christina----- Original Message ----- From: "Mayrie ReNae" <mayrierenae@xxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx> Sent: Monday, November 10, 2008 12:40 PM Subject: [bksvol-discuss] Re: How to be a black belt validater?
Hi Cait! I posted this some time during the summer. I think it deals with everything that you asked about. It's long, so get ready! Before I paste my document below, I should say that Bookshare only has three requirements for books. At least 90% of page breaks be present and properly located in the book. And title page and copyright informationpresent and legible in the book. All the rest is extra, but makes a book asperfect as I know how to make it. For all of you who saw my list of prevalidation instructions before, just hit the delete button now. See below. Mayrie COMPLETE PREVALIDATION PROCESS Okay, I did everything that I do to a book to prepare it forsubmission to Bookshare today short of reading it and documented the time it took. The total minus actually reading the book was four and a half hours. One hour of that was spent in recognition that I did while eating and doinglaundry. Probably not necessary to count that, laugh. But I included it anyway. So, here you go. The book had 292 pages including back cover, jacket flaps, preliminarypages, and, of course, the text of the book. I'll tell you in general what I did, and how long it took, then elaborate on the particular process. As has been said before, not everyone's process is the same, and there are probablyat least three ways to achieve any given result. This is what I did withthis particular book, and my process might vary slightly from book to book,but here is what I did. I was using Kurzweil 1000 and one of the find and replace parts can easilybe done in Microsoft word. In Kurzweil the paragraph mark is represented by \n. In word, that character is represented in the find and replace dialogueby ^p. That might help folks validating using Word instead of Kurzweil. 1. Scan took 90 minutesI am using an opticBook 3600 scanner in single-page mode. Scanner settingsare as follows: Scan to images, automatic page orientation, gray-scale data, resolution at 300 DPI. Recognition settings were: Collumn identification disabled, one page recognized per scan, speckleremoval disabled, Text quality is normal, partial collumns kept, suspiciousregions kept, blank pages kept, recognition engine is FineReader 8.0, English will be recognized. Reading settings: Line endings will be ignored by the editor and tables will not be identified.I do not identify tables in straight fiction because junk sometimes scans asa table and is more of a pain to remove that way, more time consuming. I have to know when I'll need table recognition so I can enable it. While scanning to images, I am always reading another book that I have run through this process to catch errors that ranked spelling didn't. 2. Recognize images took 1 hour. I do this when off eating, or doing laundry, or sleeping, something that doesn't require my computer to be doing anything else. This time may vary a lot depending upon how hardy your computer is, or how lame mine is. 3. Save the file under the name of the book. No time taken. 4. Clean up preliminary pages and confirm accurate page count: 15 minutesLabel: [From The Back Cover] [From The Front Flap] [From The Back Flap][This Page is blank.] if any blank pages exist. Read through all preliminary pagesand correct all scannos. Determine where the publisher thought page one should go and set an opperator defined page number there as page 1.Check that the last page in the book is numbered properly, telling you that you do not have any missing or duplicated pages. If the numbers don't match, either rescan and insert pages that you missed, or delete duplicated pages.5.Remove headers, protect chapter headings, number and label any blank pages,get rid of end-of-line hyphens, and ensure that blank lines at the tops of pages will be preserved: 30 minutes.Protect all chapter headings by placing the page number followed by a blankline above the chapter heading.Remove all headers. Do this only after protecting chapter headings, as veryoften the absence of a running header is the only indication of where a poorly scanned chapter heading should go. Page down through the document numbering and labeling all blank pages, and looking at the first word on each page to be sure that it is a complete word, and reconnect hyphenated words on one page.On each page beginning with a lower case letter, insert a space before thatinitial lower case letter. This will help later. 6. Insert page numbers at the tops of all pages: 30 minutes.Delete all page numbers at the bottom of pages. These don't always scan atall, so can't be counted upon to be there in the page numbering for daisy navigation, and especially in the html of the Bookshare final copy in the collection. Insert page numbers at the tops of all pages not already numbered above chapter headings followed by two carriage returns. Remove all extra blank lines by using the find and replace dialogue as follows:In the "find box" insert \n\n\n\n\n\n (\n is the character string that willsearch for a carriage return.)In the replace box type\n\n Do this with the replace box remaining the same, but with five, then four, then three carriage return symbols each successive time in the "find" box. This will get rid of all instances of more than oneblank line between any blocks of text, or between page numbers and chapter headings or text on a page. 7. Remove any extra carriage returns inadvertently inserted by the OCR: 5 minutes. This involves using the find and replace command 27 times.In the find box type " " (That is quotation mark followed by space followedby quotation mark." In the replace box type "\n"This will separate any paragraphs between speakers that might not have beenseparated by the OCR program. This does happen regularly. Now you are going to look for paragraph marks that shouldn't be there. You will do this with each letter of the alphabet in lower case.In the find box type\na (That is backslash followed immediately by the lowercase letters n and a) In the replace box type space a that is hit the space bar followed immediately by the lower case letter a Replace all. Inserting a space at the tops of pages before each occurring lower case letter allows your carefully inserted blank lines between page numbers and text on the page to be preserved now. 8. Run ranked spelling: This took 20 minutes with this book. I started out with a 99.28% accuracy rating. Correct all scannos as ranked spelling or the spell checker finds them. 9.At this point I read the book and correct any errors that the spell checkeror ranked spelling didn't find. Hopefully I catch them all. 10. Convert to rtf and close the file. No time taken. 11. In Microsoft Word, Protect page numbers and page breaks, standardize fontsand margins, and convert em dashes to double hyphens: 5 minutes. (This is agenerous estimate of how much time taken). Open the file in microsoft word. Standardize font and justify margins Make sure if validating someone else's submission that there are no smart quotes in the document, making sure that all quotation marks are standard quotes. Open book tends to produce inaccurate quotation marks in my experience.Protect page numbers and page breaks by using the find and replace dialogueas follows: In the find box type: ^m In the replace type: ^p^m^p Replace all. Convert em dashes to double hyphens by using the find and replace dialogue as follows: In the find box type: ^+ In the Replace box type: -- (That is two hyphens or two dashes, depending upon what you call that key to the right of the zero on the number row.) Save the file. NOW YOU'RE DONE! -----Original Message----- From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of My Nickels Worth Sent: Monday, November 10, 2008 8:46 AM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] How to be a black belt validater? Ok, I am wondering if we can put together a list of things a validatercan/should do to make a book a really good one for the collection. This isa sort of companion thread for the older one about being a black belt submitter.. What are some things which a validater should always do, what are things which are extras, according to bookshare's requirements, but which could really improve the book for the collection?Maybe an experienced validater could give an example of what they do with abook from downloading it off step one to uploading it for publication.. Thanks, Caitlyn To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxxput the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.
To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.