No Louise, In Kurzweil to scan only you select 'scan images only' in scanner settings. When you have finished the scanning job, go back in scanner settings and select 'recognize images'. Then press enter to accept the new setting. now simply press F9 and all images in the current batch will be recognized. Guido Guido Dante Corona IBM Accessibility Center, Austin Tx. Research Division, Phone: 512. 838. 9735. Email: guidoc@xxxxxxxxxxx Web: http://www.ibm.com/able "Louise" <lougou@xxxxxxxxxxxxxx> Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx 12/29/2004 09:55 AM Please respond to bksvol-discuss To <bksvol-discuss@xxxxxxxxxxxxx> cc Subject [bksvol-discuss] Re: txt page breaks redux Guido, After you scan a book to image and are ready to recognize, do you just go to the images folder, select the first image and then do a select all or shift Control End to select all of the images? I've never done it this way before, but it's worth a try. ----- Original Message ----- From: Guido Corona To: bksvol-discuss@xxxxxxxxxxxxx Sent: Tuesday, December 28, 2004 7:03 PM Subject: [bksvol-discuss] Re: txt page breaks redux By the way, If I scanned at 300DPI instead of 400DPI, the same 450 paperback scanning job would take just over 45 minutes. I do not use any document feeder. I just scan 2 pages per scan in continuous scanning mode with a 5 seconds delay between each scan. The same strategy should be possible with the mainstream ABBYY Fine Reader Professional 7.0 software used by our very own revered donna Smith, the Divine ABBYY Scanning Mistress! Guido Guido Dante Corona IBM Accessibility Center, Austin Tx. Research Division, Phone: 512. 838. 9735. Email: guidoc@xxxxxxxxxxx Web: http://www.ibm.com/able Guido Corona/Austin/IBM@IBMUS Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx 12/28/2004 06:50 PM Please respond to bksvol-discuss To bksvol-discuss@xxxxxxxxxxxxx cc Subject [bksvol-discuss] Re: txt page breaks redux Cindy, I am using a lowly EPSON 1660 with the Kurzweil 9.02 software. As I scan at 400 DPI in grayscale, the OCR engine would take a lot longer to recognize if I scanned and recognized pages simultaneously. So I scan images only. Then I submit the entire stack of images for recognition at the end Recognition will take anything from 20 minutes to 2 or even three hours, depnding on the length of the book and the degree of difficulty encountered in the reco process. No skin off my back though, I can do a lot of other things during that time: my attention is not needed. As I recall, books with errors amounting to or less than 0.7% are deemed excellent. Between 0.7% and 1.5% are 'good'. Submission with more than 1.5% error rate are deemed fair. But the reviewer has the opportunity to exercise judgment and override the system evaluation, up or down. If a book seems to have holes with frequent missing or corrupted words, I nuke it, no matter what the system evaluates, and add a comment for the administrator. If the book has a bunch of 'the' misspelled as 'die' I try to fix them one at a time, unless I find that to be a thankless task when that turns out to be just one aspect of a much bigger problem. So Cindy, I think you know my opinion by now. Guido Guido Dante Corona IBM Accessibility Center, Austin Tx. Research Division, Phone: 512. 838. 9735. Email: guidoc@xxxxxxxxxxx Web: http://www.ibm.com/able Cindy <popularplace@xxxxxxxxx> Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx 12/28/2004 06:24 PM Please respond to bksvol-discuss To bksvol-discuss@xxxxxxxxxxxxx cc Subject [bksvol-discuss] Re: txt page breaks redux Guido, You must have wonderful scanner!!!! There's no way I can scan a book that quickly, since I have to scan a page at a time, and can convert 8 - 12 pages at a time. Anyway, since so many other people are happier scanning, I'm leaving that to them -- but unless everybody submits in rtf format, that still leaves the problem of hard page breaks in txt books, which apparently some of you and put in and others of us cannot. My solution, as things stand now, is to download a txt file, reject it, and re-submit it as an rtf file -- which means someone else will then have to validate it. But you bring up something I've been wondering about: Should a book that is spell-checked only, and garbage removed, be approved as being in excellent condition, or good? What about scanning errors that pass the spell-check, like "be" for "he," the number one for capital I, lie for the, etc. And words that are missing completely from sentences. I know the excellent rating allows for some errors, but how many before it becomes Good instead of Excellent? I recently worked in a book that had a lot of missing words. I would suspect that the omissions wouldn't have made much difference to the reader, and I suppose that in the cases of the other examples I gave any reader could make changes as he/she read, but I wonder if it wouldn't be better for books that haven't been read and corrected by the validator to have a Good rating and leave the Excellent for books that have been done more carefully. Cindy --- Guido Corona <guidoc@xxxxxxxxxx> wrote: > I know this will sound so dreadfully heartless, no > Charitable Seasonal > spirit and all the rest. But It takes a grand total > of just 1 hour and 5 > minutes to scan an entire 450 page paperback book, > page breaks, font > info, and all the rest. Than it takes about 90 > minutes to do some basic > cleanup, and finally an average of a couple of hours > to spellcheck it. > > I really do not understand why we are even bothering > to discuss salvage > operations for DOA submissions, when the culling ax > and a quick rescan is > the only merciful course of action for most of these > runts. > > Guido Dante Corona > IBM Accessibility Center, Austin Tx. > Research Division, > Phone: 512. 838. 9735. > Email: guidoc@xxxxxxxxxxx > Web: http://www.ibm.com/able > > > > > "Marissa Mika" <Marissa.M@xxxxxxxxxxxx> > Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx > 12/28/2004 05:35 PM > Please respond to > bksvol-discuss > > > To > <bksvol-discuss@xxxxxxxxxxxxx> > cc > > Subject > [bksvol-discuss] Re: txt page breaks redux > > > > > > > Hi Cindy, > > We're still working on it. (Gotta love consensus, > huh?) Look for a > message from me by the end of the week. > > Did everyone have a good Christmas? > > Marissa > > -----Original Message----- > From: bksvol-discuss-bounce@xxxxxxxxxxxxx > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On > Behalf Of Cindy > Sent: Wednesday, December 22, 2004 9:21 PM > To: bksvol-discuss@xxxxxxxxxxxxx > Subject: [bksvol-discuss] txt page breaks redux > > Hi, Marissa, > > Thanks for the new list. > > Is there any word yet on what to do with txt files > or > if they will be accepted without hard breaks, with > spaces and page numbers instead? That doesn't > prevent > the breaks Word puts in in the wrong places, but by > adding line spaces or changing font the file can > probably be made to coincide with the book. > > When I finish Johnny Tremain I'm thinking of fixing > one of those troublesome romances, since I found a > copy. As things stand now, I think the best thing > for > me to do is to reject the txt file and submit a new > rtf file with page breaks. > > Cindy > > > > > > __________________________________ > Do you Yahoo!? > Take Yahoo! Mail with you! Get it on your mobile > phone. > http://mobile.yahoo.com/maildemo > > > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com