Paul and Noel, Since you've both expressed the same questions, let me ask something in return. How do you characterize your results. Do you generally gage the accuracy of your results via the self-reported recognition stats or by looking at amount of spelling mistakes per page when you change various settings. The recognition stats presented by Kurzweil are often misleading. What is given to us often done by getting it from the OCR engine in question. With those types of self reports, there is always a matter of accuracy and reliability. There is also a matter of validity. I have experimented with this issue a bit and have found that even with the same settings, if you keep on scanning a page a few times the supplied stats are different each time, in some cases by a large margin. However, when this issue is looked at from the actual accuracy perspective, it's quite reliable to look at the page from different scans with same settings. Even when the Optimize Scanning feature comes up with different settings for the same page when using that function several times, the accuracy is not effected too often. The gray scale with 400DPI does make a large difference. Even when I use optimize scanning, I make sure that at the end I compare results by using the Gray scale with the 400 DPI. Pratik Pratik Patel Managing Director CUNY Assistive Technology Services the City University of New York (718) 997-3775 ppatel@xxxxxx -----Original Message----- From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Edwards, Paul Sent: Thursday, April 29, 2004 4:25 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] Re: text quality Actually, I knew that. So of course, the question becomes: if it makes no difference, why do different values turn up when you do "optimize scan"? Perhaps there is a Kurzweil guru lurking. Would you care to emerge from the lurk and answer the question? Paul Paul Edwards, Director Access Services, North Campus Phone: (305) 237-1146 Fax: (305-237-1831 TTY: (305) 237-1413 Email: pedwards@xxxxxxxx home email: edwpaul@xxxxxxxxxxx -----Original Message----- From: Guido Corona [mailto:guidoc@xxxxxxxxxx] Sent: Thursday, April 29, 2004 4:02 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] Re: text quality Paul, I also use grayscale at 400 DPI most of the time with Kurzweil 8.0. If you find it is rather slow, scan images only, then turn on pure recognition before going to sleep. Your book will be ready when you wake in the morning, no matter how large it is. By the way, with grayscale brightness makes no difference. Guido Guido D. Corona IBM Accessibility Center, Austin Tx. IBM Research, Phone: (512) 838-9735 Email: guidoc@xxxxxxxxxxx Visit my weekly Accessibility WebLog at: http://www-3.ibm.com/able/weblog/corona_weblog.html "Edwards, Paul" <pedwards@xxxxxxxx> Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx 04/29/2004 02:42 PM Please respond to bksvol-discuss To <bksvol-discuss@xxxxxxxxxxxxx> cc Subject [bksvol-discuss] Re: text quality This is a difficult issue. I take the approach of carefully checking the first few pages at the beginning of a scan. If there are errors I can adjust for, I do that. I also rescan pages whose value in Kurzweil comes back lower than ninety. I do not tend to scan ninety to ninety-five because I can usually not make much of a difference and we are often dealing with a screwed over heading or something. However, I scanned a book recently which was a hard cover and which should have scanned like a dream and came out as pure druck. I have found that optimizing scanning is, for the most part, worth doing. The results do not always make me happy in that I am now scanning a book using gray scale and sixty which takes forever to scan. By the way, it is legends two edited by Robert Silverberg. Paul Paul Edwards, Director Access Services, North Campus Phone: (305) 237-1146 Fax: (305-237-1831 TTY: (305) 237-1413 Email: pedwards@xxxxxxxx home email: edwpaul@xxxxxxxxxxx -----Original Message----- From: Kellie Hartmann [mailto:kellhart@xxxxxxxxxx] Sent: Thursday, April 29, 2004 1:04 AM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] text quality Hi all. Even with the wonderful new scanning software available there are a few kinds of things that are very difficult to get a good scan from. For example, linguistics books are often very graphical in nature and contain symbols that the OCR packages don't recognize; things like r-underring and turned V etc. Also some cheap paperbacks do have places where they seem to be blurred. I scanned a novel that I was assigned to read in French class, and when I found illegible passages I tried rescanning them. I rescanned several times changing various settings, but certain passages absolutely refused to scan. I don't really plan to submit it to Bookshare anyway, but I would prefer this scan, with a couple of blurred lines every 20 pages or so, to no scan. I'm able to use this in class with no problems, so in my opinion this is far better than nothing. Finally, I have another French book which has very glossy pages and lots of flashy graphical design. Again, even with a lot of work on experimenting with different settings my results were not encouraging. This I definitely won't submit to Bookshare because I can't get it in good enough shape; the effort required would be far beyond the benefits. I agree that careless scanning is unreasonable, and think that validating is important. It always takes me much longer to validate something than to scan it because I read the whole book and fix every error that can possibly be fixed. Not every validator is going to do that, and certain books, such as enormous textbooks, really would require a great investment in time to proof thoroughly. So it isn't realistic to expect every book to be flawless. What I would really like eventually, and I know this isn't realistic either, would be to have all the fair-quality books rescanned. Kellie