Hi, Julia, I was sure this setting was still somewhere in Version 11, but I just looked and can't find it. I searched the manual, and also found nothing about it. I wonder if the program somehow doesn't need it anymore. Paula ----- Original Message ----- From: "Julia Kulak" <julia.kulak@xxxxxxxxxxxx> To: <bksvol-discuss@xxxxxxxxxxxxx> Sent: Friday, August 15, 2008 8:18 PM Subject: [bksvol-discuss] Re: Become A Black Belt Submitter > Hi, I think Kurzweil eliminated one setting in version 11, there doesn't > appear to be a setting that has recognition of light text on a dark > background. Will this mess up the book? Should I downgrade to version 10 for > this feature, and is there an equivalent setting in version 11? > Julia > ----- Original Message ----- > From: "EVAN REESE" <mentat3@xxxxxxxxxxx> > To: <bksvol-discuss@xxxxxxxxxxxxx> > Sent: Friday, August 15, 2008 2:40 PM > Subject: [bksvol-discuss] Re: Become A Black Belt Submitter > > > > Hi Jim, I have a personal copy; but you, and anyone else here, can find > > the article on Jake's site at: > > > > http://www.jbrownell.com/bks/tip.asp?id=29 > > > > Evan > > > > ----- Original Message ----- > > From: <james.homme@xxxxxxxxxxxx> > > To: <bksvol-discuss@xxxxxxxxxxxxx> > > Sent: Friday, August 15, 2008 7:49 AM > > Subject: [bksvol-discuss] Re: Become A Black Belt Submitter > > > > > >> Hi, > >> Where is the stuff Pratik' wrote about this? > >> > >> Thanks. > >> > >> Jim > >> > >> James D Homme, Usability Engineering, Highmark Inc., > >> james.homme@xxxxxxxxxxxx, 412-544-1810 > >> > >> "The difference between those who get what they wish for and those who > >> don't is action. Therefore, every action you take is a complete > >> success,regardless of the results." -- Jerrold Mundis > >> Highmark internal only: For usability and accessibility: > >> http://highwire.highmark.com/sites/iwov/hwt093/ > >> > >> > >> > >> "EVAN REESE" > >> <mentat3@verizon. > >> net> To > >> Sent by: bksvol-discuss@xxxxxxxxxxxxx > >> bksvol-discuss-bo cc > >> unce@xxxxxxxxxxxx > >> g Subject > >> [bksvol-discuss] Re: Become A Black > >> Belt Submitter > >> 08/14/2008 07:27 > >> PM > >> > >> > >> Please respond to > >> bksvol-discuss@fr > >> eelists.org > >> > >> > >> > >> > >> > >> > >> Thanks for sending this up. This is all very useful stuff. > >> > >> I do use Scan Repeatedly, and just hit the Cancel key twice if I get a > >> confidence number below the threshhold - which on my K1000 is set to > >> %98.7. > >> If I can go twenty or fifty pages without getting a page below that > >> number, > >> then it saves me from having to hit the F9 key twenty or fifty times. > >> > >> I also use autocorrection, but haven't compared a scan with and without > >> it, > >> so I cannot take sides in that debate. > >> > >> According to Pratik's excellent monograph on getting the best recognition > >> of mass market paperbacks, he wrote that grayscale and 400 dots per inch > >> can sometimes produce better results than static optimized. So your point > >> here about grayscale is a good one, but increasing the resolution from > >> 300 > >> to 400, especially for poor quality print such as you'd get with cheap > >> paperbacks can give even better recognition sometimes. Of course, > >> increasing the resolution from the usual 300 will also slow down the scan > >> and the recognition; but the extra time invested up front is very likely > >> to > >> be more than offset by the time saved cleaning up the scan afterword. > >> > >> I have scanned the same material with Suspicious Regions kept and > >> ignored, > >> and it can really make a difference in the amount of junk you get. So > >> this > >> is another good point you make here. > >> > >> Thanks again. > >> > >> Evan > >> > >> ----- Original Message ----- > >> From: Monica Willyard > >> To: Bookshare Volunteers > >> Sent: Thursday, August 14, 2008 6:19 PM > >> Subject: [bksvol-discuss] Become A Black Belt Submitter > >> > >> Hi, everyone. I wrote an email about getting really clear scans for one > >> of > >> our volunteers, and it occurred to me that someone on this list might > >> benefit from it. It's a little on the long side. I hope something in it > >> will help you. If I've said anything confusing, please ask me about it. I > >> know many of you have done a lot of scanning, so I'm focusing on things > >> that may not have occurred to you. I'll call them my top ten scanning > >> tips. (grin) They work from my experience, and you may find that you need > >> to experiment to find something that works well for you. Also, I use > >> Kurzweil for scanning. Openbook users may find some of this to be useful, > >> but some of it won't apply. I do have Openbook 7 and used it for several > >> years. So I'll do my best to help you translate these to Openbook if > >> that's what you need. > >> > >> I got a lot of these ideas from volunteers I've been fortunate enough to > >> work with over the past 2 years. Jim Baugh, Louise, Pratik, Jake, Scott, > >> Shelley, and Gerald taught me so much about good scanning. Thanks guys. > >> (smile) You rock! > >> > >> 1. Start with some solid settings in Kurzweil that will work most of the > >> time. You may know your way around Kurzweil well. I don't know if you've > >> thought to work on these settings though since they're not obvious. Under > >> the settings menu, in the general tab, make sure that your confidence > >> threshold is set to at least 98.5. Why? Kurzweil defaults to 95 percent, > >> and that means that it optimizes scans for a lower level of accuracy. > >> That > >> means you won't get the best results from optimization. That also means > >> more clean-up on the backside, and that's a pain in the neck. The other > >> setting in general that you may want to turn on if you have some disk > >> space is the option to keep scanned images. This feature lets you > >> re-recognize pages if they have issues. Sometimes just changing something > >> like detect columns will make that page come out right without you having > >> to totally rescan the page. Once you've read through the book, Kurzweil > >> will let you remove the scanned images from the book to reduce the file > >> size. > >> > >> There are three final settings that you may find useful for scanning most > >> fiction. These work well for me, especially with library books. They're > >> all under the recognition tab. Column identification should be enabled. > >> Partial columns should be ignored, and suspicious regions should be > >> ignored. This flies in the face of what Nick has recommended on the > >> Kurzweil list, so I'd better explain. When scanning books, it's somewhat > >> common to get a shadow from the spine of the book. It often makes a > >> narrow > >> column of a tab character and a random group of numbers or letters. If > >> you > >> turn off column identification, these random letters are mingled with the > >> regular text. Turning on the column detection separates this garbage from > >> the text, and ignoring partial columns and suspicious regions removes it > >> during OCR. If a page needs column detection turned off due to a table, > >> and you have retained images of the scanned page, you can easily change > >> the recognition settings and just re-recognize the page from the scanned > >> image. Do you see how this could save you time and hassle? > >> > >> Once you have settings you like, save them as default so you can start > >> scanning without worrying about them each time you start Kurzweil. > >> > >> 2. Prepare your book for scanning, and you'll get better results from the > >> start. Before you begin to scan a book, run your fingers lightly through > >> the pages to remove any possible ink ,dust, or other particles that may > >> be > >> on the pages. If the book is a library book, flip through the book in > >> sections of about fifteen pages or so, gently pressing your fingers along > >> the inner spine to encourage the book to lie flat. If the book belongs to > >> you, especially if its a paperback, flip through sections as with a > >> library book, but bend the book back so that it's outer covers almost > >> touch. You're giving your book some flexibility stretches while not > >> breaking its spine. This is especially important for thick books or > >> two-page scanning mode and will keep you from having to push down as hard > >> on books while you scan. > >> > >> 3. Optimize and verify settings for your book. Before scanning a book, > >> open to the center and use the optimize feature. The Kurzweil staff says > >> that optimization should be used in one-page mode so it can get the best > >> idea of how the print works in your book. Scan four or five pages after > >> optimization to determine if any adjustments in settings need to be made. > >> Kurzweil does a fairly good job picking the optimal settings to scan a > >> particular book unless the print quality is exceptionally bad. If you're > >> planning to scan in two-page mode, you can turn this back on once you're > >> finished with optimization. > >> > >> 4. When in doubt, go for grey-scale. Grey-scale is the best and most > >> reliable thing to try when optimization doesn't produce the quality that > >> you need. Try grey-scale with brightness of around 65 and a resolution of > >> 300 DPI. It's really great for scanning mass market paperbacks. > >> Grey-scale > >> will make your scans slower, and its scanned images are larger than those > >> made with static thresholding. It gives the best page representation > >> though, compared to other forms of thresholding. If you're using a Canon > >> or Visioneer scanner, grey-scale will save your bacon! (grin) Please note > >> that Openbook 7 doesn't implement grey-scale correctly, so automatic > >> contrast is probably your best choice. > >> > >> 5. Catch bad scans as they happen. There is a friendly debate among > >> submitters about whether to scan in batches or to scan pages and > >> recognize > >> them one at a time. There are pros and cons on both sides. I do a sort of > >> modified batch style. I scan a book while on the phone or doing something > >> else but don't use the scan repeatedly feature for one reason. I want to > >> catch badly scanned pages as they happen. It saves me from hunting for a > >> page to rescan it later. So I scan a page and let my scan recognize while > >> I'm turning to the next page. I wait for Kurzweil to tell me its > >> confidence number. I make this really easy because I've turned off the > >> progress messages for Kurzweil's scanning and recognition and have it set > >> to play a chime when scanning and recognition are finished. So if > >> Kurzweil > >> says something, it's the confidence number letting me know that the page > >> scanned below the accuracy threshold I've set. If the statistics say 97 > >> percent confidence level or less, rescan the page to try for a better > >> scan. Otherwise, you will have to struggle with many errors on the page. > >> > >> 6. Your scanner needs TLC too. Books can be dirty or dusty sometimes. > >> Mass > >> market paperbacks can leave a residue of ink dust on your scanner. Keep > >> the scanner glass clean by using a dry, lint-free cloth. Never use > >> anything wet like an alcohol pad or baby wipe. That will create little > >> bubbles under the scanner glass and will cause problems in future scans. > >> > >> 7. When scanning a book, do a spot check every 15 or 20 pages. Look at > >> the > >> last page or two of the file to make sure the settings are still > >> producing > >> accurate results. > >> > >> 8. After doing a scan, run rank spelling. It will let you see your > >> spelling errors and will put them in the order of their prevalence in > >> your > >> scan. If you find some words that Kurzweil doesn't know, you may want to > >> add them to your word list so they won't be flagged in future scans. I > >> don't do this for proper names unless its a name that will keep cropping > >> up in future books. I do add words that are valid but that Kurzweil > >> doesn't have in its internal word list. You'll find that doing this over > >> time helps Kurzweil do a better job for you when you're cleaning up your > >> scans. > >> > >> 9. Keep the de-speckle setting turned off for most books. You may need it > >> with hardcover books because they sometimes have a text decoration on the > >> pages. Otherwise, de-speckle can interfere with OCR and actually cause > >> more errors than it solves. > >> > >> 10. The issue of using auto-corrections when scanning is another issue > >> where there is debate. I believe it can be a good thing if used > >> carefully. > >> I should note that Gerald has pointed out that Openbook has some > >> auto-corrections that cause problems with books and should be fixed by > >> users of that program. Kurzweil seems to do a good job for me, and it > >> makes my work easier. I loaded up a bunch of my older scans that have > >> been > >> lurking on my hard rive for over a decade and ran auto-correction on > >> them. > >> What an improvement! I might actually get to submit some of them now. > >> Here > >> are a few auto-corrections I have added to my Kurzweil list. > >> > >> dirough for through > >> diough for though > >> diought for thought > >> diey for they > >> diere for there > >> dieir for their > >> cornpany for company > >> cornfortable for comfortable > >> tiiing for thing > >> rnany for many > >> anydiing for anything > >> > >> > >> If you use Openbook, you may want to remove a few of the corrections in > >> its default list. I regularly find these in books scanned in Openbook and > >> have to fix them as I read. > >> > >> modem for modern > >> torn for tom > >> glock for clock > >> morn for mom > >> bum for burn > >> corn for com > >> > >> That last one causes problems for anyone scanning Star Trek books because > >> Kirk presses his corn badge to talk to the ship. (grin) If a word like > >> command is hyphenated between two pages, you get corn-mand. Meanwhile, > >> Batman dials into the internet with his modern, tries to stop a crook > >> named torn from shooting him with a clock, and puts the dirty burn in > >> cuffs until mom-ing. See how auto-corrections can go wrong if you're not > >> careful? > >> > >> Whew! We've made it to the end. (grin) I hope some of this makes your > >> scans easier to work with. It'll give you a foundation to start from > >> anyhow. Clean-up tips will be another email and will take some thought. > >> I'm better at doing than explaining things. I do have a system I use > >> though. I just haven't really written it down. Anyone got a cold Dr. > >> Pepper to share? > >> > >> -- > >> Monica Willyard > >> > >> > >> > >> To unsubscribe from this list send a blank Email to > >> bksvol-discuss-request@xxxxxxxxxxxxx > >> put the word 'unsubscribe' by itself in the subject line. To get a list > >> of available commands, put the word 'help' by itself in the subject line. > >> > > > > To unsubscribe from this list send a blank Email to > > bksvol-discuss-request@xxxxxxxxxxxxx > > put the word 'unsubscribe' by itself in the subject line. To get a list > > of available commands, put the word 'help' by itself in the subject line. > > > > To unsubscribe from this list send a blank Email to > bksvol-discuss-request@xxxxxxxxxxxxx > put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line. > > > __________ Information from ESET NOD32 Antivirus, version of virus signature database 3360 (20080815) __________ > > The message was checked by ESET NOD32 Antivirus. > > http://www.eset.com > > To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.