[bksvol-discuss] Re: Become A Black Belt Submitter

  • From: "Chela Robles" <cdrobles693@xxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Fri, 15 Aug 2008 20:00:21 -0700

No it is already built-in, see, if you go to optimize, it should do it all.

On 8/15/08, Paula and James Muysenberg <outofsightlife@xxxxxxxxx> wrote:
> Hi, Julia,
>
>     I was sure this setting was still somewhere in Version 11, but I just
> looked and can't find it. I searched the manual, and also found nothing
> about it. I wonder if the program somehow doesn't need it anymore.
>
> Paula
>
> ----- Original Message -----
> From: "Julia Kulak" <julia.kulak@xxxxxxxxxxxx>
> To: <bksvol-discuss@xxxxxxxxxxxxx>
> Sent: Friday, August 15, 2008 8:18 PM
> Subject: [bksvol-discuss] Re: Become A Black Belt Submitter
>
>
>> Hi, I think Kurzweil eliminated one setting in version 11, there doesn't
>> appear to be a setting that has recognition of light text on a dark
>> background. Will this mess up the book? Should I downgrade to version 10
> for
>> this feature, and is there an equivalent setting in version 11?
>> Julia
>> ----- Original Message -----
>> From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
>> To: <bksvol-discuss@xxxxxxxxxxxxx>
>> Sent: Friday, August 15, 2008 2:40 PM
>> Subject: [bksvol-discuss] Re: Become A Black Belt Submitter
>>
>>
>> > Hi Jim, I have a personal copy; but you, and anyone else here, can find
>> > the article on Jake's site at:
>> >
>> > http://www.jbrownell.com/bks/tip.asp?id=29
>> >
>> > Evan
>> >
>> > ----- Original Message -----
>> > From: <james.homme@xxxxxxxxxxxx>
>> > To: <bksvol-discuss@xxxxxxxxxxxxx>
>> > Sent: Friday, August 15, 2008 7:49 AM
>> > Subject: [bksvol-discuss] Re: Become A Black Belt Submitter
>> >
>> >
>> >> Hi,
>> >> Where is the stuff Pratik' wrote about this?
>> >>
>> >> Thanks.
>> >>
>> >> Jim
>> >>
>> >> James D Homme, Usability Engineering, Highmark Inc.,
>> >> james.homme@xxxxxxxxxxxx, 412-544-1810
>> >>
>> >> "The difference between those who get what they wish for and those who
>> >> don't is action. Therefore, every action you take is a complete
>> >> success,regardless of the results." -- Jerrold Mundis
>> >> Highmark internal only: For usability and accessibility:
>> >> http://highwire.highmark.com/sites/iwov/hwt093/
>> >>
>> >>
>> >>
>> >>             "EVAN REESE"
>> >>             <mentat3@verizon.
>> >>             net>
> To
>> >>             Sent by:                  bksvol-discuss@xxxxxxxxxxxxx
>> >>             bksvol-discuss-bo
> cc
>> >>             unce@xxxxxxxxxxxx
>> >>             g
> Subject
>> >>                                       [bksvol-discuss] Re: Become A
> Black
>> >>                                       Belt Submitter
>> >>             08/14/2008 07:27
>> >>             PM
>> >>
>> >>
>> >>             Please respond to
>> >>             bksvol-discuss@fr
>> >>                eelists.org
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Thanks for sending this up. This is all very useful stuff.
>> >>
>> >> I do use Scan Repeatedly, and just hit the Cancel key twice if I get a
>> >> confidence number below the threshhold - which on my K1000 is set to
>> >> %98.7.
>> >> If I can go twenty or fifty pages without getting a page below that
>> >> number,
>> >> then it saves me from having to hit the F9 key twenty or fifty times.
>> >>
>> >> I also use autocorrection, but haven't compared a scan with and without
>> >> it,
>> >> so I cannot take sides in that debate.
>> >>
>> >> According to Pratik's excellent monograph on getting the best
> recognition
>> >> of mass market paperbacks, he wrote that grayscale and 400 dots per
> inch
>> >> can sometimes produce better results than static optimized. So your
> point
>> >> here about grayscale is a good one, but increasing the resolution from
>> >> 300
>> >> to 400, especially for poor quality print such as you'd get with cheap
>> >> paperbacks can give even better recognition sometimes. Of course,
>> >> increasing the resolution from the usual 300 will also slow down the
> scan
>> >> and the recognition; but the extra time invested up front is very
> likely
>> >> to
>> >> be more than offset by the time saved cleaning up the scan afterword.
>> >>
>> >> I have scanned the same material with Suspicious Regions kept and
>> >> ignored,
>> >> and it can really make a difference in the amount of junk you get. So
>> >> this
>> >> is another good point you make here.
>> >>
>> >> Thanks again.
>> >>
>> >> Evan
>> >>
>> >> ----- Original Message -----
>> >> From: Monica Willyard
>> >> To: Bookshare Volunteers
>> >> Sent: Thursday, August 14, 2008 6:19 PM
>> >> Subject: [bksvol-discuss] Become A Black Belt Submitter
>> >>
>> >> Hi, everyone. I wrote an email about getting really clear scans for one
>> >> of
>> >> our volunteers, and it occurred to me that someone on this list might
>> >> benefit from it. It's a little on the long side. I hope something in it
>> >> will help you. If I've said anything confusing, please ask me about it.
> I
>> >> know many of you have done a lot of scanning, so I'm focusing on things
>> >> that may not have occurred to you. I'll call them my top ten scanning
>> >> tips. (grin) They work from my experience, and you may find that you
> need
>> >> to experiment to find something that works well for you. Also, I use
>> >> Kurzweil for scanning. Openbook users may find some of this to be
> useful,
>> >> but some of it won't apply. I do have Openbook 7 and used it for
> several
>> >> years. So I'll do my best to help you translate these to Openbook if
>> >> that's what you need.
>> >>
>> >> I got a lot of these ideas from volunteers I've been fortunate enough
> to
>> >> work with over the past 2 years. Jim Baugh, Louise, Pratik, Jake,
> Scott,
>> >> Shelley, and Gerald taught me so much about good scanning. Thanks guys.
>> >> (smile) You rock!
>> >>
>> >> 1. Start with some solid settings in Kurzweil that will work most of
> the
>> >> time. You may  know your way around Kurzweil well. I don't know if
> you've
>> >> thought to work on these settings though since they're not obvious.
> Under
>> >> the settings menu, in the general tab, make sure that your confidence
>> >> threshold is set to at least 98.5. Why? Kurzweil defaults to 95
> percent,
>> >> and that means that it optimizes scans for a lower level of accuracy.
>> >> That
>> >> means you won't get the best results from optimization. That also means
>> >> more clean-up on the backside, and that's a pain in the neck. The other
>> >> setting in general that you may want to turn on if you have some disk
>> >> space is the option to keep scanned images. This feature lets you
>> >> re-recognize pages if they have issues. Sometimes just changing
> something
>> >> like detect columns will make that page come out right without you
> having
>> >> to totally rescan the page. Once you've read through the book, Kurzweil
>> >> will let you remove the scanned images from the book to reduce the file
>> >> size.
>> >>
>> >> There are three final settings that you may find useful for scanning
> most
>> >> fiction. These work well for me, especially with library books. They're
>> >> all under the recognition tab. Column identification should be enabled.
>> >> Partial columns should be ignored, and suspicious regions should be
>> >> ignored. This flies in the face of what Nick has recommended on the
>> >> Kurzweil list, so I'd better explain. When scanning books, it's
> somewhat
>> >> common to get a shadow from the spine of the book. It often makes a
>> >> narrow
>> >> column of a tab character and a random group of numbers or letters. If
>> >> you
>> >> turn off column identification, these random letters are mingled with
> the
>> >> regular text. Turning on the column detection separates this garbage
> from
>> >> the text, and ignoring partial columns and suspicious regions removes
> it
>> >> during OCR. If a page needs column detection turned off due to a table,
>> >> and you have retained images of the scanned page, you can easily change
>> >> the recognition settings and just re-recognize the page from the
> scanned
>> >> image. Do you see how this could save you time and hassle?
>> >>
>> >> Once you have settings you like, save them as default so you can start
>> >> scanning without worrying about them each time you start Kurzweil.
>> >>
>> >> 2. Prepare your book for scanning, and you'll get better results from
> the
>> >> start. Before you begin to scan a book, run your fingers lightly
> through
>> >> the pages to remove any possible ink ,dust, or other particles that may
>> >> be
>> >> on the pages. If the book is a library book, flip through the book in
>> >> sections of about fifteen pages or so, gently pressing your fingers
> along
>> >> the inner spine to encourage the book to lie flat. If the book belongs
> to
>> >> you, especially if its a paperback, flip through sections as with a
>> >> library book, but bend the book back so that it's outer covers almost
>> >> touch. You're giving your book some flexibility stretches while not
>> >> breaking its spine. This is especially important for thick books or
>> >> two-page scanning mode and will keep you from having to push down as
> hard
>> >> on books while you scan.
>> >>
>> >> 3. Optimize and verify settings for your book. Before scanning a book,
>> >> open to the center and use the optimize feature. The Kurzweil staff
> says
>> >> that optimization should be used in one-page mode so it can get the
> best
>> >> idea of how the print works in your book. Scan four or five pages after
>> >> optimization to determine if any adjustments in settings need to be
> made.
>> >> Kurzweil does a fairly good job picking the optimal settings to scan a
>> >> particular book unless the print quality is exceptionally bad. If
> you're
>> >> planning to scan in two-page mode, you can turn this back on once
> you're
>> >> finished with optimization.
>> >>
>> >> 4. When in doubt, go for grey-scale. Grey-scale is the best and most
>> >> reliable thing to try when optimization doesn't produce the quality
> that
>> >> you need. Try grey-scale with brightness of around 65 and a resolution
> of
>> >> 300 DPI. It's really great for scanning mass market paperbacks.
>> >> Grey-scale
>> >> will make your scans slower, and its scanned images are larger than
> those
>> >> made with static thresholding. It gives the best page representation
>> >> though, compared to other forms of thresholding. If you're using a
> Canon
>> >> or Visioneer scanner, grey-scale will save your bacon! (grin) Please
> note
>> >> that Openbook 7 doesn't implement grey-scale correctly, so automatic
>> >> contrast is probably your best choice.
>> >>
>> >> 5. Catch bad scans as they happen. There is a friendly debate among
>> >> submitters about whether to scan in batches or to scan pages and
>> >> recognize
>> >> them one at a time. There are pros and cons on both sides. I do a sort
> of
>> >> modified batch style. I scan a book while on the phone or doing
> something
>> >> else but don't use the scan repeatedly feature for one reason. I want
> to
>> >> catch badly scanned pages as they happen. It saves me from hunting for
> a
>> >> page to rescan it later. So I scan a page and let my scan recognize
> while
>> >> I'm turning to the next page. I wait for Kurzweil to tell me its
>> >> confidence number. I make this really easy because I've turned off the
>> >> progress messages for Kurzweil's scanning and recognition and have it
> set
>> >> to play a chime when scanning and recognition are finished. So if
>> >> Kurzweil
>> >> says something, it's the confidence number letting me know that the
> page
>> >> scanned below the accuracy threshold I've set. If the statistics say 97
>> >> percent confidence level or less, rescan the page to try for a better
>> >> scan. Otherwise, you will have to struggle with many errors on the
> page.
>> >>
>> >> 6. Your scanner needs TLC too. Books can be dirty or dusty sometimes.
>> >> Mass
>> >> market paperbacks can leave a residue of ink dust on your scanner. Keep
>> >> the scanner glass clean by using a dry, lint-free cloth. Never use
>> >> anything wet like an alcohol pad or baby wipe. That will create little
>> >> bubbles under the scanner glass and will cause problems in future
> scans.
>> >>
>> >> 7. When scanning a book, do a spot check every 15 or 20 pages. Look at
>> >> the
>> >> last page or two of the file to make sure the settings are still
>> >> producing
>> >> accurate results.
>> >>
>> >> 8. After doing a scan, run rank spelling. It will let you see your
>> >> spelling errors and will put them in the order of their prevalence in
>> >> your
>> >> scan. If you find some words that Kurzweil doesn't know, you may want
> to
>> >> add them to your word list so they won't be flagged in future scans. I
>> >> don't do this for proper names unless its a name that will keep
> cropping
>> >> up in future books. I do add words that are valid but that Kurzweil
>> >> doesn't have in its internal word list. You'll find that doing this
> over
>> >> time helps Kurzweil do a better job for you when you're cleaning up
> your
>> >> scans.
>> >>
>> >> 9. Keep the de-speckle setting turned off for most books. You may need
> it
>> >> with hardcover books because they sometimes have a text decoration on
> the
>> >> pages. Otherwise, de-speckle can interfere with OCR and actually cause
>> >> more errors than it solves.
>> >>
>> >> 10. The issue of using auto-corrections when scanning is another issue
>> >> where there is debate. I believe it can be a good thing if used
>> >> carefully.
>> >> I should note that Gerald has pointed out that Openbook has some
>> >> auto-corrections that cause problems with books and should be fixed by
>> >> users of that program. Kurzweil seems to do a good job for me, and it
>> >> makes my work easier. I loaded up a bunch of my older scans that have
>> >> been
>> >> lurking on my hard rive for over a decade and ran auto-correction on
>> >> them.
>> >> What an improvement! I might actually get to submit some of them now.
>> >> Here
>> >> are a few auto-corrections I have added to my Kurzweil list.
>> >>
>> >> dirough for through
>> >> diough for though
>> >> diought for thought
>> >> diey for they
>> >> diere for there
>> >> dieir for their
>> >> cornpany for company
>> >> cornfortable for comfortable
>> >> tiiing for thing
>> >> rnany for many
>> >> anydiing for anything
>> >>
>> >>
>> >> If you use Openbook, you may want to remove a few of the corrections in
>> >> its default list. I regularly find these in books scanned in Openbook
> and
>> >> have to fix them as I read.
>> >>
>> >> modem for modern
>> >> torn for tom
>> >> glock for clock
>> >> morn for mom
>> >> bum for burn
>> >> corn for com
>> >>
>> >> That last one causes problems for anyone scanning Star Trek books
> because
>> >> Kirk presses his corn badge to talk to the ship. (grin) If a word like
>> >> command is hyphenated between two pages, you get corn-mand. Meanwhile,
>> >> Batman dials into the internet with his modern, tries to stop a crook
>> >> named torn from shooting him with a clock, and puts the dirty burn in
>> >> cuffs until mom-ing. See how auto-corrections can go wrong if you're
> not
>> >> careful?
>> >>
>> >> Whew! We've made it to the end. (grin) I hope some of this makes your
>> >> scans easier to work with. It'll give you a foundation to start from
>> >> anyhow. Clean-up tips will be another email and will take some thought.
>> >> I'm better at doing than explaining things. I do have a system I use
>> >> though. I just haven't really written it down. Anyone got a cold Dr.
>> >> Pepper to share?
>> >>
>> >> --
>> >> Monica Willyard
>> >>
>> >>
>> >>
>> >> To unsubscribe from this list send a blank Email to
>> >> bksvol-discuss-request@xxxxxxxxxxxxx
>> >> put the word 'unsubscribe' by itself in the subject line.  To get a
> list
>> >> of available commands, put the word 'help' by itself in the subject
> line.
>> >>
>> >
>> > To unsubscribe from this list send a blank Email to
>> > bksvol-discuss-request@xxxxxxxxxxxxx
>> > put the word 'unsubscribe' by itself in the subject line.  To get a list
>> > of available commands, put the word 'help' by itself in the subject
> line.
>> >
>>
>>  To unsubscribe from this list send a blank Email to
>> bksvol-discuss-request@xxxxxxxxxxxxx
>> put the word 'unsubscribe' by itself in the subject line.  To get a list
> of available commands, put the word 'help' by itself in the subject line.
>>
>>
>> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 3360 (20080815) __________
>>
>> The message was checked by ESET NOD32 Antivirus.
>>
>> http://www.eset.com
>>
>>
>
>  To unsubscribe from this list send a blank Email to
> bksvol-discuss-request@xxxxxxxxxxxxx
> put the word 'unsubscribe' by itself in the subject line.  To get a list of
> available commands, put the word 'help' by itself in the subject line.
>
>


-- 
Chela
E-Mail: cdrobles693@xxxxxxxxx
WindowsLiveMessenger Only (PLEASE E-Mail ME BEFORE ADDING ME TO YOUR
CONTACTS!): cdrobles693@xxxxxxxxxxx

Blog: http://www.1jtcdr.wordpress.com
Facebook: http://www.facebook.com/profile.php?id=690550695 (PLEASE
E-MAIL ME BEFORE I EITHER ADD OR REJECT YOUR FACEBOOK REQUESTS!)
MySpace: http://www.myspace.com/chelarobles (AGAIN, PLEASE E-MAIL ME
BEFORE YOUR REQUEST IS IETHER REJECTED OR ACCEPTED!)
Skype: jazzytrumpet (E-Mail ME ABOUT ADDING YOU ONTO MY SKYPE CONTACTS
BEFORE I GIVE A CONFIRMATION Of YES OR NO!)
Mobile Phone: (925) 250-5955 (ONLY FOR THOSE WHO WORK WITH ME, OR WHO KNOW ME)
 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: