[bksvol-discuss] Re: Openbook Settings

  • From: "Evan Reese" <mentat1@xxxxxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Mon, 12 Jun 2006 07:57:39 -0700

As far as column detection goes, I kept it on while scanning a book and got hard returns at the end of every line. So I think it should be turned off unless you have text in columns where you would certainly need it, such as for an index or something. Keeping it on means that the scanner thinks that every line of text is part of a column and should be separate from the text to the side on the same line, so apparently puts hard returns to ensure that the columns are separated. But that's just a guess. What I do know is that having done it, I can say that that's not what you want your text to have, unless it is in columns, of course.

----- Original Message ----- From: "Gerald Hovas" <GeraldHovas@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Sunday, June 11, 2006 10:23 PM
Subject: [bksvol-discuss] Re: Openbook Settings



Monica,

I keep all of them turned off.

It's my understanding that Despeckle can interfere with accent marks.

There's really no reason to have White on Black turned on unless you know
you're going to need it, and in my experience, that's rare. It's possible,
too, that having this on can cause the OCR software to try to find text in
places it shouldn't, like places where a library book has gotten a bit
dirty.


Also, unless you know you have a foreign language in the book and know which
to expect, there doesn't seem to be any advantage for having Language
Analysis turned on. I have a suspission, too, that Language Analysis also
uses a form of AutoCorrection like OpenBook uses which can cause similar
problems because I sometimes seem to get better results having it off. If
you like to scan SciFi or Fantasy books or other books with unusual writing
styles like the Mitford books, then it probably does more harm than good.


BTW, I have OCR Correction turned off as well because you can't load
different dictionaries for different writing styles or classes of books, and
a single one in my opinion can't handle the job for as many different types
of books as I like to scan. It also has some known incorrect entries, so I
don't trust it even though I've fixed the ones I know about. When you get
right down to it, it's a form of global search and replace, and blindly
making those can bite you unless you really know what you're replacing, and
as I mentioned above, what is safe in one type of book might not be safe in
all types of books.


All of these settings just cause the OCR Software to have to think harder
and longer and can cause as many errors as they solve if not used under the
right conditions.


I also recommend keeping Collumn Recognition turned on because someone has
suggested that not having it on is the reason why some books have a Hard
Return added at the end of every line in the printed book. Still need to
confirm that, though. Besides, it can bail you out if you're scanning two
pages at once and the OCR software fails to realize that you've scanned two
pages and not a two collumn page plus a blank page. You'll need to turn it
off, though, when scanning pages with tables, like the Table of Contents.


That's all I can think of off the top of my head.  Perhaps Jake or Pratik
has some other thoughts on the subject.

HTH

Gerald

-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Monica Willyard
Sent: Sunday, June 11, 2006 11:35 PM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Openbook Settings

I know there are several Openbook users on this list who turn out
some good scans.  I've found 3 settings in the scanning settings area
that I'd like to know how you handle.  There is a checkbox for
despeckle and another for white on black.  Finally, there is a
setting called language analyst.  I currently have all 3 of these
turned on.  Should I have these on by default, or are some of these
settings meant for specific and somewhat rare situations?  I scan
books exclusively, no newspapers or faxes.  I appreciate any feedback
anyone has to offer.  (smile)


Monica Visit my blog at: http://plumlipstick.livejournal.com

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of
available commands, put the word 'help' by itself in the subject line.


To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.




To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.

Other related posts: