[bksvol-discuss] Re: text quality

  • From: "Pratik Patel" <pratikp1@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 29 Apr 2004 16:41:30 -0400

Paul and Noel,
 
Since you've both expressed the same questions, let me ask something in
return.  How do you characterize your results.  Do you generally gage
the accuracy of your results via the self-reported recognition stats or
by looking at amount of spelling mistakes per page when you change
various settings.  The recognition stats presented by Kurzweil are often
misleading.  What is given to us often done by getting it from the OCR
engine in question.  With those types of self reports, there is always a
matter of accuracy and reliability.  There is also a matter of validity.
I have experimented with this issue a bit and have found that even with
the same settings, if you keep on scanning a page a few times the
supplied stats are different each time, in some cases by a large margin.
However, when this issue is looked at from the actual accuracy
perspective, it's quite reliable to look at the page from different
scans with same settings.  Even when the Optimize Scanning feature comes
up with different settings for the same page when using that function
several times, the accuracy is not effected too often.  The gray scale
with 400DPI does make a large difference.  Even when I use optimize
scanning, I make sure that at the end I compare results by using the
Gray scale with the 400 DPI.
 
Pratik
 
 

Pratik Patel 
Managing Director 
CUNY Assistive Technology Services 
the City University of New York 
(718) 997-3775 
ppatel@xxxxxx 

-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Edwards, Paul
Sent: Thursday, April 29, 2004 4:25 PM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: text quality


Actually, I knew that.  So of course, the question becomes: if it makes
no difference, why do different values turn up when you do "optimize
scan"?  Perhaps there is a Kurzweil guru lurking.  Would you care to
emerge from the lurk and answer the question?
 
Paul
 
 

Paul Edwards, Director
Access Services, North Campus
Phone: (305) 237-1146
Fax: (305-237-1831
TTY: (305) 237-1413
Email: pedwards@xxxxxxxx
home email: edwpaul@xxxxxxxxxxx 

-----Original Message-----
From: Guido Corona [mailto:guidoc@xxxxxxxxxx]
Sent: Thursday, April 29, 2004 4:02 PM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: text quality



Paul,  I also use grayscale at 400 DPI most of the time with Kurzweil
8.0.  If you find it is rather slow,  scan images only,  then turn on
pure recognition before going to sleep.  Your book will be ready when
you wake in the morning,  no matter how large it is.  By the way,  with
grayscale brightness makes no difference. 

Guido 


Guido D. Corona
IBM Accessibility Center,  Austin Tx.
IBM Research,
Phone:  (512) 838-9735
Email: guidoc@xxxxxxxxxxx

Visit my weekly Accessibility WebLog at:
http://www-3.ibm.com/able/weblog/corona_weblog.html





"Edwards, Paul" <pedwards@xxxxxxxx> 
Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx 


04/29/2004 02:42 PM 


Please respond to
bksvol-discuss



To
<bksvol-discuss@xxxxxxxxxxxxx> 

cc

Subject
[bksvol-discuss] Re: text quality       

                




This is a difficult issue.  I take the approach of carefully checking
the first few pages at the beginning of a scan.  If there are errors I
can adjust for, I do that.  I also rescan pages whose value in Kurzweil
comes back lower than ninety.  I do not tend to scan ninety to
ninety-five because I can usually not make much of a difference and we
are often dealing with a screwed over heading or something.

However, I scanned a book recently which was a hard cover and which
should have scanned like a dream and came out as pure druck.

I have found that optimizing scanning is, for the most part, worth
doing.  The results do not always make me happy in that I am now
scanning a book using gray scale and sixty which takes forever to scan.
By the way, it is legends two edited by Robert Silverberg.

Paul


Paul Edwards, Director
Access Services, North Campus
Phone: (305) 237-1146
Fax: (305-237-1831
TTY: (305) 237-1413
Email: pedwards@xxxxxxxx
home email: edwpaul@xxxxxxxxxxx

-----Original Message-----
From: Kellie Hartmann [mailto:kellhart@xxxxxxxxxx]
Sent: Thursday, April 29, 2004 1:04 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] text quality


Hi all.
Even with the wonderful new scanning software available there are a few
kinds of things that are very difficult to get a good scan from. For
example, linguistics books are often very graphical in nature and
contain
symbols that the OCR packages don't recognize; things like r-underring
and
turned V etc. Also some cheap paperbacks do have places where they seem
to
be blurred. I scanned a novel that I was assigned to read in French
class,
and when I found illegible passages I tried rescanning them. I rescanned
several times changing various settings, but certain passages absolutely
refused to scan. I don't really plan to submit it to Bookshare anyway,
but I
would prefer this scan, with a couple of blurred lines every 20 pages or
so,
to no scan. I'm able to use this in class with no problems, so in my
opinion
this is far better than nothing. Finally, I have another French book
which
has very glossy pages and lots of flashy graphical design. Again, even
with
a lot of work on experimenting with different settings my results were
not
encouraging. This I definitely won't submit to Bookshare because I can't
get
it in good enough shape; the effort required would be far beyond the
benefits. I agree that careless scanning is unreasonable, and think that
validating is important. It always takes me much longer to validate
something than to scan it because I read the whole book and fix every
error
that can possibly be fixed. Not every validator is going to do that, and
certain books, such as enormous textbooks, really would require a great
investment in time to proof thoroughly. So it isn't realistic to expect
every book to be flawless. What I would really like eventually, and I
know
this isn't realistic either, would be to have all the fair-quality books
rescanned.
Kellie






Other related posts: