[bksvol-discuss] Re: Question on Scan quality for blind vs other types of disabled bookshare readers

  • From: Guido Corona <guidoc@xxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Fri, 8 Feb 2008 14:27:50 -0600

Evan,  what version of Kurzweil do you use, which engine?

G.
 

Guido Dante Corona
IBM Research,
Human Ability & Accessibility Center,   (HA&AC)
Austin Tx.
Phone:  512. 838. 9735.
Email: guidoc@xxxxxxxxxxx
Web:  http://www.ibm.com/able

". . . Maybe it was only those who were most certain they were right who 
were guaranteed to be wrong. And that maybe, just maybe, those who 
questioned the most were in the end those who came closest to being wise."
[David Poyer, The Command]




"EVAN REESE" <mentat3@xxxxxxxxxxx> 
Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx
02/07/2008 04:10 PM
Please respond to
bksvol-discuss@xxxxxxxxxxxxx


To
<bksvol-discuss@xxxxxxxxxxxxx>
cc

Subject
[bksvol-discuss] Re: Question on Scan quality for blind vs other types of 
disabled bookshare readers






I also agree with this.
 
Besides, when Pratik advocates not changing fonts in the book, he is also 
making a rather large, unstated assumption, namely that OCR software is 
accurately recognizing font style, type and size. This may be a valid 
assumption in many cases, but in many others, it is clearly not valid. 
I've seen - and heard of - examples of K1000 changing its mind about the 
font size right in the middle of text, or after a rescanning of the same 
text with different settings. How could I possibly know which is the more 
accurate rendering? And what should I do if  I am told that many of my 
page numbers are a size 2 font? which happened with one of my first scans 
with K1000. (I actually posted about this, and was told that this is an 
unreadably small font size, and I could at least tell with my Optacon that 
the page numbers were not really that small. Should I have left them as 
they were?) I often get page numbers being recognized at different sizes 
on adjacent pages when they are clearly the same size in the book. The 
same with the size of the font with other text, if I should close a scan 
session before finishing the book, or change scanner or OCR settings.
 
I'm sure many others could amplify these examples by a very large number. 
In fact, this thread was begun by a sighted person, pointing out 
readability problems that were
almost certainly caused by OCR Error.
 
So leaving the book as is with respect to font is a great idea in theory, 
but it assumes that the OCR software renders the book as it really 
appears, which it often does not. So I can do one of two things: I can 
either change the book to one font size, except for things like section 
headings, which may improve readability for sighted persons and may also 
very well return the book to something more like it's original appearance, 
or I can let the chips fall where they may when it comes to  font 
recognition and hope for the best. In many cases, probably, doing the 
latter will result in accurate recognition of font size and style, making 
for a good reading experience for sighted readers. But in many other 
cases, it may well make reading more difficult for sighted readers for 
reasons that have little or nothing to do with the appearance of the 
printed book but are due to misrecognition of the font size or style. I 
tend to do the latter anyway, just because I read through every word of 
each book I submit or validate and just don't want to spend the time 
bothering with fonts after having spent all that time trying to make sure 
that the text is first rate. But I have done the former on at least one 
occasion, and have increasing sympathy for doing it in the future, as I 
have become more aware of how sometimes apparently whimsically changeable 
font recognition seems to be. But I - like Monica - am not going to check 
every font line by line and fix it. I read through my submissions and 
validations on a braille display, and don't get that info on the Pac Mate 
without manual checking, and have no desire to read them through again 
with font messages enabled with Speech on the PC, just to make sure every 
font is properly rendered. And besides, having no vision and only the use 
of an Optacon - something most people with no vision do not have access to 
- except in cases like the absurdly small font size I mentioned, how could 
I possibly know if the font I get from K1000 is really the one in the 
book? If the font size suddenly changes by a few points for no apparent 
reason, which it sometimes does, which should I take as the one that is 
really in the book?
 
In the case of the absurdly small page numbers, I did fix those. I went 
through this 500 plus page book and checked the font of every page number 
by hand and made the very small ones a size 9 or 10 or something like 
that. There weren't a huge number of size 2 font page numbers, but how 
could I know if I didn't check them all? (I hadn't thought of selecting 
the whole book and making it all one reasonable font size at that time.) 
But I have never done that kind of checking or fixing again, and don't 
plan to any time soon. I will either leave them entirely alone - except 
for section headings - and hope for the best, or make them all one size - 
except for section headings.
 
All of the above applies to formatting as well, which also can create 
readability problems as I believe Judy mentioned  in her original message. 
OCR software - including K1000 - does not always accurately render the 
formatting of a book, and sometimes makes changes for no apparent reason - 
which, I believe someone else mentioned here. In addition to occasionally 
fixing fonts, I have also left justified the text in at least one or two 
books that I can recall, except for section headings, which I centered if 
they were centered in the book. Or I just let the formatting come out as 
the OCR software renders it, and again, hope for the best. But here again, 
I am in increasing sympathy with the notion of left justifying everything, 
except for section headings, and making some reasonable indent for the 
paragraphs, at least for fiction, which is usually just paragraphs of text 
anyway; but I have not made a habit of this kind of changing as of yet. 
However, once again, leaving the formatting as it is may also detract from 
the readability of a book by sighted readers, not because of the 
formatting of the original book, but because of inaccuracy of the 
recognition of that formatting by the software, or of its changing its 
mind as to what that formatting is for no apparent reason.
 
I recognize that what to do about these kinds of issues is something that 
reasonable people can disagree on. But merely saying that books should be 
left as they are is not adequate if they are not necessarily rendered as 
they are by even the best OCR software available.
 
Evan
 
----- Original Message ----- 
From: Jamie Yates, CPhT 
To: bksvol-discuss@xxxxxxxxxxxxx 
Sent: Wednesday, February 06, 2008 10:36 PM
Subject: [bksvol-discuss] Re: Question on Scan quality for blind vs other 
types of disabled bookshare readers

Lora, I agree with you. And here is the thing, I can see. So I know that 
the font might be different than the font in the book. The majority of 
volunteers are blind. How do they know if what they scanned isn't 
identical to the print book? So how can it matter so much? Because I can 
see, it is wrong for me to KNOWINGLY change the font of a book I scan or 
validate. But if I couldn't see, it wouldn't be wrong for me to 
UNKNOWINGLY change the font or size.

Right? 
 
But, isn't the whole point of bookshare to make books ACCESSIBLE to 
everybody? In a large print book the publisher changes the font size. In 
an audio book the publisher changes many things about the book including 
eliminating headers and page numbers. It seems to me that Bookshare should 
have some leniency here. But it also seems a simple solution for the 
sighted print disabled Bookshare member would be for that member to change 
the font size in his or her own digital copy of the book to a workable 
font for that user.


Jamie in Michigan 
Currently Reading - When death comes stealing : a Tamara Hayle mystery / 
Valerie Wilson Wesley 

I'm an eBay affiliate, click here before you bid! 
Click here for eBay! 

GIF image

Other related posts: