[bksvol-discuss] Re: Question on Scan quality for blind vs other types of disabled bookshare readers

  • From: "EVAN REESE" <mentat3@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 07 Feb 2008 17:10:43 -0500

I also agree with this.

Besides, when Pratik advocates not changing fonts in the book, he is also 
making a rather large, unstated assumption, namely that OCR software is 
accurately recognizing font style, type and size. This may be a valid 
assumption in many cases, but in many others, it is clearly not valid. I've 
seen - and heard of - examples of K1000 changing its mind about the font size 
right in the middle of text, or after a rescanning of the same text with 
different settings. How could I possibly know which is the more accurate 
rendering? And what should I do if  I am told that many of my page numbers are 
a size 2 font? which happened with one of my first scans with K1000. (I 
actually posted about this, and was told that this is an unreadably small font 
size, and I could at least tell with my Optacon that the page numbers were not 
really that small. Should I have left them as they were?) I often get page 
numbers being recognized at different sizes on adjacent pages when they are 
clearly the same size in the book. The same with the size of the font with 
other text, if I should close a scan session before finishing the book, or 
change scanner or OCR settings.

I'm sure many others could amplify these examples by a very large number. In 
fact, this thread was begun by a sighted person, pointing out readability 
problems that were
almost certainly caused by OCR Error.

So leaving the book as is with respect to font is a great idea in theory, but 
it assumes that the OCR software renders the book as it really appears, which 
it often does not. So I can do one of two things: I can either change the book 
to one font size, except for things like section headings, which may improve 
readability for sighted persons and may also very well return the book to 
something more like it's original appearance, or I can let the chips fall where 
they may when it comes to  font recognition and hope for the best. In many 
cases, probably, doing the latter will result in accurate recognition of font 
size and style, making for a good reading experience for sighted readers. But 
in many other cases, it may well make reading more difficult for sighted 
readers for reasons that have little or nothing to do with the appearance of 
the printed book but are due to misrecognition of the font size or style. I 
tend to do the latter anyway, just because I read through every word of each 
book I submit or validate and just don't want to spend the time bothering with 
fonts after having spent all that time trying to make sure that the text is 
first rate. But I have done the former on at least one occasion, and have 
increasing sympathy for doing it in the future, as I have become more aware of 
how sometimes apparently whimsically changeable font recognition seems to be. 
But I - like Monica - am not going to check every font line by line and fix it. 
I read through my submissions and validations on a braille display, and don't 
get that info on the Pac Mate without manual checking, and have no desire to 
read them through again with font messages enabled with Speech on the PC, just 
to make sure every font is properly rendered. And besides, having no vision and 
only the use of an Optacon - something most people with no vision do not have 
access to - except in cases like the absurdly small font size I mentioned, how 
could I possibly know if the font I get from K1000 is really the one in the 
book? If the font size suddenly changes by a few points for no apparent reason, 
which it sometimes does, which should I take as the one that is really in the 
book?

In the case of the absurdly small page numbers, I did fix those. I went through 
this 500 plus page book and checked the font of every page number by hand and 
made the very small ones a size 9 or 10 or something like that. There weren't a 
huge number of size 2 font page numbers, but how could I know if I didn't check 
them all? (I hadn't thought of selecting the whole book and making it all one 
reasonable font size at that time.) But I have never done that kind of checking 
or fixing again, and don't plan to any time soon. I will either leave them 
entirely alone - except for section headings - and hope for the best, or make 
them all one size - except for section headings.

All of the above applies to formatting as well, which also can create 
readability problems as I believe Judy mentioned  in her original message. OCR 
software - including K1000 - does not always accurately render the formatting 
of a book, and sometimes makes changes for no apparent reason - which, I 
believe someone else mentioned here. In addition to occasionally fixing fonts, 
I have also left justified the text in at least one or two books that I can 
recall, except for section headings, which I centered if they were centered in 
the book. Or I just let the formatting come out as the OCR software renders it, 
and again, hope for the best. But here again, I am in increasing sympathy with 
the notion of left justifying everything, except for section headings, and 
making some reasonable indent for the paragraphs, at least for fiction, which 
is usually just paragraphs of text anyway; but I have not made a habit of this 
kind of changing as of yet. However, once again, leaving the formatting as it 
is may also detract from the readability of a book by sighted readers, not 
because of the formatting of the original book, but because of inaccuracy of 
the recognition of that formatting by the software, or of its changing its mind 
as to what that formatting is for no apparent reason.

I recognize that what to do about these kinds of issues is something that 
reasonable people can disagree on. But merely saying that books should be left 
as they are is not adequate if they are not necessarily rendered as they are by 
even the best OCR software available.

Evan

  ----- Original Message ----- 
  From: Jamie Yates, CPhT 
  To: bksvol-discuss@xxxxxxxxxxxxx 
  Sent: Wednesday, February 06, 2008 10:36 PM
  Subject: [bksvol-discuss] Re: Question on Scan quality for blind vs other 
types of disabled bookshare readers


  Lora, I agree with you. And here is the thing, I can see. So I know that the 
font might be different than the font in the book. The majority of volunteers 
are blind. How do they know if what they scanned isn't identical to the print 
book? So how can it matter so much? Because I can see, it is wrong for me to 
KNOWINGLY change the font of a book I scan or validate. But if I couldn't see, 
it wouldn't be wrong for me to UNKNOWINGLY change the font or size.

  Right? 

  But, isn't the whole point of bookshare to make books ACCESSIBLE to 
everybody? In a large print book the publisher changes the font size. In an 
audio book the publisher changes many things about the book including 
eliminating headers and page numbers. It seems to me that Bookshare should have 
some leniency here. But it also seems a simple solution for the sighted print 
disabled Bookshare member would be for that member to change the font size in 
his or her own digital copy of the book to a workable font for that user.


  Jamie in Michigan 
  Currently Reading - When death comes stealing : a Tamara Hayle mystery / 
Valerie Wilson Wesley 

  I'm an eBay affiliate, click here before you bid! 
  Click here for eBay! 

Other related posts: