Hi Monica,
I don't have any real answers to your delimma as to how the book was
made, just a few general thoughts/comments.
I thought Kurzweil was supposed to open TIF files, based on release notes, I
know they had fixed a problem with opening those files at one point.
Also, it's been a while since I pulled out a copy of IrfanView, but I think
it has an option for making a multipage TIF into single page TIFs. I know
that JPG is a lossy format and so converting to it might lose some quality
of the image, whereas perhaps keeping in the same TIF format, which doesn't
have to be lossy, might make a better OCR. Just some thoughts though and
something to think about trying if this strange phenomena ever happens
again.
Glad you got something out of that file *grin*
I've been validating a strange file of unknown origins and am curious as
to how it was produced. I went to investigate the HTML question and
downloaded a book (Felicity Learns a Lesson by Valerie Tripp). This was
listed as HTML, but when I unzipped the file, it had a .tif extension.
Kurzweil thought it was gobblygook. Since .tif is a graphics format, I
opened it with several programs and mostly got a picture of the cover page.
While Word tried unsuccessfully to open it, I noticed a bunch of things that
appeared to be page breaks. One program, Photoshop Essentials, gave me an
error message saying the file was designed to be viewed on a video monitor.
How strange. I was finally able to open it in Infranview and convert all the
pages to .jpg. They appear to be unrecognized photocopies. So, I'm in the
process of OCRing and cleaning up the text. I'd sure like to know how it was
made so if I run into something like this again, it won't take me so long to
figure out how to process it.
Monica
--------------------------------------------------------------------------------
No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.15.1/250 - Release Date: 2/3/2006
To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.