I also use FineReader, but with different settings. With paperbacks I'll set the resolution at 600. If I'm using the automatic document feeder, I'll do everthing at 600 if I'm not in a hurry. Also, when you start getting blobs on pages in the same place, it's probably time to clean the glass on the scanner. Also, I let FineReader automatically set the brightness. If the paper is dark and you need more contrast, changing the document type from black and white to greyscale or color might help. Also, do you let it automatically Analyze Page Layout? It sets the OCR area very close to the print margins thereby eliminating a lot of junk characters. I also use the ABBYY interface instead of the TWAIN interface. One trick I use to get rid of non-text junk like random vertical bars after sending the document to Word is to save it as .rtf, open it in Kurzweil 1000, then save it as .rtf in Kurzweil. When I open it in Word again, the junk graphic elements are all gone because Kurzweil will strip the graphics while preserving the text formatting. FineReader does a few other things when going to Word that enhance the visual layout of documents but are not needed for us. For example, extra white space after something like a chapter number is controlled by "Space Before" in the Paragraph section of the Format menu. Headings, page numbers, and text that's set apart from the main body, are often set up as separate columns and/or section breaks. Changing or eliminating columns can result in really garbled text. Many editing changes in Word are only automatically applied to the current "section" so a problem you just fixed by selecting the whole document, mysteriously appears again. None of these oddities are difficult to fix once the cause is puzzled out. I'm not sure there isn't a fairy living in the OCR engine though because it's recognition is like magic. M in M (Monica in Maryland)
--- Begin Message ---
- From: "GenePoole" <captinlogic@xxxxxxxxx>
- To: <bksvol-discuss@xxxxxxxxxxxxx>
- Date: Wed, 24 Jan 2007 02:14:09 -0500
Is there a decent way of eliminating or greatly reducing junk characters during a scan? It seems no matter how flat the book is on the scanner, I always get 1's and i's and j's and brackets in the oddest places. And big chunks of white space at the ends of lines. Ideas? Thanks. Oh, I'm using finereader 8 with 300 resolutionand manual brightness adjustment. To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.
--- End Message ---