Its perhaps worth noting that K1000 will attempt to identify "real"
paragraph endings in a text file that is formatted such that each line is a
single line paragraph. Its not perfect at doing so - end of paragraph
decisions are actually quite hard to make without understanding the text
that is being read - but its better than nothing. So, you can open a text
file that is formatted in that manner, and then save the opened file as,
say, RTF, which is a more intelligent format in terms of differentiating
between lines and paragraphs. Or you can save it again as text, after
setting the maximum line length in the general settings dialog to some very
large number. That will cause each paragraph, or at least the K1000's
notion of paragraph, on a single line, which is probably what you want if
you must use text.
Stephen
At 08:06 AM 6/8/2005, you wrote:
Monica, check if perhaps real paragraphs are denoted by two paragraph marks in sequence. If that were the case, do a mass replacement of paragraph marks pairs with a unique word. Then do a mass deletion of all remaining paragraph marks. Finally replace back the unique word with paragraph mark pairs.
Guido
Guido Dante Corona IBM Accessibility Center, Austin Tx. Research Division, Phone: 512. 838. 9735. Email: guidoc@xxxxxxxxxxx Web: http://www.ibm.com/able
"Monica Ballard" <MBallard1@xxxxxxxxxxx> Sent by: bksvol-discuss-bounce@xxxxxxxxxxxxx
06/08/2005 05:58 AM Please respond to bksvol-discuss
To <bksvol-discuss@xxxxxxxxxxxxx> cc Subject [bksvol-discuss] Validating and Paragraph Marks
?OCR puts those paragraph symbolss in ?
Iâ??ve seen lots of files like that. Sometimes a line by line match is important so it must be common for many OCRs. To get around the tediousness of not being able to do a global replace on them, sometimes itâ??s faster to go through the document and paste a unique word at every genuine paragraph break, then do a global replace to get rid of all paragraph marks. Finally, Iâ??ll go back and replace my unique word with paragraph marks.
Monica