[liblouis-liblouisxml] Re: Paragraphs in Text Files

  • From: Keith Creasy <kcreasy@xxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Fri, 28 Mar 2014 11:58:08 +0000

Why not add it to BrailleBlaster in Java? Why Python considering that nothing 
else is a Python script in this project?


I considered this nd might just do it. Probably would take me about an hour. :)

K


-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples
Sent: Friday, March 28, 2014 7:34 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: Paragraphs in Text Files

John, yes a good example about the PDF to text conversion tools. My experience 
of those also is that they put in hard line breaks, because they are trying to 
maintain the page layout (its a good job those tools do as for some really 
awful PDF documents it was the only way I could read it was to maintain page 
layout and read using a Braille display so I could find the various columns.

Also OCR text output might be that way as well, again to maintain page layout.

Considering all these possible variations and the time spent on discussing 
this, I feel a more constructive option might be just to do external filters 
with the handling you want. How long would it take to write a python script 
which reads a text file, wrapping the whole document in 
<html><body>...</body></html> tags, and then wrapping each line in <p>...</p> 
tags? I think not long, possibly similar time to writing a few emails on this 
topic (eg. I probably could have done it in the time I have spent discussing 
this topic already). A Java filter would probably take little more time.

Michael Whapples
On 28/03/2014 11:21, John J. Boyer wrote:
> Again, I see lots of text files with line breaks within paragraphs. 
> For example, when I convert a pdf for Scientific American to text using Adobe 
> Reader, I get short lines with blank lines between paragraphs. Also TeX uses 
> blank lines between paragraphs.
>
> A configuration setting is a good idea, but the default should be the current 
> behavior. We do have to  keeep in mind what users of the software expect.
>
>                 John
>
> On Fri, Mar 28, 2014 at 07:26:19AM +0000, Keith Creasy wrote:
>> Hi Michael. It is also a very old editor, previously known as Pico 
>> and most likely older than you. :)
>>
>> It is used by a relatively small number of mostly Linux programmers and 
>> administrators and not much if at all for general content authoring. It is 
>> my editor of choice for Linux scripts but I don't even use it for much else 
>> but that and writing commit log messages from within Git and Mercurial.
>>
>> Again, I'm fine with a configuration to make it optional but the default 
>> should be that new-line chars mark the beginning of a paragraph and the 
>> configuration option for the rare occaisons when someone might actually be 
>> using an obscure text file that contains hard line breaks. My opinion is 
>> that the current implementation is wrong based on current conventions.
>>
>> Keith
>>
>>
>> Yes, I regard Nano as an isolated exception
>>
>> -----Original Message-----
>> From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
>> [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of 
>> Michael Whapples
>> Sent: Thursday, March 27, 2014 4:05 PM
>> To: liblouis-liblouisxml@xxxxxxxxxxxxx
>> Subject: [liblouis-liblouisxml] Re: Paragraphs in Text Files
>>
>> I don't know as to how widely spread text files with hard line breaks are, 
>> but I believe in some text editors (eg. nano on Linux) that if one uses line 
>> wrap it will insert hard line breaks. May be nano is an isolated example.
>>
>> The ideal for the user would be to add a configuration option.
>>
>> Michael Whapples
>> On 27/03/2014 18:05, Keith Creasy wrote:
>>> OK. Does anyone use this for text files with hard line breaks in them? Can 
>>> we just change it so it uses new-line chars to start new paragraphs as the 
>>> default? I haven't seen a text file with hard line breaks in it since about 
>>> 1986.
>>>
>>> Keith
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
>>> [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of John 
>>> J. Boyer
>>> Sent: Thursday, March 27, 2014 11:36 AM
>>> To: liblouis-liblouisxml@xxxxxxxxxxxxx
>>> Subject: [liblouis-liblouisxml] Re: Paragraphs in Text Files
>>>
>>> Not yet. Besides a new configuration setting it would be necessary to 
>>> modify the  functions in transcriber.c that deal with text files.
>>>
>>> John
>>>
>>> On Thu, Mar 27, 2014 at 02:51:59PM +0000, Keith Creasy wrote:
>>>> Hello.
>>>>
>>>> Is there a setting that would cause LibLouisUTDML to treat each new-line 
>>>> in a text file as the beginning of a new paragraph? By default it seems to 
>>>> require a blank line, that is two new-line characters, to do this. Most 
>>>> text editors these days use a new-line to begin a new paragraph.
>>>>
>>>> Thanks.
>>>>
>>>> Keith
>>>>
>>> --
>>> John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.
>>> http://www.abilitiessoft.com
>>> Madison, Wisconsin USA
>>> Developing software for people with disabilities
>>>
>>> For a description of the software, to download it and links to 
>>> project pages go to http://www.abilitiessoft.com For a description 
>>> of the software, to download it and links to project pages go to 
>>> http://www.abilitiessoft.com
>> For a description of the software, to download it and links to 
>> project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to project pages go 
to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: