[liblouis-liblouisxml] Re: output from file2brl

  • From: Greg Kearney <gkearney@xxxxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Mon, 19 May 2014 17:51:34 -0700

I am using the default configuration file 

Sent from my iPhone

Greg Kearney
Commonwealth Braille and Talking Book Cooperative

> On 19 May 2014, at 4:32 pm, "Michael Whapples" <dmarc-noreply@xxxxxxxxxxxxx> 
> (Redacted sender "mwhapples@xxxxxxx" for DMARC) wrote:
> 
> A gut feeling looking at the output is that it could be an encoding issue in 
> reading the input file.
> 
> Could you send the configuration file being used?
> 
> On what you have sent me, or may be more accurately what I have recieved, if 
> you are using the default configuration files there should be no issue.
> 
> The HTML file I recieved is UTF-8 encoding, is this correct or might an email 
> client have changed it somewhere along the line?
> 
> Whilst the encoding is what I suspect, it seems unlikely if you are using 
> defaults and if the email system did not change the HTML encoding.
> 
> To explain the output the notation '\zXXXXXXXX' (as in '\z004e004e') means 
> that liblouis did not recognise a character with ordinal in the number.
> 
> Putting some of the values into UTF16 encoding leads to something which 
> resembles the actual text of the file content.
> 
> Although thinking about this, UTF16 would not lead to such high values by 
> itself. What I am getting the feeling of is, LibLouis is compile for UCS4 
> (32-bit unicode) and LibLouisUTDML and thus file2brl are compiled for UCS2 
> (16-bit unicode). So when file2brl reads the HTML, it is packing the 
> characters into 16-bit width characters, but then liblouis is reading those 
> as 32-bit wide characters, oops, out pops something which is nonsense because 
> it recognises none of the characters.
> 
> Are you sure your file2brl and liblouisutdml were compiled against the 
> particular version of liblouis you are using?
> 
> Michael Whapples
>> On 19/05/2014 23:12, Greg Kearney wrote:
>> From a python program:
>> 
>> os.system("file2brl -t " +  newname + " " + out file )
>> 
>> which would be in normal usage:
>> 
>> file2brl -t file.html file.brf
>> 
>> 
>> Here is the HTML file: 
>> 
>> 
>> 
>> 
>> And here is what I get:
>> 
>> 
>> 
>> 
>> Commonwealth Braille & Talking Book Cooperative
>> Greg Kearney, General Manager
>> 605 Robson Street, Suite 850
>> Vancouver BC V6B 5J3
>> CANADA
>> Email: info@xxxxxxxxx
>> 
>> U.S. Address
>> 21908 Almaden Av.
>> Cupertino, CA 95014
>> UNITED STATES
>> Email: gkearney@xxxxxxxxx
>> 
>> 
>> 
>> On May 19, 2014, at 3:03 PM, John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx> 
>> wrote:
>> 
>>> What is your command line? Please give more detail when you have a problem.
>>> 
>>> John
>>> 
>>> On Mon, May 19, 2014 at 02:56:06PM -0700, Greg Kearney wrote:
>>>> When I run file2brl I am getting this kind of output, why?
>>>> 
>>>> '\z004e004e''\z-
>>>> 004c
>>>>  '\z00680054''\z-
>>>> 00200065''\z006-
>>>> 1004e''\z0069007-
>>>> 4''\z006e006f''-
>>>> \z006c0061''\z00-
>>>> 4e0020''\z00740-
>>>> 065''\z006f0077'-
>>>> '\z006b0072''\z-
>>>> 006f0020''\z0020-
>>>> 0066''\z0071004-
>>>> 5''\z00690075''\-
>>>> z00610074''\z00-
>>>> 6c0062''\z002000-
>>>> 65''\z0069004c'-
>>>> '\z00720062''\z0-
>>>> 0720061''\z0020-
>>>> 0079''\z00650053-
>>>> ''\z00760072''\-
>>>> z00630069''\z002-
>>>> 00065''\z004e00-
>>>> 28''\z0045004e''-
>>>> \z0053004c''\z0-
>>>> 0200029''\z00730-
>>>> 069''\z00610020-
>>>> ''\z00200020''\z-
>>>> 006f0077''\z006-
>>>> c0072''\z002d006-
>>>> 4''\z00690066''-
>>>> \z00730072''\z00-
>>>> 
>>>> 
>>>> Commonwealth Braille & Talking Book Cooperative
>>>> Greg Kearney, General Manager
>>>> 605 Robson Street, Suite 850
>>>> Vancouver BC V6B 5J3
>>>> CANADA
>>>> Email: info@xxxxxxxxx
>>>> 
>>>> U.S. Address
>>>> 21908 Almaden Av.
>>>> Cupertino, CA 95014
>>>> UNITED STATES
>>>> Email: gkearney@xxxxxxxxx
>>>> 
>>>> 
>>>> 
>>>> For a description of the software, to download it and links to
>>>> project pages go to http://www.abilitiessoft.com
>>> -- 
>>> John J. Boyer; President, Chief Software Developer
>>> Abilitiessoft, Inc.
>>> http://www.abilitiessoft.com
>>> Madison, Wisconsin USA
>>> Developing software for people with disabilities
>>> 
>>> For a description of the software, to download it and links to
>>> project pages go to http://www.abilitiessoft.com
> 

Other related posts: