[liblouis-liblouisxml] Re: liblouis treats valid UTF-8 sequences as invalid

  • From: "John J. Boyer" <john.boyer@xxxxxxxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Sun, 9 Sep 2012 22:33:34 -0500

I couldn't find a suitable algorithm, so I devised my own. It is used in 
liblouisutdml where it works perfectly. It may not be the optimal 
algorithm. Windows is compiling something differently than gcc.

John

On Mon, Sep 10, 2012 at 01:25:06PM +1000, James Teh wrote:
> Do you have a reference for the UTF-8 parsing algorithm you used? I 
> don't understand the code, so debugging it is tricky.
> 
> Jamie
> 
> On 10/09/2012 1:03 PM, John J. Boyer wrote:
> >I think this will have to be traced with the Windows debugger. There is
> >no error in Linux. Use compileError as the breakpoint. The loop appears
> >to be going too far.
> >
> >John
> >
> >On Mon, Sep 10, 2012 at 09:30:06AM +1000, James Teh wrote:
> >>Hi all,
> >>
> >>When I try to use the UEBC-g1.utb table in Windows, I get the following
> >>problems:
> >>uebc-g1.utb:309: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:309: error: Character '\x0017' is not defined
> >>uebc-g1.utb:310: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:310: error: Character '\x0017' is not defined
> >>uebc-g1.utb:311: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:311: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:311: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:311: error: Character '\x0007' is not defined
> >>uebc-g1.utb:312: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:312: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:312: warning: invalid UTF-8. Assuming Latin-1.
> >>uebc-g1.utb:312: error: Character '\x0007' is not defined
> >>8 warnings issued
> >>4 errors found.
> >>
> >>Lines 309 and 310 are for the × character, while 311 and 312 are for the
> >>÷ character. The file is correctly UTF-8 encoded; × is encoded as
> >>\xc3\x97 and ÷ is encoded as \xc3\xb7, which are both correct.
> >>
> >>Any ideas as to what's going on here?
> >>
> >>Thanks,
> >>Jamie
> >>
> >>--
> >>James Teh
> >>Director, NV Access Limited
> >>Email: jamie@xxxxxxxxxxxx
> >>Web site: http://www.nvaccess.org/
> >>Phone: +61 7 5667 8372
> >>For a description of the software, to download it and links to
> >>project pages go to http://www.abilitiessoft.com
> >
> 
> -- 
> James Teh
> Director, NV Access Limited
> Email: jamie@xxxxxxxxxxxx
> Web site: http://www.nvaccess.org/
> Phone: +61 7 5667 8372
> For a description of the software, to download it and links to
> project pages go to http://www.abilitiessoft.com

-- 
John J. Boyer; President, Chief Software Developer
Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: