[liblouis-liblouisxml] Re: widechar vs utf-16/utf-32

  • From: James Teh <jamie@xxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Tue, 10 Jul 2012 19:07:07 +1000

Hi.

First, widechar in liblouis depends on how liblouis was built. I'm going to assume it is 2 bytes, which I think is the default; it certainly is on Windows.

As I understand it, widechar in liblouis isn't UTF-16. UTF-16 allows 32 bit Unicode characters to be represented (using multiple characters), but liblouis does not; you have to use a 4 byte widechar type for 32 bit Unicode. However, for 16 bit Unicode characters, yes, UTF-16 and widechar should be equivalent.

Jamie

On 2/07/2012 8:44 PM, Christian Egli wrote:
Hi all

I was trying to enhance the test code in test/brl_checks.c so that test
code can use utf-8 strings instead of ascii with \xhhhh encodings. For
that I was trying some code from gnulib to convert utf-8 to utf-16 as I
thought widechar was basically ucs-2 (which was superseded by
utf-16[1]). Now my code doesn't work. This could have many reasons but
the question is of course: is the widechar type really utf-16? Or is it
maybe utf-16 with little endian? Or is it something else altogether?

Thanks
Christian

Footnotes:
[1]  http://en.wikipedia.org/wiki/UTF-16


--
James Teh
Director, NV Access Limited
Email: jamie@xxxxxxxxxxxx
Web site: http://www.nvaccess.org/
Phone: +61 7 5667 8372
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: