Also, the definition of widechar changes according to whether UCS = 2 or 4 in configure. John On Wed, Aug 07, 2013 at 02:16:44PM +0200, Bert Frees wrote: > 2013/8/6 Michael Whapples <mwhapples@xxxxxxx> > > > That still does not fully answer my question. > > > > My main concern is that all the time the 16-bit encoding of liblouis is > > referred to as UCS2 which is a fixed width encoding for 16-bit unicode code > > points (IE. characters between \x0000 and \xffff). UTF-16 on the other hand > > while being based on 16-bit code points, is not fixed width as it can > > accept characters up to \x10ffff by using surrogate pairs. > > > > For some details on what I am getting at, may be read > > http://en.wikipedia.org/wiki/**UCS2 <http://en.wikipedia.org/wiki/UCS2> > > > > So my question is, what happens should a ucs2 build of liblouis be passed > > one of these surrogate pairs for characters between \xffff and \x10ffff? > > > > Python and Java, from what I can tell do not seem to have a codec for > > UCS2, and the wikipedia article seems to suggest that UTF-16 superseeds > > UCS2 in version 2.0 of the unicode standard. Thus if I use the UTF-16 > > encoding to prepare inbuf, I could easily end up with one of these > > surrogate pairs. > > > > Is the use of UCS2 in liblouis terminology accurate (IE. being fixed width > > and not accepting the surrogate pairs) or is the term UCS2 just used either > > for historic or other reasons but actually is UTF-16. > > > > What will happen should I pass one of these surrogate pairs in inbuf? > > > > > Hi Michael, > > I don't think it would work because the parameters typeform, > inputPositions, outputPositions, hyphens, etc. all assume each 2 bytes > represent one character/position in the string. If we'd want to handle > UTF-16 encoded strings, we would need to do something similar to what we do > with UTF-8 encoded strings. -- John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc. http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com