[liblouis-liblouisxml] Re: Conflicting priorities

  • From: Keith Creasy <kcreasy@xxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Tue, 14 Jan 2014 12:41:26 +0000

Hello all.

In my more recent C programming, actually C++, I've moved toward using the ANSI 
String rather than throwing chars around. Would this be a good move for when we 
get to refactoring LibLouis and LibLouisUTDML after the March release? Maybe 
that would eliminate some of the confusion and tedium of various character 
representations. 

Keith


-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples
Sent: Tuesday, January 14, 2014 4:48 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: Conflicting priorities

UTF-8, UTF-16 and UTF-32 are byte encodings of unicode characters. They define 
how you will store a given unicode character in bytes. This is used for low 
level operations like storing and transmitting unicode strings.

Aaron is talking about unicode characters which combine with other characters 
(eg. an accent sign which applies to the previous character) vs the single 
accented character unicode character. All of these are dealing with characters, 
you may store them in any encoding.

If liblouis does no normalisation (which I suspect it does not) then you would 
have to define the character combinations in the tables.

If liblouis did do normalisation then there would only be one representation to 
define in the tables.

PS. I wish you would stop referring to liblouis using UTF-16, this is factually 
incorrect as the 16-bit unicode liblouis can be compiled for is a fixed width 
encoding, so limited to which characters it can represent, and so is UCS-2. 
UTF-16 is a variable width encoding and is capable of representing the full 
unicode set.

Michael Whapples
On 14/01/2014 00:54, John J. Boyer wrote:
> I'm still confused about what kind of Unicode Aaron is talking about. 
> liblouis itself uses either UTF-16 or UTF-32 depending on how it is compiled. 
> it does not recognize a letter followed by the accent Unicode value, although 
> this could be handled with a translation table. liblouisutdml requires UTF-8.
>
> John
>
> On Mon, Jan 13, 2014 at 03:23:57PM -0800, John Gardner wrote:
>> John, I sure am interested in math.  And I'd like to have the choices 
>> of graphics placement, enlargement, etc that I've described.  
>> ViewPlus embossers do the graphics to dots transformation though, and 
>> my expectation is that other embosser manufacturers are gonna have to 
>> do the same.  Since they won't, then my recommendation is that the 
>> tactile graphics be either placed at the end so they can be split off 
>> and put into whatever crappy software that embosser manufacturers 
>> make, or better still, be split into a completely separate folder.  I 
>> think this should be done in utdml, and I am working on a proposal 
>> for improving it, so my preference is to postpone that project for the 
>> moment.
>>
>> I'm not very interested in back translation, but getting emphasis 
>> right and getting other math languages working is a priority for me.  
>> Is this your question?
>>
>> You missed the point about those spaces.  White space is supposed to 
>> be ignored in token content.  However I have discovered many 
>> instances where regular spaces are used in mtext tokens.  I may be 
>> wrong, but I think this is wrong.  However in LEAN, I have filtered 
>> the white space out of tokens except for the mtext ones.  I am 
>> finding a lot of usage of extra spaces in mi and mn elements, 
>> presumably for readability.  And the Nemeth is leaving in those 
>> spaces, so the equations are wrong.  They aren't wrong if I make them from 
>> LEAN.
>>
>> John G
>>
>> -----Original Message-----
>> From: John J. Boyer [mailto:john.boyer@xxxxxxxxxxxxxxxxx]
>> Sent: Monday, January 13, 2014 2:25 PM
>> To: John Gardner
>> Subject: Conflicting priorities
>>
>> John,
>>
>> APH is most interested in emphasis for BrailleBlaster and in 
>> back-translation for the Braille Plus 18. They are interested in 
>> Nemeth also. Personally, my real interest is math and tactile 
>> graphics. I'm guessing that this is also your greatest interest.
>>
>> I hope you have solved the problem of getting the wrong Unicode values.
>> If you tell me the Unicode value for the unwanted space I can modify 
>> the nemeth.cti table and send it to you as an attachment.
>>
>> John
>>
>> --
>> John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.
>> http://www.abilitiessoft.com
>> Madison, Wisconsin USA
>> Developing software for people with disabilities
>>

For a description of the software, to download it and links to project pages go 
to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: