[liblouis-liblouisxml] Re: [liblouis] r715 committed - the last batch of files converted to utf-8.

  • From: "John J. Boyer" <john.boyer@xxxxxxxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Tue, 3 Jul 2012 15:13:17 -0500

To clarify, tables are "source" files and should be human-readable, just 
as program source code is human-readable. When you need a non-ascii 
character in Java, for example, you use the \uhhhh encoding. libloluis 
just uses x instead of u. 

I have not seen anyone except Mesar advocating for UTF-8 in the tables. 

There is a liblouis table called unicode.cti which can be used to find 
the hex values of Unicode characters. It contains comments which explain 
verbally what each character is.

John

On Tue, Jul 03, 2012 at 06:29:28PM +0100, Mesar Hameed wrote:
> Hi Vic,
> 
> On Tue 03/07/12,12:16, Vic Beckley wrote:
> > So in my example of the trademark symbol, which shows up as a^ in UTF-8,
> 
> No, it should not show up as a followed by a caret symbol, it should simply 
> be the trademark symbol itself.
> open the file with notepad plus plus,.
> 
> 
> > how
> > would you correctly write this using the \xhhhh format? If you were writing
> > the table and wanted to define this symbol using UTF-8, 
> > how would you find out what it was.
> 
> Just a small correction, \xhhhh or sometimes also written as u+hhhh is called 
> the unicode code point of the symbol.
> then how this is stored on the computer, is called the encoding.
> 
> so utf7, utf8, utf16 and utf32 are all different computer formats for 
> encoding unicode, and are related to how many bytes 
> are used for the minimal representation of each codepoint.
> For a more detailed explenation, please have a look on wikipedia, both at 
> "unicode", "utf8" etc.
> 
> So to your question:
> if you use a screenreader, your screenreader probably has a shortcut key for 
> telling you the codepoint of the character your cursor is 
> currently standing on.
> 
> For nvda and orca, in desktop layout, this is the numpad 2, pressed three 
> quick times.
> for example orca is telling me trademark, 2122
> If you are using a braille display and the character is not defined in your 
> current table, you will see \x2122
> 
> If you were a sighted table writer, you probably have to go to the online 
> unicode standard, and look in the long list of characters for the 
> \xhhhh representation for the character you wanted.
> 
> hope this helps,
> Mesar
> For a description of the software, to download it and links to
> project pages go to http://www.abilitiessoft.com

-- 
John J. Boyer; President, Chief Software Developer
Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: