[liblouis-liblouisxml] Re: [liblouis] r715 committed - the last batch of files converted to utf-8.

  • From: Mesar Hameed <mesar.hameed@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Tue, 3 Jul 2012 18:29:28 +0100

Hi Vic,

On Tue 03/07/12,12:16, Vic Beckley wrote:
> So in my example of the trademark symbol, which shows up as a^ in UTF-8,

No, it should not show up as a followed by a caret symbol, it should simply be 
the trademark symbol itself.
open the file with notepad plus plus,.


> how
> would you correctly write this using the \xhhhh format? If you were writing
> the table and wanted to define this symbol using UTF-8, 
> how would you find out what it was.

Just a small correction, \xhhhh or sometimes also written as u+hhhh is called 
the unicode code point of the symbol.
then how this is stored on the computer, is called the encoding.

so utf7, utf8, utf16 and utf32 are all different computer formats for encoding 
unicode, and are related to how many bytes 
are used for the minimal representation of each codepoint.
For a more detailed explenation, please have a look on wikipedia, both at 
"unicode", "utf8" etc.

So to your question:
if you use a screenreader, your screenreader probably has a shortcut key for 
telling you the codepoint of the character your cursor is 
currently standing on.

For nvda and orca, in desktop layout, this is the numpad 2, pressed three quick 
times.
for example orca is telling me trademark, 2122
If you are using a braille display and the character is not defined in your 
current table, you will see \x2122

If you were a sighted table writer, you probably have to go to the online 
unicode standard, and look in the long list of characters for the 
\xhhhh representation for the character you wanted.

hope this helps,
Mesar
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: