[liblouis-liblouisxml] UEB: Unicode char representation

  • From: Joseph Lee <joseph.lee22590@xxxxxxxxx>
  • To: liblouis-liblouisxml <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Mon, 19 Nov 2012 02:04:05 -0800

Hi John and others,
This is something that we may need to go over before continuing with
UeB table improvements:
As you may know, UEB assigns many Unicode chars to dot patterns. This
include Greek and Latin letters, math symbols, transcriber notes and
shapes.
At this time, LibLouis does not handle Unicode chars well - the
current (old) UEBC table does not even show Greek signs properly,
which is beyond the range of ASCII chars. If you read a passage
containing Unicode chars, the current UEBC code shows hex values for
Unicode chars beyond 255. This fact might be sort of a stumbling block
for languages which needs to show these Unicode characters (including
UEBC) using correct braille dot patterns.
Right now, I decided to experiment with encoding to see which one
would suit UEBC well - ANSI (works okay, but does not show Unicode
chars above 255), UTF-8 with or without bomb. It seems that one of
these two UTF-8 encodings would be best suited for UEBC. However, I
feel we need to do extensive testing to make sure that UEBC table does
what it is supposed to do: display complex Unicode symbols using
correct dot patterns, which would be useful for braille readers who
needs to access technical materials using correct UEBC signs.
Thanks.
//JL
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: