[liblouis-liblouisxml] Re: UEB: Unicode char representation

  • From: Joseph Lee <joseph.lee22590@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 19 Nov 2012 19:10:11 -0800

Hi,
Yes, although some dot patterns need to be changed.
The overall structure of the en-ueb series follows that of old uebc tables:
* en-ueb-g1.ctb: required UEB symbols and other Unicode chars. Same as
en-us-g1.ctb except these additional uNicode chars are added.
* en-ueb-g2.ctb: includes en-ueb-g1.ctb and adds grade 2 translation rules.
This is the plan I thought of unless others have any other comments. Thanks.
//JL


On 11/19/12, Greg Kearney <gkearney@xxxxxxxxx> wrote:
> Would a table something like this help with the UEB unicode issues:
>
> # potential conflicts with regular pubtuation signs
> letter                \x002C          3456-2                  comma & set 
> numeric & grade 1 word modes [3.5]
> letter                \x002E          3456-256                dot (decimal 
> point) & set numeric & grade 1 word
> modes [3.5]
> # End of conflicting signs.
>
> letter                \x2225          3456-123                parallel to 
> [3.16]
> letter                \x021D          3456-13456              Latin small 
> letter yogh [2009-9-20]
> letter                \x221E          3456-123456             Latin small 
> letter thorn [2009-9-20]
> letter                \x00F0          3456-1246               Latin small 
> letter eth [2009-9-20]
> letter                \x01BF          3456-2456               Latin letter 
> wynn (wen) [2009-9-20]
>
>
> letter                \x22A5          3456-36                 up tack (= 
> perpendicular) [3.16]
> letter                \x22BE          3456-456-246    right angle with arc 
> (or similar figure with
> "squared off" arc) [3.16]
> letter                \x0400          41                              at-sign 
> [3.17]
> letter                \x00A2          4-14                    cent sign [3.25]
>
> I'm taking the symbols from
> https://spreadsheets.google.com/pub?key=td22ZEbcYFZRtoMixEN63ZQ&output=html
>
>
> Gregory Kearney | Manager Accessible Media
> Association for the Blind of WA - Guide Dogs WA
> PO Box 101, Victoria Park WA 6979 | 61 Kitchener Ave, Victoria Park WA 6100
> Tel: 08 9311 8246 | Fax: 08 9361 8696 | www.guidedogswa.com.au
> Tel: 307-224-4022 (North America)
> Email: greg.kearney@xxxxxxxxxxxxxxxxxx
> Email: gkearney@xxxxxxxxx
>
> Everyone has the right to freedom of opinion and expression; this right
> includes freedom to hold opinions without interference and to seek, receive
> and impart information and ideas through any media and regardless of
> frontiers.
> Article 19 of the UN Universal Declaration of Human Rights
>
> On 20/11/2012, at 1:44 AM, Joseph Lee <joseph.lee22590@xxxxxxxxx> wrote:
>
>> Hi,
>> Thanks - I'll investigate. Or perhaps I could include contents of
>> Nemeth table to en-ueb-g1.ctb (encoded in UTF-8) to see what
>> happens.Cheers,
>> Joseph
>> On 11/19/12, John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx> wrote:
>>> liblouis does handle Unicode characters. Look at nemeth.ctb and its
>>> included files. They contain Greek letters, mathematical symbols, etc.
>>> written in the form \xhhhh For some months now you have been able to
>>> write characters in UTF-8 encoding, if you have a keyboard that can do
>>> it.
>>>
>>> John
>>>
>>> On Mon, Nov 19, 2012 at 02:04:05AM -0800, Joseph Lee wrote:
>>>> Hi John and others,
>>>> This is something that we may need to go over before continuing with
>>>> UeB table improvements:
>>>> As you may know, UEB assigns many Unicode chars to dot patterns. This
>>>> include Greek and Latin letters, math symbols, transcriber notes and
>>>> shapes.
>>>> At this time, LibLouis does not handle Unicode chars well - the
>>>> current (old) UEBC table does not even show Greek signs properly,
>>>> which is beyond the range of ASCII chars. If you read a passage
>>>> containing Unicode chars, the current UEBC code shows hex values for
>>>> Unicode chars beyond 255. This fact might be sort of a stumbling block
>>>> for languages which needs to show these Unicode characters (including
>>>> UEBC) using correct braille dot patterns.
>>>> Right now, I decided to experiment with encoding to see which one
>>>> would suit UEBC well - ANSI (works okay, but does not show Unicode
>>>> chars above 255), UTF-8 with or without bomb. It seems that one of
>>>> these two UTF-8 encodings would be best suited for UEBC. However, I
>>>> feel we need to do extensive testing to make sure that UEBC table does
>>>> what it is supposed to do: display complex Unicode symbols using
>>>> correct dot patterns, which would be useful for braille readers who
>>>> needs to access technical materials using correct UEBC signs.
>>>> Thanks.
>>>> //JL
>>>> For a description of the software, to download it and links to
>>>> project pages go to http://www.abilitiessoft.com
>>>
>>> --
>>> John J. Boyer; President, Chief Software Developer
>>> Abilitiessoft, Inc.
>>> http://www.abilitiessoft.com
>>> Madison, Wisconsin USA
>>> Developing software for people with disabilities
>>>
>>> For a description of the software, to download it and links to
>>> project pages go to http://www.abilitiessoft.com
>>>
>> For a description of the software, to download it and links to
>> project pages go to http://www.abilitiessoft.com
>
> For a description of the software, to download it and links to
> project pages go to http://www.abilitiessoft.com
>
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: