[liblouis-liblouisxml] Re: Updating ar-ar-g1.utb (adding arabic numerals)

  • From: Mesar Hameed <mesar.hameed@xxxxxxxxx>
  • To: Christian Egli <christian.egli@xxxxxx>
  • Date: Tue, 26 Apr 2011 13:25:16 +0100

Hi,

Please find attached patch.
I decided to break up the paragraph to a list, to make it easier to read.

If it is accepted, then the encoding of the files in svn should be changed.

Thank you.

On Tue 26/04/2011 at 11:13:01, Christian Egli wrote:
> Mesar Hameed <mesar.hameed@xxxxxxxxx> writes:
> 
> > I think we got two things intermixed.
> >
> > This is what I understood from the conversation.
> >
> > 1. The files (utb, cti, ctb, etc) should be utf8 encoded.
> > 2. All liblouis statements and arguments should be in ascii.
> > 2.5. If the louis statement is defining an unicode character, it should be 
> > written as \xhhhh where hhhh is the 
> > unicode hex code of the character in question.
> > 3. Comments may have unicode text.
> >
> > I hope this is correct?
> 
> I think this is a very good summary. The manual contains a brief section
> on encoding of the braille tables[1], which I think is OK. However If
> you'd manage to integrate your concise description in there I'd be very
> happy to accept a patch.
> 
> Thanks
> Christian
> 
> Footnotes: 
> [1]  
> http://code.google.com/p/liblouis/source/browse/trunk/doc/liblouis.texi#452
> -- 
> Christian Egli
> Swiss Library for the Blind, Visually Impaired and Print Disabled
> Grubenstrasse 12, CH-8045 Zürich, Switzerland
> 
> -----
> Die SBS laedt Sie herzlich ein:
> Tag der offenen Tuer am 25. Juni 2011 von 9 bis 16 Uhr.
> Mehr Informationen erhalten Sie unter http://www.sbs.ch/offenetuer
Index: doc/liblouis.texi
===================================================================
--- doc/liblouis.texi   (revision 479)
+++ doc/liblouis.texi   (working copy)
@@ -3,6 +3,7 @@
 @setfilename liblouis.info
 @include version.texi
 @settitle Liblouis User's and Programmer's Manual
+@documentencoding UTF-8
 
 @dircategory Misc
 @direntry
@@ -449,36 +450,52 @@
 
 The names used for files containing translation tables are completely
 arbitrary. They are not interpreted in any way by the translator.
-Contraction tables may be 8-bit ASCII files, 16-bit big-endian Unicode
-files or 16-bit little-endian Unicode files. Blank lines are ignored.
-Any leading and trailing whitespace (any number of blanks and/or tabs)
-is ignored. Lines which begin with a number sign or hatch mark
-(@samp{#}) are ignored, i.e. they are comments. If the number sign is
-not the first non-blank character in the line, it is treated as an
-ordinary character. If the first non-blank character is less-than
-(@samp{<}) the line is also treated as a comment. This makes it possible
-to mark up tables as xhtml documents. Lines which are not blank or
-comments define table entries. The general format of a table entry is:
+The files (utb, cti, ctb, etc) should be either ASCII or UTF-8 encoded.
+All opcode and operands must be in ascii.
+If the louis statement is defining an unicode character, it should
+be written as \xhhhh where hhhh is the
+unicode hex code of the character in question, see below.
+You may write comments in your own language and/or insert the unicode symbol 
being defined as part of the comment.
 
+The files are of this form:
+
+@table @kbd
+@item Blank lines
+Are ignored.
+@item Leading and trailing whitespace
+(any number of blanks and/or tabs) is ignored. 
+@item Lines which begin with a number sign or hatch mark (@samp{#})
+Are ignored, i.e. they are comments.
+@item If the number sign is not the first non-blank character in the line
+it is treated as an ordinary character.
+@item If the first non-blank character is less-than (@samp{<})
+the line is also treated as a comment. This makes it possible to mark up 
tables as xhtml documents. 
+@item Lines which are not blank or comments
+define table entries.
+@end table
+
+The general format of a table entry is:
+
 @example
 opcode operands comments
 @end example
 
 Table entries may not be split between lines. The opcode is a mnemonic
-that specifies what the entry does. The operands may be character
+that specifies what the entry does. The operands are written using ASCII, and 
they may define character
 sequences, braille dot patterns or occasionally something else. They
-are described for each opcode. With some exceptions, opcodes expect a
+are described for each opcode, please @xref{Opcode Index}.
+With some exceptions, opcodes expect a
 certain number of operands. Any text on the line after the last
 operand is ignored, and may be a comment. A few opcodes accept a
 variable number of operands. In this case a number sign begins a
-comment unless it is preceded by a backslash (@samp{\}). @xref{Opcode
-Index}, for a list of opcodes, with a link to each one.
+comment unless it is preceded by a backslash (@samp{\}). 
 
 Here are some examples of table entries.
 
 @example
 # This is a comment.
 always world 456-2456 A word and the dot pattern of its contraction
+digit \x0662 12 # Arabic numeral 2 (٢)
 @end example
 
 Most opcodes have both a "characters" operand and a "dots" operand,

Other related posts: