[bookshare-discuss] New Release of xml2brl With Some Math

  • From: "John J. Boyer" <john@xxxxxxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx, <bookshare-discuss@xxxxxxxxxxxxx>
  • Date: Tue, 7 Sep 2004 13:49:47 -0500 (CDT)

And it does real nice on Bookshare.org xml  files. I just downloaded and 
translated "Cat's Cradle" by Kurt Vonnegut. It has print page separator 
lines and the print page numbers in the upper righthand corner of each 
braille page. Below is a detailed description and instructions for 
dowenloading.

Computers to Hepp People, Inc. is proud to announce release 0.2 of the
xml2brl program, which translates files in xml or plain text into brf
files suitable for direct printing on a braille embosser. It now handles
some math. Below is the README file for the program, which gives details
of usage. To download the program, go to www.chpi.org/whatsnew.html.
This is Open Source, so anybody can download and compile it. Currently
it runs only on Linux, but we are looking into a Windows port.

---------------------


                              THE xml2brl PROGRAM

   This  is Release 0.2 of the xml2brl program. Changes from the previous
   release are listed in the file ChangeLog. This README file details any
   changes  in  usage.  The  most notable is that the program now handles
   some MathML.

   If  you are reading the plain-text README file, you may find it useful
   to  load  README.html  into  your  browser. This will enable you to go
   directly  to the sites where you can download the libraries upon which
   this  software  depends. Once you have installed xml2brl, you can also
   get  a  braille copy by running README.html through the program. It is
   written in xhtml, which is an xml flavor.

   The  braille  translation  part  of  the  xml2brl  program is based on
   BRLTTY.  All  the necessary BRLTTY files for use in the U.S. have been
   included  in  the  package. However, if you need different contraction
   tables or different text tables, you must obtain them from BRLTTY. You
   can     download     the     latest    version    of    BRLTTY    from
   [1]http://dave.mielke.cc/brltty.

   Besides BRLTTY, this software depends on the following libraries:
   glib [2]ftp://ftp.gtk.org/pub/gtk/v2.4
   libxml2 [3]http://www.xmlsoft.org
   gdome2 [4]http://gdome2.cs.unibo.it

   You  must  download the latest versions of these libraries and install
   them in the above sequence.

   The  program  accepts  input  files in either xml or plain text and in
   many  natural languages (which may be in UTF-8 Unicode) and produces a
   brf  file  suitable for printing directly on an embosser. The brf file
   has  the  same  format  as  the files on Web-Braille and should behave
   exactly the same.

   xml  files must be well-formed. They are transcribed as specified by a
   semantic-actions  (.sem) file. If no such file exists for a given root
   element,  a  prototype  file  is created. Its name is formed by adding
   ".sem"  to  the  name of the root element, for example, "dtbook3.sem".
   The  user must then edit this file to obtain phoper transcription. The
   program  will  print a warning message if the editing step is omitted.
   Instructions  on  how  to  do  this  editing  are given in the section
   "SEMANTIC-ACTIONS FILES" in this document.

   The  program  tests  whether a file is xml. If not, it assumes a plain
   text file. In this file, lines may be of any length. Paragraphs should
   be  separated by blank lines. Lines within paragraphs are concatenated
   before  translation, with blanks in place of newlines. If a blank line
   is desired in the output, use three blank lines.

   Whether  the  file  is  xml or plain text, paragraphs are indented two
   spaces.  There is a braille page number in the lower right-hand corner
   of each page. If an xml file contains print page numbers, and this has
   been specified in the semantic-actionss file, a page-separator line is
   placed  between  print pages, and the print page number appears in the
   upper right-hand corner, proceeded by the letters a, b, etc.

   The command line is:
   xml2brl inputfile outputfile
   If  you  omit  both  inputfile  and  outputfile  the program acts as a
   filter,  taking input from stdin and delivering output to stdout. This
   enables  xml2brl to be used in a chain of printer drivers, with output
   directly  to an embosser, if desired. If you wish to specify an output
   file  but  take  input  from  stdin,  use  a  minus  sign  in place of
   inputfile.  Options  are  set in a configuration file discussed in the
   section "CONFIGURATION FILE".

   The author wishes to acknowledge his debt to the BRLTTY team. to learn
   more     about     BRLTTY     go     to    its    official    website,
   [5]http://dave.mielke.cc/brltty. The section "FILES" below tells which
   files have been copied from BRLTTY.

   Like  BRLTTY, this software is under the Gnu Public License (GPL). The
   non-BRLTTY  portions  are  copyright  by  the  author,  John J. boyer,
   director@xxxxxxxx  .  The  libraries listed previously are all part of
   the  GNOME project and are under the Lesser Gnu Public License (LGPL).
   Details are given in the file C COPYING

INSTALLATIoN

   This  is  an  alpha  release. Therefore, it is best to install it in a
   subdirectory  of  your  home  directory.  To  do  this,  download  the
   distribution tarball into your home directory, then type
   tar xfz xml2brl-xxx.tar.gz
   This  will  create  the  directory xml2brl-xxx, where xxx is a version
   number.   After   installing   any  necessary  libraries,  go  to  the
   xml2brl-xxx  directory  and  type "make". This will create the xml2brl
   program. If you wish to re-create the program, first type "make clean"
   and then "make".

   Before  you try to run the program, execute the following statement at
   the command prompt:
   export LD_LIBRARY_PATH='/usr/local/lib'
   You may wish to add this command to your .bashrc script.

SEMANTIC-ACTIONS FILES

   These  files  tell  xml2brl how to handle your documents. Whenever the
   program encounters a new root element, it creates a prototype semantic
   actions file. Each line in this file has two columns. The first column
   is  the  word  "no",  signifying  that  no  semantic  action  has been
   specified.  The  second  column  may  contain one of the following: an
   element  name;  an  element  name, followed by a comma, followed by an
   attribute  name;  an element name, followed by a comma, followed by an
   attribute  name,  followed  by  a  comma,  followed  by  the first few
   characters  of an attribute value. The program prints a message saying
   it  is  creating this file, then terminates. Semantic files have names
   composed of the root element name and '.sem'.

   To  get  xml2brl  to transcribe your document correctly, you must edit
   the semantic file, replacing the word "no" in the first column with an
   appropriate  semantic action, such as "para" for paragraph, "heading1"
   for  the  main  heading,  etc.  The file sem-enum.h contains a list of
   valid  semantic  actions, most of which should be self-explanatory. If
   you  rerun  the  program without editing the semantic-actions file, it
   prints  a  message saying that the output will be unformatted. You can
   add  comments  to  the  file  by  using a number sign (#) as the first
   non-blank character in a line.

   If  you transcribe a new document with the same root element, but with
   additional  element  names,  attribute  names or values, these will be
   added  to  the  end  of  the  semantics-action  file, proceeded by the
   comment "#appended entries". You may then edit the new entries. If you
   wish  the program to continue to take no action for an entry, leave it
   unchanged.  Do  not comment it out. This will cause the program to add
   it to the end of the file as a new entry.

   Several semantic-actions files are provided with the program. There is
   one  for  dtbook3  files, such as those produced by Bookshare.org, for
   xhtml files, with or without included MathML, for Microsoft Word files
   exported as xml, and for docbook files.

CONFIGURATION FILE

   As   mentioned   previously,   options   for  xml2brl  are  set  by  a
   configuration  file.  This  file is called "xml2brl.cfg" and resembles
   the  semantics-actions  files.  Each  line has two columns, a keyword,
   such  as  CellsPerLine, and a value such as 40. Comments are proceeded
   by "#". The keywords should be self-explanatory.

FILES

   The  following  files  have  been  copied  without change from BRLTTY:
   brldefs.h  brl.h  countries.cti  ctb_compile.c ctb_definitions.h ctb.h
   ctb_translate.c  en-us-g2.ctb  misc.h tbl.c tbl.h text.nabcc.tbl. Note
   the   following   exceptions:  The  line  "include  countries.cti"  in
   us-en-g2.ctb  has  been  changed  to "include specsym.cti". The misc.c
   file  was  cut  down to only the functions needed by xml2brl and these
   functions were considerably modified.

   The following files were produced by the author:

   brffilt.c:  A  small filter for viewing brf files on a braille display
   with  translation  mode in BRLTTY turned off. It can also be used as a
   prototype  for  writing  other filters. To compile it, use the command
   line "gcc -o brffilt -O2 -Wall brffilt.c"

   ChangeLog: log of changes made from release to release

   COPYING: Detailed license

   dtbook3.sem: Semantic-actions file for books from Bookshare.org

   en-us-mathtext.ctb: Translation table for math documents

   examine_document.c:    Traverse2s    the   DOM   tree   to   determine
   characteristics  of  the  document,  such as whether it contains math.
   Also does preprocessing.

   html.sem: Semantics-action file for xhtml documents

   Makefile: For compiling the whole program.

   readconfig.c: Reads and processes the configuration file

   readconfig.h: Header file for above

   README: plain-text version of the folling

   README.html: This file.

   read_TextTable.c: Basically a wrapper for the functions in tbl.c

   semantics.c:  Contains  functions  for handling semantics-action files
   and tables

   semantics.h: Header file for semantics.c; includes sem_enum.h

   sem_enum.h:  list  of valid semantic actions. ºNote that if you change
   this file you must recompile the entire program.

   sem_rout.c: Contains non-trivial semantic routines or rutines that may
   vary with natural language

   sem_rout.h: Header file for above

   specsym.cti: Special symbols needed in translation of xml files

   transcribe_chemistry.c: Handles chemical formulas in DOM tree

   transcribe_document.c:  This  is the basic transcription routine which
   traverses    the    DOM    tree    and   calls   transcribe_paragraph,
   transcribe_math, etc., as needed.

   transcribe_graphic.c: Handles SVG graphics in the DOM tree

   transcribe_math.c: Handles MathML and other xml math notations

   transcribe_music.c: transcribes music notation expressed in xml

   transcribe_paragraph.c:  Handles  "paragraphs", including headings, in
   the DOM tree

   transcriber.c:   Contains   the   low-level   transcription  routines,
   including the routine for transcribing plain text.

   transcriber.h: Header file for above

   w_wordDocument.sem: semantics-action file for Microsoft Word documents
   exported as xml

   xml2brl.c: The main program.

   xml2brl.cfg: Configuration file

   xml2brl.h: Header file for main program

References

   1. http://dave.mielke.cc/brltty
   2. ftp://ftp.gtk.org/pub/gtk/v2.4
   3. http://www.xmlsoft.org/
   4. http://gdome2.cs.unibo.it/
   5. http://dave.mielke.cc/brltty

-- 
John J. Boyer, Executive Director
godtouches Digital  Ministry, Inc.
www.godtouches.org
825 East Johnson; Madison, WI 53703



Other related posts: