The Swedish Braille Authority eliminated contractions in Swedish Braille a
number of years ago and found that the elimination did not add significantly
to the length of the books.
How much simpler would our efforts be without contractions to think about?
On Jun 30, 2015, at 10:31 AM, Ken Perry <kperry@xxxxxxx> wrote:
I want to point out this is why I have felt the current way braille is being
created and this includes UEB is wrong. We should be working towards full
backtranslation because the computer world is where braille is going. This
means when UEB is worked on the rules should be fully forward and back
translatable without context sensitivity. This could be accomplished if the
working groups in volved thought first about it then second about other
concerns. This will continue to bite us in years to come.
Ken
-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Susan
Jolly
Sent: Saturday, June 27, 2015 3:28 PM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] General comments on backtranslation
I too fully agree with Bue who wrote "Actually, [automatic] backtranslation
is much harder than one would think."
I would like to explain why I believe that completely accurate automatic
backtranslation of braille is very difficult. In my opinion it is not
reasonable to expect a fully accurate general solution applicable to a large
number of different braille systems. There are several braille-specific
technical reasons why this is the case.
Before discussing particular reasons for the difficulties of automatic
backtranslation or interpretation of braille it is useful to consider some
general background. Interpretation of braille text requires a two-step
process known in computer science as lexical analysis followed by parsing.
Lexical analysis is a process of tokenization whereby text consisting of a
sequence of characters is converted to a sequence of tokens. A token is one
or more adjacent characters that represent a single symbol.
Parsing is the determination of the structure or meaning of a sequence of
tokens. In the case of translating simple literary braille, parsing would
typically begin by separating a sequence of tokens into shorter sequences
representing either whitespace or individual braille words. The rules of the
target braille system would then be used to backtranslate the words.
You can read more about parsing here.
http://stackoverflow.com/questions/2933192/whats-the-best-way-to-expla
in-parsing-to-a-new-programmer
By the way, if you've studied parsing you know it isn't easy. If you have a
background in computer science but haven't spent much time on parsing, my
experience is that gaining facility with parsing can take a significant
amount of time.
Braille systems are especially difficult to parse automatically because they
are based on context-dependent grammars. Context in braille includes the
relative position of braille cells and the use of semantic indictors.
Context-dependent rules typically have to be taken into account to perform
both the lexical analysis and interpretation steps when backtranslating
braille.
The standard way to write a translator to specify the appropriate grammar in
a special source language. One or more generators are then used to
automatically convert the grammar to custom software which is used to perform
the translation. Lex and Yacc are probably the best known generators but
modern generators, such as ANTLR v4, are much more powerful.
That liblouis backtranslation works at all is a tour de force and I
congratulate John on this achievement! I say this after having spent more
than a year trying to develop accurate backtranslation software just for
EBAE. Handling ambiguities was of course one of the big issues. I wrote this
software back when EBAE still used the dots-34 cell for both a slash and the
"st" contraction and I still remember the difficulty of resolving items like
instructor/guide and postmaster/postmistress. I'd first assume that all the
dots-34 cells in an item represented the contraction and check the
translation in a hash table of print words. If it wasn't found, I'd then try
various combinations involving one or more slashes until all the separate
parts were found in the table.
Best wishes,
SusanJ
For a description of the software, to download it and links to project
pages go to http://liblouis.org For a description of the software, to
download it and links to project pages go to http://liblouis.org