Hi Bue,
Thanks for the interesting feedback. And thanks for explaining that you
don't necessarily use a hyphenation table per se but a related approach.
By the way, your memory is correct. Frank Liang was the person who
apparently finalized the TeX hyphenation algorithm for American English.
Here's a link to his thesis from August 1983. I believe that other persons
later developed the corresponding algorithms for UK English and for other
languages.
https://tug.org/docs/liang/liang-thesis.pdf
It's not surprising to me that the most efficient algorithms for braille
systems for different natural languages would be different. The only
forward braille translators I've spent much time developing have been ones
written in Java for EBAE. All of these ran so fast that I could translate
ordinary but lengthy literary fiction in less than a minute. (It's been a
while since I've done any timing tests so don't remember the exact numbers.
Plus computers keep getting faster!) I suspect there is quite a bit of
computational overhead in a single translation app that handles dozens of
different braille systems and multiple input formats. And even in a single
braille system there are the rules for indicators along with special cases
such as alphanumeric items or email addresses that add overhead.
I was surprised that you wrote that a previous attempt at a dictionary-based
program for Danish was abandoned because it was too slow. My experience was
that searching a table (dictionary) using the Java.util.HashMap class was
faster than my translation table implementation but of course that result
could of course have been implementation dependent. It's of course possible
that a Danish document has many more different words than an English
document on a similar subject. If so, that factor could mean that what is
best for Danish braille translation differs from what is best for English
braille translation
I think your approach of maintaining a large accurate dictionary that you
can use to ensure that changes don't introduce unexpected errors is a really
good idea.
If I understood what you wrote, you already have the ability to
automatically compare the braille translations in your dictionary with the
results from your "hyphenation table." It thus appears to me that if you
haven't already done so you could extend this software to act as a simple
approximate braille translator for plain text Danish ignoring any need for
indicators. This should allow you to use separate runs of the same software
to compare the relative efficiency of the dictionary and "hyphenation table"
methods for translation. (You could use a prepass to delete any items from
the input text that aren't included in the dictionary so both approaches
could handle the same input.)
Best wishes,
SusanJ
For a description of the software, to download it and links to
project pages go to http://liblouis.org