[liblouis-liblouisxml] Re: A comment on contraction use and print syllables in English braille

  • From: "Susan Jolly" <easjolly@xxxxxxxxxxxxx>
  • To: <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Tue, 12 Jan 2016 16:20:17 -0700

Hi Bue,

Thanks for the interesting feedback. And thanks for explaining that you don't necessarily use a hyphenation table per se but a related approach.

By the way, your memory is correct. Frank Liang was the person who apparently finalized the TeX hyphenation algorithm for American English. Here's a link to his thesis from August 1983. I believe that other persons later developed the corresponding algorithms for UK English and for other languages.
https://tug.org/docs/liang/liang-thesis.pdf

It's not surprising to me that the most efficient algorithms for braille systems for different natural languages would be different. The only forward braille translators I've spent much time developing have been ones written in Java for EBAE. All of these ran so fast that I could translate ordinary but lengthy literary fiction in less than a minute. (It's been a while since I've done any timing tests so don't remember the exact numbers. Plus computers keep getting faster!) I suspect there is quite a bit of computational overhead in a single translation app that handles dozens of different braille systems and multiple input formats. And even in a single braille system there are the rules for indicators along with special cases such as alphanumeric items or email addresses that add overhead.

I was surprised that you wrote that a previous attempt at a dictionary-based program for Danish was abandoned because it was too slow. My experience was that searching a table (dictionary) using the Java.util.HashMap class was faster than my translation table implementation but of course that result could of course have been implementation dependent. It's of course possible that a Danish document has many more different words than an English document on a similar subject. If so, that factor could mean that what is best for Danish braille translation differs from what is best for English braille translation

I think your approach of maintaining a large accurate dictionary that you can use to ensure that changes don't introduce unexpected errors is a really good idea.

If I understood what you wrote, you already have the ability to automatically compare the braille translations in your dictionary with the results from your "hyphenation table." It thus appears to me that if you haven't already done so you could extend this software to act as a simple approximate braille translator for plain text Danish ignoring any need for indicators. This should allow you to use separate runs of the same software to compare the relative efficiency of the dictionary and "hyphenation table" methods for translation. (You could use a prepass to delete any items from the input text that aren't included in the dictionary so both approaches could handle the same input.)

Best wishes,
SusanJ


For a description of the software, to download it and links to
project pages go to http://liblouis.org

Other related posts: