[liblouis-liblouisxml] Re: Hyphenation

  • From: lars@xxxxxxxxxxxx (Lars Bjørndal)
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Tue, 19 May 2009 21:15:24 +0200

Just a litle followup:

"John J. Boyer" <johnjboyer@xxxxxxxxxxxxx> writes:

> By private e-mail Lars sent me an example of incorrect hyphenation. The
> hyphenation algorithm is complicated. First, the word at the end of the
> line is checked for length. Words of less than 5 characters are not
> candidates for hyphenation. The word in the sample was much longer than
> this. Next, the word is back-translated so the hyphenation algorithm,
> which was derived from OpenOffice can be applied to it. The result is
> stripped of leading and trailing punctuation and submitted to the
> algorithm. This produces a string of digits, with odd digits indicating
> where hyphenation may occur. The word is then forward translated, with
> position-tracking, so that positions in the translated word can be
> correlated with the positions where hyphenation may occur. The position
> nearest to the end of the line is chosen. 
>
> Hyphenation has always been rather uncertain. Since position-tracking 
> has been tweaked since the hyphenation algorithm was written, it is 
> probably time to revisit it. Work on math codes is taking priority at 
> the moment.
>
> I wonder if someone can find the program which produces the hyphenation 
> tables that we use. I did find the original paper describing it (by 
> Wang, I think), but it was a pdf and consisted mostly of page images. 
> Only very incomplete OCR had been done.

I'm sorry that I cannot help you with info about the program you asked
about. Did you get any other reply about this? To me, hyphenation
is somewhat important.

Best regards,
Lars
For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: