[liblouis-liblouisxml] Re: SV: Re: SV: Proposal for next liblouis release date

  • From: Bert Frees <bertfrees@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Wed, 8 Jan 2014 12:07:26 +0100

Hi Bue,

Aha! that explains a lot. I knew that libhyphen patterns are not the same
as Tex patterns, but I was assuming you were using the correct format. The
table at
https://code.google.com/p/liblouis/source/browse/trunk/tables/hyph_da_DK.diclooks
all right.

The patgen program in combination with the Perl script seems like a good
solution to me, or a re there any drawbacks? Do you expect to get better
results from a "direct" implementation?

Bert


2014/1/6 Bue Vester-Andersen <bue@xxxxxxxxxxxxxxxxxx>

> Hello Bert,
>
>
>
> Thank you, but I don't think that would be an option in my case. Also,
> that would mean that Liblouis wasn't really supporting Danish contraction
> without the use of an additional hyphenation tool. That might be a show
> stopper for Danish contraction in many situations where LibLouis is
> currently being used successfully for making contracted Braille in other
> languages.
>
>
>
> However, I think we have found the source of the problem. I currently have
> a list of more than 33,000 manually hyphenated words. Until now, I have
> been generating the hyphenation pattern file using the patgen program from
> Tex live. But apparently, Liblouis needs another flavor of pattern file,
> namely  the type of pattern file used in OpenOffice and LibHyphen. I
> thought the two were compatible, but apparently, they are not. I have just
> read today that if you happen to use a Tex style pattern file in this
> situation you will end up with a lot of bad hyphenations, which is exactly
> what we are getting. It will simply "fail silently".
>
>
>
> I have found a Perl script that can convert a Tex style hyphenation file
> into an OpenOffice style file. I am currently looking for a patgen program
> that can generate the right type of hyphenation file streight away.
> preferably one that can run under windows or be compiled to run under
> Windows.
>
> If you have any suggestions, I would be very thankful in deed.
>
>
>
> Kind regards
>
> Bue
>
>
>
>
>
> *Fra:* liblouis-liblouisxml-bounce@xxxxxxxxxxxxx [mailto:
> liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] *På vegne af *Bert Frees
> *Sendt:* 6. januar 2014 11:04
> *Til:* liblouis-liblouisxml@xxxxxxxxxxxxx
> *Emne:* [liblouis-liblouisxml] Re: SV: Proposal for next liblouis release
> date
>
>
>
> Hello Bue,
>
>
>
> Would it be an option for you to do the hyphenation beforehand, using a
> standard tool? Liblouis has a function `translatePrehyphenated'. You give
> it the input text and input hyphenation points, and it gives you the
> translated text and output hyphenation points back. See
> https://github.com/liblouis/liblouis/blob/master/liblouis/lou_translateString.c#L278.
> We use this approach at SBS.
>
>
>
> Kind regards,
>
> Bert
>
>
>
> 2013/12/25 Bue Vester-Andersen <bue@xxxxxxxxxxxxxxxxxx>
>
> Hi Christian,
>
> I would very much like to get the hyphenation algorithm fixed before we
> release the next version. At the moment, using the current Danish tables,
> about half of the words from our test suite are hyphenated differently from
> what would be expected with the given set of tables. The tables are now
> excelent and produce very good hyphenation using standard hyphenation
> tools. However, LibLouis does not seem to be able to interpret the
> hyphenation tables correctly. So Danish contraction still looks very bad,
> because it depends extensively on correct hyphenation. Other languages also
> depend on the hyphenation.
>
> To me it looks like some loop counter getting out of synch or something
> like it. Mesar is working hard on this issue, but any help or suggestions
> would be greatly apreciated. We have a test suite of more than 30000
> manually hyphenated words. As I said, they hyphenate correctly using the
> generated tables and standard hyphenation tools, but not with the LibLouis
> hyphenator. If we don't get it fixed before this release, we may have to
> wait yet another half year before we can have decent Danish contraction.
>
> Best regards and merry Christmas
> Bue
>

Other related posts: