[liblouis-liblouisxml] Re: Use of the lou_hyphenate function

  • From: "John J. Boyer" <johnjboyer@xxxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 8 Jun 2009 13:11:15 -0500

Michael,

Thanks for finding the bugs! Hyphenation is of interest to others on the 
list. The new lou_checkhyphens tool will help me get the bugs out of the 
woodwork.

John

On Mon, Jun 08, 2009 at 06:00:38PM +0100, Michael Whapples wrote:
> Hello,
> I am making more progress, I can't fault hyphenation when I do it with 
> the original text (IE. mode=0). I had been making a silly mistake which 
> I spotted from trying to make some more sense from the transcriber.c 
> file in liblouisxml, I had been checking for a numerical value 1 and 0, 
> rather than char '1' and '0'.
> 
> When I use hyphenation with a translated string (IE. mode=1) the hyphens 
> array contains other values other than '1' and '0'. It seems like 
> sometimes it might be correct (when checking only for value '1') but I 
> am uncertain. I can commit to mercurial some work to show the values in 
> hyphens if it would be useful (or if you want I can just catch output 
> and post it here).
> 
> Michael Whapples
> On 08/06/09 16:37, John J. Boyer wrote:
> >Michael,
> >
> >You have some good points. The hyphens string returned by lou_hyphenate
> >should contain only 0's and 1's. It is a good idea to return a string of
> >all 0s if the word cannot be hyphenated. You have discovered a bug.
> >Thanks for the suggestion. I'll let you know when I have made the fixes.
> >
> >John
> >
> >On Sun, Jun 07, 2009 at 12:35:26PM +0100, Michael Whapples wrote:
> >   
> >>Hello,
> >>I have made some progress now,I can get something which seems like
> >>correct behaviour out of lou_hyphenate. One thing which slightly caught
> >>me out is that the docs say a 1 is at the beginning of a syllable and 0
> >>else where, so I was getting my code to check for 1s, however printing
> >>out the values from hyphens reveals it to contain other values to 0 and
> >>1 (eg. 48). If I assume any non-zero value instead of 1 I think this
> >>makes sense. Is this correct?
> >>
> >>Also I have noticed that certain characters can cause lou_hyphenate to
> >>return 0 (IE. fail hyphenation), such a string is "adder", but if that
> >>sequence is part of a larger word such as "ladder" lou_hyphenate works
> >>fine. So does lou_hyphenate returning 0 mean more than error (IE. no
> >>hyphenation possible)? I would expect if the word cannot be hyphenated
> >>then hyphens should contain just zeros and lou_hyphenate to return 1
> >>(success) as the function didn't hit an error its just the word can't be
> >>hyphenated as shown in the hyphens content.
> >>
> >>Michael Whapples
> >>On 07/06/09 04:36, John J. Boyer wrote:
> >>     
> >>>Your inferences from the liblouisxml code are correct. You definitely
> >>>must have a hyphenation table. It is placed after the translation table
> >>>name, separated by a comma. For example, en-us-g2.ctb,hyph_en_US.dic
> >>>
> >>>The en-GB-g2.ctb table should work with this hyphenation table as well.
> >>>
> >>>John
> >>>
> >>>On Sat, Jun 06, 2009 at 11:28:07PM +0100, Michael Whapples wrote:
> >>>
> >>>       
> >>>>Not being a C person I haven't given the source code of liblouisxml
> >>>>great attention. However I did have a quick look at the very specific
> >>>>part of the code you pointed to and this is what I gathered:
> >>>>
> >>>>* liblouisxml seems to split the text into words before passing it to
> >>>>the lou_hyphenate function.
> >>>>* Liblouisxml deals with some of the hyphenation itself (eg. if a hyphen
> >>>>is already in the word).
> >>>>* the rest which I could gather was already known from the liblouis
> >>>>documentation.
> >>>>
> >>>>So going with the first point of single words I tried passing in just
> >>>>one word, but still get lou_hyphenate returning 0. I don't seem to get
> >>>>any log messages produced from liblouis.
> >>>>
> >>>>Do you have a minimal example for using lou_hyphenate which I could
> >>>>examine? Ideallyh one where it is easy to see what the parameters are
> >>>>which are being passed into lou_hyphenate.
> >>>>
> >>>>Is there anyway I can get details of why liblouis is returning 0?
> >>>>
> >>>>I still wonder about the table I am using, should en-us-g2.ctb work? I
> >>>>was unable to gather this from looking at the liblouisxml source.
> >>>>
> >>>>Michael Whapples
> >>>>On 06/06/09 17:06, John J. Boyer wrote:
> >>>>
> >>>>         
> >>>>>The lou_hyphenate function is tricky, as is hyphenation in general. For
> >>>>>an example of its use look at the hyphenate function in the liblouisxml
> >>>>>module transcriber.c.
> >>>>>
> >>>>>John
> >>>>>
> >>>>>On Sat, Jun 06, 2009 at 04:26:43PM +0100, Michael Whapples wrote:
> >>>>>
> >>>>>
> >>>>>           
> >>>>>>Hello,
> >>>>>>I have tried to add support for the lou_hyphenate function into my 
> >>>>>>java
> >>>>>>bindings, but I seem to only get the value 0 returned (IE. its failing
> >>>>>>to complete). Unfortunately I don't know why it fails to complete. I 
> >>>>>>am
> >>>>>>using the en-us-g2.ctb translation table as I understand that the
> >>>>>>en-GB-g2.ctb table isn't so well developed. I also tried passing in 
> >>>>>>the
> >>>>>>following string for translation table to see if specifying a
> >>>>>>hyphenation dictionary would help "en-us-g2.ctb,hyph_en_US.dic" but
> >>>>>>still no success.
> >>>>>>
> >>>>>>I guess first thing to check is if I am using a suitable table. If not
> >>>>>>what would be a correct value for trantab?
> >>>>>>
> >>>>>>Also for those java developers what would be your preferred return 
> >>>>>>type,
> >>>>>>I plan to have it return a byte array with values as given by
> >>>>>>lou_hyphenate in the hyphens parameter. An alternative I can think of 
> >>>>>>is
> >>>>>>to return a int array with each value being the index of a 1 value in
> >>>>>>the hyphens parameter of lou_hyphenate (IE. by iterating over the 
> >>>>>>return
> >>>>>>value you would get each index of the beginning of a syllable, which
> >>>>>>could be used on the string you passed into the method).
> >>>>>>
> >>>>>>Michael Whapples
> >>>>>>For a description of the software and to download it go to
> >>>>>>http://www.jjb-software.com
> >>>>>>
> >>>>>>
> >>>>>>             
> >>>>>
> >>>>>           
> >>>>For a description of the software and to download it go to
> >>>>http://www.jjb-software.com
> >>>>
> >>>>         
> >>>
> >>>       
> >>For a description of the software and to download it go to
> >>http://www.jjb-software.com
> >>     
> >   
> 
> For a description of the software and to download it go to
> http://www.jjb-software.com

-- 
John J. boyer; President, Chief Software Developer
JJB Software, Inc.
http://www.jjb-software.com
Madison, WI USA
Developing software for people with disabilities

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: