[liblouis-liblouisxml] Re: Use of the lou_hyphenate function

  • From: "John J. Boyer" <johnjboyer@xxxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 8 Jun 2009 10:37:52 -0500

Michael,

You have some good points. The hyphens string returned by lou_hyphenate 
should contain only 0's and 1's. It is a good idea to return a string of 
all 0s if the word cannot be hyphenated. You have discovered a bug. 
Thanks for the suggestion. I'll let you know when I have made the fixes.

John

On Sun, Jun 07, 2009 at 12:35:26PM +0100, Michael Whapples wrote:
> Hello,
> I have made some progress now,I can get something which seems like 
> correct behaviour out of lou_hyphenate. One thing which slightly caught 
> me out is that the docs say a 1 is at the beginning of a syllable and 0 
> else where, so I was getting my code to check for 1s, however printing 
> out the values from hyphens reveals it to contain other values to 0 and 
> 1 (eg. 48). If I assume any non-zero value instead of 1 I think this 
> makes sense. Is this correct?
> 
> Also I have noticed that certain characters can cause lou_hyphenate to 
> return 0 (IE. fail hyphenation), such a string is "adder", but if that 
> sequence is part of a larger word such as "ladder" lou_hyphenate works 
> fine. So does lou_hyphenate returning 0 mean more than error (IE. no 
> hyphenation possible)? I would expect if the word cannot be hyphenated 
> then hyphens should contain just zeros and lou_hyphenate to return 1 
> (success) as the function didn't hit an error its just the word can't be 
> hyphenated as shown in the hyphens content.
> 
> Michael Whapples
> On 07/06/09 04:36, John J. Boyer wrote:
> >Your inferences from the liblouisxml code are correct. You definitely
> >must have a hyphenation table. It is placed after the translation table
> >name, separated by a comma. For example, en-us-g2.ctb,hyph_en_US.dic
> >
> >The en-GB-g2.ctb table should work with this hyphenation table as well.
> >
> >John
> >
> >On Sat, Jun 06, 2009 at 11:28:07PM +0100, Michael Whapples wrote:
> >   
> >>Not being a C person I haven't given the source code of liblouisxml
> >>great attention. However I did have a quick look at the very specific
> >>part of the code you pointed to and this is what I gathered:
> >>
> >>* liblouisxml seems to split the text into words before passing it to
> >>the lou_hyphenate function.
> >>* Liblouisxml deals with some of the hyphenation itself (eg. if a hyphen
> >>is already in the word).
> >>* the rest which I could gather was already known from the liblouis
> >>documentation.
> >>
> >>So going with the first point of single words I tried passing in just
> >>one word, but still get lou_hyphenate returning 0. I don't seem to get
> >>any log messages produced from liblouis.
> >>
> >>Do you have a minimal example for using lou_hyphenate which I could
> >>examine? Ideallyh one where it is easy to see what the parameters are
> >>which are being passed into lou_hyphenate.
> >>
> >>Is there anyway I can get details of why liblouis is returning 0?
> >>
> >>I still wonder about the table I am using, should en-us-g2.ctb work? I
> >>was unable to gather this from looking at the liblouisxml source.
> >>
> >>Michael Whapples
> >>On 06/06/09 17:06, John J. Boyer wrote:
> >>     
> >>>The lou_hyphenate function is tricky, as is hyphenation in general. For
> >>>an example of its use look at the hyphenate function in the liblouisxml
> >>>module transcriber.c.
> >>>
> >>>John
> >>>
> >>>On Sat, Jun 06, 2009 at 04:26:43PM +0100, Michael Whapples wrote:
> >>>
> >>>       
> >>>>Hello,
> >>>>I have tried to add support for the lou_hyphenate function into my java
> >>>>bindings, but I seem to only get the value 0 returned (IE. its failing
> >>>>to complete). Unfortunately I don't know why it fails to complete. I am
> >>>>using the en-us-g2.ctb translation table as I understand that the
> >>>>en-GB-g2.ctb table isn't so well developed. I also tried passing in the
> >>>>following string for translation table to see if specifying a
> >>>>hyphenation dictionary would help "en-us-g2.ctb,hyph_en_US.dic" but
> >>>>still no success.
> >>>>
> >>>>I guess first thing to check is if I am using a suitable table. If not
> >>>>what would be a correct value for trantab?
> >>>>
> >>>>Also for those java developers what would be your preferred return type,
> >>>>I plan to have it return a byte array with values as given by
> >>>>lou_hyphenate in the hyphens parameter. An alternative I can think of is
> >>>>to return a int array with each value being the index of a 1 value in
> >>>>the hyphens parameter of lou_hyphenate (IE. by iterating over the return
> >>>>value you would get each index of the beginning of a syllable, which
> >>>>could be used on the string you passed into the method).
> >>>>
> >>>>Michael Whapples
> >>>>For a description of the software and to download it go to
> >>>>http://www.jjb-software.com
> >>>>
> >>>>         
> >>>
> >>>       
> >>For a description of the software and to download it go to
> >>http://www.jjb-software.com
> >>     
> >   
> 
> For a description of the software and to download it go to
> http://www.jjb-software.com

-- 
My websites:
GodTouches Digital Ministry, Inc. http://www.abilitiessoft.com/godtouches
Abilitiessoft, Inc. http://www.abilitiessoft.com
Location: Madison, WI, USA

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: