Michael, You have some good points. The hyphens string returned by lou_hyphenate should contain only 0's and 1's. It is a good idea to return a string of all 0s if the word cannot be hyphenated. You have discovered a bug. Thanks for the suggestion. I'll let you know when I have made the fixes. John On Sun, Jun 07, 2009 at 12:35:26PM +0100, Michael Whapples wrote: > Hello, > I have made some progress now,I can get something which seems like > correct behaviour out of lou_hyphenate. One thing which slightly caught > me out is that the docs say a 1 is at the beginning of a syllable and 0 > else where, so I was getting my code to check for 1s, however printing > out the values from hyphens reveals it to contain other values to 0 and > 1 (eg. 48). If I assume any non-zero value instead of 1 I think this > makes sense. Is this correct? > > Also I have noticed that certain characters can cause lou_hyphenate to > return 0 (IE. fail hyphenation), such a string is "adder", but if that > sequence is part of a larger word such as "ladder" lou_hyphenate works > fine. So does lou_hyphenate returning 0 mean more than error (IE. no > hyphenation possible)? I would expect if the word cannot be hyphenated > then hyphens should contain just zeros and lou_hyphenate to return 1 > (success) as the function didn't hit an error its just the word can't be > hyphenated as shown in the hyphens content. > > Michael Whapples > On 07/06/09 04:36, John J. Boyer wrote: > >Your inferences from the liblouisxml code are correct. You definitely > >must have a hyphenation table. It is placed after the translation table > >name, separated by a comma. For example, en-us-g2.ctb,hyph_en_US.dic > > > >The en-GB-g2.ctb table should work with this hyphenation table as well. > > > >John > > > >On Sat, Jun 06, 2009 at 11:28:07PM +0100, Michael Whapples wrote: > > > >>Not being a C person I haven't given the source code of liblouisxml > >>great attention. However I did have a quick look at the very specific > >>part of the code you pointed to and this is what I gathered: > >> > >>* liblouisxml seems to split the text into words before passing it to > >>the lou_hyphenate function. > >>* Liblouisxml deals with some of the hyphenation itself (eg. if a hyphen > >>is already in the word). > >>* the rest which I could gather was already known from the liblouis > >>documentation. > >> > >>So going with the first point of single words I tried passing in just > >>one word, but still get lou_hyphenate returning 0. I don't seem to get > >>any log messages produced from liblouis. > >> > >>Do you have a minimal example for using lou_hyphenate which I could > >>examine? Ideallyh one where it is easy to see what the parameters are > >>which are being passed into lou_hyphenate. > >> > >>Is there anyway I can get details of why liblouis is returning 0? > >> > >>I still wonder about the table I am using, should en-us-g2.ctb work? I > >>was unable to gather this from looking at the liblouisxml source. > >> > >>Michael Whapples > >>On 06/06/09 17:06, John J. Boyer wrote: > >> > >>>The lou_hyphenate function is tricky, as is hyphenation in general. For > >>>an example of its use look at the hyphenate function in the liblouisxml > >>>module transcriber.c. > >>> > >>>John > >>> > >>>On Sat, Jun 06, 2009 at 04:26:43PM +0100, Michael Whapples wrote: > >>> > >>> > >>>>Hello, > >>>>I have tried to add support for the lou_hyphenate function into my java > >>>>bindings, but I seem to only get the value 0 returned (IE. its failing > >>>>to complete). Unfortunately I don't know why it fails to complete. I am > >>>>using the en-us-g2.ctb translation table as I understand that the > >>>>en-GB-g2.ctb table isn't so well developed. I also tried passing in the > >>>>following string for translation table to see if specifying a > >>>>hyphenation dictionary would help "en-us-g2.ctb,hyph_en_US.dic" but > >>>>still no success. > >>>> > >>>>I guess first thing to check is if I am using a suitable table. If not > >>>>what would be a correct value for trantab? > >>>> > >>>>Also for those java developers what would be your preferred return type, > >>>>I plan to have it return a byte array with values as given by > >>>>lou_hyphenate in the hyphens parameter. An alternative I can think of is > >>>>to return a int array with each value being the index of a 1 value in > >>>>the hyphens parameter of lou_hyphenate (IE. by iterating over the return > >>>>value you would get each index of the beginning of a syllable, which > >>>>could be used on the string you passed into the method). > >>>> > >>>>Michael Whapples > >>>>For a description of the software and to download it go to > >>>>http://www.jjb-software.com > >>>> > >>>> > >>> > >>> > >>For a description of the software and to download it go to > >>http://www.jjb-software.com > >> > > > > For a description of the software and to download it go to > http://www.jjb-software.com -- My websites: GodTouches Digital Ministry, Inc. http://www.abilitiessoft.com/godtouches Abilitiessoft, Inc. http://www.abilitiessoft.com Location: Madison, WI, USA For a description of the software and to download it go to http://www.jjb-software.com