I can now confirm its not to do with UCS4. I have recompiled liblouis for UCS2 and set my LD_LIBRARY_PATH environment variable to point at where this UCS2 version of liblouis is (confirmed this picks up the right version as when I tried to use it with python it showed the correct version and the scrambled output for translation I would expect for UCS2 and UCS4 combining) and when I ran my example with lou_allround (I couldn't use python as I don't have an UCS2 version of python and I didn't fancy compiling that from source) I still got the same result.
I have tried using the en-GB-g2.ctb file and that seems to put bold to work from index 1 (IE. for the string "hello world" and typeforms as 44444444444 I got h..ello world.' as the translation).
I have still to look at trying to use the XML files for testing. Initially I hadn't noticed you had said liblouisxml so I was puzzling over where this XML file was.
Even though I will try and get the XML test done, is my example correct for trying to do what I am expecting?
Michael Whapples On 03/07/09 20:53, John J. Boyer wrote:
Michael, It is possible that ucs4 is affecting it. I have just improved the emphasis.xml file in the tests directory of liblouisxml to test for bold. The dtbook3.sem file in the lbx_files directory was also changed. If you can get these files from svn and put them in the right places, you might see if there is still a problem with emphasis. Thanks, John On Fri, Jul 03, 2009 at 05:51:18PM +0100, Michael Whapples wrote:Hello, This is interesting, it doesn't seem to work here. I have to compile with UCS4 as python is UCS4 on debian. I am quite confident that although making the typeform by hand I am getting it correct. Here is what I do: Using python:import louis louis.translateString(['en-us-g2.ctb'], u'hello world', [4]*11, 0)u'hello __w' I would expect (not knowing the US code, but assuming it is _ means word is bold as that is what it uses on world, using US table as this is meant to be more reliable than the en-GB-g2.ctb): u'_hello __w' Now using lou_allround: Press t for table, enter: en-us-g2.ctb Press e for emphasis and enter: 44444 Press r for run and enter: hello I just get it outputting: hello for the forward translation. Also trying the en-GB-g2.ctb table doesn't work properly but it does seem to insert some bold marks but not where I would expect them. Do you get correct behaviour when using the above steps to try and reproduce the bug or do you find as I do? Do my above steps seem to be correct for what I want to do or am I doing something wrong? Michael Whapples On 03/07/09 16:35, John J. Boyer wrote:Michael, Writing a typeform string by hand is quite a chore. I made up a simple xml file to test bold emphasis with liblouisxml. It worked correctly. Thanks, John On Thu, Jul 02, 2009 at 04:58:38PM +0100, Michael Whapples wrote:Hello, I noticed this when creating some examples for using the python bindings. If you set the typeforms to bold (possibly other emphasis as well) then bold indicators are not added at the beginning. I tried this with lou_allround as well and found the same results so its not a bug with the python bindings. The typeforms string used in lou_allround is "44444444444" using table en-us-g2.ctb and the string "hello world", only the world seems to get the bold indicator. Michael Whapples For a description of the software and to download it go to http://www.jjb-software.comFor a description of the software and to download it go to http://www.jjb-software.com
For a description of the software and to download it go to http://www.jjb-software.com