[liblouis-liblouisxml] Re: Error in applying emphasis

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Fri, 03 Jul 2009 23:00:00 +0100

I can now confirm its not to do with UCS4. I have recompiled liblouis for UCS2 and set my LD_LIBRARY_PATH environment variable to point at where this UCS2 version of liblouis is (confirmed this picks up the right version as when I tried to use it with python it showed the correct version and the scrambled output for translation I would expect for UCS2 and UCS4 combining) and when I ran my example with lou_allround (I couldn't use python as I don't have an UCS2 version of python and I didn't fancy compiling that from source) I still got the same result.


I have tried using the en-GB-g2.ctb file and that seems to put bold to work from index 1 (IE. for the string "hello world" and typeforms as 44444444444 I got h..ello world.' as the translation).

I have still to look at trying to use the XML files for testing. Initially I hadn't noticed you had said liblouisxml so I was puzzling over where this XML file was.

Even though I will try and get the XML test done, is my example correct for trying to do what I am expecting?

Michael Whapples
On 03/07/09 20:53, John J. Boyer wrote:
Michael,

It is possible that ucs4 is affecting it. I have just improved the
emphasis.xml file in the tests directory of liblouisxml to test for
bold. The dtbook3.sem file in the lbx_files directory was also changed.
If you can get these files from svn and put them in the right places,
you might see if there is still a problem with emphasis.

Thanks,
John

On Fri, Jul 03, 2009 at 05:51:18PM +0100, Michael Whapples wrote:
Hello,
This is interesting, it doesn't seem to work here. I have to compile
with UCS4 as python is UCS4 on debian.

I am quite confident that although making the typeform by hand I am
getting it correct. Here is what I do:

Using python:
import louis
louis.translateString(['en-us-g2.ctb'], u'hello world', [4]*11, 0)
u'hello __w'

I would expect (not knowing the US code, but assuming it is _ means word
is bold as that is what it uses on world, using US table as this is
meant to be more reliable than the en-GB-g2.ctb):
u'_hello __w'

Now using lou_allround:

Press t for table, enter:
en-us-g2.ctb

Press e for emphasis and enter:
44444

Press r for run and enter:
hello

I just get it outputting:
hello

for the forward translation.

Also trying the en-GB-g2.ctb table doesn't work properly but it does
seem to insert some bold marks but not where I would expect them.

Do you get correct behaviour when using the above steps to try and
reproduce the bug or do you find as I do?

Do my above steps seem to be correct for what I want to do or am I doing
something wrong?

Michael Whapples
On 03/07/09 16:35, John J. Boyer wrote:
Michael,

Writing a typeform string by hand is quite a chore. I made up a simple
xml file to test bold emphasis with liblouisxml. It worked correctly.

Thanks,
John

On Thu, Jul 02, 2009 at 04:58:38PM +0100, Michael Whapples
wrote:

Hello,
I noticed this when creating some examples for using the python
bindings. If you set the typeforms to bold (possibly other emphasis as
well) then bold indicators are not added at the beginning. I tried this
with lou_allround as well and found the same results so its not a bug
with the python bindings. The typeforms string used in lou_allround is
"44444444444" using table en-us-g2.ctb and the string "hello world",
only the world seems to get the bold indicator.

Michael Whapples
For a description of the software and to download it go to
http://www.jjb-software.com


For a description of the software and to download it go to
http://www.jjb-software.com

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: