[liblouis-liblouisxml] Re: Patch for python backTranslateString

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Fri, 12 Feb 2010 20:33:13 +0000

OK, most of what you said I think is fine. The only thing I might say I wonder if it is the best description is about typeform. As I understood the liblouis documentation typeform is used to retrieve typeform data about the back translated text rather than indicating the emphasis on the Braille (inbuf). So this makes me think the translateString description is slightly different to how backTranslateString uses typeform.


As for doctests, I was planning on doing that next, but I was thinking separate commit to SVN for that as they are two different pieces of work (one adds available functions, one just adds tests).

I'll look at other commit messages and try and make mine follow a similar form, however I do intend the docstrings and other docs should be kept up to date so that users shouldn't need to trawl through commit messages to find out how to use something.

Michael Whapples
On 02/12/2010 08:26 AM, Christian Egli wrote:
Hi Michael

Thanks for all your contributions! I have some comments inline in the
patch below. Also it would be great if you'd add a little comment in the
Changelog file as your changes are quite interesting and improve the
API. See other entries in the Changelog file for examples.

Michael Whapples<mwhapples@xxxxxxx>  writes:

Here is a patch (attached) for a python wrapper for
lou_backTranslateString. It all seems to work fine for me.
Can you add some doc tests to your code? Feel free to use the example
code you have below.

I haven't been able to get the typeform data to give any emphasis data
but I don't think this is a python bindings issue.

An example call is:
l = []
louis.backTranslateString(["en-us-g2.ctb"], ',hello .my _w', l)

At this point l contains character "0" for each character in the
returned string.
Are you saying that "hello my" should be in italics? If so I guess the
typeform should reflect this.

Is this a liblouis problem or a problem in my code? If my code is fine
I will commit this and move on to lou_backTranslate.
Hm, my guess would be that this is a liblouis problem. I don't know if
anyone has tested typeform with back-translation. To test liblouis I
guess we'd have to write a small test case in C (see the code in the
test directory).

Index: python/louis/__init__.py.in
===================================================================
--- python/louis/__init__.py.in   (revision 333)
+++ python/louis/__init__.py.in   (working copy)
@@ -50,6 +50,10 @@
           POINTER(c_int), POINTER(c_char), POINTER(c_char),
           POINTER(c_int), POINTER(c_int), POINTER(c_int), c_int)

+liblouis.lou_backTranslateString.argtypes = (
+         c_char_p, c_wchar_p, POINTER(c_int), c_wchar_p,
+         POINTER(c_int), POINTER(c_char), POINTER(c_char), c_int)
+
  liblouis.lou_hyphenate.argtypes = (
           c_char_p, c_wchar_p, c_int, POINTER(c_char), c_int)

@@ -135,6 +139,38 @@
          typeform[:] = typeformbuf.value
      return outbuf.value

+def backTranslateString(tran_tables, inbuf, typeform=None, mode=0):
+    """Back translate from Braille.
+    @param tran_tables: A list of translation tables.
+        First table in the list must be a full pathname, unless tables in 
liblouis tables directory.
I've been meaning to change this but this is no longer true, i.e. the
first table name doesn't have to be a full pathname. If it is then it
will look for the other tables in that path but I don't think we really
need to explain this here in the bindings. This should be (and is AFAIK
in the user documentation).

+    @type tran_tables: list of str
+    @param inbuf: The Braille to back translate.
+    @type inbuf: unicode
+    @param typeform: The list you want the typeform data put in.
+        If you don't want typeform data then give None
+    @type typeform: list
I would use the same documentation here as for translate, i.e.

     @param typeform: A list of typeform constants indicating the typeform for 
each position in inbuf,
         C{None} for no typeform information.

+    @param mode: The translation mode
+    @type mode: int
+    @returns: The string found by back translation.
+    @rtype: unicode
I'd be more to the point here (but remember I'm not a native speaker)

@returns: The back translated inbuf

The rest is all fine. I'd just enjoy some doc tests that test it and at
the same time explain the usage.

Thanks
Christian

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: