[liblouis-liblouisxml] [patch] Correct input to output position mapping for removed repeated text

  • From: James Teh <jamie@xxxxxxxxxxx>
  • To: liblouis/liblouisxml mailing list <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Wed, 17 Sep 2008 13:04:16 +1000

Hi all,

Attached is a patch to do the following:
* liblouis/lou_translateString.c: When handling input which is removed due to a "repeated" opcode, correctly update the mapping from input to output positions; i.e. the outputPos array and the returned cursorPos.

The test case is a string such as:
"a  "
Translating this with en-us-g2.ctb yields:
"a "
Observe that the second space has been eliminated due to the "repeated" opcode for spaces. Previously, the mapping from input to output positions was:
[0, 1, 0]
This is incorrect; this indicates that position 2 of the input maps to position 0 of the output.
It should be:
 [0, 1, 1]
indicating that both position 1 and 2 of the input map to position 1 of the output. Similar issues exist with cursorPos. This patch fixes both.

Note that the patch passes a string to for_updatePositions, but passes the output length as being 0, so the string is unnecessary. However, passing memcpy a NULL src is undefined according to the spec (though most implementations don't complain).

I have already committed this to the svn for the liblouis Google Code project.

Jamie

--
James Teh
Email: jamie@xxxxxxxxxxx
WWW: http://www.jantrid.net/
MSN Messenger: jamie@xxxxxxxxxxx
Jabber: jteh@xxxxxxxxxx
Yahoo: jcs_teh
Index: liblouis/lou_translateString.c
===================================================================
--- liblouis/lou_translateString.c      (revision 34)
+++ liblouis/lou_translateString.c      (revision 35)
@@ -1891,6 +1891,7 @@
                   && compareChars (&transRule->charsdots[0],
                                    &currentInput[src], transCharslen, 0))
              {
+               for_updatePositions(transRule->charsdots[0], transCharslen, 0);
                src += transCharslen;
              }
            break;

Other related posts: