[liblouis-liblouisxml] patch: Incorrect input/outputPosition values for undefined characters

  • From: James Teh <jamie@xxxxxxxxxxxx>
  • To: liblouis/liblouisxml mailing list <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Fri, 30 Jul 2010 06:19:47 +1000

Hi all,

If there are undefined characters in the input, the associated offsets in inputPositions/outputPositions are incorrect. For example, for the input string:
"a\u2022b"
The output string is:
"a'\\x2022'b"
inputPos is as follows:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 2]
This is incorrect. It should be:
[0, 1, 1, 1, 1, 1, 1, 1, 1, 2]
outputPos is as follows:
[0, 0, 9]
This is incorrect. It should be:
[0, 1, 9]

The attached patch fixes this. I'm sending it here first before committing because I'm not quite sure how the new srcMapping stuff works and I'd appreciate someone who understands it taking a look in case I got it wrong.

Thanks!

Jamie

--
James Teh
Vice President
NV Access Inc, ABN 61773362390
Email: jamie@xxxxxxxxxxxx
Web site: http://www.nvaccess.org/
Index: liblouis/lou_translateString.c
===================================================================
--- liblouis/lou_translateString.c      (revision 364)
+++ liblouis/lou_translateString.c      (working copy)
@@ -1542,8 +1542,14 @@ undefinedCharacter (widechar c)
   char *display = showString (&c, 1);
   if ((dest + strlen (display)) > destmax)
     return 0;
+  if (outputPositions != NULL)
+    outputPositions[srcMapping[src]] = dest;
   for (k = 0; k < strlen (display); k++)
+  {
+    if (inputPositions != NULL)
+      inputPositions[dest] = srcMapping[src];
     currentOutput[dest++] = getDotsForChar (display[k]);
+  }
   return 1;
 }
 

Other related posts: