[liblouis-liblouisxml] Re: [patch] Correct input to output position mapping for removed repeated text

Hi John,

Arrrg! I discovered that the patch I sent causes a regression. With compbrlAtCursor enabled, doing something like:
"a  a"
with cursorPos set to 2 causes the second "a" to disappear. I suspect this is because my call to for_updatePositions could also call doCompTrans, which probably isn't a good idea there. I have written code to manually update outputPositions and cursorPos instead.

However, this raises an interesting question about one of Eitan's test cases. Using en-us-g2, if we take the string:
"greetings  "
(Note the double space at the end.)
Position 9 is the first space, position 10 is the second space. If one translates with compbrlAtCursor and the cursorPos set to 10 (the second space), should the resulting output produce two spaces (because the second space was expanded) or just one? Eitan's test expects that there should be two. However, this is tricky because the elimination of repeats doesn't take computer braille start into account. It probably really should - for spaces, this isn't a big problem, but for other characters, it may well be.

I'm attaching the updated patch. Sorry about that. Note that this patch does the former (does not include the repeated space). I'm not quite sure how to handle computer braille start with this without breaking future translation.

Jamie

John J. Boyer wrote:
Jamie,

Thanks for the patch. I will apply it to my development copy.
Unfortunately, it is not the same as the copy in cood.google. I've been
experimenting and testing. I just fixed the problem with dots 7-8 in
tables like en-us-comp8.ctb. now, if your table does not contain the
capsign opcode but does define dot 7 for capital letters, it will be
preserved. You will have to obtain the new liblouis and liblouisxml from
www.jjb-software.com . Sorry to be out of phrase with the Google site.

I'm putting the finishing touches on the next release. You will receive
a message on the list when it is available.

John

On Wed, Sep 17, 2008 at 01:04:16PM +1000, James Teh wrote:
Hi all,

Attached is a patch to do the following:
* liblouis/lou_translateString.c: When handling input which is removed
due to a "repeated" opcode, correctly update the mapping from input to
output positions; i.e. the outputPos array and the returned cursorPos.

The test case is a string such as:
"a"
Translating this with en-us-g2.ctb yields:
"a"
Observe that the second space has been eliminated due to the "repeated"
opcode for spaces. Previously, the mapping from input to output
positions was:
[0, 1, 0]
This is incorrect; this indicates that position 2 of the input maps to
position 0 of the output.
It should be:
  [0, 1, 1]
indicating that both position 1 and 2 of the input map to position 1 of
the output. Similar issues exist with cursorPos. This patch fixes both.

Note that the patch passes a string to for_updatePositions, but passes
the output length as being 0, so the string is unnecessary. However,
passing memcpy a NULL src is undefined according to the spec (though
most implementations don't complain).

I have already committed this to the svn for the liblouis Google Code
project.

Jamie

--
James Teh
Email: jamie@xxxxxxxxxxx
WWW: http://www.jantrid.net/
MSN Messenger: jamie@xxxxxxxxxxx
Jabber: jteh@xxxxxxxxxx
Yahoo: jcs_teh

Index: liblouis/lou_translateString.c
===================================================================
--- liblouis/lou_translateString.c      (revision 34)
+++ liblouis/lou_translateString.c      (revision 35)
@@ -1891,6 +1891,7 @@
                &&  compareChars (&transRule->charsdots[0],
                                &currentInput[src], transCharslen, 0))
              {
+               for_updatePositions(transRule->charsdots[0], transCharslen, 0);
                src += transCharslen;
              }
            break;



--
James Teh
Email: jamie@xxxxxxxxxxx
WWW: http://www.jantrid.net/
MSN Messenger: jamie@xxxxxxxxxxx
Jabber: jteh@xxxxxxxxxx
Yahoo: jcs_teh
Index: liblouis/lou_translateString.c
===================================================================
--- liblouis/lou_translateString.c      (revision 34)
+++ liblouis/lou_translateString.c      (working copy)
@@ -1891,6 +1891,17 @@
                   && compareChars (&transRule->charsdots[0],
                                    &currentInput[src], transCharslen, 0))
              {
+               if (outputPositions != NULL)
+               {
+                 int tcc;
+                 for (tcc = 0; tcc < transCharslen; tcc++)
+                   outputPositions[tcc + k] = dest - 1;
+               }
+               if (!cursorStatus && src <= cursorPosition && cursorPosition < 
src + transCharslen)
+               {
+                 cursorStatus = 1;
+                 cursorPosition = dest - 1;
+               }
                src += transCharslen;
              }
            break;

Other related posts: