[liblouis-liblouisxml] Re: [patch] Correct input to output position mapping for removed repeated text

  • From: "John J. Boyer" <johnjboyer@xxxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Wed, 17 Sep 2008 05:54:52 -0500

Jamie,

Thanks for the updated patch and for studying the code so thoroughly. 
I'll have to look at the patch and think about the points you raise. 
Meanwhile, the new release of liblouis (and liblouisxml) is already on
http://www.jjb-software.com with the original patch. There should not be
a problem except in rare cases, and people are waiting for the new
release. So I'll let it stand and put up an update shortly.

John

 On Wed, Sep 17, 2008 at 08:29:07PM +1000, James 
Teh wrote:
> Hi John,
> 
> Arrrg! I discovered that the patch I sent causes a regression. With 
> compbrlAtCursor enabled, doing something like:
> "a  a"
> with cursorPos set to 2 causes the second "a" to disappear. I suspect 
> this is because my call to for_updatePositions could also call 
> doCompTrans, which probably isn't a good idea there. I have written code 
> to manually update outputPositions and cursorPos instead.
> 
> However, this raises an interesting question about one of Eitan's test 
> cases. Using en-us-g2, if we take the string:
> "greetings  "
> (Note the double space at the end.)
> Position 9 is the first space, position 10 is the second space. If one 
> translates with compbrlAtCursor and the cursorPos set to 10 (the second 
> space), should the resulting output produce two spaces (because the 
> second space was expanded) or just one? Eitan's test expects that there 
> should be two. However, this is tricky because the elimination of 
> repeats doesn't take computer braille start into account. It probably 
> really should - for spaces, this isn't a big problem, but for other 
> characters, it may well be.
> 
> I'm attaching the updated patch. Sorry about that. Note that this patch 
> does the former (does not include the repeated space). I'm not quite 
> sure how to handle computer braille start with this without breaking 
> future translation.
> 
> Jamie
> 
> John J. Boyer wrote:
> >Jamie,
> >
> >Thanks for the patch. I will apply it to my development copy.
> >Unfortunately, it is not the same as the copy in cood.google. I've been
> >experimenting and testing. I just fixed the problem with dots 7-8 in
> >tables like en-us-comp8.ctb. now, if your table does not contain the
> >capsign opcode but does define dot 7 for capital letters, it will be
> >preserved. You will have to obtain the new liblouis and liblouisxml from
> >www.jjb-software.com . Sorry to be out of phrase with the Google site.
> >
> >I'm putting the finishing touches on the next release. You will receive
> >a message on the list when it is available.
> >
> >John
> >
> >On Wed, Sep 17, 2008 at 01:04:16PM +1000, James Teh wrote:
> >>Hi all,
> >>
> >>Attached is a patch to do the following:
> >>* liblouis/lou_translateString.c: When handling input which is removed
> >>due to a "repeated" opcode, correctly update the mapping from input to
> >>output positions; i.e. the outputPos array and the returned cursorPos.
> >>
> >>The test case is a string such as:
> >>"a"
> >>Translating this with en-us-g2.ctb yields:
> >>"a"
> >>Observe that the second space has been eliminated due to the "repeated"
> >>opcode for spaces. Previously, the mapping from input to output
> >>positions was:
> >>[0, 1, 0]
> >>This is incorrect; this indicates that position 2 of the input maps to
> >>position 0 of the output.
> >>It should be:
> >>  [0, 1, 1]
> >>indicating that both position 1 and 2 of the input map to position 1 of
> >>the output. Similar issues exist with cursorPos. This patch fixes both.
> >>
> >>Note that the patch passes a string to for_updatePositions, but passes
> >>the output length as being 0, so the string is unnecessary. However,
> >>passing memcpy a NULL src is undefined according to the spec (though
> >>most implementations don't complain).
> >>
> >>I have already committed this to the svn for the liblouis Google Code
> >>project.
> >>
> >>Jamie
> >>
> >>--
> >>James Teh
> >>Email: jamie@xxxxxxxxxxx
> >>WWW: http://www.jantrid.net/
> >>MSN Messenger: jamie@xxxxxxxxxxx
> >>Jabber: jteh@xxxxxxxxxx
> >>Yahoo: jcs_teh
> >
> >>Index: liblouis/lou_translateString.c
> >>===================================================================
> >>--- liblouis/lou_translateString.c  (revision 34)
> >>+++ liblouis/lou_translateString.c  (revision 35)
> >>@@ -1891,6 +1891,7 @@
> >>            &&  compareChars (&transRule->charsdots[0],
> >>                            &currentInput[src], transCharslen, 0))
> >>          {
> >>+           for_updatePositions(transRule->charsdots[0], transCharslen, 
> >>0);
> >>            src += transCharslen;
> >>          }
> >>        break;
> >
> >
> 
> -- 
> James Teh
> Email: jamie@xxxxxxxxxxx
> WWW: http://www.jantrid.net/
> MSN Messenger: jamie@xxxxxxxxxxx
> Jabber: jteh@xxxxxxxxxx
> Yahoo: jcs_teh

> Index: liblouis/lou_translateString.c
> ===================================================================
> --- liblouis/lou_translateString.c    (revision 34)
> +++ liblouis/lou_translateString.c    (working copy)
> @@ -1891,6 +1891,17 @@
>                  && compareChars (&transRule->charsdots[0],
>                                   &currentInput[src], transCharslen, 0))
>             {
> +             if (outputPositions != NULL)
> +             {
> +               int tcc;
> +               for (tcc = 0; tcc < transCharslen; tcc++)
> +                 outputPositions[tcc + k] = dest - 1;
> +             }
> +             if (!cursorStatus && src <= cursorPosition && cursorPosition < 
> src + transCharslen)
> +             {
> +               cursorStatus = 1;
> +               cursorPosition = dest - 1;
> +             }
>               src += transCharslen;
>             }
>           break;


-- 
My websites:
http://www.godtouches.org
http://www.jjb-software.com
Location: Madison, WI, USA

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: