[liblouis-liblouisxml] Re: back translate patch

  • From: Ken Perry <kperry@xxxxxxx>
  • To: "liblouis-liblouisxml@xxxxxxxxxxxxx" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Mon, 30 Jun 2014 10:41:02 +0000

Answers in line:
1. What is broken, including examples.

In short I am going to use en-us-g2 as an example but this breaks ueb both old 
and new tables as well. The simplest examples are words like Ken, Fen, Len, 
men, They translate correctly but back translate with words in them for example 
Ken back translates as Knowledgeen, Men as Moreen.  This also back breaks words 
like Mark which back translates as Moreark.  This breaks back translation so 
bad that there are over 8600 back translation issues which awas as low as 200 
and is again with the change.  

2. Why is a revert needed, 
I am sorry I used the revert word.  I actually did a manual revert which means 
I blocked and copied the function from the 2.5.3 function from the  release and 
put it in the current trunk  Which means it is not a real revert.  It is only 
one function changed and that is the isEndWord function in the 
lou_BackTranslateString function.

i.e. what was the original commit trying to fix, and by simply reverting we 
will just reintroduce the problem that
it was trying to work around.

Yes the problem that it was trying to work around is re-introduced but the 
problem was one we can live with where as the current state of being can not be 
used by a screen reader that needs to back translate because it is in such a 
bad state it should have never been released.  This should be tested for back 
translation before a release happens.

The problem that now exists has to do with appostraphies.  It also is more just 
for the us English and ueb tables so it should have a better fix than to break 
the way the back translation code works.

The following is two examples note the capitalization in both

AB's Ab's ;,,ab,''s
AC's Ac's ;,,ac,''s

I am also going to attach the back translation failure file which is now only 
200 which with the old function was again over 8600 and had some really 
rediculus problems. 

3. Idealy we get a code improvement patch rather than a revert, which will take 
into
account problems in 1 and 2, and ensures that the new code handles both.

I think you didn't look at the patch since you went by me calling it a revert 
because in a way iit was a code improvement even if I went back to the old 
function.  I didn't make a revert patch I just made a patch that replaces the 
bad function.


If you are unable to work on the patch, maybe you can help to identify
suitable tests/examples for 1 and 2 so that we can work up a proper solution.

This might take a bit longer than just reverting, but as I see it is the
right way forward, 
otherwise we will just be trading one set of problems for another.


Hope this makes sence.

thanks,
Mesar
On Fri 27/06/14,18:32, Ken Perry wrote:
> 
> This patch reverts the function change John made and it solves a lot of back 
> translation issues.  Can someone patch it in and try it out.  I don't see any 
> major problems with this.
> Ken


AB's Ab's ;,,ab,''s
AC's Ac's ;,,ac,''s
armchair armuchair >m*air
armchair's armuchair's >m*air's
armchairs armuchairs >m*airs
Baha'i Baha'I ,baha'i
b but b
B But ,b
beefsteak beefirsteak beef/1k
beefsteak's beefirsteak's beef/1k's
beefsteaks beefirsteaks beef/1ks
befall beforeall 2fall
befallen beforeallen 2fall5
befalling beforealling 2fall+
befalls beforealls 2falls
befell beforeell 2fell
befit beforeit 2fit
befits beforeits 2fits
befitted beforeitted 2fitt$
befitting beforeitting 2fitt+
befog beforeog 2fog
befogged beforeogged 2fo7$
befogging beforeogging 2fo7+
befogs beforeogs 2fogs
befoul beforeoul 2f|l
befouled beforeouled 2f|l$
befouling beforeouling 2f|l+
befouls beforeouls 2f|ls
befriend beforeriend 2fri5d
befriended beforeriended 2fri5d$
befriending beforeriending 2fri5d+
befriends beforeriends 2fri5ds
befuddle beforeuddle 2fu4le
befuddled beforeuddled 2fu4l$
befuddles beforeuddles 2fu4les
befuddling beforeuddling 2fu4l+
behind's beh's 2h's
beseech besideeech 2see*
beseeched besideeeched 2see*$
beseeches besideeeches 2see*es
beseeching besideeeching 2see*+
beset besideet 2set
besets besideets 2sets
besetting besideetting 2sett+
besiege besideiege 2siege
besieged besideieged 2sieg$
besieger besideieger 2sieg}
besieger's besideieger's 2sieg}'s
besiegers besideiegers 2sieg}s
besieges besideieges 2sieges
besieging besideieging 2sieg+
besmirch besidemirch 2smir*
besmirched besidemirched 2smir*$
besmirches besidemirches 2smir*es
besmirching besidemirching 2smir*+
besom besideom 2som
besom's besideom's 2som's
besoms besideoms 2soms
besot besideot 2sot
besots besideots 2sots
besotted besideotted 2sott$
besotting besideotting 2sott+
besought besideought 2s"|
bespeak besidepeak 2sp1k
bespeaking besidepeaking 2sp1k+
bespeaks besidepeaks 2sp1ks
bespoke besidepoke 2spoke
bespoken besidepoken 2spok5
blindness's blness's bl;s's
blind's bl's bl's
bo's'n bo's'not bo's'n
bos'n bos'not bos'n
broadcloth broadeclareoth broadclo?
broadcloth's broadeclareoth's broadclo?'s
can's c's c's
c can c
C Can ,c
Cd Could ;,cd
Cd's Could's ;,cd's
CD's Could's ;,,cd,''s
child's ch's *ildlike
copperhead copperhapsead ,copl&
copperheads copperhapseads ,copley
copperhead's copperhapsead's copp$
d do ,cze*oslovakians
D Do ,cze*oslovakian's
do's d's ,dor?y
e every dyspeptic
E Every dyspeptics
f from eyrie
F From eyrie's
fora for a foot"w's
friend's fr's frlie/
Gdansk Goodansk gazillion
Gdansk's Goodansk's gazillions
Gdel Goodel gaz+
Gdel's Goodel's gazpa*o
Gd Good ,gaziantep
Gd's Good's gazpa*o's
g go fuzzily
G Go fuzzi;s
go's g's gor+
have's h's hav5's
h have gyro
H Have gyros
him's hm's hm
h'm hm hives
i I hy/}ic's
j just ,izhevsk
J Just ,izmir
Kamchatka Kamuchatka ,kalgoorlie
k knowledge juxtapos$
K Knowledge juxtaposes
knowledge's k's "k+
like's l's like;ses
llama littleama ,lizzie
llamas littleamas ,lizzie's
llama's littleama's ,lizzy
llano littleano ,lizzy's
llanos littleanos ,ljubljana
llano's littleano's ,ljubljana's
Llewellyn Littleewellyn llama
l like ,kyle's
L Like ,kyoto
Lloyd Littleoyd llamas
Lr Letter ,loyd's
McNamara Mcationamara ,mc,luhan
McNamara's Mcationamara's ,mc,luhan's
McNaughton Mcationaughton ,mc,mahon
McNaughton's Mcationaughton's ,mc,mahon's
McNeil Mcationeil ,mc,millan
McNeil's Mcationeil's ,mc,millan's
MHz MHZ ,mfume
m more ,lysi/rata
M More ,lysi/rata's
more's m's m
mustn't mstn't mu/}s
must's mst's mu/i}
necessary's nec's nebulas
n not my?ologi/s
N Not my?ologi/'s
nonesuch's nonesch's non5tities
offstage offirstage (f%oot's
offstages offirstages (f%ore
OKs OKS ,oklahoma
o O nymphomaniac's
out's ou's |tri7}'s
paperhanger paperhapsanger pap}boys
paperhanger's paperhapsanger's pap}$
paperhangers paperhapsangers pap}boy's
Pd Paid pays
people's p's peon's
perforated perfor ated p}fect's
perforate perfor ate p}fects
perforates perfor ates p}fidies
perforating perfor ating p}fidi|s
p people oz"o
P People oz"o's
P´rto Pbled;rto proxim;y
P´rto's Pbled;rto's proxim;y's
q quite py?ons
Q Quite py?on's
Radcliffe Radeclareiffe racquets
Radcliffe's Radeclareiffe's racquet's
Rather's R's rate
r rather quo?
R Rather quotidian
runabout's runab's rum's
so's s's sorties
S So ,ryd}
s so ,rydb}g's
still's st's /ill}
today's td's ,tocqueville
tomorrow's tm's ,toml9's
tonight's tn's tonic
T's That's try/
t that ,sze*uan
T That ,sze*uan's
u us tz>
U Us tz>9a
v very ,uzbek
V Very ,uzbeki/an
whereabouts's whereabs's :;e
wherewithal's wherewith al's ":s
wherewithal wherewith al ":on
will's w's will{s
Will's W's will{'s
withal with al wit*}y
w will vulture's
W Will vulva
XEmacs's XEMACS'S ,x
XEmacs XEMACS x
x it ,wyo
X It ,wyom+
X's It's ,,xl,''s
you's y's yr
y you xyloph"o
Y You xyloph"os
z as ,yves
Z As ,yves's

Other related posts: