[liblouis-liblouisxml] SV: Re: SV: Re: SV: 8 dots contracted with caps was: are the swap opcodes broken?
- From: Bue Vester-Andersen <bue@xxxxxxxxxxxxxxxxxx>
- To: <liblouis-liblouisxml@xxxxxxxxxxxxx>
- Date: Thu, 19 Jan 2017 23:37:01 +0100
Hi Bert,
Technical or computer unfriendly? Probably a bit of both, but not impossible, I
think.
I will try to give an example where back-translation might go wrong:
Take the string “TXT”. Never mind that it is also a computer term and should
probably therefore not be contracted in the first place.
If capsnocont is in effect, it will be translated as either ,,txt or ,t,x,t
depending on the status of capsword (plain TXT in 8 dots). So far, so good. No
contraction anyway.
Back-translating ,,txt you get TXT because the begcapsword tells liblouis to
not use contraction rules when back-translating.
However, back-translating ,t,x,t or TXT, you get TMmT, unless Liblouis knows
that it should use the capsnocont rule whenever it sees two consecutive caps,
or unless the x had a letsign in addition to the capslettersign.
The rules for letsigns in this context might be different from language to
language, hence the computer unfriendliness. The Danish rules are unclear on
this, but I think most people would use a letsign in a case like this one.
So, it is mainly a question of securing the correct back-translation, even if
there is no begcapsword sign to indicate clearly that contraction rules should
not be used here.
Hope it makes more sense now.
Bue
Fra: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] På vegne af Bert Frees
Sendt: 19. januar 2017 09:55
Til: liblouis-liblouisxml@xxxxxxxxxxxxx
Emne: [liblouis-liblouisxml] Re: SV: Re: SV: 8 dots contracted with caps was:
are the swap opcodes broken?
2017-01-18 21:17 GMT+01:00 Bue Vester-Andersen <bue@xxxxxxxxxxxxxxxxxx
<mailto:bue@xxxxxxxxxxxxxxxxxx> >:
Hi Bert,
Regarding your first "btw" I don't quite understand what the problem is.
Maybe you are overthinking it?
The problem is that the back-translator could apply contraction rules because
it does not know that it is in a no-contractions state. A German example would
be the letters that are also used as partword contractions, i.e. q, x, and y.
In Danish, we have similar letters: q, w, x, and z. If capsnocont is defined
and the back-translator sees a begcapsword, it knows that contraction rules
should not be applied. But if no begcapsword is used, it should react on seeing
two or more capital letters. Asimilar problem occurs with the nocont opcode
where a certain text string triggers the no-contractions state, e.g.
http://, ;
.txt, or .zip. Hope it makes sense.
Sorry, didn't realize you were talking about backward translation at first. But
I'm still not sure whether this is about a technical difficulty (that can be
solved), about a non-computer-friendly braille code, or about a fundamental
problem in the braille code? Some real examples would be nice. (Sorry I must
sound stupid but some things are just not so easy to grasp without proper
braille knowledge). Thanks.
Regarding your second btw, yes perhaps you are right. But in which category
fall words that are not fully uppercase, but also not only the first letter?
Hmm, good question. I don’t know about the rules for this in other languages,
but I would say that mixed caps should be treated like all caps. Otherwise, you
could have some very confusing combinations of contracted and uncontracted
braille within the same word. The alternative is to have three separate
opcodes: singlecapsnocont, mixedcapsnocont, and allcapsnocont. I think that
would be overkill, but of course I might be proven wrong. :)
Yes I agree extra opcodes would probably be overkill. Just wanted to know how
mixed caps should be treated. Documentation should make this clear.
2017-01-17 20:56 GMT+01:00 Bue Vester-Andersen <bue@xxxxxxxxxxxxxxxxxx
<
mailto:bue@xxxxxxxxxxxxxxxxxx> >:
Btw: Testing backwards made me aware of a little snag: If capsnocont has been
defined, contraction rules should of course not be used when in capsword mode.
This should be easy enough when begcapsword/endcapsword are also defined.
However, if begcapsword/endcapsword are not defined, we have to assume a
capsword situation and activate capsnocont if capital letters or contractions
appear after each other.
Btw: according to the manual, capsnocont only affects all caps words, not words
with only the first letter capitalized. This is fine for the current purpose,
but I think there are languages where you cannot contract words with first cap
either. Until recently, this was the case in Danish 6 dots grade 2, but the
rules have been changed, so that it now behaves more like English in this
respect. Perhaps “allcapsnocont” would be a better name in respect to what it
does. If we then need an opcode to stop contraction of single caps, we could
use the name capsnocont. What do you say?
Other related posts: