[liblouis-liblouisxml] Re: Status ueb integration

  • From: Davy Kager <DavyKager@xxxxxxxxxx>
  • To: "'liblouis-liblouisxml@xxxxxxxxxxxxx'" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Mon, 20 Jul 2015 11:46:58 +0000

Hi,

After running the yaml tests and looking at the existing tables I have revised
my ideas about decpoint and numericmodechars somewhat.

The liblouis 2.6.3 behavior appears to be as follows:
* decpoint implies 'numeric mode' for the specified char. This is true if the
char is directly followed by a number, regardless of the preceding text.
* midnum implies 'numeric mode' for the specified char.
* begnum and endnum do not imply 'numeric mode'.

Keeping the above in mind I updated my copy of the UEB patches for the Dutch
table. The branch is here:
https://github.com/snaekobbi/liblouis/tree/dkager_dutch_with_patches

Now both decpoint and midnum cause CTC_NumericMode to be added to the specified
chars. This means that the last line in the following definition is now
redundant:
decpoint , 2
midnum . 256
numericmodechars ,. # redundant

As far as I can see this restores the behavior of liblouis 2.6.3. It shouldn't
break en-ueb-g1.ctb because that table defines everything in numericmodechars
as either decpoint or midnum.

You may still need numericmodechars if you want something beyond decpoint and
midnum, such as only a begnum definition. I feel this is flexible enough. It
means old tables do not need updating but you also don't have to 'abuse'
decpoint to get numeric mode for chars that aren't a decimal point at all.

Feedback is appreciated, as usual.

Davy

-----Oorspronkelijk bericht-----
Van: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] Namens Christian Egli
Verzonden: woensdag 15 juli 2015 14:54
Aan: liblouis-liblouisxml <liblouis-liblouisxml@xxxxxxxxxxxxx>
Onderwerp: [liblouis-liblouisxml] Status ueb integration

Hi all

Just a heads-up on the status of the integration of the ueb changes into
liblouis main line:

From Mike Gray's repository I have pulled the latest changes into the
feature/ueb_update branch. On a separate branch (integration/ueb_merge) I
merged this with the yaml tests (feature/yaml_tests).

On feature/ueb_update `make check` passes with no failures (ueb_test_data.pl is
marked as expected failure).

On feature/yaml_tests the tests (including the yaml-based harness tests) pass.

On the merged branch, i.e. on integration/ueb_merge I get 14 failures.
The output is attached. I suspect that this confirms many of the regressions
that Davy has been talking about. In particular Davy was mentioning regressions
with Numbers. I suspect the following excerpt from the test log confirms this:

FAIL: ar-ar-g1_harness
======================

Input: '4.0'
Expected: '⠼⠙⠨⠚' (length 4)
Received: '⠼⠙⠨⠼⠚' (length 5)

Or from the Korean table:

FAIL: ko-2006-g2_harness
========================

./ko-2006-g2_harness.yaml:5 Failure
Input: '2000년'
Expected: '⠼⠃⠚⠚⠚ ⠉⠡' (length 8)
Received: '⠼⠃⠚⠚⠚⠉⠱⠒' (length 8)
Diff: Expected ' ' but received '⠉' in index 5

I will try to investigate this some more when I get to it.

Thanks
Christian



-----
Die Android-App für SBS-Hoerbuecher ist da! Jetzt kostenlos im Google Play
Store herunterladen.
(https://play.google.com/store/apps/details?id=nl.dedicon.sbsleser)

DISCLAIMER:
De informatie verzonden met dit e-mail bericht is uitsluitend bestemd voor de
geadresseerde. Indien u niet de beoogde geadresseerde bent, verzoeken wij u
vriendelijk dit aan de afzender te melden (of via:
info@xxxxxxxxxx<mailto:info@xxxxxxxxxx>) en het origineel en eventuele kopieën
te verwijderen.

The information sent in this e-mail is solely intended for the individual or
company to whom it is addressed. If you received this message in error, please
notify the sender immediately (or mail to
info@xxxxxxxxxx<mailto:info@xxxxxxxxxx>) and delete the original message and
possible copies.

��u��*m���~�^�����޶�h�yhiحjwe�y,��k�7����z�(��m����&��謢�

Other related posts: