[liblouis-liblouisxml] Re: Difficulty with the context opcode.

  • From: "Michael Whapples" <dmarc-noreply@xxxxxxxxxxxxx> (Redacted sender "mwhapples" for DMARC)
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Wed, 4 Jan 2017 11:50:50 +0000

Firstly I think your suggested rule of:

pass2 %englishLetter. @56*

Is not correct to solve the question as it needs to be only for the first character of a string, a ` prefix would at least be needed. Also see later notes about my issues with @56* and not being able to recommend it.


Well I have been writing various bugs to this list over the last few days/weeks. Do bug reports to the list not get observed? Where should bug reports go to be noted?


1. @56* does not always work. It seems to work in pass2 but not in context. Context when it is applied I get a space instead of the content * would copy.

2. The use of grouping characters seems to differ depending on removal or replacement. For a rule like:

pass2 [{mygroup]@1}mygroup ?

Both the opening and closing grouping characters are removed. A rule like:

pass2 [{mygroup]@1}mygroup @56

Would only lead to the opening grouping character being replaced, the closing one will remain in the translation.

3. The classes defined by $ are not always applied in pass2 rules:

math \xf32e @12e

pass2 [@12e-36]$d @36-3456 # Rule1

pass2 @12e ? # Rule2

For a string like:

\xf32e-3

Whilst I would expect the rule I gave the comment # Rule1 to be applied, I find # Rule2 is applied. I expect # Rule1 because it has the longer match. I conclude it is the $d at fault because a rule like:

pass2 [@12e-36]@25 @36-3456

will be applied. The documentation does not say that $ classes cannot/should not be used in pass2 rules.


I could go on but here seems not the place.

My point though is that the documentation is so slim one relies on undocumented stuff and so would it be considered a bug if it just changes without notice? As an example you say about using @56* but the documentation does not say what happens when doing this and my observation is that it is not the same for context and pass2. In fact in an earlier mail John said that @56* should not be used, * or ? should be the only thing in the third column when they are used.

Michael Whapples


On 04/01/2017 10:56, Bert Frees wrote:

Multipass opcodes aren't that difficult. I don't know of any bugs, but it is possible that there are some, I haven't used things like "*", "_" and "!" much. The issue with the zillion (256) dots 56 isn't a bug. You just end up in a loop because you're starting with the empty brackets, so in the next iteration you don't move forward. 256 is a hard limit apparently, we should probably throw an error if we reach that number.

Anyway, this seems to work for me:

pass2 %englishLetter. @56*


Can you try that?



2017-01-04 11:19 GMT+01:00 Michael Whapples <dmarc-noreply@xxxxxxxxxxxxx <mailto:dmarc-noreply@xxxxxxxxxxxxx>>:

    I am aware that no answer yet has been given to your original
    question. Having done a bit more work using context and multipass
    opcodes for some tables I am working on, they really seem to be
    questionable in how these rules work and I feel just unreliable
    opcodes. The documentation is so slim and there are so many cases
    beyond the documentation specification one is reliant on
    undocumented and probably undefined behaviour, who knows if it
    will change in the future without notice.


    I think I now understand why Mike Gray decided to create the match
    opcode to replace these. I am not sure if that match opcode has
    been included into the standard liblouis or if it is still an APH
    specific feature. I am not sure if he added details of the match
    opcode to the documentation but here is a link to an old mailing
    list post where the match opcode was described
    //www.freelists.org/post/liblouis-liblouisxml/new-opcodes
    <//www.freelists.org/post/liblouis-liblouisxml/new-opcodes>

    Also here is a link to the issue for merging the documentation for
    the match opcode
    https://github.com/liblouis/liblouis/pull/189/files
    <https://github.com/liblouis/liblouis/pull/189/files>


    May be this will offer a possible solution.


    Michael Whapples


    On 02/01/2017 16:30, Dave Mielke wrote:

        [quoted lines by Michael Whapples on 2017/01/02 at 10:38 +0000]

            OK, I wasn't certain and now you mention getting repeated
            dots 56 for
            that first rule I think I had a similar issue when creating a
            different rule.

        For that case (empty brackets), lou_trace gives me:

            1.      lowercase       a       1
            2.      context `[]$w   @56
            3.      lowercase       a       1
            4.      context `[]$w   @56

        And so on. The log made it easier to count. It looped 256 times.

            I have done a bit more looking at it and my original
            suggestion of
            @56* was wrong, it appears that can be used in the third
            column. So
            yes your original suggestion looks correct.

        This is what lou_trace gives for my original method (class
        name within
        brackets, and @56*):

            1.      lowercase       a       1
            2.      context `[$w]   @56*
            3.      lowercase       b       12
            4.      lowercase       c       14
            5.      space           0
            6.      lowercase       d       145
            7.      context _!$l[$w]        @56*
            8.      lowercase       e       15
            9.      lowercase       f       124


    For a description of the software, to download it and links to
    project pages go to http://liblouis.org



Other related posts: