[liblouis-liblouisxml] Re: LibLouis bug with largesign

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Wed, 12 Mar 2014 14:01:16 +0000

Well 0xffff is hardly an escape sequence as its a single character, it might be a control character.


An escape sequence has an escape character which puts the software or hardware in another mode so that following characters are interpreted differently, until a terminating condition is met (usually either so many characters or a terminating character is found).

Just using a single character as a control character relies upon that character never being used for anything else.

Having done a search, it appears 0xffff is defined in unicode as being for internal use. Thus so long as liblouis does not change the meaning of this character we are fine.

Michael Whapples
On 12/03/2014 13:22, John J. Boyer wrote:
0xffff is the escape sequence. It is the minimum  width of widechar. It was 
chosen because it applies to both input characters and output dot patterns.

John
On W
ed, Mar 12, 2014 at 11:46:58AM +0000, Michael Whapples wrote:
I guess one solution to having inline delimiters is not using single
characters but having a escape sequence. if thinking of other systems,
it might be marked with a backslash (\) and then to add a backslash that
would be two backslashes (\\).

This obviously adds in a escaping sequence to ensure backslashes get
preserved.

Michael Whapples
On 12/03/2014 11:38, Keith Creasy wrote:
Michael.

Great work finding the problem.

Using 0xff to demark segments seems like a hack really. For one thing it
assumes I suppose that that value isn't used for something else.

The problem is the need to translate in context but keep the Braille
associated with the correct text node. At the moment I can't think of a
fool-proof way of doing that without such a delimiter.

Maybe John can do something that prevents LibLouis from stripping the
delimiter.

-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Michael
Whapples
Sent: Wednesday, March 12, 2014 7:22 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx; brailleblaster@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] LibLouis bug with largesign

Hello,
In tracking down the bug where liblouisutdml is putting the wrong Braille
content in brl nodes, I have now worked out the actual code which is
causing this issue.

Basically what seems to be the problem is that LibLouisUTDML processes a
paragraph by adding each text segment of the paragraph into a single
buffer, separating the segment with \xffff characters. Once the buffer
contains the full paragraph text it translates it in one go, and then
splits the text into the segments by searching for \xffff.

The problem is that if one has text which gets processed by a largesign
optcode rule either side of the \xffff end segment separator (eg. "and
\xffffthe") then liblouis will remove the \xffff from between the
largesign translations. Thus now when liblouisutdml takes the translation
and splits it into segments for inserting into the appropriate brl nodes
it will miss one of the split points.

Could someone point me at the code in liblouis which strips the characters
between largesign words. Then I could make a fix fairly quickly.

I guess a longer term question I want to raise is whether this way of
processing paragraphs is really a good plan?

Michael Whapples
For a description of the software, to download it and links to project
pages go to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: