[liblouis-liblouisxml] Re: Crash in Python hyphenation

  • From: James Teh <jamie@xxxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Thu, 07 Jun 2012 20:05:41 +1000

On 7/06/2012 7:42 PM, James Teh wrote:
1. The documentation for lou_hyphenate needs to be clarified to indicate
that the hyphens string should be of size inlen + 1 (to account for the
NULL terminator);
The Python bindings of course need to be updated accordingly; i.e. using inlen.value + 1.

2. Somewhere, the code is overrunning to inlen + 2. We haven't quite
managed to track this down yet.
Christian and I think we might have found the culprit: the loop in hyphenate() line 267. The story starts a few lines back:

261   j = 0;
262   prepWord[j++] = '.';
// Now j = 1
263   for (i = 0; i < wordSize; i++)
264     prepWord[j++] = (findCharOrDots (word[i], 0))->lowercase;
// At end of loop, j = wordSize + 1
265   prepWord[j++] = '.';
// Now j = wordSize + 2
266   prepWord[j] = 0;

The point of this code seems to be to create a temporary word with trailing "." characters on each end. However:

267   for (i = 0; i < j; i++)
268     hyphens[i] = '0';

This code seems to be initialising hyphens. However, hyphens is of size wordSize, but we're going to wordSize + 1 instead of stopping at wordSize - 1 (since hyphens[wordSize] is the NULL terminator). This is our buffer overrun. If I'm correct, this should instead be:

267   for (i = 0; i < wordSize; i++)
268     hyphens[i] = '0';

John, does this look right to you?

Jamie

--
James Teh
Director, NV Access Limited
Email: jamie@xxxxxxxxxxxx
Web site: http://www.nvaccess.org/
Phone: +61 7 5667 8372


For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: