[liblouis-liblouisxml] Re: ctypes and bad trailing pad byte

  • From: James Teh <jamie@xxxxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Thu, 04 Feb 2010 09:57:59 +1000

Hi.

On 3/02/2010 9:09 PM, Christian Egli wrote:
I've been plagued with segmentation faults in the liblouis Python
interface the last couple of days.
...
Debug memory block at address p=0x9ace5a8:
     41 bytes originally requested
     The 4 pad bytes at p-4 are FORBIDDENBYTE, as expected.
     The 4 pad bytes at tail=0x9ace5d1 are not all FORBIDDENBYTE (0xfb):
Ack. I think this is due to a bug in the Python bindings, which I have fixed in r331. typeform needs to contain inlen bytes of typeform data, but liblouis writes outlen bytes of data to it (this is documented). The Python bindings weren't allocating outlen bytes for typeform, thus causing a buffer overrun. I've never seen the bug before because we're not using typeform data yet. Let me know if this fixes the crash.

This also presents another issue: the bindings aren't returning typeform to the user. They need to be reworked to present more data. I think this could be best done with a namedtuple (a dict is a bit of a pain), but namedtuples aren't in Python 2.5 and earlier, so we might need to invent our own class.

1) When invoking with the Python interpreter that has debugging symbols
the translation is different (__>USA _TODA'Y instead of _>USA __TODA'Y).
I have no idea why this could be.
Nor I. Any change with this fix?

Is there something in the ctypes definition that we need
to specify that a Unicode string is returned from translateString?
Already done; see the argtypes stuff at the top.
Or is
this maybe a problem with UCS2 and UCS4 when building the two versions
of the Python interpreter.
Nope; you'd see a lot worse problems if that were the case.

Jamie

--
James Teh
Email/MSN Messenger/Jabber: jamie@xxxxxxxxxxx
Web site: http://www.jantrid.net/
For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: