[liblouis-liblouisxml] Re: SV: Re: Python, please leave my Braille in peace

  • From: "Michael Whapples" <dmarc-noreply@xxxxxxxxxxxxx> (Redacted sender "mwhapples@xxxxxxx" for DMARC)
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Mon, 14 Sep 2015 20:57:12 +0100

I did not think that python only used 7 bits, certain encodings may restrict you to the 7 bits, but the byte type in Python3 should hold 8 bits fine.

As for writing may be the best way to deal with it is to take the nearest existing Python codec and modify/add to it to handle the additional characters.

Looking at Python3 your code is wrong as well as you are doing decode on a str object (in Python2 these were unicode objects), this should be encode to convert the str into a bytes object. Python2 may have had decode on a unicode object but that is not good to do as it won't work in Python3.

Your code should look more like:
return louis.translateString([tableString], inString).encode("cp437")

Michael Whapples

On 14/09/2015 15:34, Bue Vester-Andersen wrote:

Hi Michael,

Thanks for the suggestion.
I hope it won't be necesary to write a new encoder from scratch. I don't
really want to map something to something else. I just want to convince
Python that I am dealing with 8-bit ASCII strings, and not have Python try
to remap them to unicode. It seems that in python, everything is either
unicode or 7-bit ASCII.

Bue


-----Oprindelig meddelelse-----
Fra: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] På vegne af Michael
Whapples
Sendt: 14. september 2015 15:34
Til: liblouis-liblouisxml@xxxxxxxxxxxxx
Emne: [liblouis-liblouisxml] Re: Python, please leave my Braille in peace

If I understand correctly, would writing a custom string encoder help.

Here is a question about writing a custom encoder on stackoverflow
http://stackoverflow.com/questions/5819586/how-do-i-write-a-custom-encoding-
in-python-to-clean-up-my-data

Michael Whapples

On 14/09/2015 14:24, Bue Vester-Andersen wrote:
Hi,

Please, I hope someone can help me solve this Python problem:

When handling Danish Braille output from Liblouis, I need to not just
handle
7-bit ASCII, but the full 8-bit range, perhaps except \x00 and \x7f (null
and delete).
Actually, the Danish 8 dot character set is built on cp1252, but with some
additional characters that are not defined in cp1252. So, I need to make
Python think that the Braille output is just some 8-bit ASCII of some code
page or other, and then make it keep its hands off my Braille, so that I
can
output it to a file as it is.

I have tried code like:

return louis.translateString([tableString], inString).decode("cp437")

Then I get the following error:

Traceback (most recent call last):
...
File "c:\python27\lib\encodings\cp437.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeEncodeError: 'ascii' codec can't encode character u'\x98' in
position
0:
ordinal not in range(128)

Anybody has a good idea?

Bue


For a description of the software, to download it and links to
project pages go to http://liblouis.org
For a description of the software, to download it and links to
project pages go to http://liblouis.org

For a description of the software, to download it and links to
project pages go to http://liblouis.org

For a description of the software, to download it and links to
project pages go to http://liblouis.org

Other related posts: