[liblouis-liblouisxml] Re: AW: Re: Python package for easy installation of liblouis - announcing Transcribo, a Braille type-setting system - feedback and help wanted

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Sun, 19 Jul 2009 12:19:29 +0100

I've just thought and remembered something which could be a big catch, it is this UCS2 or UCS4 thing. This makes me think then it would be better to use ./configure and make where possible. I believe ./configure and make works for mingw but I don't know about it working for MSVC (although mingw can produce output compatible with MSVC I believe). However for most users on windows I imagine having a C compiler is not usual, so may be a binary dll should be provided, although then we get back to UCS2 and UCS4 again.


May be I should briefly say what the UCS2 and UCS4 problem is. Basically python can be compiled for 16-bit or 32-bit unicode and so can liblouis (16-bit unicode is UCS2 and 32-bit unicode is UCS4). Should we have a 16-bit unicode version of python then for the bindings to work we need an UCS2 build of liblouis and if we have a 32-bit unicode version of python then we need a version of liblouis compiled for UCS4, we cannot have a mixture (IE. 16-bit python with ucs4 liblouis will not work and neither will 32-bit python with UCS2). If we have one of those cases where python and liblouis use different size unicode then at best output from liblouis will be nonsense and I think in the worst case can lead to a crash of python with no way that python apps can recover (I am not quite sure if it is segmentation fault, but it is something just as serious).

So my thought is if such a setup.py script is to be generated then we do the following: Provide a dll for the binary version of python on windows (IE. which ever unicode size is used in the official python builds). We would detect this by checking the platform and checking sys.maxunicode (which is greater than 65536 if 32-bit unicode). We could provide a second dll for the other unicode size, but this obviously starts increasing the package size, or we could just try and compile. If the platform is not windows, I believe the compile process is the ./configure and make procedure, so we could just do this. We can pass the configure script the correct option for the unicode size of the python being used (this again can be got by checking sys.maxunicode).

Also I would say the above should be considered a source package, I don't think it would be possible to create a binary package (due to the UCS2 and UCS4 problem).

The only other thing I will say is that certainly on Linux where there are advanced paqckage management tools (such as apt on debian) easy_install is a very basic and would not be considered a preferred choice. Therefore users of linux probably will get liblouis via their distributions package system and all the UCS2 and UCS4 are dealt with. Also on debian liblouis and its bindings are packaged separately to give users choice.

Michael Whapples

On 19/07/09 01:07, Leo wrote:
I haven't tried, and I hope I won't need to. If you knew how poor my
knowledge on C compilers is... but I think what you write is a very good
starting point.

Here are some further thoughts to increase confusion:

The whole thing has to be portable. So if the configure script runs on all
platforms with all compilers (eg. mingw and MSVC on Windows), there is
probably nothing to object against your distutils-free approach which is
easier to maintain as you rightly point out. A no-brainer would probably
call 'make' on Unix-like OS's and use the ready-made DLL on win32.
Perfectionists would probably use setuptools as it abstracts from all the
platform and compiler specificities. If liblouis' configure script does that
job, I don't know. I would assume that smooth compiling with mingw and MSVC
should be the perfectionist's bottom line. But others on this list are much
better placed to judge this.

Leo

-----Ursprüngliche Nachricht-----
Von: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx]Im Auftrag von Michael
Whapples
Gesendet: Sonntag, 19. Juli 2009 01:28
An: liblouis-liblouisxml@xxxxxxxxxxxxx
Betreff: [liblouis-liblouisxml] Re: Python package for easy installation
of liblouis - announcing Transcribo, a Braille type-setting system -
feedback and help wanted


Further to my thoughts yesterday, I have now managed to do what you are
asking (well not quite yet for liblouis, but for a very, and I do mean
very, basic example of a shared library). There is another solution I
could guarantee to work for liblouis but may not sit well for a python
developer, you could always have the setup script execute the configure
script and make file.

Anyway back to getting setuptools to actually perform the compile.

So taking the C file mylib.c:

#include<mylib.h>
int addNums(int val1, int val2) {
      return val1 + val2;
}

and the header file mylib.h:

int addNums(int val1, int val2);

and then creating the setup.py script:

from setuptools import setup
from setuptools.extension import Library
setup(name="mylib",
      version="1.0",
      description="An example of compiling a library",
      ext_modules=[Library("mylib", ["mylib.c"], include_dirs=["."])]
)

Now run:

python setup.py build

You should find a shared object file in the build directory (look at the
output from the setup script to get the exact file name). I checked that
the shared object file worked as a proper shared object file by using
stypes in python to load it and use the addNums function.

I don't think this makes use of the build_clib step I mentioned
yesterday, I think setuptools.extension.Library is a replacement for an
Extension object which deals with stand alone C libraries and so
compilation happens as part of the build_ext step. Whether this has any
affect on needing to be careful about order extensions/libraries are
listed I don't know.

Also I am unsure whether defining all the compilation stuff like this in
setup.py is a good idea, IE. we would have two versions of the build
system, one using make the other using setuptools, and so both would
need maintaining and could get out of sync.

Does the above help at all?

Michael Whapples
On 17/07/09 22:36, Leo wrote:
Hello,

I am new to this list. So let me briefly explain who I am, why I've joined
and what I want.

1. I am using Braille in different languages and contexts, mainly English
and German, simple text and music, both on refreshable displays and paper.
None of the software transforming something into ready-to-emboss plain
text
appeals to me as it is either closed-source, costly, inflexible,
inaccurate,
complicated or a combination of these. Admittedly I haven't tried out
liblouisxml. But here, already the name is complicated and I anticipate
difficulties compiling it on Windows.

2. I like Python for its almost ideal combination of clear syntax,
conciseness, user-friendliness and speed. My first project has been
PyHyphen, a Python wrapper around a C library for multilingual hyphenation
that is used, eg, in OpenOffice.org (see
http://pypi.python.org/pypi/PyHyphen/).

3. At some point I took a look at reStructuredText (rST), a light-weight,
extensible markup language that is predominantly used to write software
documentation, eg. for the entire Python distribution (see
http://en.wikipedia.org/wiki/ReStructuredText). reStructuredText is very
easy to learn, powerful and clear to read. I am convinced it could serve
as
an excellent input format for high-end Braille layout. Its features
include
sections, bullet and enumerated lists, definition lists, tables,
references
such as auto-numbered footnotes, tables of contents, bibliographical
information to name but a few. What's more, rST can be extended through
custom directives and so-called interpreted text roles. Hence, it seemed
possible to use rST to mark-up text such that the output back-end would
use
different Braille translators as required, eg. for text including math,
music etc.

4. The reference implementation to process rST sources is Docutils
(http://docutils.sourceforge.net/). It can generate HTML, LaTeX, Beamer,
pdf, OpenOffice and other output formats from rST sources. Why not
ready-to-emboss plain text?

5. So a few months ago I started Transcribo.
(Homepage: http://transcribo.berlios.de
   Download daily snapshots from the Mercurial repository at:
http://hg.berlios.de/repos/transcribo/archive/tip.tar.bz2

Transcribo is currently a plain text back-end for Docutils. However, its
three-tier architecture makes it open to other input formats such as
LaTeX,
odf, RTF, xml, plain text or whatever. The core of Transcribo is a
rendering
package that generates a tree structure of frames. A frame can be thought
of
as a freely placeable, rectangular area on paper. The frames API is
flexible
enough to represent all kinds of lists, tables, multiple columns, centered
headings, and much more. Each frame may contain objects carrying content
of
any type. Each content object may be given dedicated translator instances,
wrappers with or without hyphenation etc. In particular, Transcribo
supports
liblouis as a translator for content to populate frames. Finally, the
frame-tree representation of the input file is assembled to form a plain
text file.

The bridge between Docutils and the frame renderer (in Docutils
terminology
this is called a writer) supports a subset of reStructuredText. Current
features include headings, paragraphs, hyperlinks, emphasized text style,
multi-level bullet lists and enumerations. Adding new features is often a
matter of a few lines of code. Transcribo's renderer is configured through
Python dictionaries. Future versions may prefer other formats such as JSON
or xml. The Docutils writer is mainly configured using the Docutils
configuration system, i.e. a config file and command line options. But
this
is still somewhat rudimentary. However, a command line option to choose
the
default translator is already implemented.

6. While Transcribo works with various translators, liblouis is currently
the most important one as it supports so many languages and math.

7. Transcribo might benefit from some refinement, testing and bug-fixing
before the first public release. Also, I'd like to make sure that users
can
easily install liblouis. When I tried to install it, I had some problems:
- finding the dll which is not on googlecode. John kindly pointed me to
the
page.
- copying the dll manually into the Windows/system32 directory
- downloading the liblouis sources
- installing the Python bindings
- copying some tables to a reasonable place

8. I'd like to see liblouis on the Python package index (pypi) so it can
be
installed automatically using setuptools. To this end, the dll needs to be
bundled with the Python bindings, some tables and the C sources. On Unix
systems, the sources would need to be compiled, on Win32, the dll needs to
be installed, preferrably in the package directory rather than the
windows/system32 dir as users do not always have admin privileges. It
would
be just great if the Python gurus on this list could make an effort.
Clearly, I would help write the setup script, although I don't know off
hand
how to tell distutils to compile a shared library that is not a C
extension
module.

Also, I would welcome any feedback and/or help on Transcribo. There is a
mailing list (see the homepage). It is not yet in use though. So feel free
to join.

Warm regards

Leo



For a description of the software and to download it go to
http://www.jjb-software.com

For a description of the software and to download it go to
http://www.jjb-software.com

For a description of the software and to download it go to
http://www.jjb-software.com

For a description of the software and to download it go to
http://www.jjb-software.com

Other related posts: