[liblouis-liblouisxml] Re: Issue with emphasis

  • From: "Michael Whapples" <dmarc-noreply@xxxxxxxxxxxxx> (Redacted sender "mwhapples@xxxxxxx" for DMARC)
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Thu, 20 Nov 2014 12:03:55 +0000

Yes it is true that ASCII text files would never contain that sort of information and it requires files like XHTML to provide that.


I think the question here is whether liblouis as the translator should handle text attributes or whether it is external. Is text attributes so core it should be a feature of the main translation system, or is it something for document translation tools to handle.

One thing is for sure, internally liblouis has to have its own construct to handle text attributes rather than using something standard. At the moment this construct is an additional array of bytes of the same length as the input string. The value of each byte in that array determines the attributes of the character in the input array at the same position.

An alternative implementation could have been to have a struct containing a character (widechar or unicode character), and a typeforms field and then have a array of that struct (IE. the struct represents characters with attributes).

In either case the application needs to create these constructs for liblouis to handle text attributes.

I am not sure that for text attributes that UTD is needed or whether it offers much.

It may when we start introducing indexing, yet another thing to associate with the text characters.

Michael Whapples
On 20/11/2014 11:42, Larry Skutchan wrote:
Isn't part of the issue that one cannot represent some of the target indicators 
in ASCII? How, for example, does one represent bolding, italics, or underlining 
in a text file?
My understanding is that UTD is also a means of passing much more rich 
information to the translator.



-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples
Sent: Thursday, November 20, 2014 6:33 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: Issue with emphasis

My only question of that is, if the case for doing it in liblouis is so strong, 
why have so many done their own hack work arounds outside liblouis rather than 
fixing liblouis and submitting a patch? What is it that everyone is missing in 
the fixing liblouis solution?

There must be something in the external solution idea which has lead people 
that way.

I am suggesting that the increased complexity of the external solution cannot 
be too big as it seems preferable compared to working with liblouis code.

Additionally I would suggest that having to work in C increases the difficulty 
as one looses some of the features of higher level languages (eg. string 
manipulation libraries, object orientation, etc).

Michael Whapples
On 20/11/2014 10:38, Bert Frees wrote:
A key point is that all the behaviour including text-level formatting
should be programmable in a single file format (i.e. the liblouis
table). I don't see how an external solution could be better than
including the logic in liblouis itself. It can only make things more complex.

Re: the possible backward compatibility issue, I'm hoping the typeform
argument can be extended to cover all the UEB requirements and more.
If not, we could introduce an additional function that accepts a
typeform with a bigger width (currently it's a byte).

Bert


James Teh writes:

Hi.

On 20/11/2014 9:34 AM, Michael Whapples wrote:
Whilst fixing the emphasis stuff, it probably should be modified to
allow for other text attribute handling, such as the additional
emphasis requirements of UEB and also at APH we are thinking that
transcriber notes could be handled in the same way as emphasis.
Where I am going with this is that it might potentially lead to a
change in the API which might not be backwards compatible,
Unless I'm missing something, I imagine transcriber notes could just
be another typeform flag applied to the text of the note.
liblouisutdml can then use this typeform flag when processing the
markup. At least for liblouis, I can't see this causing backwards compatibility 
issues.

How does solving it in liblouis compare to external/addon solutions?
An external solution:
* Increases complexity. Increased complexity means it is more error
prone. For example, you have to deal with multiple levels of indexing.
There's already quite enough of that confusion in the liblouis code.
:)
* Means additional dependencies. We already have two libraries:
liblouis and liblouisutdml. An external solution means a third.
* Complicates testing. Testing this stuff in a single project is hard
enough. Testing it across several projects is somewhat more
difficult, not least from a build perspective.
* Complicates maintenance, since we have to manage yet another library.

it was external then the
restriction of only using C might not be necessary and so might be
simpler to write.
True, but it also introduces a dependency problem. For example, if we
wrote it in Python, that makes integration difficult for Java
programmers and vice versa.

All of that said, I imagine fixing this in liblouis is going to be
quite difficult, so I can certainly understand the desire to avoid doing so.
However, if we keep patching the internal problems by creating
external solutions to fix them, things will get worse in another direction.

Just my AU$0.02 worth.

Jamie
For a description of the software, to download it and links to project
pages go to http://www.abilitiessoft.com
For a description of the software, to download it and links to project pages go 
to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: