[liblouis-liblouisxml] Re: lou_translate: unicode output?

  • From: Bert Frees <bertfrees@xxxxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Sat, 17 Jan 2015 21:49:59 +0100

Thanks for you thoughts Michael,

I thought about that too but it didn't worry me that much. I agree we should be
careful with creating two ways of doing one thing though.

If we include the fallback code it would be a oneliner with very limited
functionality and the goal should be to keep it like that. Not really any
maintenance issues there. For me the iconv + fallback idea is appealing because
it sounds like a good compromise between 100% iconv (all or nothing situation if
the iconv dependency is optional) and 100% reinventing the wheel.

I don't have a strong opinion about including the fallback code or not. If it's
acceptable to have no unicode option if you don't want to or can't install
iconv, then let's do it without the fallback code. (By the way I think iconv can
just as well be used under Windows.)

But what we should definitely avoid IMO is to reimplement too much stuff that
somebody else already did in iconv. Who volunteers to write that code and
maintain it?. Also we don't want to end up in the situation we have now with our
libhyphen fork.

Bert


Michael Whapples writes:

> I do see disadvantages.
>
> If it is optional with a fallback when not taking the option, we have 
> two ways of doing one thing. May be one option offers more, but 
> nevertheless some functionality has multiple implementations.
>
> This sounds like more stuff to maintain. Do we have a maintainer for all 
> this? Who will maintain the fallback code? Will everyone need to respect 
> the fallback code or is it fair game to break that code as you don't use 
> it and so have no responsibility to ensure it works, the maintainer of 
> the code has to handle the breakage (if a maintainer even exists).
>
> My fear is that the fallback could easily become an unmaintained piece 
> and soon become the pile of unmaintained junk like the MSVC build script 
> which is only there to cause one to get ones hopes up and end in 
> disatisfaction when it does not work.
>
> Just my thoughts, take what you will from them, may be those things are 
> not a concern to the project as a whole.
>
> Michael Whapples
> On 17/01/2015 11:07, Bert Frees wrote:
>> It would be an optional dependency. So for Windows users nothing will change
>> except they can get utf16 encoded unicode braille on the command line. 
>> However
>> by using iconv on Linux we get encoding in the user's LOCALE for free.
>>
>> A lot of programs have optional dependencies to enable optional features, I
>> don't see any disadvantages.
>>
>>
>>
>>
>> John J. Boyer writes:
>>
>>> I don't think that using a dependency that is available only on *noix is
>>> a good idea. Probably liblouis is most used with Windows, to say nothing
>>> of OSX. I would like to see liblouis with no dependencies.
>>>
>>> John
>>>
>>> On Sat, Jan 17, 2015 at 11:00:04AM +0100, Bert Frees wrote:
>>>> Oh. Right. Yes it's only UTF-8. Using iconv and falling back on a very 
>>>> simple
>>>> encoder sounds very reasonable. Encoding unicode braille as UTF-8 or 
>>>> UTF-16 are
>>>> just oneliners. And we'll add some automake checks for iconv.
>>>>
>>>> What branch are you working on?
>>>>
>>>>
>>>> Simon Aittamaa writes:
>>>>
>>>>> Hi Bert,
>>>>>
>>>>> That patch is only for UTF-8 output, unless I missed something? My patch 
>>>>> is
>>>>> based on iconv, which I would say is preferable on *nix systems...
>>>>>
>>>>> As I suggested earlier, we could introduce a new function,
>>>>> showStringUnicode() which in turn could use iconv (if available), and
>>>>> fallback on hand-crafted UTF-8 on encoder on *nix and hand-crafted encoder
>>>>> for UTF-16 on windows (or simply use the windows API).
>>>>>
>>>>> Since we are _probably_ only going to output a limited subset of Unicode,
>>>>> i.e. U+2800..U+28FF, we don't have spend that much time on handling
>>>>> edge-cases, we can simply error-out on anything that isn't a
>>>>> braille-pattern, or?
>>>>>
>>>>> Best,
>>>>> Simon
>>>>>
>>>>> On 15 January 2015 at 19:28, Bert Frees <bertfrees@xxxxxxxxx> wrote:
>>>>>
>>>>>> Hi Simon,
>>>>>>
>>>>>> I thought I had added a unicode option to lou_translate, but it was
>>>>>> lou_trace
>>>>>> and it's on a branch:
>>>>>>
>>>>>> https://github.com/liblouis/liblouis/commit/0b162ae430317713db5594d342f0a598813e79a3.
>>>>>> Your
>>>>>> patch is probably similar? Maybe we could combine our stuff, make sure
>>>>>> it's done
>>>>>> consistently etc. So if you want you can add changes to my branch,
>>>>>> otherwise
>>>>>> I'll assume it's okay and I just add the option to lou_translate.
>>>>>>
>>>>>> Thanks,
>>>>>> Bert
>>>>>>
>>>>>> Simon Aittamaa writes:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Is there any way to output unicode from lou_translate? For example:
>>>>>>>
>>>>>>> $ echo "Α" | lou_translate -uf unicode.dis,en-ueb-g1.ctb
>>>>>>> ⠰⠠⠨⠁
>>>>>>>
>>>>>>> I've currently patch lou_translate for this, but I was wondering if 
>>>>>>> there
>>>>>>> is a "standard" way of doing this?
>>>>>>>
>>>>>>> Best,
>>>>>>> Simon
>>>> For a description of the software, to download it and links to
>>>> project pages go to http://www.abilitiessoft.com
>> For a description of the software, to download it and links to
>> project pages go to http://www.abilitiessoft.com
>
> For a description of the software, to download it and links to
> project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: