[haiku-gsoc] Do we ICU or do we not ICU?

  • From: Oliver Tappe <zooey@xxxxxxxxxxxxxxx>
  • To: haiku-gsoc@xxxxxxxxxxxxx
  • Date: Mon, 11 May 2009 22:03:46 +0200

Hi there,

Adrien/PulkoMandy has suggested to base parts of his work on the locale kit 
on ICU (International Components for Unicode, see icu-project.org).

We (Adrien and me) have already talked a bit about the sheer size of the 
ICU-source (58 MB unpacked) to be a problem. The resulting libraries (that go 
into the haiku image) would be something around 15-17 MB. The majority of 
that is taken up by the locale data (collation information, character 
properties, charset conversion mappings, ...).

Before importing such a beast into our repo, I thought it would be a good 
idea to discuss where we'd like to go with ICU. As far as I can remember, 
Adrien wants to use ICU for collation stuff and for the number & date 
formatting & parsing (Adrien: please expand on this, if you can). 
I know that we already had implemented some of that functionality in the 
locale kit, but I can't remember how much was there and what is still missing 
(Axel & Ingo: can you shed some light on this?).

I suppose if we decide to import ICU into our repo, it would make sense to use
it for required or existing services/APIs, i.e. to implement the POSIX locale 
stuff by means of ICU, to replace the current use of libiconv with ICU's 
respective charset conversion services, to make use of ICUs regexx engine as 
a basic service, ...

There are many more features of ICU that could be used by haiku in the 
future, for instance the text/char iterator classes that could be used by 
BTextView to do proper wordwise navigation and word wrapping.
There's even (font-engine agnostic) textlayout-engine for bringing the more 
complicated scripts on screen.

I guess what I want to do with this mail is to get a discussion started about 
if using ICU makes sense at all and, if so, which parts are *required* for 
the locale kit and thus should be targeted first?

Adrien, it would be very helpful if you could say something about your own 
ideas and correct/expand the stuff I wrote above.

cheers,
    Oliver

Other related posts: