On Mon, May 11, 2009 at 4:03 PM, Oliver Tappe <zooey@xxxxxxxxxxxxxxx> wrote: > > Adrien/PulkoMandy has suggested to base parts of his work on the locale kit > on ICU (International Components for Unicode, see icu-project.org). If it helps the discussion, we will also need ICU for WebKit (though it probably could be factored out if we really did not want ICU on Haiku.) But in my mind it makes sense to use all that code (for both WebKit and Haiku i18n), which is not trivial to create properly. > We (Adrien and me) have already talked a bit about the sheer size of the > ICU-source (58 MB unpacked) to be a problem. The resulting libraries (that go > into the haiku image) would be something around 15-17 MB. The majority of > that is taken up by the locale data (collation information, character > properties, charset conversion mappings, ...). The source size could be a problem (more about that below), but I don't think 15-17 MB for the libraries is that bad, considering all that it contains. I know we want Haiku to be lean and mean, but with other modern operating systems having MULTI-GIGABYTE installations, I don't think we should get too concerned about 17 MB. > Before importing such a beast into our repo, I thought it would be a good > idea to discuss where we'd like to go with ICU. I am strongly against importing big external libraries into the Haiku source tree. There are plenty of tools now for linking external libraries into an existing source tree. Some options for SVN are svn:externals or Piston (http://piston.rubyforge.org/). We are surely not the only project that needs to use a big library like ICU as a core component, but don't want to put that code in their source control. Let's see what the options are. Once we figure out something nice I would suggest maybe setting up some other components like this (Mesa for one, probably several other things.) The only things I would not move out is anything that is deeply embedded or heavily modified (AGG and libc come to mind.) > I suppose if we decide to import ICU into our repo, it would make sense to use > it for required or existing services/APIs, i.e. to implement the POSIX locale > stuff by means of ICU, to replace the current use of libiconv with ICU's > respective charset conversion services, to make use of ICUs regexx engine as > a basic service, ... Like I said I don't think we need to or should import ICU into the Haiku source tree. But I still think we could use it for all the above. I certainly would like a built-in regex engine (as long as we can build a nice friendly Haiku API wrapper for it.) > There are many more features of ICU that could be used by haiku in the > future, for instance the text/char iterator classes that could be used by > BTextView to do proper wordwise navigation and word wrapping. > There's even (font-engine agnostic) textlayout-engine for bringing the more > complicated scripts on screen. Yes all this could definitely be good things to use ICU for. > I guess what I want to do with this mail is to get a discussion started about > if using ICU makes sense at all and, if so, which parts are *required* for > the locale kit and thus should be targeted first? If it helps I was able to compile ICU 2.6 for WebKit back in 2007. I believe others have done so since, so there may not be much "porting" required. But obviously wrapping all that functionality in nice Haiku kits is the challenge. By the way, there are some tricks that may be required to cross-compile ICU, though that may have been fixed since I had to compile it. I don't know if Adrien is doing this work from within Haiku or from a cross-compile setup... Regards, Ryan