[haiku-development] Terminal, East Asian full-width characters and BUnicodeChar vs ICU::UChar32

  • From: Siarzhuk Zharski <zharik@xxxxxx>
  • To: <haiku-development@xxxxxxxxxxxxx>
  • Date: Sat, 30 Mar 2013 10:43:13 +0100

Dear Colleagues,

to fix some issues in Terminal (#6227, #6717) and support displaying of East Asian characters correctly, some more support of full-width detection characters is required. Original MuTerm has homebrew implementation based on look up of it's own properties table. But this support was removed many years ago in the early phase of Haiku Terminal development. There are also about 3 similar looking implementations of wcwidth(wchar_t) in gdb/readine, libroot and coreutils correspondent.

We have also the BUnicodeChar class that is looking like light-weight replica of UChar32 from ICU. Looks like implementation of BUnicodeChar is not finished at the moment and it is functional only for codepoints less than 0x9F and have to be extended to the whole range to be usable for correct characters' width estimation. By the way, what was the intention for inventing BUnicodeChar? Are there any problems utilizing ICU's UChar32 directly? Do you have any advices how to implement the character properties for the whole range of codepoints in BUnicodeChar?

From the other side I found the homebrew version of IsFullWidth() in the Terminal is not so bad idea - at least currently we have only know is the character takes 2 cells or only one cell. So we should not waste the CPU resources for detection 0-width characters, for example. It is not universal, but as intermediate step will fix East Asian support in Terminal and move it forward in this direction.

Thank you for atention.

--
Kind Regards,
   S.Zharski

Other related posts: