#13184: Infinite loop with bash/readline/ICU on non-BMP Unicode characters
-------------------------------+----------------------------
Reporter: jessicah | Owner: pulkomandy
Type: bug | Status: new
Priority: normal | Milestone: Unscheduled
Component: Kits/Locale Kit | Version: R1/Development
Resolution: | Keywords:
Blocked By: | Blocking:
Has a Patch: 1 | Platform: All
-------------------------------+----------------------------
Comment (by jessicah):
Mm, I added some more tracing to my modified version; indeed, using
`ucnv_getNextUChar()` returns the single value 150370 instead, and source
length is still 4.
I'm not sure if using `ucnv_getNextUChar()` is the right fix here though.
Not sure this would work correctly on invalid sequences, as required by
`mbrtowc`.
Also, I've noticed in `WcharToMultibyte()` that we a) convert wchar_t from
UTF-32 to UTF-16, and then operate on UTF-16 for doing the actual
conversion.
So, if I'm understanding `WcharToMultibyte()` correctly, then apparently
our wchar_t is indeed UTF-32, not UTF-16, which means our
`MultibyteToWchar` should handle UTF-16 surrogate pairs correctly. Seeing
how `WcharToMultibyte()` is implemented, I think I may be able to provide
a proper patch :-)
--
Ticket URL: <https://dev.haiku-os.org/ticket/13184#comment:8>
Haiku <https://dev.haiku-os.org>
Haiku - the operating system.