On, Sat Jan 28, 2006, tpgww@xxxxxxxxxxx wrote: > On Fri, 27 Jan 2006 23:57:51 +0100 > Ronny Steiner <sir.steiner@xxxxxx> wrote: > > > Hi! > > > > Sorry, I found that the translation with german specific characters is > > not 100 percent correct. > > > > Most of the messages are correct but if I use a german specific > > character in the headlines (NOT categories) of the configuration pages > > (i.e. tab completion) the same problem appears. And here also LC_CTYPE > > changes doesn't work. > > > > All the other display of messages or filenames with these characters > > seems to be correct with the patch! > > > > Any ideas? > No, it's hard to understand why one string would be treated differently from > any other. > Because the stretch code in src/utils/e2_utils.c (the one with the big FIXME comment ;-) is messed up. After fiddling around with it about half an hour, I think, I found a solution, which is attached as patch and should work (tested only with german multibyte characters, which work well now). > Both the fr and ja translations sometimes use non [a-z] characters for > those particular configuration dialog strings, and I've not heard any > problem reported ... It's possible that the affected strings (stretched ones) do not contain multibyte characters, so that this error did not show up for now. I did not verify that by revising the .po files however. Some notes for the interested: Multibyte characters are stored in more than one byte, so that the current code, in which a gchar* pointer (single byte sized) was moved, broke up at the umlauts as those were spanned over two bytes. The result was a broken utf-8 string. The patch uses a somewhat slow utf-8 to ucs-4 to utf-8 conversion. This is necessary to iterate correctly over the single characters and place a space character between them. Unfortunately glib does not have any well working utf-8 iterator function (the ones offered are not guaranteed to be bulletproof and failed, when I tried to use them), but that minor conversion overhead should not be a problem. Regards Marcus
diff -ur emelfm2-0.1.5/src/utils/e2_utils.c emelfm2-0.1.5.patched/src/utils/e2_utils.c --- emelfm2-0.1.5/src/utils/e2_utils.c Fri Jan 6 23:22:56 2006 +++ emelfm2-0.1.5.patched/src/utils/e2_utils.c Sat Jan 28 02:32:50 2006 @@ -92,18 +92,28 @@ //FIXME for utf-8 gchar *e2_utils_str_stretch (gchar *str) { - gint len = strlen (str); - gchar *streched = g_malloc ((len * 2) + 2); - gint j = 0; - streched[j++] = ' '; - gint i; - for (i = 0; i < len; i++) - { - streched[j++] = str[i]; - streched[j++] = ' '; - } - streched[j] = '\0'; - return streched; + glong i = 0; + glong j = 0; + glong len = 0; + gunichar *conv = g_utf8_to_ucs4_fast (str, -1, &len); + gunichar *stretch = g_malloc (sizeof (gunichar) * len * 2); + glong max = 2 * len - 1; + + if (!stretch || !g_utf8_validate (str, -1, NULL)) + return g_strdup (str); + + for (i = 0; i < len; i++) + { + stretch[j] = conv[i]; + j++; + if (j < max) + stretch[j++] = 0x00000020; // space 0x20 -> ' ' + } + stretch[len * 2 - 1] = 0x00000000; + + gchar *retval = g_ucs4_to_utf8 (stretch, -1, NULL, NULL, NULL); + g_free (stretch); + return retval; } /** * @brief convert utf8 string @a string to lower case