Re: German translation

  • From: Marcus von Appen <mva@xxxxxxxxxxxx>
  • To: emelfm2@xxxxxxxxxxxxx
  • Date: Sat, 28 Jan 2006 02:47:44 +0100

On, Sat Jan 28, 2006, tpgww@xxxxxxxxxxx wrote:

> On Fri, 27 Jan 2006 23:57:51 +0100
> Ronny Steiner <sir.steiner@xxxxxx> wrote:
> 
> > Hi!
> > 
> > Sorry, I found that the translation with german specific characters is
> > not 100 percent correct.
> > 
> > Most of the messages are correct but if I use a german specific
> > character in the headlines (NOT categories) of the configuration pages
> > (i.e. tab completion) the same problem appears. And here also LC_CTYPE
> > changes doesn't work.
> > 
> > All the other display of messages or filenames with these characters
> > seems to be correct with the patch!
> > 
> > Any ideas?
> No, it's hard to understand why one string would be treated differently from 
> any other.
> 

Because the stretch code in src/utils/e2_utils.c (the one with the big
FIXME comment ;-) is messed up. After fiddling around with it about
half an hour, I think, I found a solution, which is attached as patch
and should work (tested only with german multibyte characters, which
work well now).

> Both the fr and ja translations sometimes use non [a-z] characters for
> those particular configuration dialog strings, and I've not heard any
> problem reported ...

It's possible that the affected strings (stretched ones) do not contain
multibyte characters, so that this error did not show up for now. I did
not verify that by revising the .po files however.

Some notes for the interested: Multibyte characters are stored in more
than one byte, so that the current code, in which a gchar* pointer
(single byte sized) was moved, broke up at the umlauts as those were
spanned over two bytes. The result was a broken utf-8 string.

The patch uses a somewhat slow utf-8 to ucs-4 to utf-8 conversion. This
is necessary to iterate correctly over the single characters and place a
space character between them. Unfortunately glib does not have any
well working utf-8 iterator function (the ones offered are not
guaranteed to be bulletproof and failed, when I tried to use them), but
that minor conversion overhead should not be a problem.

Regards
Marcus
diff -ur emelfm2-0.1.5/src/utils/e2_utils.c 
emelfm2-0.1.5.patched/src/utils/e2_utils.c
--- emelfm2-0.1.5/src/utils/e2_utils.c  Fri Jan  6 23:22:56 2006
+++ emelfm2-0.1.5.patched/src/utils/e2_utils.c  Sat Jan 28 02:32:50 2006
@@ -92,18 +92,28 @@
 //FIXME for utf-8
 gchar *e2_utils_str_stretch (gchar *str)
 {
-       gint len = strlen (str);
-       gchar *streched = g_malloc ((len * 2) + 2);
-       gint j = 0;
-       streched[j++] = ' ';
-       gint i;
-       for (i = 0; i < len; i++)
-       {
-               streched[j++] = str[i];
-               streched[j++] = ' ';
-       }
-       streched[j] = '\0';
-       return streched;
+    glong i = 0;
+    glong j = 0;
+    glong len = 0;
+    gunichar *conv = g_utf8_to_ucs4_fast (str, -1, &len);
+    gunichar *stretch = g_malloc (sizeof (gunichar) * len * 2);
+    glong max = 2 * len - 1;
+
+    if (!stretch || !g_utf8_validate (str, -1, NULL))
+        return g_strdup (str);
+
+    for (i = 0; i < len; i++)
+    {
+        stretch[j] = conv[i];
+        j++;
+        if (j < max)
+            stretch[j++] = 0x00000020; // space 0x20 -> ' '
+    }
+    stretch[len * 2 - 1] = 0x00000000;
+
+    gchar *retval = g_ucs4_to_utf8 (stretch, -1, NULL, NULL, NULL);
+    g_free (stretch);
+    return retval;
 }
 /**
  * @brief convert utf8 string @a string to lower case

Other related posts: