[dokuwiki] New PR: Add proper Korean romanization

From: "alexdraconian" <wiki@xxxxxxxxxxxx>
To: dokuwiki@xxxxxxxxxxxxx
Date: Thu, 8 Feb 2024 10:54:39 +0100 (CET)

Hi,

alexdraconian opened a new pull request at
https://github.com/dokuwiki/dokuwiki/pull/4194:

Replacing #4182, since PR has been compromised by anti-ransomware mass-deleted
and re-added files on my environment. Here's some important points.

-----

Currently, Korean romanization does not work properly, since `romanize()` in
`Clean.php` does not handle full character. It only works when we type each
component individually(ex. ãããã¡ãã¡), which is virtually useless.

So I added function for decomposing Korean characters and romanize them
accordingly.

However, here's the catch.

- This code may have performance impact, since it uses looping instead of
`strtr`.
- However, if I add individual full characters to table instead, 11,172
characters should be added, which is quite a lot.

I thought this PR needs some discussion or review, so I opened as draft. (This
code works though.)

-----

I implemented dedicated Korean test, with most frequently used words provided
by National Institute of the Korean Language.

This implementation does spelling-based romanization, which is not official
romanization of Korean(which uses pronounciation-based). However,
pronounciation-based romanization is much more complicated and I think this
relatively simple implementation works fine for the purpose.

As far as I know, other project(OpenProject) also uses this kind of
romanization, but I can't surely confirm.

Please help us to review this pull request, so new contributors get feedback in
a timely manner.

12596150-c668-11ee-8db0-111b8dcab177

--
DokuWiki mailing list - more info at
http://www.dokuwiki.org/mailinglist

[dokuwiki] New PR: Add proper Korean romanization

Other related posts: