[openbeos] Re: StyledEdit and character set encodings

Hi Guy,

G> there is already a fairly good hebrew extention on bebits, even a
G> Hebrew  OpenTracker, but still, the filenames are reversed as
G> usual. Gobe and text editors cope with right-to-left text from the
G> Hebrew plug,  but the delimiters (,./{}[];:`') still come out on
G> the wrong side of the  line.

This sounds a lot like the input method is only reversing the logical
character order in the character string, a Bad Thing (TM).

There is a distinction between logical order and display order. From a
logical point of view, text is not directional at all - there is a
beginning and an end, and that's it. It is just a question of what
characters are displayed left, right and on top of each other when
displaying it.

Any good rendering engine capable of supporting bidirectional text
needs to be able to produce a glyph string in display order from a
character string in logical order. It does this by converting
characters to glyphs from rules in the font, then reordering these
according to script-specific rules, largely set down in the Unicode
standard (http://www.unicode.org/unicode/standard/standard.html).
Rules for reordering characters are script-dependent; you have to
order Thai characters differently from Hebrew or Arabic ones; these
reordering rules can become quite complex if you want to render
scripts like Devanagari (in which Hindi is written) or mixed Japanese
RTL and Latin LTR script. (RTL stands for "right-to-left".) However,
as soon as the general intelligence of reordering and repositioning
characters is in place in the renderer, script support becomes very
easy to add.

It is quite obvious that changing the logical order of text by
reversing the logical character string character by character is a bad
idea, partly because it's not very clean, partly because it breaks
other operations that depend on character order. For example,
searching and, especially, sorting becomes difficult.

BeOS never had a good text rendering engine. IMHO this is a R2 issue,
but it should actually *be* in R2. I've spent some thoughts on
designing one, loosely on the basis of some ideas put forth in the
OpenType intelligent font specification (available, among other
sources, from http://www.microsoft.com/typography/otspec/default.htm)
and some examples from the Pango documentation (http://www.pango.org)
but I'm loaded with other work at present and don't have the time to
design that full time. I know a bit about multilingual typesetting,
however, and if anybody else is about to give this thoughts, I could
assist.

G> 1. (opentracker) finding a way to reverse only hebrew filenames (or
G> hebrew filnames on windows partitions) (i've spoken to bga about
G> this and he  claims its impossible..)

It is not. Reversing filenames is not necessary and is, actually, a
bad idea to do in Tracker at all. The operating system rendering
engine should display mixed-script strings correctly by itself. Then
Tracker just needs to pass the filename to the operating system font
renderer, and the reordering will be done automatically. Once such a
renderer is in place, filenames are rendered correctly along all other
text.

G> 2. (obos) writing a good input-method writer's guide (newsletter
G> article?  please??), and helping someone write a good input method
G> for hebrew, so  working with r5 interface kit but still gaining
G> functionallity.

For this, it would be interesting to see how the Hebrew input works
method to input RTL text in a solely LTR operating system like BeOS.
I've got this suspicion that it's a very, very bad hack.

Cheers -
  Philipp Reichmuth                            
mailto:mailinglistenprozessor@xxxxxxx

-- 
"The one I am greets the one I should be." - Augustinum


Other related posts: