[openbeos] Re: StyledEdit news of various interest

On Tue, 15 Jul 2003 23:01:54 -0700 PDT "Andrew Bachmann" <
shatty@xxxxxxxxxxxxx> wrote:
> Since I saw the bugs go out again for stylededit I thought I would 
> look at some of them.  In 
> particular I figured it was time to check on the status of encodings 
> support.  I'm pleased to 
> announce that the OBOS StyledEdit now has support for loading and 
> saving to any of the 
> encodings supported by libtextencodings.so.  (this is a number more 
> encodings than R5 
> StyledEdit)  Some may remember the discussion a while back over the 
> way to do this and I chose 
> to implement it via convert_to/from_utf8, which seemed to be the 
> general consensus.
> 
> As part of my effort to move towards better support for encodings in 
> general I tried to create an 
> abstraction to manage encodings.  Two tasks that I wanted this 
> abstraction to perform in 
> particular were: enumerating the encodings supported, and supplying 
> human readable names 
> for them.  In addition to these tasks, I wanted to be able to get the 
> "font id" (suitable for 
> BFont::SetEncoding) for an encoding, and the "conversion id" 
> (suitable for convert_to/from_utf8) 
> for an encoding.  I also wanted to step a bit beyond the arbitrary 
> nature of the encodings 
> collection in R5.  To do this I consulted the IANA, which manages a 
> list of encodings and various 
> properties of them.  (see http://www.iana.org/assignments/character-
> sets )
> 
> The end result is in two files which are currently located in the 
> stylededit directory.  These files 
> are CharacterSet.h and CharacterSet.cpp.  There are two classes 
> defined here.  One is a 
> CharacterSetRoster, which supports enumerating the supported 
> character sets, and finding a 
> character set by various search criteria.  (some "find" methods are 
> constant time, others are 
> linear)  The other class, CharacterSet, represents an individual 
> character set.  The fields in the 
> CharacterSet class are taken from the IANA document listed above.  
> Some nice things that this 
> added to my initial list of requirements: the MIME name for a 
> character set, if it exists, the 
> canonical IANA name for a character set, and a set of known aliases 
> for a character set.  It also 
> added something called a MIB enum that I am honestly not to sure 
> about but perhaps would be 
> useful for hard-core publishing type apps or somesuch?  (I put it in 
> for completeness.)
> 
> OBOS StyledEdit now stands as an example of how to use this 
> functionality.  I'm hoping that 
> something like this could be made a standard through beunited.  
> However I don't really hope 
> that this particular interface/implementation makes it. :-)

Well, if the naming and some details would be adjusted to be more BeOS-
like, I think, it can go into our libbe as a private API for the time 
being. :-)

> Although it works, it may be 
> preferable to support a different enumeration interface than the 
> method I chose at the time.  

The typical roster interface for enumeration would be more like:

  status_t GetNextCharacterSet(BCharacterSet *charSet);
  void RewindCharacterSets();

If the roster really keeps all of them in memory all the time, the 
interface would be similar to yours:

  int32 CountCharacterSets() const;
  const BCharacterSet *CharacterSetAt(int32 index) const;

> Also, the set of character sets is hard-coded into CharacterSet.cpp.  
> I think it would be superior to 
> have it read from a file.

Sure. BTW, this sounds like another service, that could be provided by 
the registrar. Then the application doesn't need to parse the file, but 
would get the data from the registrar.

[...]
> Warning: potential R2 issue + opinion ahead. :-)
> 
> In my opinion this functionality probably should be relocated, and 
> moved into existing 
> BTranslationUtils functions or new functions, possibly accepting 
> BFile instead of BPositionIO.  

Or we could extend BPositionIO (or even BDataIO?) the way Dano did, by 
{Read,Write}MetaData() methods, which, given the underlying object is 
indeed a BFile, would be mapped to {Read,Write}Attr().

> BTextView should be expanded to support SetEncoding/Encoding.

Mmh, is that really necessary? I always found it compelling, that the 
BeOS always uses UTF8 throughout the whole API, providing conversion 
functions for other encodings only.

> One last disappointing observation on R5 StyledEdit vs. OBOS 
> StyledEdit:  The R5 StyledEdit is 
> able to open a UTF8 file that I have created in it, without issue.  
> However, the OBOS StyledEdit 
> can not open this file.  The failure is not in the StyledEdit code.  
> The failure occurs when 
> BTextView::GetStyledText is called on the file.  It returns 
> B_TRANSLATION_ERROR_BASE.  
> This occurs when calling the R5 version of GetStyledText.  My 
> conclusion from this is that the R5 
> StyledEdit implementation does not even use BTextView::GetStyledText 
> to populate the view.  
> IMHO this is quite unfortunate.  Hopefully the OBOS version of 
> GetStyledText will be able to do 
> the right thing.  My guess is that the R5 version of GetStyledText 
> fails because it uses the 
> STXTTranslator, which decides that the input file is not a text file.  
> (the input file is text, it is 
> chinese)  R5 StyledEdit doesn't use GetStyledText at all and doesn't 
> try to make a determination.
> 
> And a mystery for those interested: the outstanding bug on viewing 
> files with dos newlines (they 
> get doubled) has left me baffled.  The R5 StyledEdit handles not only 
> dos newlines, but 
> apparently any combination of them with unix newlines.  The newlines 
> bytes have not been 
> removed or replaced, as they can be cut and pasted, and even saved.  
> It's almost as if the display 
> routine understands these different newlines.  I thought to override 
> CanEndLine in BTextView, 
> but it seems that is only used when wrapping is on, but these lines 
> work properly in R5 
> StyledEdit.  I'm at a loss.

The problem is not BTextView. It displays the CRs properly (er... as 
ugly box) -- also in OBOS StyledEdit (just copy one from R5 
StyledEdit). I suspect BTranslationUtils::GetStyledText() screws it up, 
replacing CRs by LFs, thus duplicating the newlines. The OBOS 
counterpart doesn't do this. E.g. running StyledEdited with our 
libtranslation.so, e.g. by

  LIBRARY_PATH=distro/x86.R1/beos/system/lib:$LIBRARY_PATH distro/
x86.R1/beos/apps/StyledEdit

makes it work properly.

BTW, read-only files aren't handled correctly. The R5 StyledEdit 
doesn't allow editing them, while OBOS StyledEdit does.

> In the meantime, please try the OBOS StyledEdit encoding features and 
> let me know if you have 
> any problems.

BTW, wouldn't it be a good idea to call the thing beta, put a binary 
somewhere, and set up a news item. I guess Sikosis would be the one to 
bother. :-)

CU, Ingo


Other related posts: