[openbeos] Re: BString

  • From: "Daniel Reinhold" <danielr@xxxxxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Mon, 12 Nov 2001 07:05:51 CST

>>The internal representation of the characters in a BString should not 
>>make any difference. Keith should pick whatever implementation works 
>>the best. Users of the class should make no assumption about how the 
>>data is stored -- if they do, they deserve whatever trouble they get, 
>>haha!
>
>Right!  I did not do this and that is why I had problems, I assumed
>that BStrings would do what the BeBook said they will do.

What didn't they do?

>>So yes, users should only rely on calling Length() and not on finding 
a 
>>terminating null. However, I would recommend *not allowing* zero 
bytes 
>>in BStrings. It would work technically, but from a design standpoint, 
>>it would confuse people. Making it clear that BString is only meant 
for 
>>textual strings and not binary data is in keeping with BString's API 
>>(for example, the buffer returned by String() is guaranteed to be 
null-
>>terminated).
>
>Where in the BeBook does it say BStrings are for text only?
>If it is for text only, how will it handle UNICODE without zeros?

I don't think it outright says BStrings are only meant for text, but 
that's pretty much common sense. BString is clearly another C++ class 
implementation designed to "save" programmers from <string.h>. C 
programmers whined for years about things they didn't like about char *
s. From practically the moment C++ arrived on the scene, everyone and 
their brother has implemented some kind of String class. It's mostly 
for syntactic sugar -- so that people can write stuff like:

s = "hello " + username;

instead of

strcpy (s, "hello ");
strcat (s, username);

Of course, they also make it easier to allocate and grow the string 
data (i.e. the malloc() calls are made for you). But they certainly 
aren't a necessity -- you can do anything with char *.

Binary data is a different matter. C programmers wouldn't use strcpy, 
etc. on those -- they'd use memcpy, etc. You have to manage the buffer 
length yourself, etc. It's more work. It's quite possible to write a 
generic binary string class, but I've never seen a basic string class 
myself that wasn't intended for textual data. BString certainly has the 
API expected for textual data, but not what you'd expect for a binary 
data container. Implementing BString as a generic binary container 
would be more confusing and misleading than helpful, IMO.

As for handling Unicode, it doesn't.  Unicode uses multi-byte chars and 
the BeBook clearly warns the BStrings can't be used for multi-byte 
chars. If we want to, we can write a BString version that does support 
Unicode. That's a lot more work, but perhaps that is something to 
consider. But even then, Unicode does not really have embedded zeros; 
rather, it has wide characters (e.g. chars that are two bytes long). If 
you examined those chars a byte at a time, you might think that you're 
seeing nulls -- but at the character level they wouldn't be. For 
example, consider setting a four-byte integer equal to 1. Three of 
those bytes are zero, but the whole integer still equals 1. You have to 
look at the data at the granularity of its basic elements.

>>If the need for a binary string class becomes necessary, we could 
write 
>>BBinaryString (or something to that effect) to handle that -- altho I 
>>think that good ole char * would be simpler.
>
>??? if you use char * then you are back to square one and have to 
write
>all the functions available in BStrings yourself.  That is what I did
>and it made my code a mess compared to the BString version which 
worked
>90% of the time but failed 10% because it is undocumented about how
>zero bytes affect BStrings.
>
>          Earl Colby Pottinger
>

Earl, if you are embedding zeros in your string data, it ain't gonna 
work. Your going to have to use char *s to do it. If you're trying to 
put Unicode into BStrings, you're asking for trouble. If, on the other 
hand, you assume BString variables don't contain embedded nulls, and 
you don't try to put any in, then all your troubles will disappear, I 
would wager.



Other related posts: