
|
[openbeos]
||
[Date Prev]
[11-2005 Date Index]
[Date Next]
||
[Thread Prev]
[11-2005 Thread Index]
[Thread Next]
[openbeos] Re: A new BMessage implementation (Message4)
- From: Michael Phipps <mphipps1@xxxxxxxxxxxxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Tue, 01 Nov 2005 18:46:28 -0500
I have a couple of thoughts on this.
I think that BMessages are absolutely critical and it is a good thing that
someone so devoted and diligent is attacking them. :-D
I honestly think that we need to add to the benchmarks. I would like to see
something where you pick 10 or so different system messages (i.e. messages that
the OS creates) and create numbers of those. Example:
timer.start();
for (int i=0;i<10000;i++) {
createQuitMessage();
createQuitRequestedMessage();
createWorkspaceChangedMessage();
...
}
timer.stop();
timer.print();
Truthfully, I think that the initialization time for a BMessage will mean a
WHOLE lot more than time to add fields because most of the messages out there
don't use a whole lot of fields. But I could be wrong - that's what these tests
are about. :-D
Secondly - I am concerned about having a different Flatten/Unflatten format.
Only because of people (like me) who would like to upgrade their R5 system to
Haiku by copying a couple of directories over. Losing all of my settings would
irritate me. :-) Isn't it possible to check the version of the flattened data
and instantiate it correctly?
Michael
On 2005-10-31 at 12:36:26 [-0500], mmlr@xxxxxxxx wrote:
> Hello Everyone
>
> As you may have seen, there are currently 3 BMessage implementations in the
> Haiku tree. The original one was written by Erik Jaesler and was based
> completely on Templates. While this was nice, it did not perform
> exceptionally well and is, by now, pretty much defunct. From that base I
> wrote Message2 and Message3 some three months ago. They mostly work (Flatten
> is broken in Message2, that's the reason for the good benchmarks) and tries
> to use the R5 Message format. I'd like to introduce another BMessage
> implementation (Message4). The following shall list the most important
> advantages/disadvantages of the four implementations.
>
> Message1: (original as of now)
> + Uses std:: containers
> + Supports R5 message format
> - Does not scale well
> - Slow
>
> Message2:
> + Uses hash tables for lookups
> + Supports R5 message format
> + Simple
> - Extensive amount of runtime variables
> - Does not scale that well
> - Slow
>
> Message3:
> + Uses hash tables for lookups
> + Uses one flat buffer
> + Scales very well
> + Supports (direct) unflattening of R5 message format
> + Fast
> - Flattening produces not 100% R5 message format output
> - Has many runtime variables that need to be recreated when unflattening
> - Complex
>
> Message4:
> + Uses hash tables for lookups
> + Uses three flat buffers for header, field table and data
> + Scales good for most cases
> + Faster, especially for unflattening
> + Fairly straight forward
> - Does not support R5 message format
> - Flattened messages are bigger
>
> The idea of Message4 was to reduce the amount of runtime variables that need
> to be stored on flatten or recreated on unflatten operations. An example of
> this would be the hash table. By including the hash table in the flattened
> message, the unflatten time is greatly reduced. The flatten concept is to
> just write the three flat buffers to the output buffer without any further
> processing (except for storing the public what field). For unflatten only
> copying the input buffer into the right flat buffers and restoring the what
> field is enough. This will achieve good results in flatten/unflatten but also
> generates bigger flattened messages.
>
> Message4 Format:
>
> <HeaderBuffer>
> <MessageHeader>
> <Format>
> <Flags>
> <ReplyInfo>
> ...
> <Hashtable>
> </MessageHeader>
> </HeaderBuffer>
>
> <FieldsBuffer>
> <FieldHeader>
> <Type>
> <Flags>
> <NameLength>
> <Size>
> <Count>
> ...
> <Offset>
> </FieldHeader>
> <FieldHeader>
> <FieldHeader>
> <preallocated space>
> </FieldsBuffer>
>
> <DataBuffer>
> <FieldData>
> <FieldName>
> <DataItem1>
> <DataItem2>
> <DataItem3>
> </FieldData>
> <FieldData>
> <FieldName>
> <DataSize1>
> <DataItem1>
> <DataSize2>
> <DataItem2>
> <DataSize3>
> <DataItem3>
> </FieldData>
> <preallocated space>
> </DataBuffer>
>
> Message Header
>
> The message header is fixed size. It contains a hash table (currently set to
> 10 entries = 40 bytes). It is special in that respect, that the hash table
> does not contain pointers to the fields, but the index into the FieldHeader
> array in the FieldsBuffer. While this is slower, it has the advantage that it
> can be flattened and unflattened with the message and still work without
> rebuilding it with new pointers. General state information and reply infos
> are saved in the header too. These could be stored as members, but I think
> it's cleaner to just copy them around inside a buffer.
>
> Field Headers
>
> The FieldsBuffer contains an array of FieldHeaders. The FieldHeader itself is
> fixed size, since the variable sized name is located in the data buffer and
> not in the header. The FieldsBuffer is resized lazily and adding fields costs
> only very few cycles. When removing a field that is not at the top though
> this implementation is slow, as all the indexes need to be looked through and
> decremented if they are bigger than the one of the removed header.
>
> Field Data
>
> Each fields data starts with the fieldname. Then continues either with an
> array of fixed size items (for fixed size data types) or a sequence of {
> ssize_t dataSize; uint8 data[]; } pairs. There is no direct index or offset
> table for variable sized data, so looking through variable sized data blocks
> does not scale very well. The DataBuffer is resized with a growing block size
> (size * 2) up to a maximum of 10 pages (40KB) this should reduce performance
> problems with very big messages (archiving) but still keep the waste of
> memory low for small messages. These limits can of course be adjusted when
> they proof to be inefficient.
>
> Message4 is implemented as one class (BMessage) by the way. There is no
> BMessageBody and BMessageField, as this produces just too much overhead in
> the end. The two header types are implemented as structs. It is fairly simple.
>
>
> That is about the design of the format. As you can see, it differs in the
> layout compared to R5 where you have a message header in front and then pairs
> of field headers and field data.
> To provide unflatten compatibility with the R5 message format I would suggest
> writing an R5 message reader like the one Axel wrote for Dano messages.
> We should also adopt the flatten prototypes of Dano, where you can choose the
> output format. For this an R5 and maybe even a Dano message writer can be
> implemented. An alternative to this could be to always produce an R5
> compatible output for Flatten, but internally (for messaging) use the native
> format. Since we are going to extend BMessage (by new flags or extra
> information stored in the header for example) the R5 format will not work
> very long. So going the way Be did when it introduced their new message
> format in Dano would be a sensible choice I think.
>
> The implementation of Message4 is pretty much finished. The only thing
> missing is actual message delivery. This is because of all the BMessage
> "friends" that are involved and are accessing private BMessage data directly.
> They need to be adapted to the new message format in that respect, that they
> cannot access information in BMessage members. I'm currently working on
> extending the private data accessor and removing all the direct friends. As
> soon as all private data is accessed through the accessor, messaging should
> work. Expect the checkin of Message4 to follow in the next few days.
>
> Please tell me what you think about this concept. How open is everyone
> involved to just switching the BMessage implementation? What is necessary for
> Message4 to be used as the default (single) BMessage implementation?
>
> Benchmarks can be found there: http://haiku.mlotz.ch/messagespeed.html. Bear
> in mind that Message2 only wins in flatten/unflatten sometimes because it is
> broken and does not store all the fields. The rest of the implementations
> should work ok.
|

|