[openbeosstorage] Re: Registrar-Based Notification Mechanism

  • From: Ingo Weinhold <bonefish@xxxxxxxxxxxxxxx>
  • To: openbeosstorage@xxxxxxxxxxxxx
  • Date: Thu, 13 Feb 2003 21:13:52 +0100 (MET)

On Thu, 13 Feb 2003, Axel =?iso-8859-1?q?D=F6rfler ?= wrote:

> Ingo Weinhold <bonefish@xxxxxxxxxxxxxxx> wrote:
>
> > The general idea is to move as much functionality as possible from
> > the
> > kernel to a userland server -- the registrar. The expected positive
> > effects are:
> >
> > * No (flattened) BMessages in the kernel.
> >
> > * Similar performance of the kernel entity causing the notification
> > independent of the number of listeners. Maybe even slightly better
> > performance for no/one listener.
>
> Just keep in mind that having the node monitor mechanism for the kernel
> itself might also be a very nice idea, and that shouldn't be dropped so
> easily. With a kernel delivering all those nice things, why shouldn't
> it be able to benefit itself from these=3F

Yes, I briefly thought about that too, but didn't follow that idea. One of
the problems is, how the kernel entities should be notified. For userland
application this is trival, since you have messenging, but for the
kernel...?

Discriminating active and passive entities -- i.e. those that run an own
thread (e.g. daemons) and those that don't (e.g. file systems) -- I
could imagine different approaches:

1) Event queues: Each subscriber has a queue the events are pushed into.
An event queue may have a counter semaphore released with each event
pushed into it. So the (active) entity can wait for events.

2) Ports: The event notifications are written to a port. This doesn't work
very well with passive entities, I suspect.

3) A combination of 1) and 2): The subscriber can optionally provide a
port to which only a `there is a new event in the queue' message is
written.

4) Callbacks: The subscribers supply a function to be called when events
occur.

Certainly 4) is the most flexible of the approaches. The others could even
be implemented on top of it without performance issues. Of course, it
could be misused by subscribers by implementing time consuming callback
functions which will hit the performance of the entity issuing the events
-- one would need to be very careful.

BTW, when using callbacks the mechanism for userland notifications could
hook in as just another subscriber (at the cost of one memcpy, I think).

If there is the desire to unify the interfaces of the different watching
services (node monitoring, mounting, disk_scanner stuff,...) a bit, I
could imagine something like this:

struct notification_subscriber {
  bool (*notify)(const notification_subscriber*,
                 const notification_event*);
  void *service_params;
};

Where, e.g. in case of the node watching service, service_params could
point to:

struct node_watching_parameters {
  ino_t      node;
  uint32     event_mask;
  ...
};

The subscription/unsubscription functions would look like:

status_t xyz_subscribe(const notification_subscriber *subscriber);
status_t xyz_unsubscribe(const notification_subscriber *subscriber);

One could even provide:

status_t subscribe(uint32 service,
                   const notification_subscriber *subscriber);
status_t unsubscribe(uint32 service,
                     const notification_subscriber *subscriber);

Anyway, for sake of performance optimization the structure holding the
subscribers for a certain service would need to be managed by the service
itself.

> > * No more dropping of notification messages, in case the target port
> > is
> > full at the moment of the event.
>
> Depends on the implementation - the shared memory buffer could also get
> used up completely, and what would you like to do then=3F Just wait until
> the buffer is free again=3F

Seems like you read the answer later. :-)

> The registrar thread collecting the messages should probably run on a
> high priority :-)

That's not a bad idea.

> Anyway, how does the Registrar know about the target of the message=3F
> The kernel knows this, but the registrar (currently) cannot. The only
> way would be double house-keeping, I think.

Partially. All the kernel needs to know is whether there is anyone at all
listening for an event. Only the registrar would know the concrete
targets.

> > Possible negative effects:
> > * Higher `event occurence -> arriving of notification' latency.
>
> Yes, there are more context switchs needed to deliver the message, I
> don't consider this a big problem though, as those notifications aren't
> that time critical, are they=3F

That's what I think too.

> > * More resource usage: thread(s), semaphores and memory
>
> That might be a bigger problem, but nowadays not sooo important,
> although we should prevent doing too many things to let the kernel run
> with a low amount of memory.

Well, yes with some reasonable policy, the mechanism I proposed shouldn't
eat up that much memory. Some for the ring buffer (I bet 10 KB would be
sufficient) and some for pending messages. But the latter should be rather
harmless when some sanity limits are set.

> > When an event occurs the kernel notifies the registrar, which manages
> > the
> > lists of listeners and sends the actual notification messages. The
> > kernel->registrar communication is done via shared memory; for the
> > other
> > direction syscalls can be used.
>
> If the kernel would send out *all* notifications, that would probably a
> lot more expensive for the case with no listeners, which should be
> considered as the most probable path. Right now, the kernel does
> nothing for those notifications - it makes a hash lookup and that's it.
> In that case, it had to copy the notification data into a buffer, and
> the registrar thread had to wake up and check if there are any
> listeners waiting for that info.
> Perhaps we should do a little more in the kernel (like maintaining a
> list of vnodes that trigger a notification at all, very similar as it
> is now).

Exactly what I had in mind. :-)

> If we had always more than one CPU, that would probably better, though
> :-)

Well, more CPUs are always good to have. :-)

> > 1) Subscription/unsubscription for notifications: This is a bit ugly,
> > since the functions for that live in libroot and the primary target
> > for
> > the request is the registrar. Either the request is passed through a
> > port
> > directly to the registrar, or it travels into the kernel and goes the
> > same
> > way the to the registrar the notifications take (i.e. through the
> > shared
> > memory). Once there, the listener is added to the appropriate
> > list/hashtable/...
>
> In that case, why should the subscription functions in the libroot.so
> then=3F

Er, personally I'd rather want them in libbe. And now I think about it,
actually only open_live_query() (or whatever the exact name was) is not.

> And not simply in libbe.so (there is no requirement to have only
> libroot.so make syscalls, although it might be the cleaner design to do
> so :)=3F

Syscalls weren't my concern. Those living in libbe can directly send a
message to the registrar (as currently done for roster and clipboard
watching). But in libroot, we do, of course, want to avoid dealing with
messages.

> > For it should be avoided that the kernel notifies the registrar on an
> > event noone is interested in, there still needs to be a structure in
> > the
> > kernel, holding the information, which events are listened. The
> > registrar
> > needs to tell the kernel whenever this info has changed. This
> > structure
> > could be in shared memory too, but, since subscription/unsubscription
> > is
> > not time-critical, syscalls may be the better choice. Rare events
> > (mounting, appearing/disappearing of device and the like) can, of
> > course,
> > always be sent to the registrar.
>
> Ah, you seem to have thought of this already :-)

Hehe. :-)

> If we already have a syscall that does the notification, the kernel
> could maintain its list independently. I kinda don't like the situation
> that the kernel is dependend on a userspace application, even if it is
> the registrar :-)
> In theory, every application should be able to pick up this service
> provided by the kernel :)

Mmh, I wouldn't say that the kernel would depend on the registrar. It's
more that userland notifications depend on it. If there is no registrar,
then there won't be a ring buffer and the kernel doesn't have to do
anything regarding notifications.

[...]
> > In case the ring buffer is full, reserve=5Fnotification=5Fentry()
> > allocates
> > memory for the entry on the heap and pushes it into a queue. The next
> > call
> > to it will first try to empty the queue -- copying its contents into
> > the
> > ring buffer -- and proceed as usual, i.e. reserve ring buffer space
> > or
> > alloc memory and use the queue respectively.
>
> That is probably not a very good idea, although it's not a big problem
> in the case of BeOS (since the registrar is needed for normal operation
> anyway).
> You make the sanity of the kernel depending on the presence of a
> particular application in userland.

Not really. As said above, if the registrar isn't running (anymore, or not
yet) the whole userland notification interface in the kernel does nothing.

> Perhaps we should have a libroot function where a team can register to
> provide that service, and once it goes down (i.e. crashes) that
> position is free in the kernel again, and the kernel knows that it
> doesn't have to care about maintaining those notifications anymore
> (therefore, the buffer won't overflow).

That is pretty much, what I intended anyway. As I wrote, the registrar
tells the kernel via a syscall when it is ready. Or more precisely, it
will pass the IDs of the locking semaphore and the area for shared memory
to the kernel. In theory any application could do that. When going to die
the registrar should invoke another syscall.

Regarding crashes of the registrar, I thought, that it would be enough,
that the locking semaphore is deleted with its death. But you're right,
some special handling is required, since otherwise things can end up
really bad, if the registrar crashes just after the kernel has acquired
the lock.

> Also, with that ring buffer, we could also move the registrar thread(s)
> which provide(s) that service into the kernel, too, saving one context
> switch for a message delivery (but having the (or a) BMessage in the
> kernel again).

Yep, that's basically the tradeoff, BMessage in the kernel vs.
notification delivery in userland. Letting the registrar deliver the
messages has the additional advantage, that this effort can be joined
with that of the other event watching services, which are all implemented
there.

> Or have special message ports that can't (easily) overflow (because
> basically, that's what you provide there).

Yep.

> > BTW, there is quite some potential for optimization, e.g. by
> > flattening a
> > notification message very early (i.e. directly after setting it up)
> > and
> > writing the flattened message to the target ports instead of using
> > the
> > high-level API for sending BMessages, which would cause the message
> > to be
> > flattened each time it is tried to be sent.
>
> That's something I would do anyway, and not regard as an optimization :
> -))

Well, right. Currently it can't be implemented though, since it needs some
support from BMessage. And considering the unfortunately slow BMessage
progress, I suspect, we will be done with implementing the notification
stuff before our BMessage is ready. :-(

> > Mmh, that's it, I think. At least my mind feels a bit emptier now. ;-
> > )
>
> Hehe, sounds very nice overall.
> I would just like to have a way to keep the notification mechanism
> available for kernel modules - now if you have an idea to provide
> this... :-))

I haven't thought much about that yet. Above is a spontaneous
brain-storming -- maybe some more concrete ideas will form when thinking a
bit more about it.

CU, Ingo


Other related posts: