[nanomsg] Re: Trying to implement "directory" pattern

  • From: Martin Sustrik <sustrik@xxxxxxxxxx>
  • To: Paul Colomiets <paul@xxxxxxxxxxxxxx>
  • Date: Mon, 25 Feb 2013 06:51:14 +0100

On 24/02/13 12:57, Paul Colomiets wrote:

>     Do I understnad it correctly: Is the goal to have multiple memcached
>     instances, each with it's own name and a management console able to
>     send a request to any of those instances based on a name?
>
>
> I'm not sure I understand question correctly.  But a couple of remarks
> follow.
>
> The goal is not to interconnect memcached instances (they should
> probably be connected with the BUS pattern).
>
> I don't think there is a management console in memcached :). But in my
> case, it's a single "cluster" of memcached instances, each one should
> know each other to keep set of subscriptions distinct.
>
> The protocol can be used to request data by instance name, but it's more
> than that. For example to scale memcached smoothly, you need to have
> more "buckets" than instances. For 10 nodes we allocate, say 1000
> buckets, and give each node 100 of them. When adding another node, we
> move about 9 buckets from each node to the new one. This makes the load
> distributed evenly across 11 nodes.
>
> I have more use cases than cache service, the memcached example is just
> one that is easier to explain. But if anybody knows a better way to
> implement that in nanomsg, I'm happy to listen :)

Hm. I think I'm lost already :) What's bucket?

Anyway, that should not prevent you from moving on with the implementation. See answers to your questions below.

        2. When pipe is added to answer socket, all subscriptions should be
        resent. Is there a way to put all the subscriptions into a pipe
        without
        any limits? Otherwise, I need either build a complex state
        machine to
        track which subscriptions are already resent, or bundle all
        subscriptions to a single message (inventing some ad-hoc message
        format). BTW, similar problem with just adding a subscription,
        when some
        output pipes are full. What crossroads does in this case?


    I think you are going to run into problems here.

    Sending same data to multiple destinations reliably has the effect
    of slow/dead/malevolent consumer being able to block the whole
    topology. Subscriptions, being just the data in the end, experience
    the same problem.


I think you are too idealistic here. We know for sure that subscriptions
fit memory, so we can keep a memory buffer to send them.

Sure. What I am saying is that when sending messages to two destinations in parallel, reliably and if one of the destination application hangs, it will ultimately cause the other application hang. Thus the failure propagates sideways.

Other question
is that, as far as I understand there is no API for buffering in
nanomsg, am I right? (more discussion below)

There's buffering code in inproc transport (inproc doesn't use a network protocol with tx/rx buffers so it has to buffer messages itself). See src/transports/inproc/msgqueue.h

I should probably move that class into src/utils, so that anyone can re-use it.

    What it means that one hanged up ANSWER application could possibly
    block the whole datacenter.

    You'll have to think out of the box here. For example, pub/sub
    solves the problem by allowing just one upstream pub socket per sub
    socket.


Sorry, but I don't understand how it solves problem.

It prevents propagating the failure sideways. Thus, every failure is always local to a sub-tree.

Does
setsockopt(...NN_SUBSCRIBE...) block until subscription is sent? Then
it's not documented and counter-intuitive. If setsockopt doesn't block,
then what stops you from filling all buffers by subscribing few thousand
times in a tight loop?

It doesn't block yet, because the subscription forwarding is not yet implemented, but yes, I would say it should block.

        3. I'm going to use NN_RESEND_IVL and NN_SUBSCRIBE options, but they
        have same integer value. So I either need to reinvent names, or
        we need
        to set unique numbers for built-in socket options. I think the
        latter is
        preferable, as it also less error prone (can't
        setsockopt(NN_RESEND_IVL)
        for SUB socket)


    Ah, there are socket option levels now, same as in POSIX.


Yeah, I know.

    So NN_RESEND_IVL is specific to REQ/REP pattern (NN_REQREP option
    level) and NN_SUBSCRIBE is specific to pub/sub (NN_PUBSUB option
    level). You should define an unique option level for your protocol
    and define the option constants as you see fit.


Well the name doesn't consist of protocol type. Seems strange. So I
should use NN_DIR_RESEND_IVL and NN_DIR_SUBSCRIBE or NN_RETRY_IVL and
NN_OFFER? Yeah, it's a policy issue not a technical one.

    Separate option level will guarantee that you won't clash with other
    protocols.


If I by mistake call setsockopt(.., NN_SUB, NN_RESEND_IVL, ...) bad
things can happen :)

Yes. For consistency's sake, the option names should be prefixed by socket type name. However, to keep names simple and consistent with 0MQ I opted to ommit them here, ie. NN_SUBSCRIBE instead of NN_SUB_SUBSCRIBE.

Should we change that?

Is message split done by each protocol on its own? Is there
length-prefixed or some other generic format for message headers that I
can reuse?

There are no assumptions about the format. Messages handed from the transports to individual SP protocols are pure binary BLOBs. Each protocol implementation has full control of its wire format.

Martin

Other related posts: