[nanomsg] Re: Trying to implement "directory" pattern

From: Martin Sustrik <sustrik@xxxxxxxxxx>
To: Paul Colomiets <paul@xxxxxxxxxxxxxx>
Date: Thu, 28 Feb 2013 08:17:49 +0100

On 26/02/13 20:41, Paul Colomiets wrote:

    The problem with that is auto-reconnect. Subscriber sends a lot of
    subscriptions, more that the producer is able to accept. It is not
    processing them, so TCP pushback happens. Subscriber sees that the
    subscription stream is stuck and disconnects the peer. The producer
    tries to reconnect and immediately gets hit by a subscription storm.
    As so on ad infinitum.


That's why I've written "if connection does no progress". I mean
disconnect if no bytes where sent for 10 seconds, or something like
that. It means that any number of subscriptions can be sent, even if it
would take minutes to upload them. I think in all realistic situations
(up to thousands subscriptions in up to few seconds) it will work. There
is an edge case, when you create new subscriptions in the tight loop,
and publisher can't keep up with it. But I don't think it's a situation
that's need to be taken care of.


It's a single problem IMO. It can be formulated like this:

"Given limited tx buffer (whether in kernel or in user space) whatshould be done when it gets full and user still wants to send newsubscription."

Btw, speaking of realistic situation, I've just spoke to guys who arehandling 130,000,000 subscriptions in ZeroMQ :)


Anyway, the problem can be split into 2 parts:

1. How to manage pushback.
2. What to do when it can't be managed any more.

The options for the first are either relying on TCP (problem occurs whentx buffer limit is hit) or building a rate limiting algorithm on top(problem happens when the rate limit is exceeded) -- the latter beingbasically what you are proposing.

I would say that both are functionally equivalent (ie. the problemoccurs when too much data is sent in too short a time) the onlydifference being that implementing rate limiting requires more work tobe done.

The interesting part is what happens when the problem occurs (tx bufferfull, rate limit exceeded). The options here are:


1. Drop => results in inconsistent message delivery
2. Pushback => hanged up publisher can stop the whole topology

There's also the "reconnect" option which is just an evil variation onpushback. Instead of waiting for sending the remaining few bytes, itdisconnects, reconnect and tries to send the whole subscription set anew.


There seems to be no way out.

If you see any other solution to the problem, please let me know.

Martin

Follow-Ups:
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets

References:
- [nanomsg] Trying to implement "directory" pattern
  - From: Paul Colomiets
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Martin Sustrik
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Martin Sustrik
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Martin Sustrik
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Martin Sustrik
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Martin Sustrik
- [nanomsg] Re: Trying to implement "directory" pattern
  - From: Paul Colomiets

[nanomsg] Re: Trying to implement "directory" pattern

Other related posts: