[nanomsg] Re: Trying to implement "directory" pattern

  • From: Paul Colomiets <paul@xxxxxxxxxxxxxx>
  • To: Martin Sustrik <sustrik@xxxxxxxxxx>
  • Date: Sat, 2 Mar 2013 01:56:39 +0200

Hi Martin,


On Fri, Mar 1, 2013 at 10:10 AM, Martin Sustrik <sustrik@xxxxxxxxxx> wrote:

> On 28/02/13 10:18, Paul Colomiets wrote:
>
>  The difference is buffer limit. How would you tune TCP buffering to
>> handle 130 million subscriptions? I think if you do, then machine will
>> be open to DoS attack very easily.
>>
>
> The question here is whether we should even try to push 130M subscriptions
> to a TCP connection in one go. Maybe there's a viable way to iterate
> through the connection map and send subscriptions as bandwidth becomes
> available.


This idea came to my mind several times. But, the problem is more complex,
as subscriptions are changing on the fly. While I still think it's
technically possible to optimize it to an iterator and a small buffer space
(list of recent unsubscriptions of visited branches). I don't think its
worthwhile for the following reasons:

1. You promised pluggable filters. It's very complex task to expect from
every plugin.

2. It's complex task with a dozen of edge cases. Simple solutions are
usually more reliable.

3. It is an optimization that can be done later without affecting users. It
can be done when it's need is demostrated, and when there are at least one
big customer that will use it at real scale.


>
>      2. Pushback => hanged up publisher can stop the whole topology
>>
>>
>> Why it would stop? It will result into "inconsistent message delivery"
>> until eventually subscriptions are sent. As the subscriptions are
>> usually aggregated on intermediaries, I don't think there are use cases
>> where subscription pipe is in "pushback state" all the time. So what we
>> need, is some sign to sysadmins that pushback happens (and that's
>> separate topic).
>>
>
> It may happen in the case of hanged-up application. It doesn't read
> messages, so TCP pushback is applied. Next message cannot be sent. It
> cannot be stored either, as we are out of buffer space. It cannot be
> dropped as we want the transport to be reliable. So the only thing to do is
> to stop sending new messages. That means that messages aren't sent even to
> the well-behaved peers. That way the failure propagates from the hanged-up
> application "sideways".
>
>
Technically yes. But looking at the problem slightly wider, the picture is
the following:

1. I assume that subscriptions take much smaller amount of memory than
memory needed to process the messages. So keeping another buffer for
subscriptions is not a problem. Small demonstration: if you have many
publishers, you keep a trie of subscriptions per publisher, and most of the
time there is only one slow/inactive publisher (or your admins are dumb :)
). So you only duplicate subscriptions of a single publisher, which is a
fraction of whole socket memory. Also buffer is usually more compact than a
trie (not always though)

2. Let's keep 2x the size of the trie number of subscriptions in the
buffer. If next subscription is added, kill the connection. On reconnect
buffer will be 2x shorter. This solves the situation you described:
"Instead of waiting for sending the remaining few bytes, it disconnects,
reconnect and tries to send the whole subscription set anew". Of course 2x
size can be tuned to something nicer.

This way slow publisher can never block others and works around
reconnection problem in a smooth way.


>      There seems to be no way out.
>>
>>     If you see any other solution to the problem, please let me know.
>>
>>
>> You are too pessimistic :) Can you ask guys who have millions of
>> subscriptions in zeromq few questions:
>>
>> 1. Do you use subscription forwarding?
>> 2. Does zeromq solves task well or is there are problems with zeromq
>> implementation?
>> 3. What HWMs and SND/RCVBUFs are set?
>> 4. How much memory is used by subscriptions (if it's possible to
>> estimate) ?
>>
>
> OK. Will do.


One final question, if it's not too late: Can pluggable filters make number
of subscriptions much lower in their case? (I imaging some thousand filters
cat be replaced by a single regexp, or some other kind of rule)


-- 
Paul

Other related posts: