[nanomsg] Re: ZMQ_ROUTER like functionality

  • From: Alex Elsayed <eternaleye@xxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Thu, 18 Sep 2014 11:30:07 -0700

Paul Colomiets wrote:

> Hi Alex,
> 
> On Thu, Sep 18, 2014 at 8:40 AM, Alex Elsayed
> <eternaleye@xxxxxxxxx> wrote:
>> Looking at your mails for DIRECTORY, it almost seems like adding
>> subscriptions/filters (with sender-side suppression for non-matching) to
>> SURVEY would give pretty much exactly The Right Thing, and sounds like a
>> useful primitive on its own.
> 
> Yes, except I'm not sure its similar to SURVEY pattern. I believe that
> It's more like REQ/REP pattern, in the sense that matched requests are
> load-balanced between respective workers, instead of being sent to
> every matching worker (and David's use case seem to match mine).

Mm, I see David's case as distinct - from what I understood, his _data 
access_ is stateless, but _which respondent has the data_ doesn't fit the 
stateless routing of REQ/REP. Since he said he does want to add replication, 
that makes me think SURVEY+subscriptions is the right primitive.

However, I agree that DIRECTORY as described in your mails (with stateful 
data) is a different use case. However, I would argue that as soon as you 
have multiple access to stateful data, you need an internal consistency 
protocol, not just a guarantee of single-access-per-query.

In particular, single-access doesn't do anything to protect against network 
partitions where two clients on either side of the partition mutate the 
data.

However, once you have an internal consistency protocol, single-access is no 
longer necessary at all.

> However, sometimes I see subscriptions as a separate building block,
> which can be applied to all pipeline, reqrep, and survey patterns, and
> for all of them it makes sense. What do you think?

I agree to such a degree that I'm currently writing a message for the list 
about a general protocol for upstreaming prefix subscriptions (bidirectional 
transports only) which tries to avoid the problems Martin has brought up in 
the past (sideways propagation of error in particular)

:D

>> As for how the subscriptions would actually go upstream, maybe they start
>> with a default subscription of zero length (matches anything in a trie,
>> and doesn't break back-compat), they can add more subscriptions, and when
>> they're done they (explicitly) unsubscribe from ""?
> 
> I'm not sure why you need a subscription to start with. Let me
> describe my use case better.

The reason for the initial subscription is mainly so that simply joining a 
topology works in the same way as it currently does - I don't want to break 
ABI. Might be worthwhile having a connect()-time flag to disable it.

Then, if you _have_ an initial subscription list, you need to clear it 
_after_ setting up narrower subscriptions, otherwise you have a race 
condition that may put "holes" in the stream. Missing all before time t is 
fine; receiving, then missing, then receiving again violates the principle 
of least surprise.

> Imagine we have a three machines (or better a theee clusters, but I
> skip that detail), that serve some articles. The machine S1 serves
> articles, having title that starts A to J, machine S2 serves articles
> K-Q and machine S3 serves articles R-Z.
> 
> Now let's imagine we have S2, and S3 down. When S2 comes online, if
> it's first subscribed to empty string, it will get requests for K-Q
> and for R-Z. For the latter group of articles it has no meaningful
> answer, so I whould have to filter out that requests. Note that
> comparing this with pub-sub, sending request to wrong recipient is
> much more expensive, since it leaves request waiting for a timeout at
> the client (and timeout is usually an order of magnitude larger than
> expected response time).

Note that the timeout is (in a sense) an artifact of the API, not the 
protocol. In particular, I'm tossing around a design for a pure-Rust 
implementation (like mangos was for go), which creates individual exchanges 
from AF_SP_RAW objects.

i.e. you have a SURVEYOR object, you call .ask(query) on it, and this 
returns a Survey. The Survey has a .iter() method, allowing you to iterate 
over responses (lazy lists!) as they arrive, and signaling end-of-list at 
the timeout.

This makes it trivial for a caller to act on the first response, and not 
wait for the timeout.

> Well, in fact if you see it's like a SURVEY, i.e. when request is sent
> to many workers, then empty subscription at the start makes sense.

I'd argue that because of the bit above about holes in the stream, it makes 
sense for PUB/SUB too.

> IIRC, our discussion with Martin have come to conclusion that it's
> possible to make subscriptions good enough for the pattern without
> additional semantics. I can recall details if somebody would try to
> implement it.

Well, I'm going to be writing a mail about a generic subscription scheme, so 
your details would be nice to see.



Other related posts: