Paul Colomiets wrote: > Hi Alex, > > On Thu, Sep 18, 2014 at 8:40 AM, Alex Elsayed > <eternaleye@xxxxxxxxx> wrote: >> Looking at your mails for DIRECTORY, it almost seems like adding >> subscriptions/filters (with sender-side suppression for non-matching) to >> SURVEY would give pretty much exactly The Right Thing, and sounds like a >> useful primitive on its own. > > Yes, except I'm not sure its similar to SURVEY pattern. I believe that > It's more like REQ/REP pattern, in the sense that matched requests are > load-balanced between respective workers, instead of being sent to > every matching worker (and David's use case seem to match mine). Mm, I see David's case as distinct - from what I understood, his _data access_ is stateless, but _which respondent has the data_ doesn't fit the stateless routing of REQ/REP. Since he said he does want to add replication, that makes me think SURVEY+subscriptions is the right primitive. However, I agree that DIRECTORY as described in your mails (with stateful data) is a different use case. However, I would argue that as soon as you have multiple access to stateful data, you need an internal consistency protocol, not just a guarantee of single-access-per-query. In particular, single-access doesn't do anything to protect against network partitions where two clients on either side of the partition mutate the data. However, once you have an internal consistency protocol, single-access is no longer necessary at all. > However, sometimes I see subscriptions as a separate building block, > which can be applied to all pipeline, reqrep, and survey patterns, and > for all of them it makes sense. What do you think? I agree to such a degree that I'm currently writing a message for the list about a general protocol for upstreaming prefix subscriptions (bidirectional transports only) which tries to avoid the problems Martin has brought up in the past (sideways propagation of error in particular) :D >> As for how the subscriptions would actually go upstream, maybe they start >> with a default subscription of zero length (matches anything in a trie, >> and doesn't break back-compat), they can add more subscriptions, and when >> they're done they (explicitly) unsubscribe from ""? > > I'm not sure why you need a subscription to start with. Let me > describe my use case better. The reason for the initial subscription is mainly so that simply joining a topology works in the same way as it currently does - I don't want to break ABI. Might be worthwhile having a connect()-time flag to disable it. Then, if you _have_ an initial subscription list, you need to clear it _after_ setting up narrower subscriptions, otherwise you have a race condition that may put "holes" in the stream. Missing all before time t is fine; receiving, then missing, then receiving again violates the principle of least surprise. > Imagine we have a three machines (or better a theee clusters, but I > skip that detail), that serve some articles. The machine S1 serves > articles, having title that starts A to J, machine S2 serves articles > K-Q and machine S3 serves articles R-Z. > > Now let's imagine we have S2, and S3 down. When S2 comes online, if > it's first subscribed to empty string, it will get requests for K-Q > and for R-Z. For the latter group of articles it has no meaningful > answer, so I whould have to filter out that requests. Note that > comparing this with pub-sub, sending request to wrong recipient is > much more expensive, since it leaves request waiting for a timeout at > the client (and timeout is usually an order of magnitude larger than > expected response time). Note that the timeout is (in a sense) an artifact of the API, not the protocol. In particular, I'm tossing around a design for a pure-Rust implementation (like mangos was for go), which creates individual exchanges from AF_SP_RAW objects. i.e. you have a SURVEYOR object, you call .ask(query) on it, and this returns a Survey. The Survey has a .iter() method, allowing you to iterate over responses (lazy lists!) as they arrive, and signaling end-of-list at the timeout. This makes it trivial for a caller to act on the first response, and not wait for the timeout. > Well, in fact if you see it's like a SURVEY, i.e. when request is sent > to many workers, then empty subscription at the start makes sense. I'd argue that because of the bit above about holes in the stream, it makes sense for PUB/SUB too. > IIRC, our discussion with Martin have come to conclusion that it's > possible to make subscriptions good enough for the pattern without > additional semantics. I can recall details if somebody would try to > implement it. Well, I'm going to be writing a mail about a generic subscription scheme, so your details would be nice to see.