On Saturday, September 07, 2013 07:51:11 AM Martin Sustrik wrote: > Hi Nico, > > > For some use cases I'd think an IGMP-like protocol would be best. > > Specifically for pub/sub, where a URI might denote/resolve to one or > > more group membership management services, where publishers, routers, > > and subscribers join as such and where the topology is worked out > > dynamically (this is easy enough for publishers and subscribers, but a > > bit harder for routers as real topology information would be nice to > > have but possibly difficult to extract from the network). > > As an analogy, I would say pub/sub subscriptions basically work as IGMP. > > As for the name service (btw, we should think of a better name for the > thign) the analogy is more like a network admin running around with > cables, plugging in switches etc. > > The core difference being that the former is fully automatic, while the > latter is fully human-driven. As far as names go, "topology rendezvous service" or "topology parameter distribution service" are more meaningful - but that is really messy, because it describes something trying to solve too many things at once in my view. The thing is, I really do think we will want to separate even the admin part into multiple layers. There are really a couple of distinct classes of data we want to let the admin control (phrased from the POV of a topology participant) * "The other side of this interaction has property X" * "I have property Y" * "The relationship between myself and the other side has property Z" In general, X is better off being queried (DNS is the canonical example). In general, Y is better off being propagated to the participants ahead of time (DHCP handing out addresses). In general, Z is a pain and a half because it requires global knowledge, not only of the *desired* state but of the *current* state as well, which can change dynamically and have emergent properties. So far, the discussion has lumped all these together and tried to solve them all in one go. Even if we do provide a single API to the *coder*, it may be best to have different protocols for these things rather than one overarching protocol. The actual tasks being discussed so far basically amount to: * Mapping from an ID to a locator (Very much X) * Bind vs Connect (Z) * Parameters (A mix of X, Y, and Z depending on the parameter) DNS SRV + getting a query-side wildcarding capability may be the nicest way to solve the ID to locator bit. It requires a change to SRV via the IETF, but since it's in the interactions rather than the record format it might be easier to swing. I don't think I explained this idea adequately in my other mail, so I'll correct that now. Currently, a DNS SRV record takes a form like: _service._transport.host TTL IN SRV <prio> <weight> <port> <canonical host> and a query takes the form: _service._transport.host Now, for most usages this is great, since it's common for a service to only use one transport. However, since SP can (potentially) operate over arbitrary transports, it does not suit us particularly well because in order to look up a locator, we'd need to enumerate every transport and query them all because the queries are literal. My suggestion is to see if in the *query* we can wildcard the transport, as: _service.*.host and have that result in the DNS server returning all results for _service, regardless of transport. That solves the ID to locator issue handily. Bind vs. connect is a thorny problem, because it's not just a parameter that can be set unilaterally. It's a role to play in another, lower-layer distributed algorithm of bind/listen/connect/accept. Transitioning from bind to connect (or vice versa) while running is honestly something I don't think any admin will do without having some serious wibblies about it, at least in part because while the transition is running you suddenly have two incompatible topologies sharing a name - two hosts set to connect() cannot communicate directly, nor can two set to bind(). I strongly suspect that best practices, even if bind/connect is available to the admin, would quickly converge on "always deploy a device as bind() and have the ends connect(). Rolling out another device for failover/load balancing later is just adding a new DNS record; switching bind() to connect() on the endpoints is more pain than it's worth." This is especially true because of the behavior of devices - it makes very little sense for a device to do anything other than bind(), so as soon as they roll out the first device every endpoint in the topology will use connect(). As far as parameters go, I'm not sure what the best solution is (primarily due to how some are X, some are Y, and some are Z) Thoughts?