[nanomsg] Re: The name service for nanomsg

  • From: Paul Colomiets <paul@xxxxxxxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Sat, 7 Sep 2013 12:44:10 +0300

Hi Alex,

On Sat, Sep 7, 2013 at 12:24 PM, Alex Elsayed <eternaleye@xxxxxxxxx> wrote:
> Currently, a DNS SRV record takes a form like:
> _service._transport.host TTL IN SRV <prio> <weight> <port> <canonical host>
>
> and a query takes the form:
> _service._transport.host
>
> Now, for most usages this is great, since it's common for a service to only
> use one transport. However, since SP can (potentially) operate over arbitrary
> transports, it does not suit us particularly well because in order to look up
> a locator, we'd need to enumerate every transport and query them all because
> the queries are literal.
>
> My suggestion is to see if in the *query* we can wildcard the transport, as:
> _service.*.host
> and have that result in the DNS server returning all results for _service,
> regardless of transport.
>

I don't think that's going to fly. It's much easier to create
reference implementation of our own name service, than to require all
DNS servers to update their implementations.

But actually there is a way to query dns for a list of addresses now.
You just need to query for PTR

$dig PTR topology1.example.org +short
_nanomsg._tcp.s1.example.org
_nanomsg._tcp.s1.example.org

But then you must resolve each of the names with DNS SRV, and combine
them somehow by weights and priorities.

> That solves the ID to locator issue handily.
>
> Bind vs. connect is a thorny problem, because it's not just a parameter that
> can be set unilaterally. It's a role to play in another, lower-layer
> distributed algorithm of bind/listen/connect/accept.
>
> Transitioning from bind to connect (or vice versa) while running is honestly
> something I don't think any admin will do without having some serious wibblies
> about it, at least in part because while the transition is running you
> suddenly have two incompatible topologies sharing a name - two hosts set to
> connect() cannot communicate directly, nor can two set to bind().
>

There is scenario we practice every day. In development setup we bind
a single worker to a known port. In production it connects to the
device. So it's definitely a admin-level transition.

> I strongly suspect that best practices, even if bind/connect is available to
> the admin, would quickly converge on "always deploy a device as bind() and
> have the ends connect(). Rolling out another device for failover/load
> balancing later is just adding a new DNS record; switching bind() to connect()
> on the endpoints is more pain than it's worth."
>

What about two devices in chain?

-- 
Paul

Other related posts: