[nanomsg] Re: Poll: async dns on connect?

  • From: Martin Sustrik <sustrik@xxxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Wed, 18 May 2016 11:09:49 +0200

Btw, I've got good results with dns.c in libmill:

http://25thandclement.com/~william/projects/dns.c.html

It's a single C file, no dependencies, does async dns resolution, works on linux and osx and bsd variants; doesn't do its own threading, instead it provides the user will a file descriptor to wait on. Compatible licensing.

Documentation is somehow lacking but here's working function that does dns name resolution, feel free to copy/paste it:

https://github.com/sustrik/libmill/blob/master/ip.c#L251

Martin

On 2016-05-18 10:50, Roman Puls wrote:

Hi Garrett,

IMO blocking resolvers are an absolute no go, as they let the
application stall for no good reason. If you want to be sure that a
given DNS exists once, the application can do a blocking resolution
first, if really needed.

To my knowledge, every serious networking implementation uses async
dns resolution, with threaded getaddrinfo or getaddrinfo_a (which is
the same, just hiding away the threads) as a fallback, and a pure
async resolver implementations, if available. I highly recommend to
watch at libcurl, or libevent, to see how things are being done. It's
generally a good idea to use c-ares, if available, since this is solid
and widely tested code, and just go the naive approach via threads if
such a library is not available.

Also, it shall be mandatory to resolve the hostnames again and again,
but not cache the DNS results, as this will surely lead to unexpected
/ faulty behaviour.

Thanks and regards,
  Roman



Am 17.05.2016 um 05:10 schrieb Garrett D'Amore:
This is a poll.

Today libnanomsg on *Linux* (and only Linux) performs an asynchronous DNS lookup using getaddrinfo_a().

This lookup is done after control has returned to your program.

There are advantages, and disadvantages to this.

On the pro side, your application will not “stall” waiting for DNS (only true for Linux at the moment, btw), even if you need to open another pipe or do another DNS lookup.

Its also the case that the above is “responsive”, in that if the DNS name for the remote server changes, and the client encounters a disconnection, it will do the lookup anew, picking up the changed name. This could have some resiliency benefits in some applications. (Note that libnanomsg does this, but mangos does not.)

On the con side, the above only works for Linux (for non-Linux nodes the resolution can still stall the library, but the self-healing part is still true). You also might not *want* the application to get a different IP address in the future. And the biggest draw back of all, is that the application has no idea when it tries to connect that the name given is not a valid IP.

None of this behavior is documented.

I’d like to consider changing this, so that it behaves more like mangos. In this case the DNS lookup would be performed *synchronously*, before returning a response to the client. If the DNS resolution fails, the application would get an error code back.

I think for most apps, the synchronous failure mode is more useful. Note that it is always possible for an application to do its own DNS lookups and provide IP-based URLs.

If folks prefer the asynchronous approach, I can fix it so that by using threads and getaddrinfo() in separate threads, we can get the benefit of parallel DNS lookups. But of course the asynchronous mode means that applications that give a completely bogus name will never find out about it.

One *could* imagine a *completely* synchronous nn_connect() as well, which would attempt to establish the connection. It would return only once connected or if a failure happens. I’d still retain the reconnect on disconnect behavior though. But this way you get synchronous notification if the remote host is clearly inaccessible.

What do folks think here?  Do any of you rely on the current behavior?

 - Garrett




Other related posts: