[nanomsg] Re: nn_bind issues

  • From: Matt Howlett <matt.howlett@xxxxxxxxx>
  • To: nanomsg <nanomsg@xxxxxxxxxxxxx>
  • Date: Mon, 18 Nov 2013 15:08:04 +0700

> What I was saying is that reporting ADDRINUSE to the user is
> unrealiable. It would be done immediately in some cases, it will
> happen later on sometimes, asynchronously, when user is already in a
> middle of doing a different job.
>
> Only the former case can be handled by returning the error from nn_bind().
>
> The only way to handle the latter IMO is to wait for a while a to try
> to bind once again.
>
> This double standard for handling the error, however, makes reporting
> of the error basically worthless. The semantics would be "If you bind
> and the port is in use you'll get EADDRINUSE. Maybe." There's no way
> to base any business logic on semantics like that.
>
> All in all, I would say the correct way of handling EADDRINUSE is
> re-trying until it succeeds. Of course, once we have a monitoring
> subsystem implemented the problem should be reported via
> administrative interface, so that admin can fix it -- kill one of the
> offending applications or such.

You're certainly way more on top of all the issues than me ... I guess
my only comment is it seems overly catastrophic to have the process
abort when I try to bind to a port that is already in use.

> Matt, as for your use case, why not use device in the middle to create
> a M-to-N-style topology?

Quite possibly some combination of premature optimization and
ignorance. Setting things up so the mappers know exactly which reducer
to send what results to directly wasn't too difficult, and seemed most
natural and efficient, so that's what I went with.

btw: the bind issue is not hard to work around of course - i'm just
finding a free port first without using nn_bind and staggering process
start up so in practice there will never be a race condition.

Other related posts: