[nanomsg] Re: Abort on recoverable error?

From: "Roman Puls" <puls@xxxxxxxxxxxx>
To: nanomsg@xxxxxxxxxxxxx
Date: Tue, 02 Sep 2014 07:04:09 +0000

Same here. I guess it would be clever to use the same error reportingmechanisms as with the posix sockets, where a bind also returns a -1 anderrno set to EADDRINUSE (or EADDRNOTAVAIL, ...).

The same applies to many places, and the same 'asserting' misbehaviorprevented me to use zeromq.



Generally regarding asserts/aborts:
- fine in debug builds, not in release builds

- library shall report error conditions when -DNDEBUG defined instead ofasserting- assertions without proper error handling inside the library indicatesinsufficient error recovery- also: always give the developer (API user) the chance to handleerrors!


So my suggestion would be

#define ASSERT_RETURN(cond, _errno) \
do { \
  if (!(expr)) { assert(expr); errno = _errno; return -1; } \
} while (0)

and then in library code:

int dosomething(void *ctx, int family) {
  ASSERT_RETURN(ctx, EINVAL);
  ASSERT_RETURN(family == AF_INET | family == AF_INET6, EAFNOSUPPORT)

  // or taken from nn_btcp_create
 self = nn_alloc (sizeof (struct nn_btcp), "btcp");
  // NO:  alloc_assert (self);

ASSERT_RETURN(self, ENOMEM); // new, however previously allocatedresources need to be released

  ...
}

This is a safe(r) pattern. With -DNDEBUG defined during the build, itwill safely return a -1 plus errno is set. With NDBUG undefined, it mayabort().However, special care needs to be taken that no resources are leakingwhen returning instead of aborting.


Cheers,
  Roman

PS: One of my favorite tests in terms of protocol correctness andability to handle errors was:


cat /dev/urandom | nc localhost 5555

If the program aborts (which was the case at least in zeromq 2.x), thenbetter don't use it for productive code.

Same.

 - Bruce

Sent from my iPhone
On Sep 2, 2014, at 1:06 PM, Matt Howlett <matt.howlett@xxxxxxxxx>wrote:
I agree completely... the way abort is used in nanomsg prevents mefrom using it in scenarios I otherwise would like to.
On Tue, Sep 2, 2014 at 3:56 PM, Garrett D'Amore <garrett@xxxxxxxxxx>wrote:
I'd argue that's a bug. And yet, I see lots of these things in theunderlying code now that I look. :-(
Libraries should only ever abort() in the event that the librarydeveloper created a bug -- and *every* case where the libraryabort()'s or fails an assertion should be treated as a high prioritybug.
On Mon, Sep 1, 2014 at 10:49 PM, Jihyun Yu <yjh0502@xxxxxxxxx> wrote:
Hi,

It seems that nanomsg aborts on recoverable error. For example
following code fails with abort();

==
#include <nanomsg/nn.h>
#include <nanomsg/pubsub.h>

int main(void) {
    int socket = nn_socket(AF_SP, NN_PUB);
    if(socket < 0) return socket;
    int tp1 = nn_bind(socket, "tcp://*:11234");
    if(tp1 < 0) return tp1;
    int tp2 = nn_bind(socket, "tcp://*:11234");
    if(tp2 < 0) return tp2;

    return 0;
}
==

Here's an output

==
Address already in use [98] (src/transports/tcp/btcp.c:378)
/bin/bash: line 1: 22624 Aborted                 ./a.out
==

I'm writing Erlang binding of nanomsg[1], and calling abort() inside
library functions is not acceptable in Erlang. Is the design
intentional?


[1] https://github.com/yjh0502/nanomsg

References:
- [nanomsg] Re: Abort on recoverable error?
  - From: Bruce Mitchener

[nanomsg] Re: Abort on recoverable error?

Other related posts: