[nanomsg] Re: [PATCH] Terminate astream if initializing usock failed

From: Martin Sustrik <sustrik@xxxxxxxxxx>
To: <nanomsg@xxxxxxxxxxxxx>
Date: Tue, 12 Mar 2013 12:09:53 +0100

On 2013-03-12 11:49, Nir Soffer wrote:

On Tue, Mar 12, 2013 at 12:09 PM, Martin Sustrik <sustrik@xxxxxxxxxx>wrote:
On 2013-03-12 10:59, Nir Soffer wrote:

 On OS X the scenario is this:
1. client connects and close the socket
2. server accept a new socket
3. server tune the socket
4. setsockopt fail with EINVAL (looks like a bug in OS X, as thevalue is
valid (e.g. 128K receive buffer)
Yuck. That's an ugly behaviour. Anyway, what if we simply ignored the
EINVAL error from all those setsockopts, when on OSX?
What will happen if we fail to make the socket non-blocking, and thenwe
try to initialize the stream with a blocking socket?
I think that failing fast is better in most cases. What I don't likeaboutthis patch is that we cannot detect a bug in the library. For example,ifwe change the code to set a (real) invalid value, we will simply failto
handle all connections, instead of asserting.

Yeah. I hate that we can't distinguish real "errorneous value" problem(nanomsg should assert) from "peer disconnected" problem (nanomsg shouldclose the socket in decent manner).

Anyway, given that we are not able to distinguish between the two, Ithink that only option is to assume that there's no bug in the relatedlibrary code and consider the error to mean "peer disconnected".

What if we check EINVAL, then fail with -EINVAL on OS X (and otherplatform
that show this error), and assert on Linux?

AFAICS it's the same as ignoring the error on OSX. The assumption isthat the peer have disconnected. If we ignore the error next send/recvcall on socket fails and socket will be shut down. If we return -EINVAL,the socket will be shut down. Same thing.

I've investigated a little bit further and the problem is instream.c,lines 127-130. There's a send operation launched. If sending fails,thecallback deallocates the stream object. The execution then continuestoline 130 where recv() is done on invalid (recently deallocated)socket.
This explain the impossible crash :-)
Synchronous failures through callbacks are really horrible to debug -I wasbitten by such issues many times. It would be much easier if allerrors
that invoke a callback are asynchrounous. For example, if the send
operation fails, we set a flag and wake up the event loop, so thecallback
on the next iteration of the event loop.

Yes. Same experience with callbacks here. (ZeroMQ is a callback hellinternally.) I would like to do it in a different way for nanomsg,however, it's not a trivial patch, so I've been delaying dealing withthis issue till I have some uninterrupted time to hack on it.

Few thoughts on the mechanism: When AIO subsystem wants to send anotification it should check whether it is currently in a worker threador in user's thread (there's a function for determining that already).In the former case it should just invoke the callback. In the lattercase it should send an nn_event (already implemented) to the workerthread. The worker thread, on reception of the event should call thecallback.


How does that sound?

Martin

Follow-Ups:
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Nir Soffer
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Nir Soffer

References:
- [nanomsg] [PATCH] Terminate astream if initializing usock failed
  - From: Nir Soffer
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Martin Sustrik
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Nir Soffer
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Martin Sustrik
- [nanomsg] Re: [PATCH] Terminate astream if initializing usock failed
  - From: Nir Soffer

[nanomsg] Re: [PATCH] Terminate astream if initializing usock failed

Other related posts: