[nanomsg] Re: Inproc latency and programming challenge!

  • From: Martin Sustrik <sustrik@xxxxxxxxxx>
  • To: Tino Breddin <tb@xxxxxxxxxxxxxx>
  • Date: Fri, 25 Jan 2013 19:25:07 +0100

Hi Tino,

(a) Is there documentation regarding thread management and message flow?

Unfortunately, not yet. How it works, in short, is that each socket is inside a critical section. Each socket also has a condition variable to wait on in blocking functions such as nn_recv() and a worker thread. When sending message from socket A to socket B, following sequence of steps happens:

1. nn_recv() on B is called.
2. B's critical section is entered
3. There are no messages waiting in B
4. Wait for condition variable in B

5. nn_send() is called on A
6. A's critical section is entered
7. message is written into shared pipe
8. event is sent to B's worker thread (via eventfd)
9. A's critical section is exited

10. B's worker thread gets the event
11. It signals B's condition variable

12. Application thread blocked in nn_recv() is unblocked
13. It receives the message from the shared pipe
14. It leaves B's critical section
15. nn_recv() on B exits

I would say all the above is pretty straightforward except for the worker thread thing. Specifically, why doesn't the sender thread signal the receiver's condition variable directly, instead of going through the worker thread?

The reason is that if it is done so, there's a race condition when both sides of the inproc connection are sending at the same time:

1. nn_send(A)
2. lock mutex A

3. nn_send(B)
4. lock mutex B
5. message is written to a shared pipe
6. lock mutex A <-------- deadlock happens here
7. signal A's condition variable
etc.

(b) Have you implemented some sort of busy wait already (for the workers
at least) ?

No. Not yet. I guess busy wait could reduce the latency of the "post" step from 7us to 0.5us.

If so, and if "event" step can be somehow eliminated altogether, we can expect inproc latency below 5us. That would be really nice.

Martin

Other related posts: