[nanomsg] Re: Scaproust 0.2.0 released: first performance report
- From: Matthew Hall <mhall@xxxxxxxxxxxxxxx>
- To: nanomsg@xxxxxxxxxxxxx
- Date: Wed, 14 Dec 2016 23:42:24 -0800
On Wed, Dec 14, 2016 at 04:47:07PM +0000, Benoit Labaere wrote:
There is a per message overhead for several reasons:
- allocation on each send, because the socket takes ownership of the
message when sending.
For a long time I have been wishing for a way to free the messages via
callback handler. Because I usually prefer to create the messages in special
pooled memory, send them, then mark them free from the pool upon successful
TX. It's basically zero alloc cost but not available in nanomsg or zeromq yet.
- there is a channel between the user thread and the I/O thread, it is
crossed once by the sent message and once by the 'return code'.
Many people think non-blocking is best. But in my performance studies I
usually find that a very large number (like 4-6 per physical core) of simple
blocking user threads is faster as it prevents this problem right here, and
the queueing overhead prevents you from managing to totally saturate.
- there is no prefetch of incoming messages.
Yes, you have to try to fit messages into cacheline multiples and proper
alignment then prefetch first 1-3 lines after RX completes before pushing back
to user code.
Good luck,
Matthew.
Other related posts: