Hi all,I've just created a tool to analyse the latency of nanomsg in detail. So far it does so only for inproc transport (other transports to be added later).
The output of the analyser is attached. The whole operation of sending and receiving a message is split into several steps and duration of each of them is measured. The test runs in loop 1000x. Results for each step are plotted in form of percentile graphs. Also, average time of each step (in CPU ticks) is shown in the legend.
To summarise the result, almost all steps take negligible amount of time (0.3-1.0us) with the exception of two steps:
1. Sending an event (via eventfd on Linux) to wake up the worker thread when message becomes available ("event" = ~10us).
2. Waking up the application thread waiting for the message via pthread_cond_wait ("post" = ~7us).
The challenge: Can anyone make this better either by optimising the wake-ups or even removing one of them altogether?
To generate the latency analysis yourself follow these steps: 1. Install R language for statitical analysis 2. Download nanomsg git repo.3. In nanomsg/CMakeLists.txt uncomment "# add_definitions (-DNN_LATENCY_MONITOR=1000)" line to switch the latency analysis on
4. mkdir build 5. cd build 6. cmake .. 7. make 8. run the inproc latency test: ./inproc_lat 1 1000 9. generate the graph: R -f ../perf/inproc_lat.R 10. Check the graph in ./inproc_lat.png Good luck! Martin
Description: PNG image