[nanomsg] Re: interesting performance figures

From: Garrett D'Amore <garrett@xxxxxxxxxx>
To: Peter Kümmel <syntheticpp@xxxxxxx>, nanomsg@xxxxxxxxxxxxx
Date: Thu, 27 Mar 2014 10:54:32 -0700

On March 27, 2014 at 10:02:03 AM, Peter Kümmel (syntheticpp@xxxxxxx) wrote:
On 27.03.2014 16:17, Garrett D'Amore wrote:  
>> Looks like inproc performs best in comparison to nanomsg, and gives  
>> the only test where Go beats nanomsg (4k messages, C:4MB/s Go:5MB/s).  

4k as argument size in a function call is not the use case for which Go is 
optimized for.  
Could you run your tests with a payload with only a couple of bytes?  
Sure, this is not nanomsg's use case, but it will show nanomsg's overhead  
and what would be possible.  

My latency tests were done with 111 byte messages.  They don’t measure thruput 
though - thruput measurements are pointless with tiny messages.  What is more 
interesting there is the latency, and the comparison is not too bad.  I’m not 
as fast as native C — the context switching hurts there I think, and I have 
longer code paths (several abstraction layers to transit), but even so, I’m 
getting within spitting distance of the C code.  To be honest, the difference 
between 6 usec and 8 usec per RPC call is so tiny that it almost doesn’t hurt 
at all.

The bigger concern is that the TCP stack flow seems to hurt.  In Go, its about 
50 usec to perform the exchange as opposed to half that for native C.  The fact 
that the latency went up suggests that the problem there resides deeper in the 
Go TCP stack.  My guess is that there may be some suboptimal data copying.

That said, I know I’m thrashing the GC.  I’ve filed an RFE against my 
implementation to offer a version that doesn’t suffer this way.

> 
> I was actually surprised that there were *any* such cases. The nanomsg code 
> is pretty mean and lean. That said, I might 
> see better scalability with thousands of clients, but I’ve not written test 
> cases for that. Its not a pressing concern 
> for me at the moment. :-) 

Yes, especially when thread management would become complex, using Go would be 
a benefit. 

Yes.   Go offers many other benefits apart from performance.  The language is 
still young, and I think performance work is something folks are working on 
actively.    That said, if performance is your absolute most important concern, 
then probably working in C is a better choice at this point.   I’m not using 
nanomsg for extreme workloads; I’m using it for a clean API and a robust 
transport layer.  I’m using Go because I like the language, I like that its 
“dependency free” at runtime, and I think I can get more done quickly, with 
fewer surprises, than the alternatives.  I still write a lot of C — I’m a 
kernel / driver-developer by day. :-)

> 
> Btw, with inproc now working, it should definitely be possible to try this 
> out on play.golang.org — I haven’t done so 
> yet, but I will probably do so if I find some free time later today. 

From the About site: 
"There are also limits on execution time and on CPU and memory usage" 

So, I assume benchmarking would be a problem. 
Oh right… wouldn’t want to benchmark except as a toy.  But using 
play.golang.org to “try” it out, and poke around, offers some quick learning 
changes, and is a good way for me to post up some examples that folks can 
noodle around with try out my stuff.  I think this is an accelerator to 
adoption.  I know many times I had a question about something in Go (this was 
my first Go project, btw, but I’m thoroughly convinced I’ll be using it a *lot* 
more), and the ability to just try something out easily in a browser was much 
easier than trolling mailing lists, google, or documentation.

>> I wonder if you have an idea how good other inproc frameworks perform, 
>> especially Qt's queued connections. 
> 

I've tried to implement a server with Qt, and it performs very bad. 
Now I'm thinking about combining Qt with nanomsg, or if it makes sens to 
replace Qt's message handling with nanomsg's one. 

I don’t know.  I suspect that if the perf is terrible, replacing it might be a 
good thing.  nanomsg is quite performant.  You could also look at ØMQ, which 
might be better for integration into something like Qt itself.  Plus its 
written in C++.  Normally I don’t consider that an advantage, but since you’re 
already paying the cost for that choice with Qt itself, ØMQ won’t make it any 
worse. :-)

        - Garrett

References:
- [nanomsg] interesting performance figures
  - From: Garrett D'Amore
- [nanomsg] Re: interesting performance figures
  - From: Peter Kümmel
- [nanomsg] Re: interesting performance figures
  - From: Garrett D'Amore
- [nanomsg] Re: interesting performance figures
  - From: Peter Kümmel

[nanomsg] Re: interesting performance figures

Other related posts: