[nanomsg] Re: ReqRep high performance

  • From: Garrett D'Amore <garrett@xxxxxxxxxx>
  • To: "nanomsg@xxxxxxxxxxxxx" <nanomsg@xxxxxxxxxxxxx>
  • Date: Mon, 19 Jan 2015 09:01:30 -0800

Look at the device framework.  You don't need parallel links just parallel 
processing.  I'm not sure that other examples exist.  

Sent from my iPhone

> On Jan 18, 2015, at 11:55 PM, Pierre Salmon <pierre.salmon@xxxxxxxxxxxxx> 
> wrote:
> 
> Hi Garrett, thanks for your answers. I will parallelize my code to open 
> multiple links. Where can i find an example of Raw REQ/REP ?
> 
> Pierre
> 
> On 01/16/2015 06:11 PM, Garrett D'Amore wrote:
>>> On Jan 16, 2015, at 8:00 AM, Pierre Salmon <pierre.salmon@xxxxxxxxxxxxx> 
>>> wrote:
>>> 
>>> Hi,
>>> 
>>> I have a little question, what is the best architecture to have 
>>> request/response system with high performance (300000 msg/s).
>>> Now, i use REQREP socket pattern but, with simple example, i hace only 
>>> ~30000 msg/s (1 thread with REQ socket and 1 thread with REP socket). if i 
>>> add new threads (REP+REQ) in apps, i cannot increase this result (always 
>>> 30000).
>>> what am i doing wrong ?
>>> 
>>> Pierre
>> This begs many, many questions.
>> 
>> The code can probably achieve > 1M messages per second, but *not* if you’re 
>> running a vanilla req/rep socket.  Those sockets are strictly serialized, 
>> and you wind up losing performance because you can only have a single 
>> message *outstanding*.  Networking latency thus becomes the limiter in that 
>> situation.
>> 
>> The solution to that problem is to make sure you’re using raw modes — 
>> RREQ/RREP if I recall the code properly.  (In mangos you get this by setting 
>> the socket option to Raw mode, but nanomsg instead makes you select it 
>> during socket initialization.)
>> 
>> Be aware that running in raw mode means that you have to take care to match 
>> replies to requests, by looking at the header, and copying the header from 
>> the request to the reply.
>> 
>> There may be other factors limiting you too.   For example, do you have 
>> enough resources; do you have other serialization points in your application 
>> code; does your threading code properly engage multiple cores; do you have 
>> enough bandwidth to serve the traffic; etc. etc.  But at *first* guess, its 
>> probably the raw vs. cooked mode that is limiting you.  If you’re already in 
>> raw mode, you will need to do further analysis.
>> 
>> If you have to run serialized, you won’t be able to get such high message 
>> rates per second.  To get 300K messages per second you’d need to have a 
>> round trip latency of only 3 usec.  I’m not aware of any commodity transport 
>> that can do that, or even get close.  TCP transports over ethernet are 
>> probably on the order of 10x that latency.   (Note that raw ethernet, 
>> assuming 64-byte frames, can do about 9M packets per second over 10Gbe, or 
>> just under 1M for 1Gbe.   That’s running at wire rate with zero interpacket 
>> latency.   At 1GbE even 1 usec latency cuts that rate in *half*, so you 
>> *have* to get parallelization to achieve high rates.)
>> 
>>    - Garrett
> 
> 

Other related posts: