[nanomsg] Re: Load balancing / redundancy among connections

From: Drew Crawford <drew@xxxxxxxxxxxxxxxxxx>
To: nanomsg@xxxxxxxxxxxxx
Date: Wed, 6 Aug 2014 23:30:24 -0500

For whatever it’s worth, I benchmarked nanomsg req/rep in my “very bad network 
lab" and it did very poorly in a packet loss scenario.  I think when packet 
loss rose above 25% or so it was impossible to transmit a single message.

The problem wasn’t critical enough at that time to merit any further 
investigation from me, but if there’s interest from somebody else in submitting 
some patches I’d be happy to benchmark them on what is a pretty robust test 
environment.

Drew


On Aug 6, 2014, at 3:09 PM, Alex Elsayed <eternaleye@xxxxxxxxx> wrote:

> Name Withheld wrote:
> 
>> I have two linux machines (X and Y), with 2-20 extremely unreliable IP
>> connections between them. TCP is more reliable even without the its
>> reliability control than UDP in this setting, because the various ISPs
>> along the way apparently drop UDPs when they are congested, but not TCP.
>> The connections use different media (frame relay, GPRS, 3G, 4G, WiFi,
>> mesh networks, you name it), and are mostly there although each goes
>> away for few minutes to a few hours every few days. A jungle, no doubt.
>> 
>> I want to use all the connections available at a given moment to
>> increase the bandwidth, and since I can modify the applications running
>> on both machines - I wondered if I could use nanomsg for that? I can
>> deal with reordering and duplicate messages, but not with missing
>> messages ("at least once" delivery is needed)
>> 
>> From reading the documentation, it sounds like two pipeline connections
>> (X push -> Y pull, X pull <- Y push) would give me the load balancing,
>> as long as I can use different IP addresses to guarantee the connections
>> are going out through the different connections (which I can! each
>> machine has 20 IP addresses). However, I can't figure out from the docs
>> if there is retransmit if a connection dies while a message is "in
>> flight" . Also, I can't figure out from the docs how a broken transport
>> is detected - will it have to wait until the TCP connection died
>> (>1minute), or is there in inner timeout I can control?
>> 
>> So, the question is:
>> 
>> Is my understanding correct, and pipeline is the way to go? Or is there
>> a better solution? (Or, is nanomsg totally not the right tool for me?)
>> 
>> I've considered using the linux bonding interface, and doing a tcp
>> connection above that; However, this would introduce crazy latencies and
>> retransmits because tcp tries to keep packet order, which I can do
>> without.
>> 
>> Thanks in advance.
> 
> One thing I'd suggest looking into is MPTCP[1] (Multipath TCP) - it's 
> designed for basically this exact use case.
> 
> If you can't (or don't want to) build kernel modules, then the MPTCP 
> proxy[2] (which runs in userspace using netfilter/iptables) may be of 
> interest.
> 
> Either option would be essentially transparent underneath nanomsg, due to 
> the design of MPTCP.
> 
> Another option might be SCTP (since it supports multihoming), although that 
> would likely require adding an SCTP transport to nanomsg.
> 
> [1] http://www.multipath-tcp.org/
> [2] http://www.ietf.org/mail-archive/web/multipathtcp/current/msg01934.html
> 
>

Follow-Ups:
- [nanomsg] Re: Load balancing / redundancy among connections
  - From: Garrett D'Amore

References:
- [nanomsg] Load balancing / redundancy among connections
  - From: Name Withheld
- [nanomsg] Re: Load balancing / redundancy among connections
  - From: Alex Elsayed

[nanomsg] Re: Load balancing / redundancy among connections

Other related posts: