[nanomsg] Re: initial code repo for Go version of SP protocols

  • From: Alex Elsayed <eternaleye@xxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Mon, 24 Mar 2014 15:37:52 -0700

I have a number of concerns with this; replies inline.

Martin Sustrik wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Garrett,
> 
> Agreed. Relying on IP fragmentation and going forward with a
> simplistic UDP protocol (as outlined in the RFC) seems to be the best
> idea.

Doing this would go over like a lead balloon in the IETF, for the simple 
reason that it offers zero capability for congestion control.

Doing it over DCCP is better, but the information needed for CC (ACK/NAK) is 
also sufficient for retransmission - thus making it something of a moot 
point.

> Still, keep in mind that IPv6 doesn't support fragmentation.
> 
> Martin
> 
> On 24/03/14 06:51, Garrett D'Amore wrote:
>> 
>> On Mar 23, 2014, at 10:08 PM, Martin Sustrik
>> <sustrik@xxxxxxxxxx> wrote:
>> 
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>> 
>>> On 24/03/14 04:57, Garrett D'Amore wrote:
>>>> It would be pretty straight forward to add UDP support.  The
>>>> issue will be defining what the wire format for UDP looks like.
>>>> I have some thoughts here and if I had my druthers each SP
>>>> message would translate to a single UDP datagram.  Others might
>>>> have different ideas.
>>> 
>>> Yes. I would say that UDP-mapping RFC should contain a single
>>> sentence: "SP message maps directly to an UDP datagram."
>>> 
>>> That obviously entails that all messages exceeding path MTU will
>>> be dropped silently.
>> 
>> Mostly agreed.   However….
>> 
>> *IP* can perform UDP fragment reassembly.  I think it supports up
>> to 64k packets, even over link layers that are substantially
>> smaller (ethernet MTUs are 1500 bytes usually).

This also enormously magnifies the effects of packet loss - losing one 
fragment loses the entire packet, resulting in the application layer 
possibly resending the entire thing (REQ), causing a massive bandwidth 
amplification.

Combined with the total lack of congestion control, this leads to rapid 
congestion collapse.

>> 64k would probably be an ultimate size limit in any case (owing to
>> IP limitations).  Minus IP/UDP header overheads of course! :-)

This might result in quite a bit of unhappiness.

>> What is interesting is Path MTU (and Path MTU discovery in
>> particular) tends to include a flag to tell MTU to discard
>> fragments.  This is intentional so that IP will reduce the MTU to
>> something that works across all intermediate links without
>> fragmentation or fragment reassembly.

This relies on a specific type of ICMP packet being able to find its way 
back to the sender. This is broken with depressing frequency by overzealous 
firewall configurations.

>> Its not clear to me that imposing such a restriction on SP is a
>> good idea.  While I think it would be a pretty bad idea to build a
>> protocol that *relied* on fragment reassembly (these are slow path,
>> edge cases in IP stacks after all), there might be real reasons to
>> support an occasional large frame.  That said… why wouldn’t use TCP
>> in such circumstances? :-)

Using TCP is suboptimal here because if you're using XREP/XREQ, you have 
independent SP interactions that then share head-of-line blocking.

In order for it to be even somewhat acceptable in the IETF, what would need 
to happen is as follows:
* SP PDUs are split into datagrams, such that they fit under the MTU. There 
are RFCs on what to do if PMTUD fails, so there are fallbacks.
* Congestion control MUST occur

Once you do those, you might as well do this (you already have all the info 
on what needs resent because of congestion control):
* Datagrams are resent if they do not get acked/do get nacked

And then as a free optimization, you really want this:
* Datagrams within SP PDUs are ordered, but there are no ordering 
constraints _between_ SP PDUs

Because without that "unordered between PDUs" bit, you gain nothing over 
TCP.

>>> 
>>> As for a possibility of making a more complex transport on top of
>>> UDP, the primary motive, IMO, would be to allow messages to
>>> exceed PMTU. That in turn means adding identification of peers
>>> (either by using connections or by using globally unique
>>> identifiers), sequence numbering the packets, PMTU discovery,
>>> identifying message boundaries withing packets etc. If anyone is
>>> interested in this, I'll be happy to elaborate.
>> 
>> IP already does all this…. it’s called fragment reassembly.
>> nanomsg should *NOT* reinvent this particular wheel IMO.  If
>> someone really cares to transport large frames, they have two
>> options:
>> 
>> 1. Use TCP 2. Figure out an application layer solution
>> 
>> Making SP/nanomsg offer an alternative solution just seems….
>> wasteful… to me.
>> 
>> - Garrett
>> 
>> 
>> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iQEcBAEBAgAGBQJTL+jsAAoJENTpVjxCNN9Y2zwH/38NdVA6a33bY1b82BaMLPkM
> T9CXM3lM9AJrY4I0bOToGghk9YpGiRs0yFQXzoHx6flaFsBtTH4NEzjVAodRW2eB
> Tkpzh0W999F0J1DuKPso8+xgK4QWGKqy4ChuNl5c4RAtLk2ennnvTlQsRR0XB+A7
> byio/BGaDnvKdfFggUVpTMqlbBoSpmMQ4gusOvRwTmqVK9D+DB4s4J+zV5RLLxSq
> F3P4JeYtQlR1VGEZoNzWlMclSaCtOEhSf3iJIYYIR0Bkcl0VjCl32dOJIUqLkHwL
> wOJ7Zs2pLPnLWk1AImWF0/n7Voq8oDLI9jmI+YJDSe7GvwDXR7/3Gtd2K6YBgBg=
> =ePt3
> -----END PGP SIGNATURE-----



Other related posts: