[nanomsg] Re: draft surveyor RFC

  • From: Drew Crawford <drew@xxxxxxxxxxxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Sat, 7 Mar 2015 19:04:19 -0600

> It seems like you aren't really planning on being part of any bigger nanomsg 
> ecosystem though.  That's somewhat disappointing since I think you are a 
> smart guy with useful ideas.  Oh well. 

I have very much tried to be a part of the nanomsg ecosystem for the past 
several years. From patches, to offers of patches, to very serious change 
proposals. Unfortunately, it has not exactly worked out on the whole.

The charitable interpretation is that I have different problems that require 
different solutions.  The uncharitable interpretation is that the process has 
failed to incorporate diverse viewpoints, and will likely continue to do so for 
the forseeable future.

Irregardless, like you, I am motivated to solve my own problems.  To the extent 
that I have to break with nanomsg to accomplish that is unfortunate, but 
certainly I do not see what arguing about RFCs is going to solve in the next 2 
years that it has not solved in the last 2.  A change of strategy is required.

I certainly do not begrudge you for inventing a solution that solves your 
problems, as that is very much in the spirit of what I am doing, and what I 
think needs to happen much more often for the project as a whole to be healthy. 
 If you intend to contrib an implementation to libnanomsg I think 
standardization would be appropriate.

Drew


> On Mar 7, 2015, at 6:35 PM, Garrett D'Amore <garrett@xxxxxxxxxx> wrote:
> 
> I would actually be submitting the changes for both mangos and libnanomsg.  
> 
> I suspect you are right and very few people use this protocol.  But I now 
> have a need for it but I need it be not broken.  So I'm motivated. 
> 
> I use mangos and libnanomsg as part of my project - some agents are written 
> in C and some in Go.  So it's important to me that the two be able to 
> communicate. 
> 
> I'm curious to see what you produce but you seem to have deviated in major 
> ways that we don't now about yet. Interoperability doesn't seem to be your 
> particular concern.  I guess that's fair enough.  It seems like you aren't 
> really planning on being part of any bigger nanomsg ecosystem though.  That's 
> somewhat disappointing since I think you are a smart guy with useful ideas.  
> Oh well. 
> 
> Sent from my iPhone
> 
> On Mar 7, 2015, at 4:25 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx 
> <mailto:drew@xxxxxxxxxxxxxxxxxx>> wrote:
> 
>>> I think the RFCs need to be prescriptive at least in terms of 
>>> interoperability. 
>> 
>> Interoperability with what?
>> 
>> For fun, I searched github for NN_SURVEYOR.  I found nanomsg forks, nanomsg 
>> bindings, and 6-line code samples.  I did not see any serious users of 
>> NN_SURVEYOR.  (Admittedly I did not look beyond the first few pages.)  And 
>> in any case nobody seems to be responding to your questions about the 
>> protocol’s future.
>> 
>> I doubt very much that there is enough nanomsg deployed at present that any 
>> breaking change in any protocol would deserve more than a handful of 
>> scheduled maintenance windows.  I probably run one of the larger nanomsg 
>> deployments in terms of user number and yet I am the one making breaking 
>> changes.  That’s a clue.
>> 
>>> Otherwise we should all just go our separate ways. That would be 
>>> unfortunate IMO. 
>> 
>> Well, let’s not kid ourselves.  This is already the present reality.
>> 
>> libnanomsg is, in my estimation, unmaintained at the present time
>> mangos is driving this SURVEYOR change, which few outside it seem to care 
>> about strongly
>> I have for more than a year now been semi-quietly shipping a fork which 
>> breaks RFCs at minor points
>> I am currently crossing the threshold between minor and major RFC breakage 
>> and am making plans to move my breaking work into the open
>> 
>> The ship to produce a single universal standard has sailed.  It’s shuffled 
>> off the mortal coil.  It is an ex-parrot.
>> 
>> I don’t know if I would describe that as “unfortunate”.  It simply means 
>> people have different problems, and that there is not one solution to rule 
>> them all.  I know that even as I am breaking RFCs, I am still glad I have 
>> RFCs to break.  They contain useful ideas.  There are just some details that 
>> aren’t a good fit for my circumstances.
>> 
>> Returning to the RFC in question, I can tell you that I won’t implement it.  
>> Again, that is not to say that the solution is bad, simply that I do not 
>> suffer from the motivating problem.  As such I don’t think I’m in a fair 
>> position to evaluate the solution in enough detail to have an informed 
>> opinion.
>> 
>> As far as building consensus on getting the RFC merged in, I think the 
>> operating question is what the support is from the implementations. Clearly 
>> I am out; but if there is a clear plan to bring it to both libnanomsg and 
>> mangos then I think there is reason enough to have it standardized.
>> 
>> Drew
>> 
>> 
>> 
>> 
>> 
>>> On Mar 7, 2015, at 4:31 PM, Garrett D'Amore <garrett@xxxxxxxxxx 
>>> <mailto:garrett@xxxxxxxxxx>> wrote:
>>> 
>>> I think the RFCs need to be prescriptive at least in terms of 
>>> interoperability.  Otherwise we should all just go our separate ways. That 
>>> would be unfortunate IMO.  The current implementation in libnanomsg cannot 
>>> support devices properly with survey nor can it support multiple surveyors 
>>> connected to a responder.  This is a bug that must be fixed.  
>>> 
>>> One could argue that the approach of using backtraces could be an option 
>>> maybe a new protocol altogether.  I imagine you'd prefer to have the 
>>> inclusion be optional so that trivial 1:1 topologies or those without 
>>> replies could skip it. But that seems like a substantial departure from 
>>> existing practice. 
>>> 
>>> The question therefore I think is whether the old protocol needs to be 
>>> preserved in parallel to the new one and whether this approach to the fix 
>>> is generally agreeable. 
>>> 
>>> I'd vote to drop the legacy broken protocol from libnanomsg but I can see 
>>> that some may want to preserve it for compatibility.  
>>> 
>>> Sent from my iPhone
>>> 
>>> On Mar 7, 2015, at 12:11 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx 
>>> <mailto:drew@xxxxxxxxxxxxxxxxxx>> wrote:
>>> 
>>>> My opinion can be summarized as follows.
>>>> 
>>>> 1.  I don’t suffer from the problem motivating this change.
>>>> 2.  As a corollary to 1, I won’t implement it.  
>>>> 3. That is not to say that it is a bad solution to the motivating problem, 
>>>> just that I don’t need a solution to the motivating problem.
>>>> 4. I do in principle object to “unnecessarily narrow” problems being 
>>>> solved inside new protocol specifications, of which I think this is an 
>>>> example
>>>> 5. But it is not the only example, and I do not think it is an especially 
>>>> egregious departure from existing nanomsg practice, so the real villain is 
>>>> another castle
>>>> 6. At some level we have to decide what is the purpose of the RFC 
>>>> directory.
>>>> 6a.  If it is descriptive, then this document adequately describes what 
>>>> you are doing, and I don’t object to that.  Nor do I understand how an 
>>>> objection would be possible to any RFC that adequately describes what 
>>>> somebody is doing.
>>>> 6b.  If it is prescriptive, then I simply won’t follow it.  But I’m not 
>>>> invested enough in the motivating problem to present an argument that 
>>>> those who follow it are wrong.
>>>> 
>>>> Drew
>>>> 
>>>> 
>>>>> On Mar 7, 2015, at 1:00 PM, Garrett D'Amore <garrett@xxxxxxxxxx 
>>>>> <mailto:garrett@xxxxxxxxxx>> wrote:
>>>>> 
>>>>> Bueller?  Bueller?
>>>>> 
>>>>> Would really really like a solution to this.  Any other opinions (for my 
>>>>> approach, or against it)?  Or should I just go ahead and submit a pull 
>>>>> request at this point?
>>>>> 
>>>>>   - Garrett
>>>>> 
>>>>>> On Feb 25, 2015, at 9:57 AM, Garrett D'Amore <garrett@xxxxxxxxxx 
>>>>>> <mailto:garrett@xxxxxxxxxx>> wrote:
>>>>>> 
>>>>>> So I didn’t see a reply to this.  I’d really like to move forward with 
>>>>>> this — I have a need for “fixed” surveyor methods in my application.  
>>>>>> I’m writing the code that does REQ/REP style processing for now - I 
>>>>>> think this is more than sufficient for all current needs.  I’d hate to 
>>>>>> defer fixing this pending the requirements for the creation of an as yet 
>>>>>> non-existant UDP transport.
>>>>>> 
>>>>>> I’ve certainly convinced myself that even UDP can live with the 32-bit 
>>>>>> “pipe IDs” that are currently being embedded in the headers.  Doing so 
>>>>>> will require some modest amount of state on the peers, but frankly 
>>>>>> that’s not unreasonable, and I think its far better than carrying all 
>>>>>> that state in the headers itself.  (I see grave concerns with carrying 
>>>>>> identifying information like intermediate IP addresses 
>>>>>> 
>>>>>>  - Garrett
>>>>>> 
>>>>>>> On Feb 20, 2015, at 11:16 AM, Garrett D'Amore <garrett@xxxxxxxxxx 
>>>>>>> <mailto:garrett@xxxxxxxxxx>> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> On Feb 20, 2015, at 12:49 AM, Martin Sustrik <sustrik@xxxxxxxxxx 
>>>>>>>> <mailto:sustrik@xxxxxxxxxx>> wrote:
>>>>>>>> 
>>>>>>>> On 2015-02-19 22:08, Garrett D'Amore wrote:
>>>>>>>> 
>>>>>>>>> Thinking about it further, I think this is a *bad* idea.  The problem
>>>>>>>>> is that then don’t have a way to infer stack depth easily — which
>>>>>>>>> makes it impossible to count hops and problematic therefore for loop
>>>>>>>>> prevention.
>>>>>>>>> Additionally, there may be value in keeping more state (even for UDP)
>>>>>>>>> with a pipe than the peer.  Therefore, I’m going to propose that a UDP
>>>>>>>>> transport implementation could create pseudo-pipes, with a cache and
>>>>>>>>> timeout associated with them, as well as some upper bound.
>>>>>>>>> For example, time out any pipe without traffic seen in the last 60
>>>>>>>>> seconds.  Then when a new message is received from a different peer,
>>>>>>>>> create a pipe ID for it, storing the IP address & port of the peer.
>>>>>>>>> When traffic comes in from the same peer, or goes out to it, bump the
>>>>>>>>> timer on it.
>>>>>>>>> Figure a maximum of “n” UDP pipes to be opened.  For example, 10,000
>>>>>>>>> ports.  In the worse case, you’d need to store something like 64 bits
>>>>>>>>> for the IP address and port (more for IPv6), plus room for a sweep
>>>>>>>>> hand timer (for mark and sweep based timeout, which would be
>>>>>>>>> simplest), so data buckets are 8 bytes, and  figure another 32 bytes
>>>>>>>>> for tracking linked list linkage (linking buckets in hash table) —
>>>>>>>>> plus guess maybe another 8 bytes of over head, so 64 bytes per UDP
>>>>>>>>> port.  The some total of this is 64K per 1000 ports, which  comes in
>>>>>>>>> at less than a MB for an entire 10,000 ports.  If you want to support
>>>>>>>>> up to 1M active unique peers, it gets a little more expensive, but its
>>>>>>>>> still only 100MB, which is not that big a deal for modern computers.
>>>>>>>>> I doubt many single servers have to deal with 1M unique visitors per
>>>>>>>>> minute, and those that do are pretty darned beefy. :-)  (Actually,
>>>>>>>>> looking at say Google — which had the highest web visitor count for
>>>>>>>>> the month back in May of 2012 they had 173 M unique visitors per
>>>>>>>>> month, which is actually only 4004 unique visitors per *minute*.  So
>>>>>>>>> having a limit of 1000, or even 10000 max open pipes for one service
>>>>>>>>> instance doesn’t seem limiting.)
>>>>>>>> 
>>>>>>>> First: Why have pseudo-connections at all? (Ignoring the issue of 
>>>>>>>> variable-length backtrace records.)
>>>>>>> 
>>>>>>> Again, its tracking whatever state might be necessary to process the 
>>>>>>> packet *and* return the reply.  To get through your topology state is 
>>>>>>> required.  The question is whether all the state lives in the packet, 
>>>>>>> or you are willing to let devices along the path participate in state 
>>>>>>> keeping.  Since really  the state that is necessary is only required 
>>>>>>> for routing replies, not every protocol needs it.  For example pub/sub 
>>>>>>> only really needs a hop count which can travel with the frame. (And 
>>>>>>> that’s missing, but another problem to fix later for loop prevention.)
>>>>>>> 
>>>>>>> There’s another point here too… the middle components may have state 
>>>>>>> doesn’t fit well in 32-bits, could even be pretty large.  Forcing that 
>>>>>>> to travel with the frame is onerous.
>>>>>>> 
>>>>>>> And, then there is a privacy problem.  If all the state needed is kept 
>>>>>>> with the frame, then it is exposed on the wire.   This may expose 
>>>>>>> things about my internal network (IP addresses and so forth) that I 
>>>>>>> consider private to me.  That has two potential side effects.  One is 
>>>>>>> security oriented (my internal network gets exposed via this protocol), 
>>>>>>> the other is architectural (people can start attempting to *use* that 
>>>>>>> knowledge in their applications, violating all the nice clean layering 
>>>>>>> that we’ve built; having parseable headers is I think ultimately a road 
>>>>>>> to hell.)
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Second: My conceptual image of an UDP socket is a universal radio 
>>>>>>>> transmitter/receiver. It can get data from anyone and send data to 
>>>>>>>> anyone. No restrictions aside of the limited packet length. If we are 
>>>>>>>> going to have udp:// transport I would like to preserve that 
>>>>>>>> conceptual image. If, on the other hand, we are going to build a more 
>>>>>>>> connection-like transport on top of UDP, let's call it something 
>>>>>>>> different. In short, transport functionality should correspond to the 
>>>>>>>> transport name.
>>>>>>> 
>>>>>>> I don’t see how that is at odds with what I’ve described, for the 
>>>>>>> protocols where that makes sense (e.g. BUS).  Now that said, I’m only 
>>>>>>> thinking about unicast UDP.  If you’re wanting to figure out ways to 
>>>>>>> use broadcast or multicast UDP, *that* feels like a bigger departure — 
>>>>>>> I think some of the protocols (such as req/rep) fall down in the face 
>>>>>>> of this.
>>>>>>> 
>>>>>>>> 
>>>>>>>> Third: Here's another use case for variable-length items, just off the 
>>>>>>>> top of my head: Imagine a REQ/REP or SURVEYOR topology spanning from 
>>>>>>>> inside of a company to the outside world. The company may not want to 
>>>>>>>> expose details of its network to the world (via the traceback records) 
>>>>>>>> and thus may choose to place device at the edge of their network that 
>>>>>>>> takes the current stack of the request and encrypts it, creating a 
>>>>>>>> single mangled record. When the replies arrive at the edge, they are 
>>>>>>>> decrypted and the message is routed forward into the corporate network.
>>>>>>> 
>>>>>>> That level of privacy is *easier* to achieve by just ripping off the 
>>>>>>> header entirely and writing a new one - in fact, if you have some state 
>>>>>>> here you can save the backtrace in your state.  You could of course 
>>>>>>> implement that mangling bit you just described today instead.  But in 
>>>>>>> that case its going to still appear to have a set number of hops.  If 
>>>>>>> the mangled header has a different size, that will cause confusion.  It 
>>>>>>> would be bad to store a very much longer header than what the message 
>>>>>>> had on ingress, because that would appear to be adding hops to a naive 
>>>>>>> examiner.
>>>>>>> 
>>>>>>> You know, it occurs to me that we could probably dispense with a lot of 
>>>>>>> these if we just changed the final request ID part of the header from a 
>>>>>>> 32-bit word (1 + 31 bits) to have a different format; for example, 
>>>>>>> 1+7bits+24 bits.  The 24-bits would be a pipe ID, and the 7-bits could 
>>>>>>> carry a hop count.  That would leave room for up to 16 million pipes, 
>>>>>>> and really who can handle more than that simultaneously?  And you’d be 
>>>>>>> able to count up to 127 hops — and frankly nobody wants messages 
>>>>>>> bouncing around their network for more hops than that! :-)
>>>>>>> 
>>>>>>> If we made *that* change, then we could dispense with most of the 
>>>>>>> header payload rules, except to require the following:
>>>>>>> 
>>>>>>> a) devices always strip off the same size header that they attach.
>>>>>>> b) headers are always grown in increments of 32-bits.
>>>>>>> c) each intermediate 32-bit word of a header must have the upper bit 
>>>>>>> cleared.
>>>>>>> 
>>>>>>> What transports or protocols do beyond that then becomes a 
>>>>>>> transport/protocol decision.
>>>>>>> 
>>>>>>> Now it turns out that in my implementation of mangos, the protocol is 
>>>>>>> responsible for adding / removing “pipe IDs” to the header, because the 
>>>>>>> protocol doesn’t know transport details.  Internally all transports 
>>>>>>> just have a 32-bit ID assigned by the system, for each pipe they 
>>>>>>> present.  Breaking that abstraction would cause serious internal 
>>>>>>> redesign to be done, and that’s not something I’d like to do.  But I 
>>>>>>> also keep “connection” state details to offer up to APIs as well.  For 
>>>>>>> example, for TLS connections I can present the TLS peer certificate 
>>>>>>> that was presented (if any), for websocket I give access to the actual 
>>>>>>> enclosing HTTP headers, and and for TCP and things on top if it, I give 
>>>>>>> access to the peer’s TCP endpoint address.  (In the future I hope to 
>>>>>>> offer access to peer credentials for IPC, and on systems that offer it, 
>>>>>>> on local TCP connections too.  There is some ahem — work — to do to 
>>>>>>> make that happen for systems because Go doesn’t expose the necessary 
>>>>>>> system calls — yet.  I’m probably going to send patches upstream to Go 
>>>>>>> to fix that for illumos/Solaris at least.)
>>>>>>> 
>>>>>>>         - Garrett
>>>>>> 
>>>>> 
>>>> 
>> 

Other related posts: