[nanomsg] Re: towards a more robust socket model

From: Garrett D'Amore <garrett@xxxxxxxxxx>
To: "nanomsg@xxxxxxxxxxxxx" <nanomsg@xxxxxxxxxxxxx>
Date: Sun, 16 Nov 2014 08:41:14 -0800
In theory I like your approach.  It is similar to system v streams.  However I 
suspect that the raw vs cooked mode won't be very elegant to solve this way.  
It turns out from my own experience with mangos that I don't think you can get 
quite away with this.  The protocols are rather complex and have behaviors that 
I think don't express that well in layered form.  In particular the protocols 
have quite different semantics from one another in what the application sees. 

That said I'd love to be proven wrong. 

I don't think then websocket example is well suited to this. It really is just 
a transport btw. At least it seems very clean to me in that respect.

But your security challenges as well as other things you have discussed in the 
past do seem like a natural fit for a filtering type approach like you describe 
here. 

Sent from my iPhone

> On Nov 15, 2014, at 11:51 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx> wrote:
> 
> In my time on this project I have observed some disagreements that seem 
> intractable:
> 
> There has been a lot of discussion about implementing 
> security/encryption/cryptography in nanomsg, going back years
> There has been months of discussion and a failed patch regarding #324, which 
> is a design error that creates undesirable coupling between certain protocols 
> and transports, and is an issue that I think has been worked around in some 
> fashion by every 3rd-party protocol/transport contributor that I know of in 
> various incompatible ways
> There is now discussion about transports, whether they are “content aware” or 
> they are “content dumb”, or “protocol aware” or “protocol dumb” in the 
> context of opcodes for WS.  The issue arises because WS as specified by W3C 
> has elements of both a nanomsg protocol and a nanomsg transport, but so far 
> has been implemented as a transport and not as a protocol.
> 
> I believe these problems (and maybe others) can be solved by introducing a 
> more robust software architecture for sockets.  However as far as I’m aware 
> nobody has proffered a particular alternative architecture or explains how it 
> solves any problem. I would like to proffer such an architecture and explain 
> how it solves, or substantially improves, all three of these problems.
> 
> I am most familiar with the security/encryption problem because I have solved 
> it.  I have worked on this for over a year and my solution is currently 
> deployed to around 10k users as part of a commercial project.  For reasons 
> that will become clear, my solution so far only works for REQ/REP, and next 
> year I have a requirement to get it working for at least one other protocol 
> family.  So keep in mind that at the end of the tunnel for this architecture 
> problem is potentially getting commercially-developed security contributed 
> back to core.
> 
> The problem with doing cryptography really reduces to the following 
> situation.  You probably want to enforce message integrity across a complex 
> and multi-hopped network.  This implies injecting some code to sign and 
> verify messages *near the application layer, above any scalability 
> protocols*.  Meanwhile you want to encrypt and decrypt traffic on a 
> hop-by-hop basis, to stop an NSA-level adversary from seeing (ciphertext, but 
> identical) packets moving from hop to hop and thereby deducing who is talking 
> to who.  This implies injecting some code *right above the transport layer*, 
> to handle the hop-by-hop situation.  It turns out it is also useful to inject 
> code in more places, to handle sessions, and some other crypto-related 
> problems.
> 
> However nanomsg sockets have only 2 slots of customization.  A protocol, and 
> a transport.  So if your integrity code lives in the protocol slot, and your 
> hop-by-hop encryption lives in the transport slot, then you have no slots to 
> specify a scalability protocol nor a transport!
> 
> The way I solve this at the moment is a terrible tower of hacks.  Some of 
> those hacks have wandered upstream, for example my work on modular devices is 
> at its core a way to inject code into more places as a packet traverses a 
> network than the 2 customization slots that nanomsg recognizes.
> 
> What I’d like to propose is instead of having 2 fixed slots, a nanomsg socket 
> to be composed of a stack of components of arbitrary length.  Each component 
> has its output piped to the input of the next component, like processes in a 
> Unix shell.  We call the bottom-most component which talks to the network a 
> transport, and we call all other components, that each talk to the next 
> component in the dataflow, a protocol.
> 
> [protocol]->[protocol]->[protocol]->[transport]    <———————> 
> [transport]->[protocol]->[protocol]->[protocol]
> 
> socket 1                                                                      
>                              socket 2
> 
> 
> In this way what I am proposing is not a radical departure from the existing 
> architecture, but rather a generalization of the existing architecture to 
> sockets with arbitrary numbers of protocols; a way to extend the architecture 
> to more kinds of sockets than can be realized today.  In particular, it’s 
> backwards compatible; all existing sockets can be represented very naturally 
> in this architecture, as sockets of length 2 with 1 protocol and 1 transport 
> (or, in another way, that I propose later in this email).
> 
> This architecture solves the security problem in the following way.  It 
> allows me to inject security code at arbitrary places in the network stack 
> such as
> 
> [Integrity]  ->  [Scalability Protocol]  ->  [Hop-by-hop encryption]  ->  
> [Transport]
> 
> where the first and third component would be protocols new to nanomsg, while 
> the second and fourth components are existing components.
> 
> This architecture solves several implementation problems for WebSockets in 
> the following way.  It would be possible to build for example
> 
> [WS Opcodes Protocol]  -> [Scalability Protocol]  ->  [WS Protocol]  -> [TCP 
> or TLS]
> 
> This splits WS implementation into 2 protocols, one that can handle opcode 
> switching above scalability layer and another that handles the bulk of WS 
> below scalability layer.  These protocols could be used together, in the case 
> that opcode support is desired, or with just the main WS protocol, if opcode 
> support is not desired.  Finally, the actual transport (TCP or TLS, which 
> IIRC isn’t supported in the current WebSocket “transport”) is moved out into 
> a proper transport, where it is pluggable and interchangeable, both for WS 
> and also for any ordinary non-WS socket to use.  Under this scheme the 
> *actual transport* really is just a dumb pipe, which has been one important 
> philosophical objection to WS opcodes.  Nor is WS coupled to the *actual 
> transport*, another philosophical objection that has been raised to coupling 
> between protocols and transports.
> 
> This architecture also provides a clear path to more robust headers, which I 
> think at present is non-intuitive and leads to unexpected situations (like 
> #324).  With this architecture, one would simply walk down a socket’s stack 
> and ask each protocol/transport how many bytes of overhead for headers it 
> would like to reserve [0].  Then nanomsg can do one up-front allocation of 
> the correct size for the message.  We could even standardize on a struct 
> describing the header format being declared by each protocol, [1]  so that 
> for a well-known stack the header can be easily parsed, for debugging or any 
> other purpose.
> 
> Finally, in addition to fixing those 3 problems, this proposal simplifies the 
> implementation of at least one existing practice, that of SP vs RAW sockets.  
> Currently SP vs RAW is implemented in a pseudo-OO fashion, where SP sockets 
> are essentially a subclass of their RAW counterparts, selectively overriding 
> certain methods and calling into the superclass as appropriate by the use of 
> method tables.  Whereas under the proposed architecture, SP sockets have a 
> natural dataflow representation:
> 
> [SP socket] -> [RAW socket] -> [Transport]
> 
> In conclusion I think this architecture makes significant progress on, if not 
> completely solves, many problems that have been previously intractable, and 
> also improves things not currently contemplated as problems.  It also manages 
> to unify many concerns into a common design that previously we have been 
> studying separately.  Finally as a generalization on the existing system, it 
> is backwards compatible and not a radical departure from the previous design. 
>  Really this proposal is a very old proposal, as old as the OSI layer model 
> itself.  We long ago decided that layering was fundamental to networking, and 
> I think it is time we brought this philosophy into nanomsg itself.
> 
> Some problems remain.  One problem is how to create these many-protocoled 
> sockets from an API perspective.  Another problem is how to implement 
> ispeer().  I’m sure there are other issues on both API and implementation 
> levels.  However I think that if we can reach a broad consensus on 
> architecture in the abstract, that API and implementation issues will prove 
> much more tractable, than the very intractable problems we were arguing 
> before over long periods of time.
> 
> If on the other hand we don’t reach some consensus on improving the 
> architecture, I think we should be concerned.  I’m already shipping in 
> production an extensive cryptographic fork that cannot make it back into 
> upstream because it depends on finding solutions to the architecture problems 
> outlined here.  We are now facing a second decision point in that a second 
> commercial user has appeared who cannot fit well into the existing 
> architecture, and it will be the second time that someone works around the 
> architecture and invests resources into yet another fork which cannot for 
> technical reasons swim upstream.  If we fail to act now then I see no reason 
> why this trend will not continue, and rob nanomsg of free contributions that 
> would otherwise raise the tide for all boats.  I also see no reason why, once 
> enough affected users have accumulated, they wouldn’t consolidate themselves 
> into a unified fork that addresses their architecture concerns, and devote 
> their resources to that.
> 
> I don’t mean to get apocalyptic, and I think we have some time to look at 
> this problem and reach a carefully-weighed conclusion that balances a lot of 
> competing concerns.  I think for the first time we have, if nothing else, a 
> *specific* proposal that purports to address particular problems in a 
> particular way, rather than general criticism that the existing architecture 
> fails to contemplate some use case.  Historically I have been responsible for 
> the latter, I hope now to be more responsible for the former.
> 
> I look forward to seeing if we can reach a consensus on this architecture or 
> something like it that can unify many of our pet problems into a common 
> purpose.
> 
> Drew
> 
> [0] As an optimization, each component in the stack could specify whether it 
> is happy to work with headers in a separate buffer, or require headers to be 
> included in the message body.  At send-time, the protocols would get as 
> send() arguments a headers* and body* buffer that, if the entire stack 
> agreed, happens to be 2 different buffers, or if one or more protocols 
> disagreed, happens to be a contiguous buffer.  I think this resolves the 
> inproc/ipc discrepancy, since inproc can declare its willingness to store 
> headers in a noncontiguous buffer, while ipc can insist on a contiguous 
> buffer, and under the C specification, code written for non-contiguous 
> buffers always works in the contiguous case.  And as a notable improvement 
> against any competing proposal on the headers issue, security protocols could 
> also force contiguous buffers, so they could insist on encrypting headers and 
> bodies together, which is an important design issue for them.  The result is 
> a fast, zero-copy implementation where that makes sense, and a single-copy 
> implementation in all other cases.
> 
> [1]  Although this specific item would imply the use of #pragma pack, which 
> is not strictly portable. I think most modern compilers support it, but it’s 
> not a standard.
> 
> 
> 
> 
>
References:
- [nanomsg] towards a more robust socket model
  - From: Drew Crawford
[nanomsg] Re: towards a more robust socket model

Other related posts: