[nanomsg] Re: websocket mapping

  • From: Garrett D'Amore <garrett@xxxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Mon, 2 Feb 2015 23:04:17 -0800

So  I had a DNS problem… that seems to be fixed now…

More thoughts.

I’ve abandoned (actually had done so earlier today) the approach of a separate 
exchange.  I’ve taken to a simpler approach for websocket, which is to create a 
subprotocol value of “<x>.sp.nanomsg.org <http://sp.nanomsg.org/>” where <x> 
represents the string “req”, or “rep” or somesuch — but *and this is important* 
the value is the protocol offered by the server — i.e. the a req client will 
actually send “rep.sp.nanomsg.org <http://rep.sp.nanomsg.org/>” here.  

It turns out that this means that in my implementation the protocol layer needs 
to know both its value, and the value of its supposed peer.

And to answer the concerns about why I think having this exchange is very 
important — whether in the TCP stream or at the individual exchange layer, the 
reason is that we have potentially different header formats — the wire format 
for REQ is different from BUS, etc.  Having some check to prevent people from 
connecting incompatible protocols is very important.

Note that SP is *not* as I see it oriented to making low latency *connections*, 
but rather towards operating at lower latency while already connected.  (For a 
UDP protocol we’d have to have these values as part of the UDP payload header, 
I suppose.)  Having an extra exchange up front for TCP establishment doesn’t 
seem especially tragic, but I found that it was going to be awkward for the 
likely JavaScript clients.  So using the subprotocol option seems like the best 
compromise.

(We could have done other things - like using websocket extensions, or creating 
new HTTP headers, but generally such approaches will not be accessible to 
JavaScript clients.  From what I can tell JavaScript offers only a very limited 
view of the underlying websocket connection.  TBH, as I spent time reading the 
RFC 6455 for WebSocket, I mostly wanted to vomit.)

Anyway, I’ve implemented the above changes, but haven’t pushed them yet.  I’m 
waiting on finishing up the debug for wss (websocket over TLS), which is quite 
nearly done.

        - Garrett

> On Jan 30, 2015, at 12:21 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx> wrote:
> 
>> > "I don’t think SP handshakes in general are useful"
>> What's the alternative? Encode this information per-message, rather than 
>> per-connection?
> 
> Per-message is certainly preferable, but more broadly I don’t really believe 
> that there is a problem to be solved here.
> 
> Allegedly negotiation exists to provide the user an error message if the 
> protocols are incompatible.  But fundamentally nanomsg is always used with a 
> higher-level “application protocol” that logically encapsulates the version 
> of nanomsg (and the version of its protocols) that the application uses.  It 
> is of no comfort, when App 1.0 communicates with App 2.0, that they both use 
> the same version of REQ.  You can’t solve the problem at nanomsg layer.
> 
> More broadly I think writing “if (message [0] != MAGIC) return error;” 
> increases the chance that invalid messages are not robustly handled, and that 
> errors in handling them will not be detected under ordinary circumstances.  
> Writing that code is like writing code that checks for the Evil Bit 
> <https://en.wikipedia.org/wiki/Evil_bit>.  It is at best useless and at worst 
> dangerous.
> 
> I think there would be some value if protocols were versioned and there was a 
> desire for one socket to support several old versions depending on what 
> connecting peers purported to support.  But I am not aware that anyone is 
> actually doing that, and the people who are doing SP extensions are doing it 
> in ways that specifically avoid that, for example by shipping entirely new 
> protocols.
> 
> In spite of all that I am not really opposed to it if the cost is low.  A 
> byte here or there is nothing to quibble about.  Where I start to have more 
> serious problems is if we introduce additional roundtrips.  Now we are not 
> just doing something a bit silly, we are also being slow.
> 
>> Per-message CMSG headers exist; are these sufficient (and desirable)?
> 
> I think the relevant question from my perspective is how it looks on the 
> wire.  cmsg is an implementation detail of mainline; my implementation will 
> not share the same structs in its internal representation.
> 
> Drew
> 
> 
>> On Jan 30, 2015, at 12:09 PM, Jack Dunaway <jack@xxxxxxxxxxxxxxxx 
>> <mailto:jack@xxxxxxxxxxxxxxxx>> wrote:
>> 
>> > "third nanomsg re-implementation"
>> Neat! Can we start a new thread on this? I like how Garrett has chosen to 
>> keep Mangos discussion on the nanomsg list to get it off the ground; it 
>> seems both have benefitted from staying in the same communication channel 
>> during development.
>> 
>> > "If a protocol encounters a frame it does not understand, it should raise 
>> > an error"
>> I agree with this, and have suggested that nanomsg adopt the same strategy 
>> as WebSockets by immediately sending a Close Handshake then failing the 
>> connection to any peer that sends a malformed/unexpected frame. This remains 
>> a robustness/security concern where nanomsg could improve. There are other 
>> actions/remedies that could be taken other than the 
>> close-handshake-then-close-connection recipe, but this is incredibly simple 
>> and at the same time should provide enough breadcrumbs to troubleshoot 
>> failure modes.
>> 
>> > "I don’t think SP handshakes in general are useful"
>> What's the alternative? Encode this information per-message, rather than 
>> per-connection?
>> 
>> > "We just need to specify how to encode the multiple protocols and the 
>> > ordering"
>> Per-message CMSG headers exist; are these sufficient (and desirable)?
>> 
>> Best regards,
>> Jack R. Dunaway | Wirebird Labs LLC
>> 
>> On Fri, Jan 30, 2015 at 10:58 AM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx 
>> <mailto:drew@xxxxxxxxxxxxxxxxxx>> wrote:
>> In the case of TCP, the RFC gives the following statement about the 
>> handshake mechanism:
>> 
>> >  The goal of this design is to keep connection establishment as fast
>> >    as possible by avoiding any additional protocol handshakes, i.e.
>> >    network round-trips.  Specifically, the protocol headers can be
>> >    bundled directly with to the last packets of TCP handshake and thus
>> >    have virtually zero performance impact.
>> 
>> 
>> I think we mostly agree that this is a good plan for TCP.
>> 
>> The problem as applies to WS is that we have mutually exclusive requirements:
>> 
>> 1.  We want to have a low-overhead handshake
>> 2.  WS doesn’t support full-duplex negotiation (because HTTP doesn’t).
>> 
>> In this case we must choose, either to have a high-overhead handshake, or to 
>> forget full-duplex negotiation.
>> 
>> This is probably as good of a time as any to announce that I’ve started work 
>> seriously on a third nanomsg re-implementation, that will sit beside nanomsg 
>> and mangos as a third implementation of the protocols.  A goal of my 
>> re-implementation is to leverage the new socket architecture I posted to 
>> this list a few months ago, that is difficult to achieve in core for various 
>> reasons. A second goal is to fix a bunch of longstanding minor gripes that 
>> are hard to do in the existing codebase.
>> 
>> My implementation strictly prefers low-overhead solutions to 
>> full-negotiation.  So to the extent that we settle on a full-duplex, 
>> additional-round-trip solution I will break compatibility on that point.  
>> For my set of problems the overhead is critical.
>> 
>> More broadly I don’t think SP handshakes in general are useful (but so long 
>> as the cost is low the objection is academic).  If a protocol encounters a 
>> frame it does not understand, it should raise an error.  Moreover the 
>> presence of handshake purporting to be e.g. REQ is not a guarantee that is 
>> understandable by a REQ socket; packets can be spoofed and so on.  At best 
>> this is a debug convenience; protocols should be written (and tested) in 
>> such a way that they are not reliant on handshakes as anything other than a 
>> convenience.
>> 
>> Here, the full-duplex handshake is /in/convenient, because it is slow.  I 
>> think in this case we need to look seriously at relaxing the handshake 
>> requirement.
>> 
>> Another objection I have to this specific proposal is that the assumption of 
>> 1 SP protocol contradicts the design of the new socket architecture I posted 
>> for discussion a few months earlier, because I support multiple layered SP 
>> protocols, and this proposal only considers one.
>> 
>> I actually have this problem with TCP as well.  In that case there is not 
>> really an option except breaking compatibility with sp-tcp-mapping in the 
>> layered situation.  Ideally it would be fixed in the RFC, but I think the 
>> TCP RFC may be too widely deployed for it to be revised at this point (even 
>> though, I’ve noticed the “expires” date in the RFC has passed…)
>> 
>> With WS however I think we have a window to design something that would not 
>> require layered implementations to break the RFC.  We just need to specify 
>> how to encode the multiple protocols and the ordering.
>> 
>> Drew
>> 
>> 
>> > On Jan 30, 2015, at 6:36 AM, Garrett D'Amore <garrett@xxxxxxxxxx 
>> > <mailto:garrett@xxxxxxxxxx>> wrote:
>> >
>> > I’ve started (and mostly completed) a websocket mapping of mangos.  I 
>> > started hoping to make it compatible with nanomsg’s mapping as specified 
>> > in the websocket nanomsg RFC, but as I got into details, I discovered 
>> > somethings about this RFC that I think are a big mistake.
>> >
>> > The biggest problem is that we are relying on the Protocol field to 
>> > express which SP topology is used.  This results in at least these 
>> > problems:
>> >
>> >       a) The transport needs to know the name of all protocols, which does 
>> > not facilitate expansion and badly breaks the layering abstraction.
>> >
>> >       b) Some implementations don’t make it easy to pick up on the values 
>> > used by the server and client protocols.  For example, the Go websocket 
>> > implementation lets the client specify the protocols it want(s) to use, 
>> > but it cannot see what the advertised protocol on the server side is.
>> >
>> >       c) Its likely that some implementations are going to have trouble 
>> > when x-nanomsg-req != x-nanomsg-rep — i.e. we cannot have mismatched 
>> > protocols.
>> >
>> > I think problem “a” could be easily fixed by using numerics encoded as 
>> > strings, but problems b & c are more severe — I think they come about from 
>> > a basic design philosophy difference — in websocket I think the intention 
>> > is that the ultimate “negotiation” is done on the server side, based on 
>> > what the client claims to desire.  Its not really a two-way neg like SP is.
>> >
>> > Thinking about this quite a bit, I’ve decided that the best thing would be 
>> > to change the mapping as follows:
>> >
>> > 1. The protocol field should just be the value “x-nanomsg”  — without any 
>> > reference to the subprotocol.
>> >
>> > 2. A message exchange very much like what is is done for TCP should be 
>> > done at when the connection is established.  That is each side should  
>> > send a “header”, and then should receive and validate the peer’s header.
>> >
>> > 3. Given that some websocket clients really need to deal with text values 
>> > (e.g. older web browsers can only send strings), the content of the header 
>> > should be as follows:
>> >
>> >       “SP0:<protocol>:0”   — these are ASCII characters forming a string.  
>> > The SP0: indicates SP version 0, the <protocol> is a numeric protocol (for 
>> > example PUB would 32, and SUB would be 33) represented in decimal ASCII, 
>> > and the :0 is reserved for future protocol enhancements (like the reserved 
>> > fields in the binary header exchange).   So for example, a REQ client 
>> > would send SP0:48:0 and a REP server would send SP0:49:0
>> >
>> > This should facilitate working with more websocket implementations, 
>> > although to be clear the header exchange would prevent using this library 
>> > to talk to arbitrary websocket services — peers would only match and peer 
>> > each other if they are running the proper SP peer protocol.  I see almost 
>> > no value in attempting to make nanomsg a general purpose websocket library.
>> >
>> > Thoughts?
>> >
>> >       - Garrett
>> >
>> >
>> 
>> 
>> 
> 

Other related posts: