[nanomsg] Re: Introduction and questions

From: Gonzalo Diethelm <gonzalo.diethelm@xxxxxxxxx>
To: nanomsg@xxxxxxxxxxxxx
Date: Mon, 24 Jun 2013 10:23:50 -0400

If we don't have a version indicator in the protocol, I can guarantee we
will miss it in the future. It might even be a good idea to have a generic
version indicator for the whole nanomsg protocol, and one for each specific
protocol (PAIR, REQ/REP, PUB/SUB, etc.). Just think of this: how many times
have you wished the ZeroMQ protocol had had this from the beginning?

On Mon, Jun 24, 2013 at 1:34 AM, Martin Sustrik <sustrik@xxxxxxxxxx> wrote:

> Hi Jimmy,
>
>
>  I do not like the way multipart messages are handled in zmq.
>>
>> I'm sure there are very good reasons to glop it onto the socket. I
>> found the threads where it was introduced in ZMQ but I did not find
>> the reasons for its inclusion compelling enough to justify the added
>> complexity.
>>
>> I'd rather have it be a separate system/microprotocol entirely that
>> you choose to layer on top of a socket or not that needn't even be
>> part of the core package.
>>
>
> Exactly. That's currently the case with nanomsg. There's no support for
> multi-part messages. Later on we can build it on top (a different lib
> even?) as a light-weight alternative to more complex presentation layer
> protocols (JSON/XML).
>
>
>  3. We can simply create a new protocol. Sounds like the best option to me,
>>> but it would mean more work with writing the protocol down etc. On the
>>> other
>>> hand, writing it down would allow us to do that in RFC format, so that it
>>> can be easily passed to IETF when the time comes.
>>>
>>
>> New protocol(s) would probably be simplest.
>>
>> I don't know if jumping straight into encoding it into RFCese would be
>> the easiest path, but the project (any project, really) could only
>> benefit from clear documentation, regardless of format. If RFC is
>> easiest for you, that's fine.
>>
>> I think it should be modular.
>>
>> One nanomsg protocol, then a protocol for each class of sockets
>> (PAIR/REQ/REP, PUB/SUB etc). The same way there isn't one RFC for
>> TCP/IP/HTTP/FTP/TELNET/ET AL.
>>
>
> Yes. That's what I was thinking myself.
>
>
>  Even if the REQ/REP protocol is just "you don't have to do anything
>> special, just send and/or receive data," and the PUB/SUB protocol is
>> just that the first bit of the message is a null terminated topic. I'm
>> sure there'd be more to them than that, but you get the idea.
>>
>
> Yes. There will definitely be more. The protocols look superficially
> simple, but they are not. There's a lot of issues to consider and the
> design choices should be clearly worded in the RFCs.
>
>
>  I don't see it anywhere in the code, but I may have missed it, but I'd
>> like to see a handshake when you connect two sockets. Nothing fancy.
>> Just, "hi I'm nanomsg wire format x.xx and this is a type Y socket,
>> version x.xx"† and a simple OK or NOK response (and anything that's
>> not one of those two is an implicit NOK) before getting down to
>> business.
>>
>
> Check src/transports/utils/protohdr.**h&.c That's the state machine that
> exchanges 8 bytes when connection is established. So far, there's no actual
> data filled in, but that's easy to add that.
>
> Each party, IMO, should advertise at least the following properties:
>
> 1. Some constant tag so that SP communication can be distinguished from
> other TCP connections. Currently it's 4 bytes like this: \0\0SP
>
> 2. Protocol ID (i.e. PUB/SUB, REQ/REP etc.) This also includes version. If
> there's a new version of PUB/SUB it can just get new ID. No need for
> explicit version field.
>
> 3. Role of the endpoint in the protocol (e.g. PUB vs. SUB).
>
> 4. Topology ID. So, for example, if you have two pub/sub topologies on
> your network (e.g. stock quotes vs. stock trades) you want to assign them
> different IDs so that node from one topology cannot be accidentally
> connected to the other topology. This property needs some more thinking
> about though.
>
>
>  † it would probably look more like NM:1;REQ:2 which is easy enough for
>> humans to read, machines to parse, and let's scalability protocols be
>> added willy nilly without having to come up with magic numbers to
>> identify them that everyone has to agree on in advance.
>>
>
> I personally prefer binary encoding (e.g. fixed 8 byte header) as it makes
> it easier for hardware to deal with it, even in high-volume scenarios
> (backbone routers etc.)
>
> Also, when there are new connection-less transports added, the header will
> be included into each packet. Thus, making it as short as possible so avoid
> excessive bandwidth overhead seems like a good idea.
>
> Of course, UDP header could be binary while TCP header is text-based,
> however, it kind of feels cleaner to strive for similar header style for
> different transports.
>
>
>  Every message after that, as far as I can see, can just be the message
>> length header. The protocol for the sockets can prepend anything it
>> needs before framing the message. Or even, if need be, have a second
>> protocol-specific handshake to negotiate any special considerations.
>> Having each SP in charge of whatever extra it needs to add on top of
>> the framing but letting the lower level take care of negotiations
>> should also help keep the code modular and easy to expand, too.
>>
>
> Yes. That's the idea.
>
>
>  One thing to keep in mind is that by creating a new implementation has the
>>> drawback of not automatically getting new features from nanomsg. If
>>> someone,
>>> say, implements InfiniBand transport for nanomsg, the Go implementation
>>> wouldn't get it for free. Same applies to possible new messaging
>>> patterns.
>>>
>>
>> That is a concern. Of course, it cuts both ways. If it turns out much
>> easier to get a new messaging pattern up and running in the Go port,
>> it could very well turn into a playground for new patterns even if new
>> transports come somewhat more slowly. There are downsides, but I think
>> it would be a net win.
>>
>> There's also no reason not to have a nanomsg port to Go and a Go
>> binding, and you choose which best suits your needs.
>>
>
> Yes.
>
> In the long run I would like to have SPs implemented directly in the
> kernel so that every language has access to the same functionality without
> need for additional native libraries (we've already done a PoC for that)
> but looking at the DBus-in-Linux-kernel saga it doesn't seem to happen any
> time soon.
>
> In the meantime, both bindings and ports sound like a reasonable ideas.
>
> Martin
>
>
>


-- 
Gonzalo Diethelm
gonzalo.diethelm@xxxxxxxxx

Follow-Ups:
- [nanomsg] Re: Introduction and questions
  - From: Martin Sustrik

References:
- [nanomsg] Introduction and questions
  - From: Gonzalo Diethelm
- [nanomsg] Re: Introduction and questions
  - From: Ondrej Kupka
- [nanomsg] Re: Introduction and questions
  - From: Gonzalo Diethelm
- [nanomsg] Re: Introduction and questions
  - From: Ondrej Kupka
- [nanomsg] Re: Introduction and questions
  - From: jimmy frasche
- [nanomsg] Re: Introduction and questions
  - From: Martin Sustrik
- [nanomsg] Re: Introduction and questions
  - From: jimmy frasche
- [nanomsg] Re: Introduction and questions
  - From: Martin Sustrik

[nanomsg] Re: Introduction and questions

Other related posts: