[nanomsg] Re: accessing control IDs

From: Drew Crawford <drew@xxxxxxxxxxxxxxxxxx>
To: nanomsg@xxxxxxxxxxxxx
Date: Tue, 6 May 2014 18:20:58 -0500
I don’t think putting it in the body is the right solution, or at least not 
right for every case.

For one thing it requires allocating storage for this is in the messages.  It 
is an interesting question how much storage is required but the naive 
implementation would be a 128-bit GUID.  For a 4-byte messages this bloats the 
network traffic significantly.  Particularly when there is a simple solution 
with zero overhead—pull the pipe key from nanomsg.

For a another thing, ZeroMQ clearly and unambiguously supports this model (via 
ROUTER).  So anyone porting ZeroMQ code is in for a rough time implementing 
their own scheme atop nanomsg on all the codebases to get the same behavior 
they had before.  There are of course legitimate reasons to be incompatible 
with ZeroMQ (such as achieving compatibility instead with BSD sockets) but I 
think this case is much more harmful than helpful and does not isolate itself 
to one particular system or codebase.

> let’s think about cases where the client sending requests loses connection, 
> then reconnects with a different address, or connects from multiple endpoints 
> (mobile device, desktop computer, …), if remote endpoint information is 
> provided by nanomsg all these endpoints will appear as different entities, 
> whereas in your application logic they should be considered the same.


You should probably not rely on transport-layer guarantees to authenticate a 
user.  However *given* transport-layer information, it *becomes possible to 
implement* many authentication schemes.

For example (and this is the problem that motivated this discussion) suppose I 
have some decision oracle which can determine with complete certainty whether a 
particular user intended to send a packet.  Then

> if (!oracle_user_sent_message(user,message)) {
>   end_session();
> }

Now if we know (or can guess) that John sent the message via transport-layer 
information, the solution is straightforward.  But trying all the possible 
users is impractical for a slow oracle.  So the transport-layer information can 
comprise part of an application-level authentication scheme to identify which 
user the oracle should be asked about.  Even for sockets that do not have 1:1 
fanout to users, the fanout may divide the users into enough buckets that 
asking the oracle about each member in the bucket becomes practical.

Now of course we could prepend some session ID to the message rather than rely 
on transport-level data, and bloat the message accordingly.  However anybody 
who gets ahold of the session ID could spoof messages with that session ID from 
anywhere on the network.  Now these would be rejected by our perfect oracle, 
but not before ending the user’s session, comprising a DDoS attack against the 
legitimate user.  Alternatively, relying on the TCP information significantly 
increases the difficulty of the attack, requires a TCP MITM technique or some 
other advanced persistent threat capability to execute.  This is a major 
security and reliability advantage to relying on TCP data in my situation.

Drew




On May 6, 2014, at 12:13 PM, Achille Roussel <achille.roussel@xxxxxxxxx> wrote:

> Could you put the state information in the body of your message instead of 
> attempting to get it from nanomsg. HTTP is also stateless but websites 
> maintain more or less state using cookies, session ids or access tokens… 
> maybe you can implement this at the application logic level.
> 
> I think it’s a sain design to have your transport protocol separated from 
> your application logic, let’s think about cases where the client sending 
> requests loses connection, then reconnects with a different address, or 
> connects from multiple endpoints (mobile device, desktop computer, …), if 
> remote endpoint information is provided by nanomsg all these endpoints will 
> appear as different entities, whereas in your application logic they should 
> be considered the same.
> 
> On May 6, 2014, at 1:23 AM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx> wrote:
> 
>>> There are many cases that require state full networking
>>> 
>> I’m in such a case.  The open question at this point is how to achieve it.
>> 
>> 
>> On May 5, 2014, at 10:52 PM, Apostolis Xekoukoulotakis <xekoukou@xxxxxxxxx> 
>> wrote:
>> 
>>> Req rep were designed by default to be stateless, that is why finding the 
>>> address of the message has been hidden on purpose.
>>> 
>>> There are many cases that require state full networking but state full is 
>>> more difficult because it requires that you implement an update mechanism 
>>> on the routing information.
>>> 
>>> On May 6, 2014 5:02 AM, "Drew Crawford" <drew@xxxxxxxxxxxxxxxxxx> wrote:
>>> I have dug a little deeper into this.  it appears that in global.c [1] 
>>> msg_controllen is never set.  I’m not sure if that’s expected.
>>> 
>>> The attached patch sets controllen based on the size of the chunk.  Whether 
>>> right or wrong, this seems to produce the behavior expected by zerotacg and 
>>> Achille, e.g., control bytes are emitted in the RAW case.  The 8 bytes are
>>> 
>>>> d0,4e,c0,00,c1,7f,00,00,
>>> 
>>> 
>>> Three of which (bytes[3],bytes[4],and bytes[5]) seem to change from 
>>> run-to-run.  This is mildly surprising, because the RFC documents the 
>>> control ID at being 32 bits, so one would expect four bytes to change from 
>>> one execution to the next.  I’m also unable to account for the presence of 
>>> the remaining bytes.  Something may be wrong with my patch, or with my 
>>> understanding of the codebase or RFC.
>>> 
>>> This is an interesting line of inquiry, but since a solution along this 
>>> line has the limitation of requiring me to implement my own end-to-end 
>>> behaviors on top of a raw socket, I’m wondering if it would be desirable to 
>>> introduce an API for this purpose
>>> 
>>>> /* Returns an integer that uniquely identifies the immediate sender of the 
>>>> most-recently-received message.  Returns an error if no messages have ever 
>>>> been received on the socket */
>>> 
>>>> int nn_sender(int socket);
>>> 
>>> Such API could work equally well for raw sockets as full sockets, could be 
>>> implemented for different socket topologies, and does not introduce an 
>>> application-layer dependency on parsing the control header format.
>>> 
>>> 
>>> [1] https://github.com/nanomsg/nanomsg/blob/master/src/core/global.c#L817
>>> 
>>> 
>>> 
>>> 
>>> On May 5, 2014, at 4:38 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx> wrote:
>>> 
>>>> I thought about that, however, msg_controllen still returns -1 when using 
>>>> raw sockets, suggesting there are no control information available, as the 
>>>> sample below illustrates.  Maybe something is wrong with the code sample?
>>>> 
>>>> Another problem is that use of raw sockets would require me to roll my own 
>>>> end-to-end behavior which may be undesirable.
>>>> 
>>>> 
>>>>>     int client = nn_socket(AF_SP,NN_REQ);
>>>>>     int server = nn_socket(AF_SP_RAW,NN_REP);
>>>>>     nn_connect(client,"inproc://test");
>>>>>     nn_bind(server,"inproc://test");
>>>>>     nn_send(client,"A",1,0);
>>>>>     
>>>>>     int rc;
>>>>>     void *body;
>>>>>     void *control;
>>>>>     struct nn_iovec iov;
>>>>>     struct nn_msghdr hdr;
>>>>> 
>>>>>     iov.iov_base = &body;
>>>>>     iov.iov_len = NN_MSG;
>>>>>     memset (&hdr, 0, sizeof (hdr));
>>>>>     hdr.msg_iov = &iov;
>>>>>     hdr.msg_iovlen = 1;
>>>>>     hdr.msg_control = &control;
>>>>>     hdr.msg_controllen = NN_MSG;
>>>>>     rc = nn_recvmsg (server, &hdr, 0);
>>>>>     print_array(body,rc,"body”); //contains only A
>>>>> 
>>>>>     printf("msg_iovlen %d\n",hdr.msg_iovlen); // 1
>>>>>     printf("msg_controllen %d\n",hdr.msg_controllen); // -1
>>>> 
>>>> 
>>>> On May 5, 2014, at 4:32 PM, Achille Roussel <achille.roussel@xxxxxxxxx> 
>>>> wrote:
>>>> 
>>>>> You have to use AF_SP_RAW sockets to get access to these info in the 
>>>>> control header when receiving a message with nn_recvmsg. 
>>>>> 
>>>>> On May 5, 2014, at 2:27 PM, Drew Crawford <drew@xxxxxxxxxxxxxxxxxx> wrote:
>>>>> 
>>>>>> I have a REP socket.  I’m trying to identify the channel (sender or 
>>>>>> forwarder) on which some message has arrived to the socket.  A 
>>>>>> transport-layer understanding of the sender is not required; any 
>>>>>> identifying value, such as an integer, is sufficient.  Consulting the 
>>>>>> REQREP spec  suggests that the topmost “channel ID”, one of the records 
>>>>>> in the “backtrace”, is the identifier I’m looking for.
>>>>>> 
>>>>>> Clearly this identifier is not exposed over the nn_recv interface.  I 
>>>>>> had some hopes that it would be accessible in the nn_recvmsg interface, 
>>>>>> possibly as control information, but it seems not to be the case:
>>>>>> 
>>>>>>>     int client = nn_socket(AF_SP,NN_REQ);
>>>>>>>     int server = nn_socket(AF_SP,NN_REP);
>>>>>>>     nn_connect(client,"inproc://test");
>>>>>>>     nn_bind(server,"inproc://test");
>>>>>>>     nn_send(client,"A",1,0);
>>>>>>>     
>>>>>>>     int rc;
>>>>>>>     void *body;
>>>>>>>     void *control;
>>>>>>>     struct nn_iovec iov;
>>>>>>>     struct nn_msghdr hdr;
>>>>>>> 
>>>>>>>     iov.iov_base = &body;
>>>>>>>     iov.iov_len = NN_MSG;
>>>>>>>     memset (&hdr, 0, sizeof (hdr));
>>>>>>>     hdr.msg_iov = &iov;
>>>>>>>     hdr.msg_iovlen = 1;
>>>>>>>     hdr.msg_control = &control;
>>>>>>>     hdr.msg_controllen = NN_MSG;
>>>>>>>     rc = nn_recvmsg (server, &hdr, 0);
>>>>>>>     print_array(body,rc,"body”); //contains only A
>>>>>>> 
>>>>>>>     printf("msg_iovlen %d\n",hdr.msg_iovlen); //1
>>>>>>>     printf("msg_controllen %d\n",hdr.msg_controllen); //-1
>>>>>> 
>>>>>> 
>>>>>> I have consulted a previous mailing thread on this topic which suggests 
>>>>>> channel IDs are manipulated in rep.c.  Indeed, the information I’m 
>>>>>> looking for seems to be moved around between nn_sockbase, nn_msg, 
>>>>>> nn_rep, and similar structures.  However I cannot work out a sane way to 
>>>>>> get those structures from application code.  
>>>>>> 
>>>>>> Any suggestions on identifying the sender of a remote message?
>>>>>> 
>>>>>> Drew
>>>>> 
>>>> 
>>> 
>>> 
>> 
>
Follow-Ups:
- [nanomsg] Re: accessing control IDs
  - From: Garrett D'Amore
- [nanomsg] Re: accessing control IDs
  - From: Martin Sustrik
References:
- [nanomsg] accessing control IDs
  - From: Drew Crawford
- [nanomsg] Re: accessing control IDs
  - From: Achille Roussel
- [nanomsg] Re: accessing control IDs
  - From: Drew Crawford
- [nanomsg] Re: accessing control IDs
  - From: Drew Crawford
- [nanomsg] Re: accessing control IDs
  - From: Apostolis Xekoukoulotakis
- [nanomsg] Re: accessing control IDs
  - From: Drew Crawford
- [nanomsg] Re: accessing control IDs
  - From: Achille Roussel
[nanomsg] Re: accessing control IDs

Other related posts: