[nanomsg] Re: nanomsg status and encryption

  • From: Alex Elsayed <eternaleye@xxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Mon, 16 Dec 2013 23:13:07 -0800

Drew Crawford wrote:

> Thanks.  Let me summarize the conversation so far:
> 
> * I agree with you that transport-level is the way to go
> * I understand what you mean by “state” and it seems reasonable
> * I’m concerned that your risk tolerance may be stricter than mine.
> 
>> I'd suggest that for your use case, implementing something at the
>> transport layer (_underneath_ any SP framing) using TLS itself is likely
>> the best option - especially since you're doing single-hop in the first
>> place. That lets us backstop on something well-studied, and avoid rolling
>> our own.
> 
> 
> Let me explain in a lot more detail why I can’t use TLS.
> 
> The major problem is that I’m operating on a very bad network.  In
> particular, latency is very high.  A typical RTT might be 300ms, a bad RTT
> might be in seconds.

Ouch, yeah. Okay, I see where you're coming from. At that point though, you 
may even be better off with something other than _TCP_ - protocols like 
MinimaLT, which provide TCP semantics but roll the crypto even further down 
into the transport layer. Such protocols can do the entire setup in 1RTT, 
and then are 0-RTT for subsequent connections between the same hosts until 
the underlying association expires. (I don't think MinimaLT is a 
particularly good option here because of implementation concerns and how new 
it is, but it was what first came to mind w.r.t pushing things farther down 
in the search for lower latency).

> It’s worth saying here that, by solving the problems of slow networks, one
> also improves performance of the fast ones.  So the problems I have are
> broadly applicable in any situation where the performance requirements are
> *perceived* to be tight with respect to the underlying network.  I am
> concerned about processing user-initiated requests on slow networks, but
> the problem can be equivalently thought of as processing machine-initiated
> requests on very fast ones; it is not a problem that only applies to
> networks with latencies of 300ms.

I'm certainly with you there. Bandwidth improves, but the speed of light 
isn't going to change in the forseeable future.

> TLS requires 2 additional RTT (beyond TCP’s handshake) and so this
> overhead is of significant concern.  The overhead grows in the
> chain-of-trust case.
> 
> There’s been a lot of work in the “resume” case for TLS, but not much work
> on the initial slow start problem.  The “cold start” situation is
> important because a high number of connections will have a small number of
> requests in my use case.  There has been some discussion about tackling
> cold start at the IETF but I don’t think a revised standard is immediately
> forthcoming in any soonish timeframe.

Yeah, there's stuff like False Start and Snap Start, but those have very 
real security concerns in addition to reliability issues.

> Meanwhile there are today schemes considered safe that have single-RTT
> handshakes, and so adopting such a scheme in my case is a good calculated
> risk.  One such scheme is CurveCP (http://curvecp.org/packets.html), which
> is with some modifications the scheme that is being most strongly
> considered by zeromq, and there is even a fairly well-maintained fork:
> http://curvezmq.org

CurveCP is definitely an option, although this bit on the CurveZMQ site does 
worry me somewhat:

"While CurveCP uses strictly incrementing short nonces, CurveZMQ has no such 
requirement since commands are guaranteed to arrive in order over the stream 
transport."

Mainly, I worry because it I can find no documentation about how CurveZMQ 
_does_ generate nonces - and if it can cause nonce reuse, that would be a 
SIGNIFICANT security problem[1]. It's _probably_ done right... but without 
documentation of the algorithm, I can't be sure, and that gives me the 
willies.

CurveCP has some ways in which it is not ideal[2], but it's held up well to 
security analysis. Making ad-hoc changes to a cryptographic protocol or 
implementation is generally a good way to weaken it accidentally[3].

> Another problem is allowing the server to identify/authenticate connecting
> clients.   This is certainly solvable in the application layer, by having
> a first client “Hello I’m Jim” type packet.  However, this introduces an
> additional RTT before real client requests can be made.  And this also
> introduce problems, as you point out, with the statelessness of REQ/REP
> and so it should be solved in the transport layer.  The particular method
> to solve this within in the transport layer is probably something like
> client-side certificates, but this continues to invoke new TLS mechanisms
> that bloat the prelude.  Again, modern systems generally build this into
> the handshake at significant savings.
> 
> In general, when you compare TLS against non-standard alternatives,
> today's alternatives are roughly 2x better or even more, whether you are
> looking at latency overhead, data transfer overhead, or some other
> benchmark.  This is a good risk to take if you’re on a bad network, or if
> high performance is of significant concern on a fast one.

That's definitely a fair concern.

> Another problem is that AES (really the only “good” cipher allowed in TLS)
> represents a significant burden on the (ARM) chips using it.  AES tops out
> at about 35 cycles/byte on ARM, and meanwhile there are “considered very
> strong” ciphers like Salsa20 that clock in at only 4 cycles/byte.  This is
> a very significant savings for my use case, and something that is a very
> good calculated risk for any non-x86 architecture (and for x86, it’s not a
> bad bet either).

That's not strictly true - GnuTLS supports Salsa20 and poly1305 (and judging 
by the discussion on the IETF TLS list will likely add ChaCha soon), the 
1.0.2-aead branch of OpenSSL has ChaCha and poly1305 courtesy of Adam 
Langley, and the IETF is in the process of standardizing ChaCha+poly1305 as 
an official, blessed AEAD mode.

> For all of these reasons, TLS is a bad choice for me.  My performance
> requirements are too tight.  Really the only thing that TLS can offer me
> is its additional years of cryptographic scrutiny, but it is those very
> same years that have bloated it into the poorly-performing protocol it is
> today, so those are mutually exclusive goals. It makes much more sense for
> me to pick something that’s been vetted for a few years and much faster
> than something vetted for decades and much slower.

Yeah, the RTT cost of TLS is the most convincing thing there (and it's quite 
convincing). One thing this has solidified for me is that security at the 
transport layer in nanomsg should be implemented as an additional state 
machine interface between the underlying transport and the SP layer. That 
would allow bolting either TLS (for those who need or want it) or CurveCP or 
something else on at runtime via the URI (tcp+curve:// vs tcp+tls:// vs ...)

>> The thing there is that once people are using it (even if it's alpha)
>> there will be a good bit of pressure to keep compatibility. For something
>> with the associated costs of failure that crypto does, that's a problem
> 
> 
> I think it’s important to include in this discussion the opportunity cost
> of doing nothing (or of doing something that isn’t good enough to be used
> broadly).
> 
> My philosophy generally is that crypto is simply good hygiene.  Due to
> recent world events, I wouldn’t deploy, even within my own datacenter,
> unsecured messaging systems in production.  So as a person doing a
> drive-by on the nanomsg website, the project is immediately dismissible.

I feel much the same way, with a caveat: _good_ crypto is good hygiene. If 
crypto is designed or implemented poorly, and doesn't provide the guarantees 
expected of it, then all it provides is a false sense of security - 
something I feel is _actively_ dangerous.

> TLS solves some problems for some people.  But for those applications that
> have tight performance requirements, it’s insufficient.  And so even if
> TLS was implemented, the people who have (or who think they have) tight
> performance requirements would continue to do the drive-by and land on
> CurveZMQ or similar.  And some significant fraction of users will choose
> to use nanomsg without encryption against their better judgment.

This, I will agree with - hence my increased conviction in having it as 
(another) layer that can be swapped out (and by putting it into the 
transport specifier, it makes clear that they are different protocols and 
that the same approaches that a nanomsg user would take when changing from 
ipc:// to tcp:// come into play).

> Meanwhile I don’t think there’s a significant fraction of people who would
> land on nanomsg’s website, see a big fat *WARNING alpha stage* in bold
> text, with releases labeled “0.2 alpha”, and conclude from this
> information that there is a robust, well-vetted crypto implementation with
> strong backwards compatibility guarantees.

Oh, they won't conclude that. They'll start using nanomsg 0.2 alpha, they 
use the crypto because it's just good hygiene, someone else will start 
relying on the service they wrote for testing purposes, and then they face 
the nightmare of having to update every client in order to switch to the 
proper, secure protocol.

This happens with internal webservers, internal fileservers, pre-release 
protocols, _anything_ which is useful. If it's useful, people use it. If 
people use it, people will yell loudly if it breaks for any reason, even (or 
especially) a good reason. It's one of the things that is exceedingly 
frustrating about sysadmin/netadmin work.

It's the big reason the Linux kernel is so ironclad about not breaking the 
ABI, and therefore so reluctant to add new APIs.

> I mean, I can see that problem at 1.0.  But hopefully, by 1.0, you
> actually _do_ have a robust, broadly-appliable crypto solution, because
> anything less (including no crypto, including TLS-speed crypto) really
> isn’t production-ready software that is prepared to be deployed in a wide
> variety of messaging environments.  Sure, shipping bad crypto is a scary
> failure mode, but shipping no crypto or shipping crypto that works for
> slow applications are equally-important failure modes that have
> equally-scary consequences (“my data was stolen because I can’t use your
> crypto options”).

* No crypto: "I knew it was unencrypted."
* Slow crypto: "I knew I could encrypt it, but chose not to."
* Bad crypto: "I knew I needed security, and I chose to use it, but I was 
mislead about its capabilities."

I'd say the last one is a categorically different failure mode, because it 
takes the choice away from the user entirely and becomes a breach of trust.

>> then we face the risk of people
>> either a.) not updating at all or b.) using retry-fallbacks the way
>> browsers do with TLS versions, and all the security nightmares thereof.
> 
> In my view, especially at this state in nanomsg’s lifecycle, and arguably
> even at “mature” stages like zeromq, it is a reasonable assumption that
> somebody (the same somebody) controls both ends of the pipe.  So for
> example, in my client client/server architecture, I control the client and
> the server, and I can upgrade them at near the same time.

Maybe yes, maybe no - I know I've been tossing ideas around in my head for a 
decentralized name service based on (S/)Kademlia and HashCash-style proof-
of-work; Kademlia's semantics map beautifully to the REQ/REP pattern which 
could make the coding far easier, but it'd be a massive bunch of nodes owned 
by an equally massive number of _different_ entities.

> It is also true in my case, and I suspect it is true in many cases, that
> there is already some plan to deal with breaking application-layer
> compatibility.  For example, I might change the format for some request,
> and there is a plan in place to deal with this change.  That same plan
> could be trivially extended to deal with breaking changes in the transport
> layer.

That is true, and calls back to why I think transport-oriented security 
should be an additional state machine that appears in the transport part of 
the URI - that way we can re-use known, practiced migration mechanisms if it 
becomes necessary.

[1] "[A]nonce should only be used once for each key. One interesting 
question here is what happens if a nonce is reused by accident:

    * The stream cipher will leak the xor of the plaintexts. In many cases 
this allows the reconstruction of both plaintexts.
    * The MAC is broken as well, since universal hashing relies on unique 
nonces" (from [2])

[2] https://codesinchaos.wordpress.com/2012/09/09/curvecp-1/

[3] See also: The Debian weak keys incident, WEP's reuse of IVs, the many 
difficulties in implementing DSA securely with Weierstrass curves (which is 
why Curve25519 is a specific form of Edwards curve), the CBC padding attacks 
on TLS itself, etc. Another good resource: 
http://cseweb.ucsd.edu/~mihir/cse107/yoshi.pdf‎ 


Other related posts: