+1. I have in fact spent time in the past ripping zero copy *out* of NIC drivers to yield both simpler code, and faster operation. The cases for zero copy today involve either huge messages (NICs can realize a benefit with TCP large send offload/large receive offload, where the data to be copied can be quite big), or weird systems with zillions of cores but limited cache per core (early generation Niagra processors — the SPARC-T1 series.) On sane systems, data copying is simpler and faster. - Garrett > On Oct 27, 2014, at 9:23 PM, Martin Sustrik <sustrik@xxxxxxxxxx> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Steve, > >> OK, thanks. But: >> >> 1. Is the copy needed because of asynchronous send behavior, where >> nanomsg returns to the caller before sending the data and takes >> responsibility to finish the send later? If so, shouldn't the copy >> occur at that point instead of being unconditionally performed up >> front? > > It could, but it does not at the moment. > >> 2. I thought perhaps the reason was to avoid partial sends due to >> writev emulation, or maybe just because the underlying nanomsg >> functions for sending take single blocks of data rather than I/O >> vectors? > > No, the only reason is that the code is simpler this way. > > As a side note: After years of dealing with messaging I've realised > that people are irrationally scared of copying the data. For example, > given the following program: > > memcpy (dst, src, 8); > counter++; > > they are willing to go to great lengths to optimise out the first line > while not caring at all about the second one. In fact though, both > lines do the same amount of data manipulation. In fact, measurements > show that zero-copy provides real perf improvement only when messages > are above ~512kB long. > > This point is important because zero-copy techniques break the very > basic principle of good sofware design. Specifically, encapsulation: > The highest layer should be still aware of the memory layout imposed > by the lowest layer, because it can't reshuffle the bytes - that would > not be zero-copy! > > The result of the above is that there's a tendency of sacrificing much > of the code's readability and maintainability to gain performance > improvement in minority of use cases (very large messages). > >> 3. Doesn't this penalize all applications, some of them >> needlessly? >> >> The reason I'm interested is that in the Erlang nanomsg driver, >> Erlang naturally sends I/O vecs to the enm:send() function (i.e., >> the I/O vector is essentially a built-in type in Erlang), and so >> passing them along directly to nanomsg should mean very high >> efficiency and copy avoidance. So, it would be very handy if the >> copying was either removed or if the caller could pass a flag to >> say "I guarantee these data blocks will not be freed, please don't >> copy them." > > Feel free to implement the optimisation (no additions to API are > needed though), but be aware of the above: You are probably going to > make the code much more ugly for a rather questionable benefit. > > Martin > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > > iQEcBAEBAgAGBQJUTxpBAAoJENTpVjxCNN9YY10H/0zplJdOB3WlaIeKr5WpMTzX > Mmksx9YXTcWVeJYKvdxcSinuH1paiyJaQXBan2BIa/iTmjRBVYexMeGmnGGVCaic > REBK7QvgXV0xoxs44zcmvOhJM6lxKYf4fcj3RkEl5aybP+l8l4QN40r9g9wE2RF0 > V3WrQMFhbQyGh/5U25iLHeiCqMaQAejf8B7ZWb0buPtTYn9hqf7gYq562m1nfKNy > mRATsafu3MbG4iok6BwX2oZebTTXQYCFv1NUSWiqEbWXRNhXYJF3NJq010VqTMqj > zIXah+biWQ6qVlFEllMYDVplkNL2of1/d/FecYBxRgqv1EkUoUDgp3vODgfZaBM= > =hNxK > -----END PGP SIGNATURE----- >