I'm not surprised by the number being high, I'm surprised by it being _consistent_ - usually, you _don't_ see that kind of similarity across embedded ARM boards, normal desktops and laptops, and beefy servers. I would have expected, for instance, that the turnover point might be lower on embedded because of generally poorer memory bandwidth. Or that it might be higher on multi-CPU servers because NUMA overhead for nonlocal access might dominate in reading from a memfd. Garrett D'Amore wrote: > I'm not surprised. As I said. Copying is fast. > > Sent from my iPhone > >> On Sep 7, 2014, at 11:48 AM, Alex Elsayed >> <eternaleye@xxxxxxxxx> wrote: >> >> On future versions of Linux there's also the option of using memfds, >> which went in for 3.17 - memfds allow you to allocate an in-memory >> anonymous file, write to it, and 'seal' it to lock it against changes. >> The intent is explicitly that these be used for IPC, via the FD passing >> mechanism. >> >> In doing testing to find the performance turnover point relative to >> passing byte arrays through kdbus (which does exactly two context >> switches on a one- way message), the developers found that the value >> where memfds outperformed straight copy in 1-to-1 communication was >> 512KB, and surprisingly was the same across platforms - x86, ARM, x86_64, >> etc. >> >> See https://dvdhrm.wordpress.com/tag/memfd/ for more info. >> >> Garrett D'Amore wrote: >> >>> Without some kind of collaboration layer so that parties understand >>> which >>> addresses are in use and how it seems kind of useless to me. All the >>> hard >>> work still lives in the application. This compares poorly to other >>> nanomsg transports. >>> >>> If you want to map up a bunch of data and use nanomsg to send control >>> data >>> only and use shmem for data then just do that. You don't have to do >>> anything more than you would if you were going to try to have nanomsg >>> handle this for you. >>> >>> As always, KISS. >>> >>> A big part of nanomsgs attraction is that it is simple and easy to build >>> fault tolerant distributed architectures. Using shared memory flies in >>> the face of that. >>> >>> Sent from my iPhone >>> >>>> On Sep 6, 2014, at 5:05 AM, Martin Sustrik >>>> <sustrik@xxxxxxxxxx> wrote: >>>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> Hi Garrett, >>>> >>>>> In my experience, people vastly underestimate the performance of >>>>> bcopy. >>>>> >>>>> Unless you are passing around vast amounts of data (why are you >>>>> using something like nanomsg in that case, btw?) it simply doesn't >>>>> pay off. You lose the intended performance gains in the extra >>>>> complexity and locking. >>>>> >>>>> To make this work well (and this would not be portable outside of >>>>> the platform barring unusual measures like RDMA), you'd need a >>>>> collaboration layer, a very large shared memory region, and some >>>>> kind of ring or consume and produce indices in the buffer. >>>>> Probably better to have two separate buffers, one for each >>>>> direction, with different MMU settings (cache coherency). >>>>> >>>>> This also becomes really fragile. A bug in one program can now >>>>> take out the other, unless you're very careful to treat the shared >>>>> memory region with the same kind of care that you do packet data. >>>>> (i.e. don't pass around program state, or pointers, etc.) Don't >>>>> assume that the other side won't trash the memory. >>>>> >>>>> There may be some extreme cases where this complexity is >>>>> worthwhile; you *could* use nanomsg to do that. But again, why? >>>>> I'd just map the data up, and use POSIX signaling & mutexes to >>>>> coordinate access. My guess is that this will be simpler than >>>>> trying to coordinate across a simulated network. >>>>> >>>>> I have a hard time imagining that I'd want to forward data received >>>>> from this over some kind of device to other parties in the nanomsg >>>>> infrastructure, which is why I don't see much call to make this >>>>> work with nanomsg. >>>> >>>> Agreed with all the above. >>>> >>>> However, given there is a use case where one process allocates a large >>>> chunk of memory (say 1GB) does some work on it, then passes it to >>>> another processes et c. I can see no reason why nanomsg should not be >>>> able to support that. >>>> >>>> As already mentioned, most of the infrastructure is already in place: >>>> nn_allocmsg() is already allocator-agnostic and so can be used to >>>> allocate a message in shmem (say, for IPC message sizes above 1MB): >>>> >>>> void *p = nn_allocmsg (2000000, NN_IPC); >>>> >>>> Also, as you may recall, there is a "type" field in IPC protocol which >>>> can be used to let the other party know that the message is coming >>>> out-of-band, namely in shmem. >>>> >>>> All that being said, note that I am not proposing to do ring buffers >>>> et c. Just allocate very large messages as chunks of shmem and you are >>>> done. (Ring buffers are also doable, but hardly worth it IMO.) >>>> >>>> Ron, if you are interested in this stuff, feel free to implement it. >>>> What you have to look at is how nn_allocmsg/nn_freemsg works, add >>>> allocation of messages in shmem there, then modify the IPC transport >>>> in such a way that it can transport shmem descriptors in addition to >>>> the standard IPC bytesteam. >>>> >>>> Martin >>>> >>>> >>>> -----BEGIN PGP SIGNATURE----- >>>> Version: GnuPG v1.4.11 (GNU/Linux) >>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ >>>> >>>> iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1 >>>> GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq >>>> 0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu >>>> KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg >>>> hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB >>>> VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4= >>>> =+fpz >>>> -----END PGP SIGNATURE----- >>>> >> >> >>