[nanomsg] Re: What has changed since 0.2 in socket handling?

  • From: Boszormenyi Zoltan <zboszor@xxxxx>
  • To: "nanomsg@xxxxxxxxxxxxx" <nanomsg@xxxxxxxxxxxxx>
  • Date: Sat, 29 Nov 2014 08:27:12 +0100

Hi,

sorry for not replying your answer but I just re-subscribed recently
and I didn't receive the answer from the mailing list.

I sent the test program in private that integrated networking
into a GLIB mainloop. The real code we use allows switching
between ZeroMQ 3 (3.2.4, to be exact) and nanomsg at
configure time and uses static inline wrappers and #define's
for this reason. We only use the REP/REP pattern at the moment.

The currently attached test programs (obvious ones, really)
do exhibit the same problem I described in the first mail on
Fedora 20 and 21. Messaging stops after a few (2 to 8) thousand
messages.

Similar code (or the wrapper API with GLIB mainloop integration)
that uses ZeroMQ didn't stop, I have run one test during the night
and after about 72 million packets, the program still runs stable
and without any leaks. Again, on ZeroMQ 3.2.4.

Regarding the closed sockets in TIME_WAIT state, I noticed that
they slow down ZeroMQ, too, but don't make it lock up. Setting
these sysctl variables help eliminating the slowdown by instructing
the kernel to reuse those sockets more aggressively:

net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

Unfortunately, this didn't help nanomsg.

Best regards,
Zoltán Böszörményi

2014-11-21 21:46 keltezéssel, Boszormenyi Zoltan írta:
> Hi,
>
> I use nanomsg with a wrapper library that integrates the networking
> request-response pattern into the GLIB mainloop via
> nn_getsockopt(NN_SOL_SOCKET, NN_RCVFD).
>
> IIRC, it worked well and without any leaks back then with nanomsg 0.2-ish.
>
> Now, I have upgraded to 0.5 and e.g. on Fedora 20 and 21, my example
> programs lock up after some time. netstat shows there are many sockets
> in TIME_WAIT state even after both te client and server programs have quit.
>
> Also, this memory leak was observed on both Fedora 20 and 21:
>
> ==18504== 43,776 (21,888 direct, 21,888 indirect) bytes in 342 blocks are 
> definitely lost
> in loss record 3,232 of 3,232
> ==18504==    at 0x4A0645D: malloc (in 
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18504==    by 0x3E902DA99C: gaih_inet (in /usr/lib64/libc-2.18.so)
> ==18504==    by 0x3E902DE38C: getaddrinfo (in /usr/lib64/libc-2.18.so)
> ==18504==    by 0x5085FEF: handle_requests (in /usr/lib64/libanl-2.18.so)
> ==18504==    by 0x3E90E07EE4: start_thread (in /usr/lib64/libpthread-2.18.so)
> ==18504==    by 0x3E902F4B8C: clone (in /usr/lib64/libc-2.18.so)
>
> My understanding with nanomsg 0.2 was that I need these with REQ/REP:
>
> server:
> initialization: nn_socket, nn_bind
> in the handler loop: nn_recv[msg] + nn_freemsg on the incoming message, then  
> nn_send[msg]
> to the client
> when quitting: nn_close
>
> client (per REQ/REP message exchange):
> nn_socket, nn_connect, nn_send[msg], nn_recv[msg], nn_close
>
> Do I need to nn_close() the socket on the server side or anything else
> after the reply was sent?
>
> Thanks in advance,
> Zoltán Böszörményi
>

#include <stdio.h>
#include <string.h>
#include <nanomsg/nn.h>
#include <nanomsg/reqrep.h>

int main(void) {
        char progress[4] = "|/-\\";
        int progress_idx = 0;
        int packets = 0;
        int socket;

        socket = nn_socket(AF_SP, NN_REP);
        nn_bind(socket, "tcp://*:6799");

        for (;;) {
                size_t msglen;
                void *msg, *msg_reply;

                msglen = nn_recv(socket, &msg, NN_MSG, 0);

                msg_reply = nn_allocmsg(msglen, 0);
                memcpy(msg_reply, msg, msglen);
                nn_send(socket, &msg_reply, NN_MSG, NN_DONTWAIT);

                nn_freemsg(msg);

                printf("%c processed: %d                \r", 
progress[progress_idx], ++packets);
                fflush(stdout);
                progress_idx = (progress_idx + 1) % 4;
        }

        nn_close(socket);

        return 0;
}
#include <string.h>
#include <stdio.h>
#include <nanomsg/nn.h>
#include <nanomsg/reqrep.h>

int main(int argc, char **argv) {
        char progress[4] = "|/-\\";
        int progress_idx = 0;
        int packets_sent = 0, packets_received = 0;

        for (;;) {
                int socket;
                char *msg, *msg_reply;
                size_t msglen;

                socket = nn_socket(AF_SP, NN_REQ);
                nn_connect(socket, "tcp://localhost:6799");
                msg = nn_allocmsg(64, 0);
                memset(msg, 0, 64);
                msg[0] = 1;
                nn_send(socket, &msg, NN_MSG, NN_DONTWAIT);
                packets_sent++;

                /* Wait for reply */
                msglen = nn_recv(socket, &msg_reply, NN_MSG, 0);
                if (msglen == 64 && msg_reply[0] == 1)
                        packets_received++;
                else {
                        int i;
                        printf("bad packet: len %d ", (int)msglen);
                        for (i = 0; i < msglen; i++)
                                printf("0x%02x ", msg_reply[i]);
                        printf("\n");
                }
                nn_freemsg(msg_reply);

                nn_close(socket);

                printf("%c sent: %d received: %d                \r", 
progress[progress_idx], packets_sent, packets_received);
                fflush(stdout);
                progress_idx = (progress_idx + 1) % 4;
        }

        return 0;
}

Other related posts: