Hi André, please post a gist that lets us reproduce the hang. Best, Jason > On Dec 16, 2014, at 8:51 AM, André Jonsson <andre.jonsson@xxxxxxxxxxxxx> > wrote: > > Hi all, > > I'm trying to replace my home-grown message bus with nanomsg in an existing > application. > Everything was surprisingly pain-free, until I noticed that nn_close() hangs. > > For most sockets it works fine, but in a specific scenario - during shutdown > of a subsystem - it hangs (it's a SUB socket). > > I breaked (broke?) the process and checked the stack of all threads, and two > of them were inside nn_glock_lock(), a worker: > > #0 0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1 > #1 0x9fffffffbd181920:0 in __mxn_sleep+0x1190 () > from /usr/lib/hpux64/libpthread.so.1 > #2 0x9fffffffbd0fbce0:0 in __pthread_mutex_lock_wait_ng+0x260 () > from /usr/lib/hpux64/libpthread.so.1 > #3 0x9fffffffbd0f9c90:0 in __pthread_mutex_lock_ng+0x250 () > from /usr/lib/hpux64/libpthread.so.1 > #4 0x9fffffffbd0f9a20:0 in pthread_mutex_lock+0x20 () > from /usr/lib/hpux64/libpthread.so.1 > #5 0x4000000000381590:0 in nn_glock_lock () at src/utils/glock.c:63 > #6 0x40000000003736a0:0 in nn_global_submit_statistics () > at src/core/global.c:1125 > #7 0x400000000036e2a0:0 in nn_global_handler () at src/core/global.c:1286 > #8 0x400000000037b250:0 in nn_fsm_feed () at src/aio/fsm.c:72 > #9 0x400000000037b1a0:0 in nn_fsm_event_process () at src/aio/fsm.c:66 > #10 0x400000000037ac80:0 in nn_ctx_leave () at src/aio/ctx.c:63 > #11 0x400000000037dc40:0 in nn_worker_routine () > at src/aio/worker_posix.inc:189 > #12 0x4000000000384260:0 in nn_thread_main_routine () > at src/utils/thread_posix.inc:35 > > ... and my thread, trying to close its SUB socket: > > #0 0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1 > (gdb) bt > #0 0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1 > #1 0x9fffffffbd181920:0 in __mxn_sleep+0x1190 () > from /usr/lib/hpux64/libpthread.so.1 > #2 0x9fffffffbd0fbce0:0 in __pthread_mutex_lock_wait_ng+0x260 () > from /usr/lib/hpux64/libpthread.so.1 > #3 0x9fffffffbd0f9c90:0 in __pthread_mutex_lock_ng+0x250 () > from /usr/lib/hpux64/libpthread.so.1 > #4 0x9fffffffbd0f9a20:0 in pthread_mutex_lock+0x20 () > from /usr/lib/hpux64/libpthread.so.1 > #5 0x4000000000381590:0 in nn_glock_lock () at src/utils/glock.c:63 > #6 0x400000000036ff30:0 in nn_close () at src/core/global.c:571 > #7 0x40000000001eeae0:0 in msg::Bus::Sink::~Sink (this=0x60000000005770f0, > No.Identifier_87=2) at message_bus.cxx:194 > (+ more of my stuff) > > > As these two threads are seemingly waiting to lock the same mutex, there must > be another thread that already has the lock. > > Is this the correct assumption? And, how do I find which thread it is? > > > /André >