[nanomsg] Re: nn_close() hangs

  • From: André Jonsson <andre.jonsson@xxxxxxxxxxxxx>
  • To: nanomsg@xxxxxxxxxxxxx
  • Date: Tue, 16 Dec 2014 18:06:03 +0100 (CET)

I should probably state which platform I'm on:

HP-UX 11.31 (Itanium)

/André

----- Original Message -----
> From: "André Jonsson" <andre.jonsson@xxxxxxxxxxxxx>
> To: nanomsg@xxxxxxxxxxxxx
> Sent: Tuesday, 16 December, 2014 17:51:08
> Subject: nn_close() hangs

> Hi all,
> 
> I'm trying to replace my home-grown message bus with nanomsg in an existing
> application.
> Everything was surprisingly pain-free, until I noticed that nn_close() hangs.
> 
> For most sockets it works fine, but in a specific scenario - during shutdown 
> of
> a subsystem - it hangs (it's a SUB socket).
> 
> I breaked (broke?) the process and checked the stack of all threads, and two 
> of
> them were inside nn_glock_lock(), a worker:
> 
> #0  0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1
> #1  0x9fffffffbd181920:0 in __mxn_sleep+0x1190 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #2  0x9fffffffbd0fbce0:0 in __pthread_mutex_lock_wait_ng+0x260 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #3  0x9fffffffbd0f9c90:0 in __pthread_mutex_lock_ng+0x250 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #4  0x9fffffffbd0f9a20:0 in pthread_mutex_lock+0x20 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #5  0x4000000000381590:0 in nn_glock_lock () at src/utils/glock.c:63
> #6  0x40000000003736a0:0 in nn_global_submit_statistics ()
>    at src/core/global.c:1125
> #7  0x400000000036e2a0:0 in nn_global_handler () at src/core/global.c:1286
> #8  0x400000000037b250:0 in nn_fsm_feed () at src/aio/fsm.c:72
> #9  0x400000000037b1a0:0 in nn_fsm_event_process () at src/aio/fsm.c:66
> #10 0x400000000037ac80:0 in nn_ctx_leave () at src/aio/ctx.c:63
> #11 0x400000000037dc40:0 in nn_worker_routine ()
>    at src/aio/worker_posix.inc:189
> #12 0x4000000000384260:0 in nn_thread_main_routine ()
>    at src/utils/thread_posix.inc:35
> 
> ... and my thread, trying to close its SUB socket:
> 
> #0  0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1
> (gdb) bt
> #0  0x9fffffffbcc46b90:0 in __ksleep+0x30 () from /usr/lib/hpux64/libc.so.1
> #1  0x9fffffffbd181920:0 in __mxn_sleep+0x1190 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #2  0x9fffffffbd0fbce0:0 in __pthread_mutex_lock_wait_ng+0x260 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #3  0x9fffffffbd0f9c90:0 in __pthread_mutex_lock_ng+0x250 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #4  0x9fffffffbd0f9a20:0 in pthread_mutex_lock+0x20 ()
>   from /usr/lib/hpux64/libpthread.so.1
> #5  0x4000000000381590:0 in nn_glock_lock () at src/utils/glock.c:63
> #6  0x400000000036ff30:0 in nn_close () at src/core/global.c:571
> #7  0x40000000001eeae0:0 in msg::Bus::Sink::~Sink (this=0x60000000005770f0,
>    No.Identifier_87=2) at message_bus.cxx:194
> (+ more of my stuff)
> 
> 
> As these two threads are seemingly waiting to lock the same mutex, there must 
> be
> another thread that already has the lock.
> 
> Is this the correct assumption? And, how do I find which thread it is?
> 
> 
> /André

Other related posts: