Stack unwinding problem on Solaris x64

  • From: Dmitri Shubin <sbn@xxxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 02 Aug 2012 15:51:00 +0400

Hello!

I got strange problem in LuaJIT stack unwinding on Solaris x64.
If I build everything using GCC with unwinder from libgcc_s everything works fine. But when I try to use GCC-built libluajit.a in executable built using Sun Studio 12.2 with standard solaris unwinder from libc I got a crash in err_unwind() function due to invalid lua_State pointer.

As it turned out this is due to invalid value that _Unwind_GetCFA() function returned (called from lj_err_unwind_dwarf())

E.g. in core I can see the following stack:

[1] err_unwind(0x71255e31, 0x6e0bd80, 0x0, 0x6e0bd80, 0x6e0b5d0, 0x7124bbb5), at 0x7124b875
  [2] lj_err_unwind_dwarf(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7124bc25
[3] _Unwind_RaiseException_Body(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffaae0bfc [4] _SUNW_Unwind_RaiseException(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffaae0de9
  [5] err_raise_ext(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7124be2c
  [6] lj_err_throw(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7124be89
  [7] lj_trace_err_info(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7129b8d8
  [8] lj_record_ins(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7127f0ec
  [9] trace_state(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7129d7cd
  [10] lj_vm_cpcall(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x712473ba
  [11] lj_trace_ins(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7129db8d
  [12] lj_dispatch_ins(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x71255e31
  [13] lj_vm_inshook(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x71248a41
  [14] lua_pcall(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x7125a584
...

lj_err_unwind_dwarf() is personality routine for lj_vm_cpcall() (frame 10) so it's called to process its stack frame.

x86_64 psABI says that _Unwind_GetCFA() "returns the 64-bit Canonical Frame Address which is defined as the value of %rsp at the call site in the previous frame." (6.2.5 Context Management)

So if I understand this correctly *(CFA - 8) should contain address somewhere inside lj_vm_cpcall() caller, i.e. in lj_trace_ins() (frame 11).
And when using Solaris std unwinder it's in fact true, but it crashes.
When using GCC unwinder return address is somewhere inside lj_vm_cpcall() itself, i.e. AFAIU CFA for trace_state() (frame 9) is returned instead, but it works.

I can make std unwinder work by adjusting returned CFA to point to next frame by subtracting 80 bytes from it but it's interesting to understand why this happens.

Am I missing something and GCC unwinder is right?

TIA

Other related posts: