[haiku-bugs] Re: [Haiku] #5956: PPC boot fails on double lock of mutex

  • From: "bonefish" <trac@xxxxxxxxxxxx>
  • Date: Fri, 14 May 2010 13:39:13 -0000

#5956: PPC boot fails on double lock of mutex
---------------------------+------------------------------------------------
 Reporter:  stevenh        |       Owner:  axeld         
     Type:  bug            |      Status:  new           
 Priority:  normal         |   Milestone:  R1            
Component:  System/Kernel  |     Version:  R1/Development
 Keywords:                 |   Blockedby:                
 Platform:  PowerPC        |    Blocking:  1048          
---------------------------+------------------------------------------------

Comment(by bonefish):

 To give a bit of background what this is all about: In the kernel debugger
 we have the problem that many commands have to examine/use/print kernel
 data structures and that those structures can be corrupt or that e.g. a
 pointer specified by the user is not correct. So it's not unlikely that
 invalid memory is accessed, triggering a page fault. The page fault
 handler normally checks whether the accessed address is OK and, if so,
 maps the underlying page respectively. If the access is invalid and
 happened in userland, the respective team is crashed (or gets a SIGSEGV,
 if a handler is installed for the signal). If the access happened in the
 kernel, the kernel panics.

 Since sometimes the kernel has to access potentially unsafe user address,
 there are special functions that do that (user_memcpy(), user_strlcpy()).
 Before accessing the memory, they set a fault handler program address for
 the current thread (thread::fault_handler), which the page fault handler
 returns to when the memory access was illegal. This way the functions can
 simply return an error code in such a case. This only really works because
 the functions are fully written in assembly (cf. e.g. [http://dev.haiku-
 os.org/browser/haiku/trunk/src/system/kernel/arch/x86/arch_x86.S?rev=35907#L197
 arch_cpu_user_memcpy()] for x86 -- for PPC it is written in C and probably
 doesn't work correctly) and thus have full control over all registers.

 In the kernel debugger things are a bit more complicated. Using something
 like user_memcpy() for every single memory access would be very tedious.
 The alternative used is [http://dev.haiku-
 os.org/browser/haiku/trunk/src/system/kernel/debug/debug.cpp?rev=36748#L1855
 debug_call_with_fault_handler()] which calls the given function after
 installing a fault handler similar to arch_cpu_user_memcpy() (just using
 the per-cpu instead of the thread structure). Unlike
 arch_cpu_user_memcpy() it does not have full control over the registers
 though, since the page fault can happen anywhere in the called function.
 To work around that problem
 [http://www.opengroup.org/onlinepubs/9699919799/functions/setjmp.html
 setjmp()] is used to store the register context. If the fault handler is
 called, it is restored via
 [http://www.opengroup.org/onlinepubs/9699919799/functions/longjmp.html
 longjmp()].

 So more concretely, this is how things work:
  - debug_call_with_fault_handler() (architecture independent):
    - Calls setjmp() to store the register state.
    - Then calls arch_debug_call_with_fault_handler().
  - arch_debug_call_with_fault_handler():
    - Stores the jump buffer pointer (which is possibly needed later) on
 the stack.
    - Sets the CPU's fault handler.
    - Also sets the CPU's fault_handler_stack_pointer. On x86 this is what
 the [http://dev.haiku-
 
os.org/browser/haiku/trunk/src/system/kernel/arch/x86/arch_int.cpp?rev=36624#L918
 page fault handler] sets register ebp to when the fault handler is called.
    - Calls the given function.
    - If the function caused a page fault, the control flow continues at
 the fault handler code. The only reliable register at this point is ebp.
 The function retrieves the jump buffer pointer from the stack invokes
 longjmp() to restore all registers and return to
 debug_call_with_fault_handler().

 The PPC version of arch_debug_call_with_fault_handler() has to do the
 same, just in a PPC specific manner, obviously. The stack pointer is !r1,
 so that one has to be saved in cpu_ent::fault_handler_stack_pointer.
 Function parameter passing happens through !r3-!r10, for function return
 values !r3 (and !r4 for a 64 bit value) is used.

 You'll also have to adjust the [http://dev.haiku-
 
os.org/browser/haiku/trunk/src/system/kernel/arch/ppc/arch_int.cpp?rev=36290#L121
 page fault part] of ppc_exception_entry(), since it doesn't support the
 cpu_ent::fault_handler[_stack_pointer] for the kernel debugger yet. Cf.
 the x86 page fault handler.

 Your function snippet looks quite incomplete yet. I also don't see where
 you've got the argc/argv part from -- neither the called function nor
 longjmp() take parameters like this.

 Also note, that CPU_ENT_fault_handler and
 CPU_ENT_fault_handler_stack_pointer (the relative offsets of the
 respective cpu_ent members) are not yet available for PPC yet, since
 there's no asm_offsets.cpp file for PPC yet. Have a look at the
 [http://dev.haiku-
 os.org/browser/haiku/trunk/src/system/kernel/arch/x86/asm_offsets.cpp?rev=34311
 x86 counterpart] and see how [http://dev.haiku-
 os.org/browser/haiku/trunk/src/system/kernel/arch/x86/Jamfile?rev=36221#L51
 it is handled] in the respective Jamfile.

 BTW, according to our coding style there should be a space between the
 `//` and the comment text.

 Good luck!

-- 
Ticket URL: <http://dev.haiku-os.org/ticket/5956#comment:5>
Haiku <http://dev.haiku-os.org>
Haiku - the operating system.

Other related posts: