[ktap] Re: "BUG: unable to handle kernel paging request" after ktap script to trace SyS_write

  • From: Jovi Zhangwei <jovi.zhangwei@xxxxxxxxx>
  • To: ktap@xxxxxxxxxxxxx
  • Date: Mon, 21 Oct 2013 11:55:55 +0800

On Mon, Oct 21, 2013 at 12:01 AM, Azat Khuzhin <a3at.mail@xxxxxxxxx> wrote:
> On Sun, Oct 20, 2013 at 7:27 PM, Jovi Zhangwei <jovi.zhangwei@xxxxxxxxx> 
> wrote:
>> On Sun, Oct 20, 2013 at 10:22 PM, Azat Khuzhin <a3at.mail@xxxxxxxxx> wrote:
>>> On Sun, Oct 20, 2013 at 12:50 PM, Jovi Zhangwei <jovi.zhangwei@xxxxxxxxx> 
>>> wrote:
>>>> On Sat, Oct 19, 2013 at 5:06 PM, Azat Khuzhin <a3at.mail@xxxxxxxxx> wrote:
>>>>> Hi all,
>>>>>
>>>>> I used next script, and after a few seconds of working, I have all ssh
>>>>> sessions killed.
>>>>> #! /usr/bin/env ktap
>>>>> trace probe:SyS_write {
>>>>>         print(backtrace(0, 100))
>>>>> }
>>>>>
>>>>> However, I can login after this, and have next messages in dmesg:
>>>>>
>>>> Hi Azat,
>>>
>>> Hi Jovi,
>>>
>>>>
>>>> Thanks for your reporting, could you help to check if below two
>>>> scripts works correctly?
>>>>
>>>> #! /usr/bin/env ktap
>>>> trace probe:SyS_write {
>>>>         print_backtrace(0, 100)
>>>> }
>>>
>>> [  107.416689] BUG: unable to handle kernel paging request at 
>>> ffffffff621ee2b5
>>> [  107.417774] IP: [<ffffffff621ee2b5>] 0xffffffff621ee2b4
>>> [  107.418569] PGD 160c067 PUD 0
>>> [  107.419409] Oops: 0010 [#3] PREEMPT SMP
>>> [  107.420038] Modules linked in: ktapvm(O) nfsd auth_rpcgss
>>> oid_registry nfs_acl nfs lockd fscache sunrpc loop fuse joydev
>>> hid_generic usbhid hid snd_pcm microcode snd_page_alloc snd_timer snd
>>> soundcore ohci_pci ohci_hcd ehci_hcd psmouse usbcore i2c_piix4
>>> processor ac ther
>>> mal_sys serio_raw pcspkr parport_pc usb_common evdev parport button
>>> i2c_core ext4 crc16 jbd2 mbcache sd_mod sg crc_t10dif sr_mod cdrom
>>> crct10dif_common ata_generic ahci libahci ata_piix libata e1000
>>> scsi_mod
>>> [  107.420038] CPU: 1 PID: 2654 Comm: less Tainted: G      D    O
>>> 3.12.0-rc5+ #40
>>> [  107.420038] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
>>> VirtualBox 12/01/2006
>>> [  107.420038] task: ffff88007a911790 ti: ffff88007a05c000 task.ti:
>>> ffff88007a05c000
>>> [  107.420038] RIP: 0010:[<ffffffff621ee2b5>]  [<ffffffff621ee2b5>]
>>> 0xffffffff621ee2b4
>>> [  107.420038] RSP: 0018:ffff88007a05df78  EFLAGS: 00010293
>>> [  107.420038] RAX: 0000000000000001 RBX: 00000000000003ff RCX: 
>>> 0000000000000001
>>> [  107.420038] RDX: 00000000000003ff RSI: 0000000000628940 RDI: 
>>> 0000000000000001
>>> [  107.420038] RBP: 000000000000005f R08: 0000000001e49c20 R09: 
>>> 0000000000000000
>>> [  107.420038] R10: 0000000000000001 R11: 0000000000000246 R12: 
>>> 0000000000000000
>>> [  107.420038] R13: 0000000000000043 R14: 0000000000000000 R15: 
>>> 0000000000000000
>>> [  107.420038] FS:  00007ff4a1bda700(0000) GS:ffff88007fd00000(0000)
>>> knlGS:0000000000000000
>>> [  107.420038] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [  107.420038] CR2: ffffffff621ee2b5 CR3: 000000007b91a000 CR4: 
>>> 00000000000006e0
>>> [  107.420038] Stack:
>>> [  107.420038]  000000000000005f ffffffff813b2b22 0000000000000246
>>> 0000000000000001
>>> [  107.420038]  0000000000000000 0000000001e49c20 0000000000000001
>>> 0000000000000023
>>> [  107.420038]  00000000000003ff 0000000000628940 0000000000000001
>>> 0000000000000001
>>> [  107.420038] Call Trace:
>>> [  107.420038]  [<ffffffff813b2b22>] ? system_call_fastpath+0x16/0x1b
>>> [  107.420038] Code:  Bad RIP value.
>>> [  107.420038] RIP  [<ffffffff621ee2b5>] 0xffffffff621ee2b4
>>> [  107.420038]  RSP <ffff88007a05df78>
>>> [  107.420038] CR2: ffffffff621ee2b5
>>> [  107.420038] ---[ end trace fe1cc133ae0119a3 ]---
>>>
>>> And others.
>>>
>>>>
>>>>
>>>> #! /usr/bin/env ktap
>>>> trace probe:schedule {
>>>>         print(backtrace(0, 100))
>>>> }
>>>
>>> This one printed message about 10 tines
>>> and after this system hangs.
>>>
>>> In debugger attached to kernel:
>>> Program received signal SIGSEGV, Segmentation fault.
>>>
>>>
>> Thanks. It seems the kernel crash doesn't caused by backtrace(), but
>> ktapvm core or kernel side.
>
> Yes, I already test another script (without backtrace), that also have
> some problems.
> Do you have this issue?
>
>>
>> Would you please check below scripts again?
>>
>> -------------------------------------------------
>> #!/usr/bin/env ktap
>>
>> trace probe:SyS_write {
>>     print("testing")
>> }
>
> page fault.
>
>> --------------------------------------------------
>> #!/usr/bin/env ktap
>>
>> trace probe:SyS_write {
>> }
>
> page fault.
>
>> --------------------------------------------------
>> #!/usr/bin/env ktap
>>
>> trace probe:schedule {
>> }
>
> Just after this command, I ran dmesg, and received page fault.
> And system hanged.
>
>> -------------------------------------------------
>> #!/usr/bin/env ktap
>>
>> trace syscalls:sys_write_enter {
>> }
>
> "s/sys_write_enter/sys_enter_write/"
> Seems works fine.
>
>> -------------------------------------------------
>> ./ktap test/arith.kp
>
> No output. No errors in kern.log
>
>> -------------------------------------------------
>>
>> And your hardware is x86-64 or ARM-64?
>
> x86-64
>
Unfortunately I cannot reproduce the crash in my local x86-64
box(whatever any kernel version, include 3.12-rc6).

Things to be clear now, kernel will crash only using ktap kprobe,
tracepoint and other ktap basic operation is fine.
It have a little strange that actually ktapvm doesn't know tracepoint
or kprobe in kernel side, ktapvm just treat them as a id,
so it's weird that tracepoint is fine but kprobe is not working.

I guess you ran vanilla kernel (3.12-rc6) with no additional patch,
could you post your kernel config file?
then I can try to reproduce it in my box.

Thank you.

Jovi

Other related posts: