[Bugme-new] [Bug 14148] New: kernel panic: do_wp_page assert_pte_locked failed when DEBUG_VM

Kumar Gala galak at kernel.crashing.org
Sat Sep 12 07:37:19 EST 2009


On Sep 11, 2009, at 3:09 PM, Andrew Morton wrote:

>
> (switched to email.  Please respond via emailed reply-to-all, not  
> via the
> bugzilla web interface).
>
> On Wed, 9 Sep 2009 15:09:15 GMT
> bugzilla-daemon at bugzilla.kernel.org wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=14148
>>
>>           Summary: kernel panic: do_wp_page assert_pte_locked  
>> failed when
>>                    DEBUG_VM
>>           Product: Platform Specific/Hardware
>>           Version: 2.5
>>    Kernel Version: 2.6.31-rc3, 2.6.31-rc9-git2
>>          Platform: All
>>        OS/Version: Linux
>>              Tree: Mainline
>>            Status: NEW
>>          Severity: normal
>>          Priority: P1
>>         Component: PPC-32
>>        AssignedTo: platform_ppc-32 at kernel-bugs.osdl.org
>>        ReportedBy: wangbj at lzu.edu.cn
>>        Regression: Yes
>>
>>
>> Created an attachment (id=23049)
>> --> (http://bugzilla.kernel.org/attachment.cgi?id=23049)
>> problematic config file for mpc8548cds
>>
>> powerpc mpc8548cds (I only have this board on hand) will kernel  
>> panic if
>> DEBUG_VM (kernel hacking) is enabled due to assertion failed in  
>> function
>> do_wp_page(). I think it highly possible for other ppc boards like  
>> 44x have the
>> same problem too, but I don't have the board.
>>
>> here is the full log from power up (after u-boot). and the  
>> attachment is
>> related .config, NOTE the kernel boot successfully if  
>> CONFIG_DEBUG_VM is not
>> enabled.
>>
>> host system is gentoo, the gcc (powerpc-unknown-linux-gnu-gcc) is  
>> build by
>> gentoo crossdev, version 4.4.1, (cross) glibc is 2.9, (cross)  
>> binutils is
>> 2.19.1, (cross) kernel headers is 2.6.30. target (mpc8548cds) root  
>> filesystem
>> is also gentoo (200907xx, extracted from stage3 tarball).
>>
>> I have running similar test on x86 using qemu (0.10.6, +kvm), the  
>> result seems
>> OK, especially x86 pass all lock api test suite.
>
> First question:
>
>> [10611.192802] ------------[ cut here ]------------
>> [10611.197409] Kernel BUG at c0014d70 [verbose debug info  
>> unavailable]
>
> Why did we not get the file-n-line?  That's iritating.
>
> Oh, CONFIG_DEBUG_BUGVERBOSE=n.  Don't do that.  We should make that  
> thing
> harder to get at, to stop people shooting our feet off.
>
>> [10611.203660] Oops: Exception in kernel mode, sig: 5 [#1]
>> [10611.208866] PREEMPT MPC85xx CDS
>> [10611.211997] Modules linked in:
>> [10611.215040] NIP: c0014d70 LR: c0014eb4 CTR: 00000002
>> [10611.219988] REGS: cf82db40 TRAP: 0700   Not tainted  (2.6.31-rc3)
>> [10611.226061] MSR: 00029000 <EE,ME,CE>  CR: 88448044  XER: 20000000
>> [10611.232162] TASK = cf828000[1] 'init' THREAD: cf82c000
>> [10611.237108] GPR00: 00000001 cf82dbf0 cf828000 cf9781c0 bf8031d8  
>> cf9f400c
>> 0057902f 00000001
>> [10611.245471] GPR08: cf978200 cf9f4000 00000002 00000000 28448042  
>> 1001b0b0
>> 00000001 cf88ee00
>> [10611.253833] GPR16: c05c0000 bf8031d8 00000002 10000000 48000000  
>> 00000001
>> 00000008 c05ecf20
>> [10611.262196] GPR24: 0057902b 0057902f cf82c000 00000000 cf9f400c  
>> 00000001
>> bf8031d8 cf98b000
>> [10611.270749] NIP [c0014d70] assert_pte_locked+0x3c/0x44
>> [10611.275872] LR [c0014eb4] ptep_set_access_flags+0xa8/0xf4
>> [10611.281252] Call Trace:
>> [10611.283687] [cf82dbf0] [bf8031d8] 0xbf8031d8 (unreliable)
>> [10611.289079] [cf82dc10] [c008e87c] do_wp_page+0xf8/0x82c
>> [10611.294292] [cf82dc60] [c0014770] do_page_fault+0x2c0/0x480
>> [10611.299851] [cf82dd10] [c0011078] handle_page_fault+0xc/0x80
>> [10611.305504] [cf82ddd0] [c00f2b4c] load_elf_binary+0x8a8/0x121c
>> [10611.311325] [cf82de50] [c00af418] search_binary_handler 
>> +0x144/0x37c
>> [10611.317578] [cf82dea0] [c00b0bc8] do_execve+0x270/0x2c8
>> [10611.322794] [cf82dee0] [c0008754] sys_execve+0x68/0xa4
>> [10611.327919] [cf82df00] [c0010c38] ret_from_syscall+0x0/0x3c
>> [10611.333482] [cf82dfc0] [c00b9350] sys_dup+0x38/0x78
>> [10611.338349] [cf82dfd0] [c0002030] init_post+0x94/0x108
>> [10611.343478] [cf82dfe0] [c054c234] kernel_init+0x114/0x130
>> [10611.348865] [cf82dff0] [c00109b8] kernel_thread+0x4c/0x68
>> [10611.354249] Instruction dump:
>> [10611.357206] 4d9e0020 38000000 0f000000 0f000000 81230024  
>> 5480653a 7c09002e
>> 54090027
>> [10611.364959] 7c000026 54001ffe 0f000000 38000001 <0f000000>  
>> 4e800020 7c0802a6
>> 9421fff0
>> [10611.372887] ---[ end trace 0cda2392272f221a ]---
>
> So do_wp_page() called ptep_set_access_flags().  If CONFIG_DEBUG_VM=y,
> powerpc's ptep_set_access_flags() will call
> arch/powerpc/mm/pgtable.c:assert_pte_locked().  Because of the lack of
> file-n-line info it is unclear which of those many assertions
> triggered.  It looks like BUG_ON(!pmd_present(*pmd)).  Perhaps.
>
>
> Please set CONFIG_DEBUG_BUGVERBOSE=y in your .config and then tell us
> (via emailed reply-to-all) which line in arch/powerpc/mm/pgtable.c
> triggered the BUG.  Please actually quote that line, or tell us  
> exactly
> which kernel version you're using so we can see which line it was in
> the source code.
>
> Thanks.

I think I fixed this:

commit 797a747a82e23530ee45d2927bf84f3571c1acb2
Author: Kumar Gala <galak at kernel.crashing.org>
Date:   Tue Aug 18 15:21:40 2009 +0000

     powerpc/mm: Fix assert_pte_locked to work properly on uniprocessor

     Since the pte_lockptr is a spinlock it gets optimized away on
     uniprocessor builds so using spin_is_locked is not correct.  We  
can use
     assert_spin_locked instead and get the proper behavior between UP  
and
     SMP builds.

     Signed-off-by: Kumar Gala <galak at kernel.crashing.org>
     Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>

But the patch was queued up for .32 not .31

- k


More information about the Linuxppc-dev mailing list