crash in kmem_cache_init
nish.aravamudan at gmail.com
Wed Jan 23 09:12:06 EST 2008
On 1/22/08, Olaf Hering <olaf at aepfle.de> wrote:
> On Tue, Jan 22, Mel Gorman wrote:
> > http://www.csn.ul.ie/~mel/postings/slab-20080122/partial-revert-slab-changes.patch
> > .. Can you please check on your machine if it fixes your problem?
> It does not fix or change the nature of the crash.
> > Olaf, please confirm whether you need the patch below as well as the
> > revert to make your machine boot.
> It crashes now in a different way if the patch below is applied:
Was this with the revert Mel mentioned applied as well? I get the
feeling both patches are needed to fix up the memoryless SLAB issue.
> Linux version 2.6.24-rc8-ppc64 (olaf at lingonberry) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #43 SMP Tue Jan 22 22:39:05 CET 2008
> early_node_map active PFN ranges
> 1: 0 -> 892928
> Unable to handle kernel paging request for data at address 0x00000058
> Faulting instruction address: 0xc0000000000fe018
> cpu 0x0: Vector: 300 (Data Access) at [c00000000075bac0]
> pc: c0000000000fe018: .setup_cpu_cache+0x184/0x1f4
> lr: c0000000000fdfa8: .setup_cpu_cache+0x114/0x1f4
> sp: c00000000075bd40
> msr: 8000000000009032
> dar: 58
> dsisr: 42000000
> current = 0xc000000000665a50
> paca = 0xc000000000666380
> pid = 0, comm = swapper
> enter ? for help
> [c00000000075bd40] c0000000000fb368 .kmem_cache_create+0x3c0/0x478 (unreliable)
> [c00000000075be20] c0000000005e6780 .kmem_cache_init+0x284/0x4f4
> [c00000000075bee0] c0000000005bf8ec .start_kernel+0x2f8/0x3fc
> [c00000000075bf90] c000000000008590 .start_here_common+0x60/0xd0
> 0xc0000000000fe018 is in setup_cpu_cache (/home/olaf/kernel/git/linux-2.6-numa/mm/slab.c:2111).
> 2106 BUG_ON(!cachep->nodelists[node]);
> 2107 kmem_list3_init(cachep->nodelists[node]);
I might be barking up the wrong tree, but this block above is supposed
to set up the cachep->nodeslists[*] that are used immediately below.
But if the loop wasn't changed from N_NORMAL_MEMORY to N_ONLINE or
whatever, you might get a bad access right below for node 0 that has
no memory, if that's the node we're running on...
> 2108 }
> 2109 }
> 2110 }
> 2111 cachep->nodelists[numa_node_id()]->next_reap =
> 2112 jiffies + REAPTIMEOUT_LIST3 +
> 2113 ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
> 2115 cpu_cache_get(cachep)->avail = 0;
More information about the Linuxppc-dev