ppc64 oops..

Benjamin Herrenschmidt benh at kernel.crashing.org
Tue Nov 15 18:50:22 EST 2005


On Mon, 2005-11-14 at 22:40 -0800, Linus Torvalds wrote:
> 
> On Mon, 14 Nov 2005, Linus Torvalds wrote:
> > 
> > I'm just about to boot something that added some printk's to mm/bootmem.c 
> > that should be equivalent. But then I'm really turning in.
> 
> Ok, looks like a ppc64 bug:
> 
> 	Top of RAM: 0x180000000, Total RAM: 0x100000000
> 	Memory hole size: 2048MB
> 	Freeing bootmem 0,6442450944
> 	Reserving bootmem 0, 7589888
> 	Reserving bootmem 30408704, 1433600
> 	Reserving bootmem 34988032, 262144
> 	Reserving bootmem 268427264, 8192
> 	Reserving bootmem 2130702336, 16781312
> 	Reserving bootmem 6375018496, 196608
> 	Reserving bootmem 6375216128, 7048
> 	Reserving bootmem 6375223296, 4096
> 	Reserving bootmem 6375229016, 67221928
> 
> That's the trace from mm/bootmem.c. 
> 
> So the ppc64 boot code adds one 6GB region, not two 2GB ones.
> 

Mine, which has 3.5Gb does:

free_bootmem_core(0, 80000000)
free_bootmem_core(100000000, 60000000)
reserve_bootmem_core(0, 702000)
reserve_bootmem_core(1d06000, 40000)
reserve_bootmem_core(ffee000, 12000)
reserve_bootmem_core(7efff000, 1001000)
reserve_bootmem_core(15bfb7000, 2c000)
reserve_bootmem_core(15bfe3700, 888)
reserve_bootmem_core(15bfe4000, 1000)
reserve_bootmem_core(15bfe55c0, 401aa40)

Which looks correct. 

However, I just noticed there is some big bogosity in CONFIG_NUMA,
arch/powerpc/mm/numa.c:

static void __init setup_nonnuma(void)
{
	unsigned long top_of_ram = lmb_end_of_DRAM();
	unsigned long total_ram = lmb_phys_mem_size();

	printk(KERN_INFO "Top of RAM: 0x%lx, Total RAM: 0x%lx\n",
	       top_of_ram, total_ram);
	printk(KERN_INFO "Memory hole size: %ldMB\n",
	       (top_of_ram - total_ram) >> 20);

	map_cpu_to_node(boot_cpuid, 0);
	add_region(0, 0, lmb_end_of_DRAM() >> PAGE_SHIFT);
	node_set_online(0);
}

That is absolute junk. It totally ignores the IO hole and will trigger
exactly what you mentioned.

However, for that code to be reached, you need to have both:

1) CONFIG_NUMA
2) numa=off on the command line

Is this the case ?

I'll try to catch the NUMA folks to fix that crap ... I'm not exactly
sure what is the best way to proceed, probably when numa is disabled,
we should still go through all the nodes, but adding all the regions to
the same kernel-side node.

Ben.





More information about the Linuxppc64-dev mailing list