Nodes with no memory
    Dave Hansen 
    dave at linux.vnet.ibm.com
       
    Sat Nov 22 10:50:41 EST 2008
    
    
  
I was handed off a bug report about a blade not booting with a, um
"newer" kernel.  After turning on some debugging messages, I got this
ominous message:
        node 1
        NODE_DATA() = c000000000000000
Which obviously comes from here:
arch/powerpc/mm/numa.c
        for_each_online_node(nid) {
                unsigned long start_pfn, end_pfn;
                unsigned long bootmem_paddr;
                unsigned long bootmap_pages;
                get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
                /* Allocate the node structure node local if possible */
                NODE_DATA(nid) = careful_allocation(nid,
                                        sizeof(struct pglist_data),
                                        SMP_CACHE_BYTES, end_pfn);
                NODE_DATA(nid) = __va(NODE_DATA(nid));
                memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
		...
careful_allocation() returns a NULL physical address, but we go ahead
and run __va() on it, stick it in NODE_DATA(), and memset it.  Yay!
I seem to recall that we fixed some issues with memoryless nodes a few
years ago, like around the memory hotplug days, but I don't see the
patches anywhere.
I'm thinking that we need to at least fix careful_allocation() to oops
and not return NULL, or check to make sure all it callers check its
return code.  Plus,  we probably also need to ensure that all ppc code
doing for_each_online_node() does not assume a valid NODE_DATA() for all
those nodes.
Any other thoughts?
I'll have a patch for the above issue sometime soon. 
-- Dave
    
    
More information about the Linuxppc-dev
mailing list