Nodes with no memory

Dave Hansen dave at linux.vnet.ibm.com
Sat Nov 22 10:50:41 EST 2008


I was handed off a bug report about a blade not booting with a, um
"newer" kernel.  After turning on some debugging messages, I got this
ominous message:

        node 1
        NODE_DATA() = c000000000000000

Which obviously comes from here:

arch/powerpc/mm/numa.c

        for_each_online_node(nid) {
                unsigned long start_pfn, end_pfn;
                unsigned long bootmem_paddr;
                unsigned long bootmap_pages;

                get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);

                /* Allocate the node structure node local if possible */
                NODE_DATA(nid) = careful_allocation(nid,
                                        sizeof(struct pglist_data),
                                        SMP_CACHE_BYTES, end_pfn);
                NODE_DATA(nid) = __va(NODE_DATA(nid));
                memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
		...

careful_allocation() returns a NULL physical address, but we go ahead
and run __va() on it, stick it in NODE_DATA(), and memset it.  Yay!

I seem to recall that we fixed some issues with memoryless nodes a few
years ago, like around the memory hotplug days, but I don't see the
patches anywhere.

I'm thinking that we need to at least fix careful_allocation() to oops
and not return NULL, or check to make sure all it callers check its
return code.  Plus,  we probably also need to ensure that all ppc code
doing for_each_online_node() does not assume a valid NODE_DATA() for all
those nodes.

Any other thoughts?

I'll have a patch for the above issue sometime soon. 

-- Dave




More information about the Linuxppc-dev mailing list