[PATCH] numa placement for dynamically added memory

Nathan Lynch ntl at pobox.com
Sat Dec 3 06:20:54 EST 2005


Mike Kravetz wrote:
> On Thu, Dec 01, 2005 at 10:02:30PM -0500, Nathan Lynch wrote:
> > > +		/* Domains not present at boot default to 0 */
> > > +		if (!node_online(numa_domain))
> > > +			numa_domain = 0;
> > 
> > Nope, 0 is not always a valid node on pSeries lpar.  I suggest using
> > any_online_node(), or revisiting the idea of logical<->physical
> > mapping of node/domain ids.  I tried the latter a few months ago but
> > I've been working on other stuff lately and haven't been able to
> > revisit it.
> 
> Yeah, I can do that.  As a side note, it looks like 0 will always be a
> valid node in the current code.  If we successfully execute
> parse_numa_properties(), then this code will be run.
> 
>         for (i = 0; i <= max_domain; i++)
>                 node_set_online(i);

Yes, the code erroneously assumes that we can just mark nodes 0
through max_domain - 1 online.  Explained below.


> If we execute setup_nonnuma() instead, then the following is executed:
> 
> 	node_set_online(0);
> 
> I've previously wondered about the above code in parse_numa_properties().
> You seem to confirm that is not the desired behavior.  Should this be
> changed?

I think so.

The fundamental issue is that the numa code does not distinguish
between logical node numbers and the identifiers given by the platform
in the ibm,associativity properties to denote "affinity domains".
This is ok for cases such as larger Power4 machines running without a
hypervisor and LPARs on smaller Power5 machines (e.g. just 2 nodes).
But with larger Power5 systems, we're getting into trouble over this.
We need to be able to handle situations where the domain numbering as
given by the platform doesn't necessarily begin at zero and isn't
necessarily continuous -- for example a partition with domains
numbered 2, 7, and 9.

So I think a logical to "physical" mapping makes sense, similar to
what we do for cpus.


Nathan



More information about the Linuxppc64-dev mailing list