[RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node
Joonsoo Kim
iamjoonsoo.kim at lge.com
Fri Feb 7 16:48:19 EST 2014
On Thu, Feb 06, 2014 at 12:52:11PM -0800, David Rientjes wrote:
> On Thu, 6 Feb 2014, Joonsoo Kim wrote:
>
> > From bf691e7eb07f966e3aed251eaeb18f229ee32d1f Mon Sep 17 00:00:00 2001
> > From: Joonsoo Kim <iamjoonsoo.kim at lge.com>
> > Date: Thu, 6 Feb 2014 17:07:05 +0900
> > Subject: [RFC PATCH 2/3 v2] topology: support node_numa_mem() for
> > determining the
> > fallback node
> >
> > We need to determine the fallback node in slub allocator if the allocation
> > target node is memoryless node. Without it, the SLUB wrongly select
> > the node which has no memory and can't use a partial slab, because of node
> > mismatch. Introduced function, node_numa_mem(X), will return
> > a node Y with memory that has the nearest distance. If X is memoryless
> > node, it will return nearest distance node, but, if
> > X is normal node, it will return itself.
> >
> > We will use this function in following patch to determine the fallback
> > node.
> >
>
> I like the approach and it may fix the problem today, but it may not be
> sufficient in the future: nodes may not only be memoryless but they may
> also be cpuless. It's possible that a node can only have I/O, networking,
> or storage devices and we can define affinity for them that is remote from
> every cpu and/or memory by the ACPI specification.
>
> It seems like a better approach would be to do this when a node is brought
> online and determine the fallback node based not on the zonelists as you
> do here but rather on locality (such as through a SLIT if provided, see
> node_distance()).
Hmm...
I guess that zonelist is base on locality. Zonelist is generated using
node_distance(), so I think that it reflects locality. But, I'm not expert
on NUMA, so please let me know what I am missing here :)
> Also, the names aren't very descriptive: {get,set}_numa_mem() doesn't make
> a lot of sense in generic code. I'd suggest something like
> node_to_mem_node().
It's much better!
If this patch eventually will be needed, I will update it.
Thanks.
More information about the Linuxppc-dev
mailing list