[PATCH 1/3] powerpc/pseries: Simplify cpu readd to use drc_index

Nathan Lynch nathanl at linux.ibm.com
Wed Jun 5 03:21:28 AEST 2019


Tyrel Datwyler <tyreld at linux.vnet.ibm.com> writes:
> On 05/20/2019 08:01 AM, Nathan Lynch wrote:
>> Kernel implementation details aside, how do you change the cpu-node
>> relationship at runtime without breaking NUMA-aware applications? Is
>> this not a fundamental issue to address before adding code like this?
>> 
>
> If that is the concern then hotplug in general already breaks
> them. Take for example the removal of a faulty processor and then
> adding a new processor back.  It is quite possible that the new
> processor is in a different NUMA node. Keep in mind that in this
> scenario the new processor and threads gets the same logical cpu ids
> as the faulty processor we just removed.

Yes, the problem is re-use of a logical CPU id with a node id that
differs from the one it was initially assigned, and there are several
ways to get into that situation on this platform. We probably need to be
more careful in how we allocate a spot in the CPU maps for a newly-added
processor. I believe the algorithm is simple first-fit right now, and it
doesn't take into account prior NUMA relationships.


> Now we have to ask the question who is right and who is wrong. In this
> case the kernel data structures reflect the correct NUMA
> topology. However, did the NUMA aware application or libnuma make an
> assumption that specific sets of logical cpu ids are always in the
> same NUMA node?

Yes, and that assumption is widespread because people tend to develop on
an architecture where this kind of stuff doesn't happen (at least not
yet).

And I don't really agree that the current behavior reflects what is
actually going on. When Linux running in a PowerVM LPAR receives a
notification to change the NUMA properties of a processor at runtime,
it's because the platform has changed the physical characteristics of
the partition. I.e. you're now using a different physical processor,
with different relationships to the other resources in the system. Even
if it didn't destabilize the kernel (by changing the result of
cpu_to_node() when various subsystems assume it will be static),
continuing to use the logical CPU ids on the new processor obscures what
has actually happened. And we have developers that have told us that
this behavior - changing the logical cpu<->node relationship at runtime
- is something their existing NUMA-aware applications cannot handle.



More information about the Linuxppc-dev mailing list