[PATCH 0/2] Skip offline cores when enabling SMT on PowerPC

Nysal Jan K.A. nysal at linux.ibm.com
Fri Jun 14 13:52:15 AEST 2024


On Thu, Jun 13, 2024 at 09:34:10PM GMT, Michael Ellerman wrote:
> "Nysal Jan K.A." <nysal at linux.ibm.com> writes:
> > From: "Nysal Jan K.A" <nysal at linux.ibm.com>
> >
> > After the addition of HOTPLUG_SMT support for PowerPC [1] there was a
> > regression reported [2] when enabling SMT.
> 
> This implies it was a kernel regression. But it can't be a kernel
> regression because previously there was no support at all for the sysfs
> interface on powerpc.
> 
> IIUIC the regression was in the ppc64_cpu userspace tool, which switched
> to using the new kernel interface without taking into account the way it
> behaves.
> 
> Or are you saying the kernel behaviour changed on x86 after the powerpc
> HOTPLUG_SMT was added?
> 

The regression is in ppc64_cpu. If we need the older behaviour we will need this
or an equivalent change in the kernel though. Fixing it in userspace in an
efficient way might be difficult.

> > On a system with at least
> > one offline core, when enabling SMT, the expectation is that no CPUs
> > of offline cores are made online.
> >
> > On a POWER9 system with 4 cores in SMT4 mode:
> > $ ppc64_cpu --info
> > Core   0:    0*    1*    2*    3*
> > Core   1:    4*    5*    6*    7*
> > Core   2:    8*    9*   10*   11*
> > Core   3:   12*   13*   14*   15*
> >
> > Turn only one core on:
> > $ ppc64_cpu --cores-on=1
> > $ ppc64_cpu --info
> > Core   0:    0*    1*    2*    3*
> > Core   1:    4     5     6     7
> > Core   2:    8     9    10    11
> > Core   3:   12    13    14    15
> >
> > Change the SMT level to 2:
> > $ ppc64_cpu --smt=2
> > $ ppc64_cpu --info
> > Core   0:    0*    1*    2     3
> > Core   1:    4     5     6     7
> > Core   2:    8     9    10    11
> > Core   3:   12    13    14    15
> >
> > As expected we see only two CPUs of core 0 are online
> >
> > Change the SMT level to 4:
> > $ ppc64_cpu --smt=4
> > $ ppc64_cpu --info
> > Core   0:    0*    1*    2*    3*
> > Core   1:    4*    5*    6*    7*
> > Core   2:    8*    9*   10*   11*
> > Core   3:   12*   13*   14*   15*
> >
> > The CPUs of offline cores are made online. If a core is offline then
> > enabling SMT should not online CPUs of this core.
> 
> That's the way the ppc64_cpu tool behaves, but it's not necessarily what
> other arches want.
> 

True, but from a user perspective it seems logical though. I think one can make
a case for either behaviour. 

> > An arch specific
> > function topology_is_core_online() is proposed to address this.
> > Another approach is to check the topology_sibling_cpumask() for any
> > online siblings. This avoids the need for an arch specific function
> > but is less efficient and more importantly this introduces a change
> > in existing behaviour on other architectures.
> 
> It's only x86 and powerpc right?
> 
> Having different behaviour on the only two arches that support the
> interface does not seem like a good result.
> 

Agree, I was originally thinking of sending out a patch changing this for both
architectures, but was unsure if there might be users who now depend on this
behaviour on x86.

> > What is the expected behaviour on x86 when enabling SMT and certain cores
> > are offline? 
> 
> AFAIK no one really touches SMT on x86 other than to turn it off for
> security reasons.
> 
> cheers
> 

Thanks for your comments. It will be good to hear if changing this behaviour
for both x86 and PowerPC might be an acceptable path forward.

Regards
--Nysal

> > [1] https://lore.kernel.org/lkml/20230705145143.40545-1-ldufour@linux.ibm.com/
> > [2] https://groups.google.com/g/powerpc-utils-devel/c/wrwVzAAnRlI/m/5KJSoqP4BAAJ
> >
> > Nysal Jan K.A (2):
> >   cpu/SMT: Enable SMT only if a core is online
> >   powerpc/topology: Check if a core is online
> >
> >  arch/powerpc/include/asm/topology.h | 13 +++++++++++++
> >  kernel/cpu.c                        | 12 +++++++++++-
> >  2 files changed, 24 insertions(+), 1 deletion(-)
> >
> >
> > base-commit: c760b3725e52403dc1b28644fb09c47a83cacea6
> > -- 
> > 2.35.3


More information about the Linuxppc-dev mailing list