[PATCH] powerpc/pseries/cpuhp: respect current SMT when adding new CPU

Laurent Dufour ldufour at linux.ibm.com
Sat Apr 1 02:11:19 AEDT 2023

On 30/03/2023 18:19:38, Michal Suchánek wrote:
> On Thu, Mar 30, 2023 at 05:51:57PM +0200, Laurent Dufour wrote:
>> On 13/02/2023 16:40:50, Nathan Lynch wrote:
>>> Michal Suchánek <msuchanek at suse.de> writes:
>>>> On Mon, Feb 13, 2023 at 08:46:50AM -0600, Nathan Lynch wrote:
>>>>> Laurent Dufour <ldufour at linux.ibm.com> writes:
>>>>>> When a new CPU is added, the kernel is activating all its threads. This
>>>>>> leads to weird, but functional, result when adding CPU on a SMT 4 system
>>>>>> for instance.
>>>>>> Here the newly added CPU 1 has 8 threads while the other one has 4 threads
>>>>>> active (system has been booted with the 'smt-enabled=4' kernel option):
>>>>>> ltcden3-lp12:~ # ppc64_cpu --info
>>>>>> Core   0:    0*    1*    2*    3*    4     5     6     7
>>>>>> Core   1:    8*    9*   10*   11*   12*   13*   14*   15*
>>>>>> There is no SMT value in the kernel. It is possible to run unbalanced LPAR
>>>>>> with 2 threads for a CPU, 4 for another one, and 5 on the latest.
>> Indeed, that's not so easy. There are multiple ways for the SMT level to be
>> impacted:
>>  - smt-enabled kernel option
>>  - smtstate systemctl service (if activated), saving SMT level at shutdown
>> time to restore it a boot time
>>  - pseries-energyd daemon (if activated) could turn off threads
>>  - ppc64_cpu --smt=x user command
>>  - sysfs direct writing to turn off/on specific threads.
>> There is no SMT level saved, on "disk" or in the kernel, and any of these
>> options can interact in parallel. So from the user space point of view, the
>> best we could do is looking for the SMT current values, there could be
>> multiple values in the case of a mixed SMT state, peek one value and apply it.
>> Extending the drmgr's hook is still valid, and I sent a patch series on the
>> powerpc-utils mailing list to achieve that. However, changing the SMT level
>> in that hook means that newly added CPU will be first turn on and there is
>> a window where this threads could be seen active. Not a big deal but not
>> turning on these extra threads looks better to me.
> Which means
> 1) add an option to not onlince hotplugged CPUs by default

After discussing this with Srikar, it happens that exposing the smt-enabled
value set a boot time (or not) in SYS FS and to update it when SMT level is
changed using ppc64_cpu will be better. This will aslo allow the energy
daemon to take this value in account.

> 2) when a tool that wants to manage CPU onlining is active it can set
> the option so that no threads are onlined automatically, and online the
> desired threads

When new CPU are added, the kernel will automatically online the right
number of threads based on that recorded SMT level.

> 3) when no such tool is active the default should be to online all
> threeads to preserve compatibility with existing behavior

I don't think we should keep the existing behavior, customers are confused
and some user space tools like lparstart have difficulties to handled mixed
SMT levels.

I'll submit a new series exposing the SMT level and propose a patch for
ppc64_cpu too.

>> That's being said, I can't see any benefit of a user space implementation
>> compared to the option I'm proposing in that patch.
> The userspace implementation can implement arbitrily complex policy,
> that's not something that belongs into the kernel.
> Thanks
> Michal

More information about the Linuxppc-dev mailing list