[PATCH] powerpc/pseries/cpuhp: respect current SMT when adding new CPU

Laurent Dufour ldufour at linux.ibm.com
Fri Mar 31 02:51:57 AEDT 2023


On 13/02/2023 16:40:50, Nathan Lynch wrote:
> Michal Suchánek <msuchanek at suse.de> writes:
>> On Mon, Feb 13, 2023 at 08:46:50AM -0600, Nathan Lynch wrote:
>>> Laurent Dufour <ldufour at linux.ibm.com> writes:
>>>> When a new CPU is added, the kernel is activating all its threads. This
>>>> leads to weird, but functional, result when adding CPU on a SMT 4 system
>>>> for instance.
>>>>
>>>> Here the newly added CPU 1 has 8 threads while the other one has 4 threads
>>>> active (system has been booted with the 'smt-enabled=4' kernel option):
>>>>
>>>> ltcden3-lp12:~ # ppc64_cpu --info
>>>> Core   0:    0*    1*    2*    3*    4     5     6     7
>>>> Core   1:    8*    9*   10*   11*   12*   13*   14*   15*
>>>>
>>>> There is no SMT value in the kernel. It is possible to run unbalanced LPAR
>>>> with 2 threads for a CPU, 4 for another one, and 5 on the latest.
>>>>
>>>> To work around this possibility, and assuming that the LPAR run with the
>>>> same number of threads for each CPU, which is the common case,
>>>
>>> I am skeptical at best of baking that assumption into this code. Mixed
>>> SMT modes within a partition doesn't strike me as an unreasonable
>>> possibility for some use cases. And if that's wrong, then we should just
>>> add a global smt value instead of using heuristics.
>>>
>>>> the number
>>>> of active threads of the CPU doing the hot-plug operation is computed. Only
>>>> that number of threads will be activated for the newly added CPU.
>>>>
>>>> This way on a LPAR running in SMT=4, newly added CPU will be running 4
>>>> threads, which is what a end user would expect.
>>>
>>> I could see why most users would prefer this new behavior. But surely
>>> some users have come to expect the existing behavior, which has been in
>>> place for years, and developed workarounds that might be broken by this
>>> change?
>>>
>>> I would suggest that to handle this well, we need to give user space
>>> more ability to tell the kernel what actions to take on added cores, on
>>> an opt-in basis.
>>>
>>> This could take the form of extending the DLPAR sysfs command set:
>>>
>>> Option 1 - Add a flag that tells the kernel not to online any threads at
>>> all; user space will online the desired threads later.
>>>
>>> Option 2 - Add an option that tells the kernel which SMT mode to apply.
>>
>> powerpc-utils grew some drmgr hooks recently so maybe the policy can be
>> moved to userspace?
> 
> I'm not sure whether the hook mechanism would come into play, but yes, I
> am suggesting that user space be given the option of overriding the
> kernel's current behavior.

Indeed, that's not so easy. There are multiple ways for the SMT level to be
impacted:
 - smt-enabled kernel option
 - smtstate systemctl service (if activated), saving SMT level at shutdown
time to restore it a boot time
 - pseries-energyd daemon (if activated) could turn off threads
 - ppc64_cpu --smt=x user command
 - sysfs direct writing to turn off/on specific threads.

There is no SMT level saved, on "disk" or in the kernel, and any of these
options can interact in parallel. So from the user space point of view, the
best we could do is looking for the SMT current values, there could be
multiple values in the case of a mixed SMT state, peek one value and apply it.

Extending the drmgr's hook is still valid, and I sent a patch series on the
powerpc-utils mailing list to achieve that. However, changing the SMT level
in that hook means that newly added CPU will be first turn on and there is
a window where this threads could be seen active. Not a big deal but not
turning on these extra threads looks better to me.

That's being said, I can't see any benefit of a user space implementation
compared to the option I'm proposing in that patch.

Does anyone have a better idea?

Cheers,
Laurent.


More information about the Linuxppc-dev mailing list