[PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die
Michael Ellerman
mpe at ellerman.id.au
Mon Oct 11 23:34:28 AEDT 2021
Nathan Lynch <nathanl at linux.ibm.com> writes:
> Michael Ellerman <mpe at ellerman.id.au> writes:
>> Daniel Henrique Barboza <danielhb413 at gmail.com> writes:
>>> This is enough to say that we can't easily see the history behind this comment.
>>> I also believe that we're better of without it since it doesn't make sense
>>> with the current codebase.
>>
>> It was added by the original CPU hotplug commit for ppc64::
>>
>> https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b
>>
>>
>> The code was fairly similar:
>>
>> void __cpu_die(unsigned int cpu)
>> {
>> int tries;
>> int cpu_status;
>> unsigned int pcpu = get_hard_smp_processor_id(cpu);
>>
>> for (tries = 0; tries < 5; tries++) {
>> cpu_status = query_cpu_stopped(pcpu);
>>
>> if (cpu_status == 0)
>> break;
>> set_current_state(TASK_UNINTERRUPTIBLE);
>> schedule_timeout(HZ);
>> }
>> if (cpu_status != 0) {
>> printk("Querying DEAD? cpu %i (%i) shows %i\n",
>> cpu, pcpu, cpu_status);
>> }
>>
>> /* Isolation and deallocation are definatly done by
>> * drslot_chrp_cpu. If they were not they would be
>> * done here. Change isolate state to Isolate and
>> * change allocation-state to Unusable.
>> */
>> paca[cpu].xProcStart = 0;
>>
>> /* So we can recognize if it fails to come up next time. */
>> cpu_callin_map[cpu] = 0;
>> }
>>
>>
>> drslot_chrp_cpu() still exists in drmgr:
>>
>> https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406
>>
>>
>> I agree the comment is no longer meaningful and can be removed.
>
> Thanks for providing this background.
>
>> It might be good to then add a comment explaining why we need to set
>> cpu_start = 0.
>
> Sure, I can take that as a follow-up. Or perhaps it should be moved to
> the online path.
Yeah possibly.
>> It's not immediately clear why we need to. When we bring a CPU back
>> online in smp_pSeries_kick_cpu() we ask RTAS to start it and then
>> immediately set cpu_start = 1, ie. there isn't a separate step that sets
>> cpu_start = 1 for hotplugged CPUs.
>
> Hmm I'm not following the distinction you seem to be drawing between
> bringing a CPU back online and a hotplugged CPU. kick_cpu is used in all
> cases AFAIK.
Yeah that wasn't very clear, reading it back I have half confused myself.
At boot we need the paca->cpu_start flag because some CPUs can be
spinning in generic_secondary_common_init, in head_64.S.
ie. they're not in RTAS, they're spinning in kernel code, and the only
thing that stops them coming "online" in the Linux sense is
paca->cpu_start.
You can see that in pseries/smp.c:
static inline int smp_startup_cpu(unsigned int lcpu)
{
...
if (cpumask_test_cpu(lcpu, of_spin_mask))
/* Already started by OF and sitting in spin loop */
return 1;
We also hit that case when kexec'ing, where all the secondaries come in
that way.
On the other hand when we offline a CPU, we set paca->cpu_start = 0, in
pseries_cpu_die(), and then we return the CPU to RTAS.
The only way it *should* come back online is via smp_pSeries_kick_cpu(),
which calls smp_startup_cpu() to bring the CPU out of RTAS, and then
smp_pSeries_kick_cpu() immediately sets cpu_start = 1.
So the sequence is:
CPU goes offline from Linux POV
paca->cpu_start = 0;
CPU offline in RTAS
...
CPU brought out of RTAS
paca->cpu_start = 1;
CPU comes back online from Linux POV
But I guess I kind of answered my own question above, where I said
*should*. Clearing paca->cpu_start when we offline the CPU gives us a
little bit of backup if the CPU comes out of RTAS unexpectedly. ie. it
would then spin on paca->cpu_start, rather than spontaneously coming
back into Linux when we weren't expecting it.
cheers
More information about the Linuxppc-dev
mailing list