[PATCH] powerpc/smp: poll cpu_callin_map more aggressively in __cpu_up()

Nathan Lynch nathanl at linux.ibm.com
Thu Jun 30 03:51:13 AEST 2022


Michael Ellerman <mpe at ellerman.id.au> writes:

> Nathan Lynch <nathanl at linux.ibm.com> writes:
>> Replace the outdated iteration and timeout calculations here with
>> indefinite spin_until_cond()-wrapped poll of cpu_callin_map. __cpu_up()
>> already does this when waiting for the cpu to set its online bit before
>> returning, so this change is not really making the function more brittle.
>
> I'm not sure I agree that this doesn't make the code more brittle.
>
> The existing indefinite wait you mention is later in the function, and
> happens after the CPU has successfully come into the kernel.
>
> I think it's more common that a stuck/borked CPU doesn't come into the
> kernel at all, rather than comes in and then fails to online.
>
> So I think the bail out when the CPU fails to call in is useful, I would
> guess I see that "Processor x is stuck" message multiple times a year
> while debugging various things.

Yeah I can see how my claim is too strong here.

>> Removing the msleep(1) in the hotplug path here reduces the time it takes
>> to online a CPU on a P9 PowerVM LPAR from about 30ms to 1ms when exercised
>> via thaw_secondary_cpus().
>
> That is a nice improvement.
>
> Can we do something that returns quickly in the happy case and still has
> a timeout when things go wrong? Seems like a busy loop with a
> time_after() check would do the trick.

Yes, I'll rework it like that. Thanks.


More information about the Linuxppc-dev mailing list