[PATCH] powerpc/pseries/hotplug-cpu: increase wait time for vCPU death

Michael Ellerman mpe at ellerman.id.au
Tue Aug 11 21:56:50 AEST 2020


Michael Roth <mdroth at linux.vnet.ibm.com> writes:
> Quoting Nathan Lynch (2020-08-07 02:05:09)
...
>> wait_for_cpu_stopped() should be able to accommodate a time-based
>> warning if necessary, but speaking as a likely recipient of any bug
>> reports that would arise here, I'm not convinced of the need and I
>> don't know what a good value would be. It's relatively easy to sample
>> the stack of a task that's apparently failing to make progress, plus I
>> probably would use 'perf probe' or similar to report the inputs and
>> outputs for the RTAS call.
>
> I think if we make the timeout sufficiently high like 2 minutes or so
> it wouldn't hurt and if we did seem them it would probably point to an
> actual bug. But I don't have a strong feeling either way.

I think we should print a warning after 2 minutes.

It's true that there are fairly easy mechanisms to work out where the
thread is stuck, but customers are unlikely to use them. They're just
going to report that it's stuck with no further info, and probably
reboot the machine before we get a chance to get any further info.

Whereas if the kernel prints a warning with a stack trace we at least
have that to go on in an initial bug report.

>> I'm happy to make this a proper submission after I can clean it up and
>> retest it, or Michael R. is welcome to appropriate it, assuming it's
>> acceptable.
>> 
>
> I've given it a shot with this patch and it seems to be holding up in
> testing. If we don't think the ~2 minutes warning message is needed I
> can clean it up to post:
>
> https://github.com/mdroth/linux/commit/354b8c97bf0dc1146e36aa72273f5b33fe90d09e
>
> I'd likely break the refactoring patches out to a separate patch under
> Nathan's name since it fixes a separate bug potentially.

While I like Nathan's refactoring, we probably want to do the minimal
fix first to ease backporting.

Then do the refactoring on top of that.

cheers


More information about the Linuxppc-dev mailing list