rcu_sched self-detected stall on CPU

Miguel Ojeda miguel.ojeda.sandonis at gmail.com
Fri Apr 8 23:52:50 AEST 2022


On Fri, Apr 8, 2022 at 9:23 AM Michael Ellerman <mpe at ellerman.id.au> wrote:
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.

Hmm... I noticed this for some kernel builds: in some builds/commits,
it triggered the very first time, while in others I had to re-try
quite a few times. It could be a "fluke", but since it happened to you
too (and Zhouyi seemed to need 12 tries), it may be that particular
kernel builds makes the bug much more likely.

> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
>
> Reverting that on mainline makes the bug go away.

That is great, thanks for that -- I can revert that one in our CI meanwhile.

> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

Yeah, it is one based on the "debug" one you sent for Rust PPC.
Assuming you based that one on the others we had for other archs, then
I guess we are bound to find some things like this at times like with
randconfig, since I made them to be fairly minimal and "custom"... :)

Cheers,
Miguel


More information about the Linuxppc-dev mailing list