rcu_sched self-detected stall on CPU

Zhouyi Zhou zhouzhouyi at gmail.com
Fri Apr 8 09:14:20 AEST 2022


Dear Paul and Miguel

On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck at kernel.org> wrote:
>
> On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck at kernel.org> wrote:
> > >
> > > Ah.  So you would instead look for boot to have completed within 10
> > > seconds?  Either way, reliable automation might well more important than
> > > reduction in time.
> >
> > No (although I guess that could be an option), I was only pointing out
> > that when no stall is produced, the run should be much quicker than 30
> > seconds (at least it was in my setup), which would be the majority of the runs.
>
> Ah, thank you for the clarification!
Thank both of you for the information. In my setup (PPC cloud VM), the
majority of the runs complete at least for 50 seconds. From last
evening to this morning (Beijing Time), following experiments have
been done:
1) torture mainline: the test quickly finished by hitting "rcu_sched
self-detected stall" after 12 runs
2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
have been done without trigger the bug

Conclusion:
There must be a commit that causes the bug as Paul has pointed out.
I am going to do the bisect, and estimate to locate the bug within a
week (at most).
This is a good learning experience, thanks for the guidance ;-)

Kind Regards
Zhouyi
>
>                                                         Thanx, Paul


More information about the Linuxppc-dev mailing list