rcu_sched self-detected stall on CPU

Paul E. McKenney paulmck at kernel.org
Fri Apr 8 11:43:01 AEST 2022


On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote:
> Dear Paul and Miguel
> 
> On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck at kernel.org> wrote:
> >
> > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck at kernel.org> wrote:
> > > >
> > > > Ah.  So you would instead look for boot to have completed within 10
> > > > seconds?  Either way, reliable automation might well more important than
> > > > reduction in time.
> > >
> > > No (although I guess that could be an option), I was only pointing out
> > > that when no stall is produced, the run should be much quicker than 30
> > > seconds (at least it was in my setup), which would be the majority of the runs.
> >
> > Ah, thank you for the clarification!
> Thank both of you for the information. In my setup (PPC cloud VM), the
> majority of the runs complete at least for 50 seconds. From last
> evening to this morning (Beijing Time), following experiments have
> been done:
> 1) torture mainline: the test quickly finished by hitting "rcu_sched
> self-detected stall" after 12 runs
> 2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
> have been done without trigger the bug
> 
> Conclusion:
> There must be a commit that causes the bug as Paul has pointed out.
> I am going to do the bisect, and estimate to locate the bug within a
> week (at most).
> This is a good learning experience, thanks for the guidance ;-)

Very good, and looking forward to seeing what you find.

							Thanx, Paul


More information about the Linuxppc-dev mailing list