[PATCH linux-next][RFC] powerpc: avoid lockdep when we are offline

Nicholas Piggin npiggin at gmail.com
Wed Sep 28 12:51:14 AEST 2022


On Wed Sep 28, 2022 at 11:48 AM AEST, Zhouyi Zhou wrote:
> Thank Nick for reviewing my patch
>
> On Tue, Sep 27, 2022 at 12:25 PM Nicholas Piggin <npiggin at gmail.com> wrote:
> >
> > On Tue Sep 27, 2022 at 11:48 AM AEST, Zhouyi Zhou wrote:
> > > This is second version of my fix to PPC's  "WARNING: suspicious RCU usage",
> > > I improved my fix under Paul E. McKenney's guidance:
> > > Link: https://lore.kernel.org/lkml/20220914021528.15946-1-zhouzhouyi@gmail.com/T/
> > >
> > > During the cpu offlining, the sub functions of xive_teardown_cpu will
> > > call __lock_acquire when CONFIG_LOCKDEP=y. The latter function will
> > > travel RCU protected list, so "WARNING: suspicious RCU usage" will be
> > > triggered.
> > >
> > > Avoid lockdep when we are offline.
> >
> > I don't see how this is safe. If RCU is no longer watching the CPU then
> > the memory it is accessing here could be concurrently freed. I think the
> > warning is valid.
> Agree
> >
> > powerpc's problem is that cpuhp_report_idle_dead() is called before
> > arch_cpu_idle_dead(), so it must not rely on any RCU protection there.
> > I would say xive cleanup just needs to be done earlier. I wonder why it
> > is not done in __cpu_disable or thereabouts, that's where the interrupt
> > controller is supposed to be stopped.
> Yes, I learn flowing events sequence from kgdb debugging
> __cpu_disable -> pseries_cpu_disable -> set_cpu_online(cpu, false)  =
> leads to =>  do_idle: if (cpu_is_offline(cpu) -> arch_cpu_idle_dead
> so xive cleanup should be done in pseries_cpu_disable.

It's a good catch and a reasonable approach to the problem.

> But as a beginner, I afraid that I am incompetent to do above
> sophisticated work without error although I am very like to,
> Could any expert do this for us?

This will be difficult for anybody, it's tricky code. I'm not an
expert at it.

It looks like the interrupt controller disable split has been there
since long before xive. I would try just move them together than see
if that works.

Documentation/core-api/cpu_hotplug.rst says that __cpu_disable should
shut down the interrupt handler. So if there is a complication it
would probably be from powerpc-specific CPU hotplug  or interrupt
code.

Thanks,
Nick



More information about the Linuxppc-dev mailing list