[PATCH] powerpc/powernv/cpuidle: Pass correct drv->cpumask for registration

Vaidyanathan Srinivasan svaidy at linux.vnet.ibm.com
Mon Mar 20 15:31:12 AEDT 2017


* Michael Ellerman <mpe at ellerman.id.au> [2017-03-20 14:05:39]:

> Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com> writes:
> 
> > * Michael Neuling <mikey at neuling.org> [2017-03-18 16:28:02]:
> >
> >> Vaidy,
> >> 
> >> Thanks for fixing this.
> >> 
> >> > drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init().
> >> > This breaks cpuidle on powernv where sysfs files are not created for
> >> > cpus in cpu_possible_mask that cannot be hot-added.
> >> 
> >> I think I prefer the longer description below than this.
> >
> > [PATCH] powerpc/powernv/cpuidle: Pass correct drv->cpumask for
> >
> > drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init().
> > This breaks cpuidle on powernv where sysfs files are not created for
> > cpus in cpu_possible_mask that cannot be hot-added.
> 
> I'm confused.

It depends on CONFIG_HOTPLUG_CPU now, but we have the following
combinations to handle:
(a) CONFIG_HOTPLUG_CPU=y/n
(b) pseries vs powernv
 
> > On powernv platform cpu_present could be less than cpu_possible
> > in cases where firmware detects the cpu, but it is not available
> > for OS.
> 
> It's entirely normal for present < possible, on my laptop for example,
> so I don't see how that causes the bug.

Yes, present < possible in itself not a problem.  It is whether
cpu_device exist for that cpu or not.

> > Such cpus are not hotplugable at runtime on powernv and
> > hence we skip creating sysfs files for these cpus.
> 
> Why are they not hotpluggable? Looking at topology_init() they should be
> hotpluggable as long as ppc_md.cpu_die is populated, which it is on
> PowerNV AFAICS.

Currently it depends on CONFIG_HOTPLUG_CPU=y.
 
> Is it really "creating sysfs files" that's important, or is that we
> don't call register_cpu() for those CPUs?

Currently if CONFIG_HOTPLUG_CPU=n, then we skip calling register_cpu()
and that causes the problem.

The fix in cpuidle-powernv.c could be based on CONFIG_HOTPLUG_CPU, but
since we can choose to set cpu->hotpluggable based on other factors
even when CONFIG_HOTPLUG_CPU=y, I choose to use cpu_possible.  We will
need to change  cpuidle-powernv to register for additional cpus when
we really support hot-add of cpus that did not exist at boot.

> > Trying cpuidle_register_device() on cpu without sysfs node will
> > cause crash like:
> >
> > cpu 0xf: Vector: 380 (Data SLB Access) at [c000000ff1503490]
> >     pc: c00000000022c8bc: string+0x34/0x60
> >     lr: c00000000022ed78: vsnprintf+0x284/0x42c
> >     sp: c000000ff1503710
> >    msr: 9000000000009033
> >    dar: 6000000060000000
> >   current = 0xc000000ff1480000
> >   paca    = 0xc00000000fe82d00   softe: 0        irq_happened: 0x01
> >     pid   = 1, comm = swapper/8
> > Linux version 4.11.0-rc2 (sv at sagarika) (gcc version 4.9.4 (Buildroot 2017.02-00004-gc28573e) ) #15 SMP Fri Mar 17 19:32:02 IST 2017
> > enter ? for help
> > [link register   ] c00000000022ed78 vsnprintf+0x284/0x42c
> > [c000000ff1503710] c00000000022ebb8 vsnprintf+0xc4/0x42c (unreliable)
> > [c000000ff1503800] c00000000022ef40 vscnprintf+0x20/0x44
> > [c000000ff1503830] c0000000000ab61c vprintk_emit+0x94/0x2cc
> > [c000000ff15038a0] c0000000000acc9c vprintk_func+0x60/0x74
> > [c000000ff15038c0] c000000000619694 printk+0x38/0x4c
> > [c000000ff15038e0] c000000000224950 kobject_get+0x40/0x60
> > [c000000ff1503950] c00000000022507c kobject_add_internal+0x60/0x2c4
> > [c000000ff15039e0] c000000000225350 kobject_init_and_add+0x70/0x78
> > [c000000ff1503a60] c00000000053c288 cpuidle_add_sysfs+0x9c/0xe0
> > [c000000ff1503ae0] c00000000053aeac cpuidle_register_device+0xd4/0x12c
> > [c000000ff1503b30] c00000000053b108 cpuidle_register+0x98/0xcc
> > [c000000ff1503bc0] c00000000085eaf0 powernv_processor_idle_init+0x140/0x1e0
> > [c000000ff1503c60] c00000000000cd60 do_one_initcall+0xc0/0x15c
> > [c000000ff1503d20] c000000000833e84 kernel_init_freeable+0x1a0/0x25c
> > [c000000ff1503dc0] c00000000000d478 kernel_init+0x24/0x12c
> > [c000000ff1503e30] c00000000000b564 ret_from_kernel_thread+0x5c/0x78
> 
> I really don't understand how a CPU not being present leads to a crash
> in printf()? Something in that call chain should have checked that the
> CPU was registered before crashing in printf() - surely?

Yes, we should have just failed to register the cpuidle driver.  I have
the fix here:

[PATCH] cpuidle: Validate cpu_dev in cpuidle_add_sysfs
http://patchwork.ozlabs.org/patch/740634/

--Vaidy



More information about the Linuxppc-dev mailing list