[PATCH] powerpc/eeh: Validate arch in eeh_add_device_early()

Michael Ellerman mpe at ellerman.id.au
Wed Jan 13 21:38:07 AEDT 2016


On Sun, 2016-01-10 at 01:08 -0200, Guilherme G. Piccoli wrote:

> Commit 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
> added a check on function eeh_add_device_early(): since in Cell arch eeh_ops
> is NULL, that code used to crash on Cell. The commit's approach was validate
> if EEH was available by checking the result of function eeh_enabled().
> 
> Since the function eeh_add_device_early() is used to perform EEH
> initialization in devices added later on the system, like in hotplug/DLPAR
> scenarios, we might reach a case in which no PCI devices are present on boot
> and so EEH is not initialized. Then, if a device is added via DLPAR for
> example, eeh_add_device_early() fails because eeh_enabled() is false.
> 
> We can hit a kernel oops on pSeries arch if eeh_add_device_early() fails:
> if we have no PCI devices on machine at boot time, and then we add a PCI device
> via DLPAR operation, the function query_ddw() triggers the oops on NULL pointer
> dereference in the line "cfg_addr = edev->config_addr;". It happens because
> config_addr in edev is NULL, since the function eeh_add_device_early() was not
> completed successfully.
> 
> This patch just changes the way the arch checking is done in function
> eeh_add_device_early(): we use no more eeh_enabled(), but instead we check the
> running architecture by using the macro machine_is(). If we are running on
> pSeries or PowerNV, the EEH mechanism can be enabled; otherwise, we bail out
> the function. This way, we don't enable EEH on Cell and we don't hit the oops
> on DLPAR either.

But eeh_enabled() is still false? That seems like it's liable to cause breakage
elsewhere.

Shouldn't the PCI hotplug code instead be taught to initialise EEH correctly
when the first device is added?

cheers



More information about the Linuxppc-dev mailing list