[PATCH 1/2] powerpc/eeh: Check for EEH availability in eeh_add_device_early()

Gavin Shan gwshan at linux.vnet.ibm.com
Wed Feb 3 09:44:38 AEDT 2016


On Tue, Jan 19, 2016 at 06:18:19PM -0200, Guilherme G. Piccoli wrote:
>The function eeh_add_device_early() is used to perform EEH initialization in
>devices added later on the system, like in hotplug/DLPAR scenarios. Since the
>commit 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
>a new check was introduced in this function - Cell has no EEH capabilities
>which led to kernel oops if hotplug was performed, so checking for
>eeh_enabled() was introduced to avoid the issue.
>
>However, in architectures that EEH is present like pSeries or PowerNV, we might
>reach a case in which no PCI devices are present on boot and so EEH is not
>initialized. Then, if a device is added via DLPAR for example,
>eeh_add_device_early() fails because eeh_enabled() is false.
>
>Also, we can hit a kernel oops on pSeries arch if eeh_add_device_early() fails:
>if we have no PCI devices on machine at boot time, and then we add a PCI device
>via DLPAR operation, the function query_ddw() triggers the oops on NULL pointer
>dereference in the line "cfg_addr = edev->config_addr;". It happens because
>config_addr in edev is NULL, since the function eeh_add_device_early() was not
>completed successfully.
>
>This patch just changes the way the arch checking is done in function
>eeh_add_device_early(): we don't use eeh_enabled() anymore, but instead we
>introduce the function eeh_available() that checks the running architecture
>by using the macro machine_is(). If we are running on pSeries or PowerNV, the
>EEH mechanism is available (even if not initialized yet). This way, we don't
>try to enable EEH on Cell and we don't hit the oops on DLPAR either.
>
>Fixes: 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell")
>Signed-off-by: Guilherme G. Piccoli <gpiccoli at linux.vnet.ibm.com>
>---
> arch/powerpc/kernel/eeh.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 40e4d4a..69031d7 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1056,6 +1056,23 @@ int eeh_init(void)
> core_initcall_sync(eeh_init);
>
> /**
>+ * eeh_available - Checks for the availability of EEH based on running
>+ * architecture.
>+ *
>+ * This routine should be used in case we need to check if EEH is
>+ * available in some situation, regardless if EEH is enabled or not.
>+ * For example, if we hotplug-add a PCI device on a machine with no
>+ * other PCI device, EEH won't be enabled, yet it's available if the
>+ * arch supports it.
>+ */
>+static inline bool eeh_available(void)
>+{
>+	if (machine_is(pseries) || machine_is(powernv))
>+		return true;
>+	return false;
>+}
>+

As I was told by somebody else before, the comments for static function
needn't to be exported.

>+/**
>  * eeh_add_device_early - Enable EEH for the indicated device node
>  * @pdn: PCI device node for which to set up EEH
>  *
>@@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn)
> 	struct pci_controller *phb;
> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>
>-	if (!edev || !eeh_enabled())
>+	if (!edev || !eeh_available())
> 		return;
>
> 	if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE))

The change here seems not correct enough. Before the changes, eeh_add_device_early()
does nothing if EEH is disabled on pSeries. With the changes applied, the EEH device
(edev) will be scanned even EEH is disabled on pSeries.

>From the code changes, I didn't figure out the real problem you try to fix. Cell
platform doesn't have flag EEH_PROBE_MODE_DEVTREE. So the function does nothing
on Cell platform except calling into pdn_to_eeh_dev(). I'm not sure if the kernel
crashed in pdn_to_eeh_dev() on Cell platform?

Thanks,
Gavin



More information about the Linuxppc-dev mailing list