[PATCH v2 2/3] powerpc/eeh: Restore config from edev in eeh_pe_reset_and_recover()

Gavin Shan gwshan at linux.vnet.ibm.com
Tue Apr 26 20:21:31 AEST 2016


On Fri, Apr 22, 2016 at 11:28:03PM +1000, Gavin Shan wrote:
>The function eeh_pe_reset_and_recover() is used to recover EEH
>error when the passthrou device are transferred to guest and
>backwards. The content in the device's config space will be lost
>on PE reset issued in the middle of the recovery. The function
>saves/restores it before/after the reset. However, config access
>to some adapters like Broadcom BCM5719 at this point will causes
>fenced PHB. The config space is always blocked and we save 0xFF's
>that are restored at late point. The memory BARs are totally
>corrupted, causing another EEH error upon access to one of the
>memory BARs.
>
>This restores the config space from the content saved to the
>EEH device when it's populated, to resolve above issue.
>
>Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
>Cc: stable at vger.kernel.org #v3.18+
>Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>Reviewed-by: Russell Currey <ruscur at russell.cc>
>---
> arch/powerpc/kernel/eeh_driver.c | 39 ++-------------------------------------
> 1 file changed, 2 insertions(+), 37 deletions(-)
>
>diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>index 1c7d703..ec6e889 100644
>--- a/arch/powerpc/kernel/eeh_driver.c
>+++ b/arch/powerpc/kernel/eeh_driver.c
>@@ -163,22 +163,6 @@ static bool eeh_dev_removed(struct eeh_dev *edev)
> 	return false;
> }
>
>-static void *eeh_dev_save_state(void *data, void *userdata)
>-{
>-	struct eeh_dev *edev = data;
>-	struct pci_dev *pdev;
>-
>-	if (!edev)
>-		return NULL;
>-
>-	pdev = eeh_dev_to_pci_dev(edev);
>-	if (!pdev)
>-		return NULL;
>-
>-	pci_save_state(pdev);
>-	return NULL;
>-}
>-
> /**
>  * eeh_report_error - Report pci error to each device driver
>  * @data: eeh device
>@@ -304,22 +288,6 @@ static void *eeh_report_reset(void *data, void *userdata)
> 	return NULL;
> }
>
>-static void *eeh_dev_restore_state(void *data, void *userdata)
>-{
>-	struct eeh_dev *edev = data;
>-	struct pci_dev *pdev;
>-
>-	if (!edev)
>-		return NULL;
>-
>-	pdev = eeh_dev_to_pci_dev(edev);
>-	if (!pdev)
>-		return NULL;
>-
>-	pci_restore_state(pdev);
>-	return NULL;
>-}
>-
> /**
>  * eeh_report_resume - Tell device to resume normal operations
>  * @data: eeh device
>@@ -561,9 +529,6 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
> 	/* Put the PE into recovery mode */
> 	eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
>
>-	/* Save states */
>-	eeh_pe_dev_traverse(pe, eeh_dev_save_state, NULL);
>-
> 	/* Issue reset */
> 	ret = eeh_reset_pe(pe);
> 	if (ret) {
>@@ -578,8 +543,8 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
> 		return ret;
> 	}
>
>-	/* Restore device state */
>-	eeh_pe_dev_traverse(pe, eeh_dev_restore_state, NULL);
>+	/* Restore device's config space */
>+	eeh_pe_restore_bars(pe);

As talked with David Gibson offline, it's not good enough to pick
the initial config space if the PE doesn't have blocked config property.
For that case, we still need rely on pci_{save,restore}_state() to
pick config space in last site. Thanks to David for checking the code
carefully. I will respin to fix it in next revision.

>
> 	/* Clear recovery mode */
> 	eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>-- 
>2.1.0
>



More information about the Linuxppc-dev mailing list