[PATCH 2/11] ppc64: EEH: Add clarifying messages.
Brian King
brking at linux.vnet.ibm.com
Wed Mar 21 05:26:44 EST 2007
Linas Vepstas wrote:
> There are multiple code patchs tht resuls in a "permanent
> failure"; when examining rare events, it can be hard to see
> which was taken. This patch adds printk's to assist.
Should these printk's be logging the location of the failing device/slot?
Brian
>
> Signed-off-by: Linas Vepstas <linas at austin.ibm.com>
>
> ----
> arch/powerpc/platforms/pseries/eeh_driver.c | 20 +++++++++++++++-----
> 1 file changed, 15 insertions(+), 5 deletions(-)
>
> Index: linux-2.6.21-rc4-git4/arch/powerpc/platforms/pseries/eeh_driver.c
> ===================================================================
> --- linux-2.6.21-rc4-git4.orig/arch/powerpc/platforms/pseries/eeh_driver.c 2007-03-19 12:51:09.000000000 -0500
> +++ linux-2.6.21-rc4-git4/arch/powerpc/platforms/pseries/eeh_driver.c 2007-03-19 13:19:28.000000000 -0500
> @@ -367,8 +367,10 @@ struct pci_dn * handle_eeh_events (struc
> */
> if ((event->state == pci_channel_io_perm_failure) &&
> ((event->time_unavail <= 0) ||
> - (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000)))
> + (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000))) {
> + printk(KERN_WARNING "EEH: Permanent failure\n");
> goto hard_fail;
> + }
>
> eeh_slot_error_detail(frozen_pdn, 1 /* Temporary Error */);
> printk(KERN_WARNING
> @@ -390,8 +392,10 @@ struct pci_dn * handle_eeh_events (struc
> */
> if (result == PCI_ERS_RESULT_NONE) {
> rc = eeh_reset_device(frozen_pdn, frozen_bus);
> - if (rc)
> + if (rc) {
> + printk(KERN_WARNING "EEH: Unable to reset, rc=%d\n", rc);
> goto hard_fail;
> + }
> }
>
> /* If all devices reported they can proceed, then re-enable MMIO */
> @@ -417,21 +421,27 @@ struct pci_dn * handle_eeh_events (struc
> }
>
> /* If any device has a hard failure, then shut off everything. */
> - if (result == PCI_ERS_RESULT_DISCONNECT)
> + if (result == PCI_ERS_RESULT_DISCONNECT) {
> + printk(KERN_WARNING "EEH: Device driver gave up\n");
> goto hard_fail;
> + }
>
> /* If any device called out for a reset, then reset the slot */
> if (result == PCI_ERS_RESULT_NEED_RESET) {
> rc = eeh_reset_device(frozen_pdn, NULL);
> - if (rc)
> + if (rc) {
> + printk(KERN_WARNING "EEH: Cannot reset, rc=%d\n", rc);
> goto hard_fail;
> + }
> result = PCI_ERS_RESULT_NONE;
> pci_walk_bus(frozen_bus, eeh_report_reset, &result);
> }
>
> /* All devices should claim they have recovered by now. */
> - if (result != PCI_ERS_RESULT_RECOVERED)
> + if (result != PCI_ERS_RESULT_RECOVERED) {
> + printk(KERN_WARNING "EEH: Not recovered\n");
> goto hard_fail;
> + }
>
> /* Tell all device drivers that they can resume operations */
> pci_walk_bus(frozen_bus, eeh_report_resume, NULL);
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
--
Brian King
eServer Storage I/O
IBM Linux Technology Center
More information about the Linuxppc-dev
mailing list