[PATCH] [ACENIC] Add check for EEH slot frozen in watchdog handler

olof at austin.ibm.com olof at austin.ibm.com
Tue Aug 26 06:18:44 EST 2003


The below patch adds functionality for IBM pSeries machines, where a PCI
slot can be frozen when the hardware detects a problem with the board (or
a card-initiated read/write to a protected/nonexistent memory area).

The way the EEH handling is architected in ppc64 is by hooking into the
read{b,w,l} macros and friends. The watchdog handler doesn't ever go down
that path, so there were two choices available:

1. Add a piece of code that will read a register on the board
2. Add an explicit check (this is somewhat expensive, computation-wise).

(2) was chosen since this is not in a common path, and it would seem
pretty random to just read any register (besides the risk that someone
would ask "Why's this here?" and rip it out later. :-)

Patch is against file revision 1.27 in BK, it should apply to 2.4.22 with
a couple of lines offset.



Thanks,

Olof

Olof Johansson                                        Office: 4E002/905
pSeries Linux Development                             IBM Systems Group
Email: olof at austin.ibm.com                          Phone: 512-838-9858
All opinions are my own and not those of IBM




--- linux-2.4/drivers/net/acenic.c.orig	2003-08-25 15:04:37.000000000 -0500
+++ linux-2.4/drivers/net/acenic.c	2003-08-25 15:05:25.000000000 -0500
@@ -67,6 +67,10 @@
 #include <linux/highmem.h>
 #include <linux/sockios.h>

+#ifdef CONFIG_PPC_PSERIES
+#include <asm/eeh.h>
+#endif
+
 #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
 #include <linux/if_vlan.h>
 #endif
@@ -1867,6 +1871,15 @@ static void ace_watchdog(struct net_devi
 		       dev->name);
 		netif_wake_queue(dev);
 	}
+
+#ifdef CONFIG_PPC_PSERIES
+	/* IBM pSeries (ppc64) has a feature called EEH, in which a slot is
+	 * frozen if the bridge detects a parity error or DMA access violation.
+	 * It's possible that the watchdog triggers because the slot got frozen,
+	 * verify that this is not the case.
+	 */
+	eeh_check_failure(regs, 0);
+#endif
 }


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/




More information about the Linuxppc64-dev mailing list