[PATCH 5/7] ppc64: EEH handle empty PCI slot failure

linas linas at austin.ibm.com
Fri Sep 30 10:58:27 EST 2005



05-eeh-empty-slot-error.patch

Performing PCI config-space reads to empty PCI slots can lead to reports of 
"permanent failure" from the firmware. Ignore permanent failures on empty slots.

Signed-off-by: Linas Vepstas <linas at linas.org>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c	2005-09-29 16:06:25.583986100 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c	2005-09-29 16:06:33.567867154 -0500
@@ -617,7 +617,32 @@
 	 * In any case they must share a common PHB.
 	 */
 	ret = read_slot_reset_state(pdn, rets);
-	if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) {
+
+	/* If the call to firmware failed, punt */
+	if (ret != 0) {
+		printk(KERN_WARNING "EEH: read_slot_reset_state() failed; rc=%d dn=%s\n",
+		       ret, dn->full_name);
+		__get_cpu_var(false_positives)++;
+		return 0;
+	}
+
+	/* If EEH is not supported on this device, punt. */
+	if (rets[1] != 1) {
+		printk(KERN_WARNING "EEH: event on unsupported device, rc=%d dn=%s\n",
+		       ret, dn->full_name);
+		__get_cpu_var(false_positives)++;
+		return 0;
+	}
+
+	/* If not the kind of error we know about, punt. */
+	if (rets[0] != 2 && rets[0] != 4 && rets[0] != 5) {
+		__get_cpu_var(false_positives)++;
+		return 0;
+	}
+
+	/* Note that config-io to empty slots may fail;
+	 * we recognize empty because they don't have children. */
+	if ((rets[0] == 5) && (dn->child == NULL)) {
 		__get_cpu_var(false_positives)++;
 		return 0;
 	}
@@ -650,7 +675,7 @@
 	/* Most EEH events are due to device driver bugs.  Having
 	 * a stack trace will help the device-driver authors figure
 	 * out what happened.  So print that out. */
-	dump_stack();
+	if (rets[0] != 5) dump_stack();
 	schedule_work(&eeh_event_wq);
 
 	return 0;



More information about the Linuxppc64-dev mailing list