[PATCH] pSeries: EEH improperly enabled for some Power4 systems

Linas Vepstas linas at austin.ibm.com
Sat Jan 27 09:53:11 EST 2007


On Sat, Jan 27, 2007 at 09:30:57AM +1100, Paul Mackerras wrote:
> Linas Vepstas writes:
> 
> > It appears that EEH is improperly enabled for some Power4 systems.
> > On these systems, the ibm,set-eeh-option returns a value of success
> > even when EEH is not supported on the given node. Thus, an explicit
> > check for support is required.
> 
> What happens on the power4 systems when EEH is improperly enabled?
> 
> What systems has the patch been tested on?

Sorry, I should have said more from the get-go.

During boot, on power4, without this patch, one sees messages 
similar to:

EEH: event on unsupported device, rc=0 dn=/pci at 400000000110/IBM,sp at 1
EEH: event on unsupported device, rc=0 dn=/pci at 400000000110/pci at 2
EEH: event on unsupported device, rc=0 dn=/pci at 400000000110/pci at 2,2
etc.

The patch makes these go away.

Without this patch, EEH recovery does seem to work correctly for 
at least some devices (I tested ethernet e1000), but fails to 
recover others (the Emulex LightPulse LPFC, most notably). 
Off the top of my head, I don't remember why some devices are 
affected, but not others.

The PAPR indicates that the correct way to test for EEH is as 
done in this patch; its not clear to me if this was in the PAPR 
all along, or recently added; if it was there all along, its not
clear to me why this hadn't been fixed long ago. I suspect only
certain firmware levels are affected.

I've tested on one power4 and one power5; both have "old" 
firmware (firmware dating back to not long after product 
announce). It sure would be nice to test on more machines, huh? 
I don't know how to quickly test on a broad spectrum of machines.

If this makes you nervous, I suppose this patch can wait for
the 2.6.21 series.

--linas




More information about the Linuxppc-dev mailing list