[PATCH] PPC64: EEH Recovery

Linas Vepstas linas at austin.ibm.com
Tue Nov 23 07:21:18 EST 2004


On Sun, Nov 21, 2004 at 10:36:13AM +1100, Benjamin Herrenschmidt was heard to remark:
> On Sat, 2004-11-20 at 17:11 -0600, Milton D. Miller II wrote:
> 
> > 2) I object to grabing pci devices so they don't disappear and reappear.
> >    I worry about duplicate devices across register/unregister and sysfs
> >    kobject lifetimes getting confused and duplicate names.
> > 
> >   I'd prefer we just kept the pci config stuff we are going to restore
> >   off the of device node.
> 
> Agreed... though it could even be a driver responsibility to restore the
> stuff ... The basic stuff like BARs don't need to be saved/restored I
> suppose too, just get the kernel to re-assign addresses after the old
> ones have been freed ...

I don't know how to get the kernel to assign BAR's.  I looked for this, 
didn't find anything, and thus concluded that I had to save & restore
BAR's myself.  The power-management suspend-resume code comes close, 
but not quite. There were no firmware calls for this.

The firmware will set up the BAR's when power is toggled (initial system 
powerup, or pci hotplug).  However, it will *not* automatically
reconfigure after a reset.  There is an explicit call to ask it to
reconfigure a bridge after reset, but not a device.  So I used the 
configure-bridge rtas call, but did the devices myself, manually. 


> > How about a list of (dn *, pci config words to write)?
> > or an array of dn
> 
> I don't understand why we need to do that... it's totally redundant with
> just unplugging/re-plugging the device, the kernel will then re-assign
> addresses to it.

You mean "the firmware".  Yes, I thought about doing that, but the 
problem seemed to be that the rpa_php_hotplug tools were available
only as RPM's from the IBM website, and were specific to the PPC64
architecture.  So I figured that asking that generic, architecture-
independent udev and hotplug scripts to be modified to invoke a 
PPC64-specific closed-source binary was not going to work. So it
seemed easier to do the reset in the kernel; its not a whole lotta 
lines of code.  The only gotcha was to save and restore the various
BAR's so that the devices would come up properly.

--linas



More information about the Linuxppc64-dev mailing list