Symbios PCI error recovery [Was: Re: [PATCH/RFC] ppc64: EEH + SCSI recovery (IPR only)]

Linas Vepstas linas at austin.ibm.com
Fri Apr 1 06:14:09 EST 2005


On Tue, Mar 22, 2005 at 11:38:36AM -0600, Brian King was heard to remark:
> Linas Vepstas wrote:
> > 
> > My current hardware will halt all i/o to/from the symbios controller
> > upon detection of a PCI error.  The recovery proceedure that I am
> > currently using is to call system firmware (aka 'bios') to raise
> > and then lower the #RST pci signal line for 1/4 second, then wait 2
> > seconds for the  PCI bus to settle, then restore the PCI config space
> > registers (BARs, interrupt line, etc) to what they used to be. Then,
> > I call sym_start_up() in an attempt to get the symbios card working
> > again.  And that's where I get stuck ... 
> > 
> > My assumption is that after the #RST, that the symbios card will sit
> > there, dumb and stupid, with no scripts running.  But sometimes I find 
> > that the card has done something to make the PCI error hardware trip
> > again.  Typically, this means that the card attempted to DMA to some
> > address that its not allowed to touch, or raised #SERR or possibly 
> > #PERR (I can't tell which). 
> 
> What config registers are you restoring? 

BAR's, grant, latency, interrupt, cacheline size. 

> Is it possible symbios does not
> like something in your config restore?

possibly...

> Another possiblity is that asserting PCI reset is not cleanly resetting
> the card. Does PCI reset force BIST to be run on these cards? You could
> try to manually run BIST on the card after the PCI reset to see if that

I didn't see bist in the code, but I wasn't looking for it either.  I
could try that.

> helps, or you could try power cycling the slot instead of using PCI reset.

yes I could :(  I'll try that next.  Problem is, not all slots are
power-cyclable, only the hotplug slots are.  I've discoverd that 
for example, the ethernet chips are soldered to the motherboard, and
can't be power-cycled (but fortunately, those don't give me trouble).


--linas



More information about the Linuxppc64-dev mailing list