Symbios PCI error recovery [Was: Re: [PATCH/RFC] ppc64: EEH + SCSI recovery (IPR only)]

Linas Vepstas linas at
Fri Apr 1 06:06:22 EST 2005


Got distracted by other issues, so I'm answering a week late...

On Tue, Mar 22, 2005 at 10:57:28AM -0700, Grant Grundler was heard to remark:
> On Mon, Mar 21, 2005 at 05:10:28PM -0600, Linas Vepstas wrote:
> > My current hardware will halt all i/o to/from the symbios controller
> > upon detection of a PCI error.  The recovery proceedure that I am
> > currently using is to call system firmware (aka 'bios') to raise
> > and then lower the #RST pci signal line for 1/4 second, then wait 2
> > seconds for the  PCI bus to settle, then restore the PCI config space
> > registers (BARs, interrupt line, etc) to what they used to be. Then,
> > I call sym_start_up() in an attempt to get the symbios card working
> > again.  And that's where I get stuck ... 
> Does this process cause a SCSI bus reset?

Don't get a chance to get that far.  Have to bring up the PCI interfaces
first, before any scsi command can be issued.

> BTW, when did sym2 get a chance to cleanup "pending" requests?

Yes, the sym2 driver has mechanisms for that.

> You want everything moved back to the "queued" state or failed
> (flush pending IO so upper layers can retry if they want).

Upper layer is the linux block device; my understanding is that it does
not retry, nor do the filesystems above that.  Passing errors upwards
seems to be pretty darned fatal.  My goal is to limit retries to the

> > Sometimes, I get the PCI error while the card is sitting there idly
> > after the #RST, but more often, I get the error in sym_chip_reset(),
> > immediately after the   OUTB (nc_istat, SRST);
> Oh? Is this the driver trying to issue SCSI Reset?

No I am trying to reinitialize the scsi card after the pci bus has been
reset.  This has nothing to do with scsi bus resets, as far as I know


More information about the Linuxppc64-dev mailing list