[PATCH 7/13]: PCI Err: Symbios SCSI driver recovery

Benjamin Herrenschmidt benh at kernel.crashing.org
Wed Jun 29 11:51:07 EST 2005


On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote:
> pci-err-7-symbios.patch
> 
> Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver.
> Tested, seems to work well under i/o stress to one disk. Not
> stress tested under heavy i/o to multiple scsi devices.
> 
> Note the check of the pci error state flag inside an infinite
> loop inside the interrupt handler. Without this check, the 
> device can spin forever, locking up hard, long before the 
> asynchronous error event (and callbacks) are ever called. 

I don't understand the logic of that check. In general, I don't think
checking the error state is reliable at all. You may be in an interrupt
on the only CPU in the system, thus the error management code may have
no chance to update that error state field while you are looping... It
may work for us since we call the eeh stuff from the IO accessors but
will not in the generic case.

Normally, you should check for non-responding hardware by testing things
like reading all ff's or having a timeout in the loop. The bug is that
the driver has a potential infinite loop in the first place.

The only type of "synchronous" error checking that can be done is what
is proposed by Hidetoshi Seto. You could use his stuff here.

Ben.





More information about the Linuxppc64-dev mailing list