[patch 8/8] PCI Error Recovery: PPC64 core recovery routines
Linas Vepstas
linas at austin.ibm.com
Tue Aug 30 02:09:15 EST 2005
On Mon, Aug 29, 2005 at 04:40:20PM +1000, Paul Mackerras was heard to remark:
> Linas Vepstas writes:
>
> > Actually, no. There are three issues:
> > 1) hotplug routines are called from within kernel. GregKH has stated on
> > multiple occasions that doing this is wrong/bad/evil. This includes
> > calling hot-unplug.
> >
> > 2) As a result, the code to call hot-unplug is a bit messy. In
> > particular, there's a bit of hoop-jumping when hotplug is built as
> > as a module (and said hoops were wrecked recently when I moved the
> > code around, out of the rpaphp directory).
>
> One way to clean this up would be to make rpaphp the driver for the
> EADS bridges (from the pci code's point of view).
I guess I don't understand what that means. Are you suggesting moving
pSeries_pci.c into the rpaphp code directory?
> Then it would
> automatically get included in the error recovery process and could do
> whatever it should.
John Rose, the current maintainer of the rpaphp code, is pretty militant
about removing things from, not adding things to, the rpaphp code.
Which is a good idea, as chunks of that code are spaghetti, and do need
simplification and cleanup.
> > 3) Hot-unplug causes scripts to run in user-space. There is no way to
> > know when these scripts are done, so its not clear if we've waited
> > long enough before calling hot-add (or if waiting is even necessary).
>
> OK, so let's just add a new hotplug event called KOBJ_ERROR or
> something, which tells userspace that an error has occurred which has
> made the device inaccessible. Greg, would that be OK?
Why do we need such an event?
I would prefer to deprecate the hot-plug based recovery scheme. This
is for many reasons, including the fact that some devices that can get
pci errors are soldered onto the planar, and are not hot-pluggable.
--linas
More information about the Linuxppc64-dev
mailing list