PCI errors [was Re: "sparse" warnings..]
linas at austin.ibm.com
linas at austin.ibm.com
Thu May 6 04:06:13 EST 2004
On Tue, May 04, 2004 at 06:08:39PM -0700, Linus Torvalds wrote:
>
> On Tue, 4 May 2004 linas at austin.ibm.com wrote:
> >
> > Except that is not how the hardware works. Once you get the error,
> > that's it, the device is blown up out of the water, its history.
> > Its impossible to ignore this error.
>
> So?
> Return garbage, and continue.
>
> There's nothing else you _can_ do. Go on with life. If the driver doesn't
> have error recovery, what else woul you suggest?
Well, I guess there are two discussion threads here, short answer is
'yes, that's right'.
-- At the low level, 'what should the pio/mmio inb macros do'
discussion, the answer is that the checks are there because
the pSeries system architects have declared that the kernel should
panic as soon as possible if the device driver doesn't know what
to do with the EEH error. I'll see what I can do to review this
decision, but it may take months. Some words of wisdom with your
name attached to them may sway the outcome. (paulus & benh, if this
comes up in whatever system-level architecture discussions you are
privy to, let me know & sway the authorites as needed).
The current philosophy is that it it better to panic than to risk
unknown data corruption. Of course, why one would even have a
non-EEH aware adapter in a system that is so dad-burned critical
is a bit of a mystery to me.
-- At the high level, as you point out, many device drivers already
know how to deal with the all-ff's return value. I'm now mostly
trying to understand the paths that a hotplug even may take, and
make sure that things like resetting the slot state happen in all
the right places. I'm mostly looking for the minimal soultion:
to install any needed hooks in existing frameworks. For network
type things, I'm looking at the hotplug framework. For scsi,
I'm looking at the scsi reset sequence. Time will tell if this
was the right thing to do.
** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc64-dev
mailing list