hotplug remove vs. device driver close

linas at austin.ibm.com linas at austin.ibm.com
Fri Jun 4 04:50:44 EST 2004


On Thu, Jun 03, 2004 at 09:20:20AM -0700, Greg KH wrote:
> On Thu, Jun 03, 2004 at 11:40:04AM +1000, Anton Blanchard wrote:
> >
> > > > We are hitting a situation where we are hot-plug removing a pci card
> > > > before closing the device driver.  This seems to lead to kernel
> > > > memory leaks if not outright crashes. I'm trying to understand what
> > > > the correct solution to this is supposed to be.
> > >
> > > To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"
> >
> > How do you currently guarantee this on cardbus?
>
> We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> quite easily, so it is pretty simple to fix up a PCI driver to also
> handle this.
>
> But the main answer is that the PCI Hotplug spec states that the OS does
> NOT have to protect for this happening to regular PCI devices.

So if I understand what you are saying: if the OS crashes because of
a sysadmin error or a script error during pci hotplug remove, that's
considered OK?

I understand why the PCI spec would say that: they have no desire
to over-burden already struggling OS developers: the PCI spec
committee probably thinks in terms of "provide function not policy".
That's normal and as it should be.

But in the five-9's world of high availability, automatic failover,
etc. etc. this sure sounds like a great way of putting executives
on a warpath.  I humbly suggest that the Linux kernel policy should
be that we do better than th PCI spec, and attempt minimize damage
due to operator error.  If not all drivers or tools or subsystems
adhere to this policy, so be it, but robustness should be a goal.


--linas


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list