hotplug remove vs. device driver close

Greg KH greg at kroah.com
Fri Jun 4 05:02:06 EST 2004


On Thu, Jun 03, 2004 at 01:50:44PM -0500, linas at austin.ibm.com wrote:
> On Thu, Jun 03, 2004 at 09:20:20AM -0700, Greg KH wrote:
> > On Thu, Jun 03, 2004 at 11:40:04AM +1000, Anton Blanchard wrote:
> > >
> > > > > We are hitting a situation where we are hot-plug removing a pci card
> > > > > before closing the device driver.  This seems to lead to kernel
> > > > > memory leaks if not outright crashes. I'm trying to understand what
> > > > > the correct solution to this is supposed to be.
> > > >
> > > > To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"
> > >
> > > How do you currently guarantee this on cardbus?
> >
> > We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> > quite easily, so it is pretty simple to fix up a PCI driver to also
> > handle this.
> >
> > But the main answer is that the PCI Hotplug spec states that the OS does
> > NOT have to protect for this happening to regular PCI devices.
>
> So if I understand what you are saying: if the OS crashes because of
> a sysadmin error or a script error during pci hotplug remove, that's
> considered OK?

As sysadmin I can delete your whole root fs, and reboot the box into
obvilion.  Are you considering changing this ability too?  :)

If you are really worried about this, then look into a different
permisssion model for Linux like SELinux.

Or you can simply fix up your PCI driver to properly handle reading all
FF when the device has been removed.  That seems to be what you need to
do to solve this for your small subset of drivers on your platform,
correct?

> I understand why the PCI spec would say that: they have no desire
> to over-burden already struggling OS developers: the PCI spec
> committee probably thinks in terms of "provide function not policy".
> That's normal and as it should be.

That's also what the kernel provides, function not policy.  Put your
policy in userspace and force your admin to use a tool that ensures that
the device has properly shutdown anything that is bound to that device
before it tells the kernel to remove it from the system.

thanks,

greg k-h

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list