kvm PCI assignment & VFIO ramblings

Roedel, Joerg Joerg.Roedel at amd.com
Tue Aug 23 23:14:41 EST 2011


On Mon, Aug 22, 2011 at 03:17:00PM -0400, Alex Williamson wrote:
> On Mon, 2011-08-22 at 19:25 +0200, Joerg Roedel wrote:

> > I am in favour of /dev/vfio/$GROUP. If multiple devices should be
> > assigned to a guest, there can also be an ioctl to bind a group to an
> > address-space of another group (certainly needs some care to not allow
> > that both groups belong to different processes).
> 
> That's an interesting idea.  Maybe an interface similar to the current
> uiommu interface, where you open() the 2nd group fd and pass the fd via
> ioctl to the primary group.  IOMMUs that don't support this would fail
> the attach device callback, which would fail the ioctl to bind them.  It
> will need to be designed so any group can be removed from the super-set
> and the remaining group(s) still works.  This feels like something that
> can be added after we get an initial implementation.

Handling it through fds is a good idea. This makes sure that everything
belongs to one process. I am not really sure yet if we go the way to
just bind plain groups together or if we create meta-groups. The
meta-groups thing seems somewhat cleaner, though.

> > Btw, a problem we havn't talked about yet entirely is
> > driver-deassignment. User space can decide to de-assign the device from
> > vfio while a fd is open on it. With PCI there is no way to let this fail
> > (the .release function returns void last time i checked). Is this a
> > problem, and yes, how we handle that?
> 
> The current vfio has the same problem, we can't unbind a device from
> vfio while it's attached to a guest.  I think we'd use the same solution
> too; send out a netlink packet for a device removal and have the .remove
> call sleep on a wait_event(, refcnt == 0).  We could also set a timeout
> and SIGBUS the PIDs holding the device if they don't return it
> willingly.  Thanks,

Putting the process to sleep (which would be uninterruptible) seems bad.
The process would sleep until the guest releases the device-group, which
can take days or months.
The best thing (and the most intrusive :-) ) is to change PCI core to
allow unbindings to fail, I think. But this probably further complicates
the way to upstream VFIO...

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632



More information about the Linuxppc-dev mailing list