kvm PCI assignment & VFIO ramblings
Joerg Roedel
joerg.roedel at amd.com
Tue Aug 23 03:25:08 EST 2011
On Sat, Aug 20, 2011 at 12:51:39PM -0400, Alex Williamson wrote:
> We had an extremely productive VFIO BoF on Monday. Here's my attempt to
> capture the plan that I think we agreed to:
>
> We need to address both the description and enforcement of device
> groups. Groups are formed any time the iommu does not have resolution
> between a set of devices. On x86, this typically happens when a
> PCI-to-PCI bridge exists between the set of devices and the iommu. For
> Power, partitionable endpoints define a group. Grouping information
> needs to be exposed for both userspace and kernel internal usage. This
> will be a sysfs attribute setup by the iommu drivers. Perhaps:
>
> # cat /sys/devices/pci0000:00/0000:00:19.0/iommu_group
> 42
Right, that is mainly for libvirt to provide that information to the
user in a meaningful way. So userspace is aware that other devices might
not work anymore when it assigns one to a guest.
>
> (I use a PCI example here, but attribute should not be PCI specific)
>
> From there we have a few options. In the BoF we discussed a model where
> binding a device to vfio creates a /dev/vfio$GROUP character device
> file. This "group" fd provides provides dma mapping ioctls as well as
> ioctls to enumerate and return a "device" fd for each attached member of
> the group (similar to KVM_CREATE_VCPU). We enforce grouping by
> returning an error on open() of the group fd if there are members of the
> group not bound to the vfio driver. Each device fd would then support a
> similar set of ioctls and mapping (mmio/pio/config) interface as current
> vfio, except for the obvious domain and dma ioctls superseded by the
> group fd.
>
> Another valid model might be that /dev/vfio/$GROUP is created for all
> groups when the vfio module is loaded. The group fd would allow open()
> and some set of iommu querying and device enumeration ioctls, but would
> error on dma mapping and retrieving device fds until all of the group
> devices are bound to the vfio driver.
I am in favour of /dev/vfio/$GROUP. If multiple devices should be
assigned to a guest, there can also be an ioctl to bind a group to an
address-space of another group (certainly needs some care to not allow
that both groups belong to different processes).
Btw, a problem we havn't talked about yet entirely is
driver-deassignment. User space can decide to de-assign the device from
vfio while a fd is open on it. With PCI there is no way to let this fail
(the .release function returns void last time i checked). Is this a
problem, and yes, how we handle that?
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
More information about the Linuxppc-dev
mailing list