kvm PCI assignment & VFIO ramblings

Alexander Graf agraf at suse.de
Wed Aug 24 13:36:59 EST 2011


On 23.08.2011, at 18:41, Benjamin Herrenschmidt wrote:

> On Tue, 2011-08-23 at 10:23 -0600, Alex Williamson wrote:
>> 
>> Yeah.  Joerg's idea of binding groups internally (pass the fd of one
>> group to another via ioctl) is one option.  The tricky part will be
>> implementing it to support hot unplug of any group from the
>> supergroup.
>> I believe Ben had a suggestion that supergroups could be created in
>> sysfs, but I don't know what the mechanism to do that looks like.  It
>> would also be an extra management step to dynamically bind and unbind
>> groups to the supergroup around hotplug.  Thanks, 
> 
> I don't really care that much what the method for creating them is, to
> be honest, I just prefer this concept of "meta groups" or "super groups"
> or "synthetic groups" (whatever you want to name them) to having a
> separate uiommu file descriptor.
> 
> The one reason I have a slight preference for creating them "statically"
> using some kind of separate interface (again, I don't care whether it's
> sysfs, netlink, etc...) is that it means things like qemu don't have to
> care about them.
> 
> In general, apps that want to use vfio can just get passed the path to
> such a group or the /dev/ path or the group number (whatever we chose as
> the way to identify a group), and don't need to know anything about
> "super groups", how to manipulate them, create them, possible
> constraints etc...
> 
> Now, libvirt might want to know about that other API in order to provide
> control on the creation of these things, but that's a different issue.
> 
> By "static" I mean they persist, they aren't tied to the lifetime of an
> fd.
> 
> Now that's purely a preference on my side because I believe it will make
> life easier for actual programs wanting to use vfio to not have to care
> about those super-groups, but as I said earlier, I don't actually care
> that much :-)

Oh I think it's one of the building blocks we need for a sane user space device exposure API. If I want to pass user X a few devices that are all behind a single IOMMU, I just chown that device node to user X and be done with it.

The user space tool actually using the VFIO interface wouldn't be in configuration business then - and it really shouldn't. That's what system configuration is there for :).

But I'm fairly sure we managed to persuade Alex that this is the right path on the BOF :)


Alex



More information about the Linuxppc-dev mailing list