kvm PCI assignment & VFIO ramblings

Chris Wright chrisw at sous-sol.org
Sat Aug 27 07:06:19 EST 2011


* Aaron Fabbri (aafabbri at cisco.com) wrote:
> On 8/26/11 12:35 PM, "Chris Wright" <chrisw at sous-sol.org> wrote:
> > * Aaron Fabbri (aafabbri at cisco.com) wrote:
> >> Each process will open vfio devices on the fly, and they need to be able to
> >> share IOMMU resources.
> > 
> > How do you share IOMMU resources w/ multiple processes, are the processes
> > sharing memory?
> 
> Sorry, bad wording.  I share IOMMU domains *within* each process.

Ah, got it.  Thanks.

> E.g. If one process has 3 devices and another has 10, I can get by with two
> iommu domains (and can share buffers among devices within each process).
> 
> If I ever need to share devices across processes, the shared memory case
> might be interesting.
> 
> > 
> >> So I need the ability to dynamically bring up devices and assign them to a
> >> group.  The number of actual devices and how they map to iommu domains is
> >> not known ahead of time.  We have a single piece of silicon that can expose
> >> hundreds of pci devices.
> > 
> > This does not seem fundamentally different from the KVM use case.
> > 
> > We have 2 kinds of groupings.
> > 
> > 1) low-level system or topoolgy grouping
> > 
> >    Some may have multiple devices in a single group
> > 
> >    * the PCIe-PCI bridge example
> >    * the POWER partitionable endpoint
> > 
> >    Many will not
> > 
> >    * singleton group, e.g. typical x86 PCIe function (majority of
> >      assigned devices)
> > 
> >    Not sure it makes sense to have these administratively defined as
> >    opposed to system defined.
> > 
> > 2) logical grouping
> > 
> >    * multiple low-level groups (singleton or otherwise) attached to same
> >      process, allowing things like single set of io page tables where
> >      applicable.
> > 
> >    These are nominally adminstratively defined.  In the KVM case, there
> >    is likely a privileged task (i.e. libvirtd) involved w/ making the
> >    device available to the guest and can do things like group merging.
> >    In your userspace case, perhaps it should be directly exposed.
> 
> Yes.  In essence, I'd rather not have to run any other admin processes.
> Doing things programmatically, on the fly, from each process, is the
> cleanest model right now.

I don't see an issue w/ this.  As long it can not add devices to the
system defined groups, it's not a privileged operation.  So we still
need the iommu domain concept exposed in some form to logically put
groups into a single iommu domain (if desired).  In fact, I believe Alex
covered this in his most recent recap:

  ...The group fd will provide interfaces for enumerating the devices
  in the group, returning a file descriptor for each device in the group
  (the "device fd"), binding groups together, and returning a file
  descriptor for iommu operations (the "iommu fd").

thanks,
-chris


More information about the Linuxppc-dev mailing list