kvm PCI assignment & VFIO ramblings

David Gibson dwg at au1.ibm.com
Mon Aug 1 12:48:46 EST 2011


On Sat, Jul 30, 2011 at 09:58:53AM +1000, Benjamin Herrenschmidt wrote:
[snip]
> That current hack won't work well if two devices share an iommu. Note
> that we have an additional constraint here due to our paravirt
> interfaces (specificed in PAPR) which is that PE domains must have a
> common parent. Basically, pHyp makes them look like a PCIe host bridge
> per domain in the guest. I think that's a pretty good idea and qemu
> might want to do the same.
> 
> - We hack out the currently unconditional mapping of the entire guest
> space in the iommu. Something will have to be done to "decide" whether
> to do that or not ... qemu argument -> ioctl ?

Not quite.  We already require the not-yet-upstream patches which add
guest-side (emulated) IOMMU support to qemu.  The approach we're using
for the passthrough (or at least will when I fix up my patches again)
is that we only map all guest ram into the vfio iommu if and only if
there is no guest visible iommu advertised in the qdev.

This kind of makes sense - if there is no iommu from the guest
perspective, the guest will expect to see all its physical memory 1:1
in DMA.

The hacky bit is that when there *is* a guest visible iommu, it's
assumed that whatever interface the guest iommu uses is somehow wired
up to vfio map/unmap calls.  For us at the moment, this means
passthrough devices for us must be assigned to a special (guest) pci
domain which sets up a suitable wires up the paravirt iommu to the vfio iommu.

In theory under some circumstances, with full emu, you could wire up
an emulated guest iommu interface to a different host iommu
implementation via this mechanism.  However that wouldn't work if the
guest and host iommus capabilities are too different, and in any case
would require considerable extra abstraction work on the qemu guest
iommu code.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


More information about the Linuxppc-dev mailing list