To extend the feature of vfio-mdev

Bob Liu liubo95 at
Mon Oct 23 12:52:19 AEDT 2017

On 2017/10/21 0:36, Alex Williamson wrote:
> On Fri, 20 Oct 2017 13:04:43 +0800
> Kenneth Lee <liguozhu at> wrote:
>> On Thu, Oct 19, 2017 at 12:56:04PM -0600, Alex Williamson wrote:
>>> Date: Thu, 19 Oct 2017 12:56:04 -0600
>>> From: Alex Williamson <alex.williamson at>
>>> To: Kenneth Lee <liguozhu at>
>>> CC: Jon Masters <jcm at>, Jon Masters <jcm at>,
>>>  Jonathan Cameron <jonathan.cameron at>, liubo95 at,
>>>  xuzaibo at
>>> Subject: Re: To extend the feature of vfio-mdev
>>> Message-ID: <20171019125604.26577eda at t450s.home>
>>> Hi Kenneth,
>>> On Thu, 19 Oct 2017 12:13:46 +0800
>>> Kenneth Lee <liguozhu at> wrote:
>>>> Dear Alex,
>>>> I hope this mail finding you well. This is to discuss the possibility to
>>>> extend the vfio-mdev feature to form a general accelerator framework for
>>>> Linux. I name the framework as "WrapDrive".
>>>> I made a presentation on Linaro Connect SFO17 (ref: 
>>>>, and discussed it
>>>> with Jon Master. He said he can connect us for further cooperation.
>>>> The idea of WrapDrive is to create a mdev for every user application so
>>>> they can share the same PF or VF facility. This is important to
>>>> accelerators, because we cannot create a VF for every process in most
>>>> cases.
>>>> WrapDrive need to add the following feature upon vfio and vfio-mdev
>>>> 1. Set unified abi in the sysfs so the same type of
>>>>    accelerator/algorithm can be managed from the user space  
>>> We already have a defined, standard mdev interface where vendor drivers
>>> can add additional attributes.  If warpdrive is a wrapper around
>>> vfio-mdev, can't it define standard attributes w/o vfio changes?  
>> Yes. We just define necessary attributes so the application with same
>> requirements can take it as a whole.
>>>> 2. Let the mdev use the parent dev's iommu facility  
>>> What prevents you from doing this now?  The mdev vendor driver is
>>> entirely responsible for managing the DMA of each mdev device.  Mdev
>>> vGPUs use the GTT of the parent device to do this today, vfio only
>>> tracks user mappings and provides pinned pages to the vendor driver on
>>> request.  IOW, this sounds like something within the scope of the
>>> vendor driver, not the vfio-mdev core.  
>> I'm sorry I don't know much how i915 work. But according to the implementation
>> of vfio_iommu_type1_attach_group, the mdev's iommu_group is added to the
>> external_domain list. But vfio_iommu_map() iommu_map() only the domain list.
>> Therefore, if ioctl(VFIO_IOMMU_MAP_DMA) to the mdev's iommu_group, it won't do
>> anything. What is mdev vendor driver expected to do? Should it register to the
>> notification chain or adopted another interface to do so? Is this intended by
>> the mdev driver? I think it may be necessary to provide some standard way by
>> default.
> This is the \mediation\ of a mediated driver, it needs to be aware of
> any DMA that the device might perform within the user address space and
> request pinning of those pages through the mdev interface.
> Additionally, when an IOMMU is active on the host, it's the mdev vendor
> driver's responsibility to setup any necessary IOMMU mappings for the
> mdev.  The mdev device works within the IOMMU context of the parent
> device.  There is no magic "map everything" option with mdev as there is
> for IOMMU isolated devices.  Part of the idea of mdev is that isolation
> can be provided by device specific means, such as GTTs for vGPUs.  We
> currently have only an invalidation notifier such that vendor drivers
> can invalidate pinned mappings when unmapped by the user, the mapping
> path presumes device mediation to explicitly request page pinning based
> on device configuration.
>>>> 3. Let iommu driver accept more than one iommu_domain for the same
>>>>    device. The substream id or pasid should be support for that  
>>> You're really extending the definition of an iommu_domain to include
>>> PASID to do this, I don't think it makes sense in the general case.  So
>>> perhaps you're talking about a PASID management layer sitting on top of
>>> an iommu_domain.  AIUI for PCIe, a device has a requester ID which is
>>> used to find the context entry for that device.  The IOMMU may support
>>> PASID, which would cause a first level lookup via those set of page
>>> tables, or it might only support second level translation.  The
>>> iommu_domain is a reflection of that initial, single requester ID.  
>> Maybe I misunderstand this. But the IOMMU hardware, such as SMMU for ARM,
>> support multiple page table and is referred by something like ASID. If we should
>> support it in Linux, iommu_domain should be the best choice (no matter you call
>> it cookie or id or something else). Or where you can get a object referring to it?
> For PASID, a PASID is unique only within the requester ID.  I don't
> know of anything equivalent to your ASID within PCIe.
>>>> 4. Support SVM in vfio and iommu  
>>> There are numerous discussions about this ongoing.  
>> Yes. I just said we needed the support.
> It seems like this is the crux of your design if you're looking for
> IOMMU based isolation based on PASID with dynamic mapping of the
> process address space.  There was quite a lot of discussion about this
> at the PCI/IOMMU/VFIO uconf at LPC this year and the details of the
> IOMMU API interfaces are currently being developed.  This has
> implications for both directly assigned vfio devices as well as the
> potential to further enhance vfio-mdev such that DMA isolation and
> mapping might be managed in a common way while the vendor driver
> manages only the partitioning of the device.

Yes, actually when vfio-mdev was first introduced the PASID feature was not popular.
So that we have to use a very complex vendor driver for supporting virtual device while only one physical device.

But now things changed, PASID is being popular.
And with PASID, you may think the device(with a pasid number) is actually a normal physical device which can use full iommu.

In theory, for such device we can use vfio-pci or vfio-platform.
But vfio-mdev provides a dynamic create interface which is very useful and reusable. 

Enhance vfio-mdev such that DMA isolation and mapping might be managed in a common way
would be very important.


More information about the Linux-accelerators mailing list