To extend the feature of vfio-mdev
Bob Liu
liubo95 at huawei.com
Mon Oct 23 12:52:19 AEDT 2017
On 2017/10/21 0:36, Alex Williamson wrote:
> On Fri, 20 Oct 2017 13:04:43 +0800
> Kenneth Lee <liguozhu at hisilicon.com> wrote:
>
>> On Thu, Oct 19, 2017 at 12:56:04PM -0600, Alex Williamson wrote:
>>> Date: Thu, 19 Oct 2017 12:56:04 -0600
>>> From: Alex Williamson <alex.williamson at redhat.com>
>>> To: Kenneth Lee <liguozhu at hisilicon.com>
>>> CC: Jon Masters <jcm at jonmasters.org>, Jon Masters <jcm at redhat.com>,
>>> Jonathan Cameron <jonathan.cameron at huawei.com>, liubo95 at huawei.com,
>>> xuzaibo at huawei.com
>>> Subject: Re: To extend the feature of vfio-mdev
>>> Message-ID: <20171019125604.26577eda at t450s.home>
>>>
>>>
>>> Hi Kenneth,
>>>
>>> On Thu, 19 Oct 2017 12:13:46 +0800
>>> Kenneth Lee <liguozhu at hisilicon.com> wrote:
>>>
>>>> Dear Alex,
>>>>
>>>> I hope this mail finding you well. This is to discuss the possibility to
>>>> extend the vfio-mdev feature to form a general accelerator framework for
>>>> Linux. I name the framework as "WrapDrive".
>>>>
>>>> I made a presentation on Linaro Connect SFO17 (ref:
>>>> http://connect.linaro.org/resource/sfo17/sfo17-317/), and discussed it
>>>> with Jon Master. He said he can connect us for further cooperation.
>>>>
>>>> The idea of WrapDrive is to create a mdev for every user application so
>>>> they can share the same PF or VF facility. This is important to
>>>> accelerators, because we cannot create a VF for every process in most
>>>> cases.
>>>>
>>>> WrapDrive need to add the following feature upon vfio and vfio-mdev
>>>>
>>>> 1. Set unified abi in the sysfs so the same type of
>>>> accelerator/algorithm can be managed from the user space
>>>
>>> We already have a defined, standard mdev interface where vendor drivers
>>> can add additional attributes. If warpdrive is a wrapper around
>>> vfio-mdev, can't it define standard attributes w/o vfio changes?
>>
>> Yes. We just define necessary attributes so the application with same
>> requirements can take it as a whole.
>>
>>>
>>>> 2. Let the mdev use the parent dev's iommu facility
>>>
>>> What prevents you from doing this now? The mdev vendor driver is
>>> entirely responsible for managing the DMA of each mdev device. Mdev
>>> vGPUs use the GTT of the parent device to do this today, vfio only
>>> tracks user mappings and provides pinned pages to the vendor driver on
>>> request. IOW, this sounds like something within the scope of the
>>> vendor driver, not the vfio-mdev core.
>>
>> I'm sorry I don't know much how i915 work. But according to the implementation
>> of vfio_iommu_type1_attach_group, the mdev's iommu_group is added to the
>> external_domain list. But vfio_iommu_map() iommu_map() only the domain list.
>>
>> Therefore, if ioctl(VFIO_IOMMU_MAP_DMA) to the mdev's iommu_group, it won't do
>> anything. What is mdev vendor driver expected to do? Should it register to the
>> notification chain or adopted another interface to do so? Is this intended by
>> the mdev driver? I think it may be necessary to provide some standard way by
>> default.
>
> This is the \mediation\ of a mediated driver, it needs to be aware of
> any DMA that the device might perform within the user address space and
> request pinning of those pages through the mdev interface.
> Additionally, when an IOMMU is active on the host, it's the mdev vendor
> driver's responsibility to setup any necessary IOMMU mappings for the
> mdev. The mdev device works within the IOMMU context of the parent
> device. There is no magic "map everything" option with mdev as there is
> for IOMMU isolated devices. Part of the idea of mdev is that isolation
> can be provided by device specific means, such as GTTs for vGPUs. We
> currently have only an invalidation notifier such that vendor drivers
> can invalidate pinned mappings when unmapped by the user, the mapping
> path presumes device mediation to explicitly request page pinning based
> on device configuration.
>
>>>> 3. Let iommu driver accept more than one iommu_domain for the same
>>>> device. The substream id or pasid should be support for that
>>>
>>> You're really extending the definition of an iommu_domain to include
>>> PASID to do this, I don't think it makes sense in the general case. So
>>> perhaps you're talking about a PASID management layer sitting on top of
>>> an iommu_domain. AIUI for PCIe, a device has a requester ID which is
>>> used to find the context entry for that device. The IOMMU may support
>>> PASID, which would cause a first level lookup via those set of page
>>> tables, or it might only support second level translation. The
>>> iommu_domain is a reflection of that initial, single requester ID.
>>
>> Maybe I misunderstand this. But the IOMMU hardware, such as SMMU for ARM,
>> support multiple page table and is referred by something like ASID. If we should
>> support it in Linux, iommu_domain should be the best choice (no matter you call
>> it cookie or id or something else). Or where you can get a object referring to it?
>
> For PASID, a PASID is unique only within the requester ID. I don't
> know of anything equivalent to your ASID within PCIe.
>
>>>> 4. Support SVM in vfio and iommu
>>>
>>> There are numerous discussions about this ongoing.
>>
>> Yes. I just said we needed the support.
>
> It seems like this is the crux of your design if you're looking for
> IOMMU based isolation based on PASID with dynamic mapping of the
> process address space. There was quite a lot of discussion about this
> at the PCI/IOMMU/VFIO uconf at LPC this year and the details of the
> IOMMU API interfaces are currently being developed. This has
> implications for both directly assigned vfio devices as well as the
> potential to further enhance vfio-mdev such that DMA isolation and
> mapping might be managed in a common way while the vendor driver
> manages only the partitioning of the device.
>
Yes, actually when vfio-mdev was first introduced the PASID feature was not popular.
So that we have to use a very complex vendor driver for supporting virtual device while only one physical device.
But now things changed, PASID is being popular.
And with PASID, you may think the device(with a pasid number) is actually a normal physical device which can use full iommu.
In theory, for such device we can use vfio-pci or vfio-platform.
But vfio-mdev provides a dynamic create interface which is very useful and reusable.
Enhance vfio-mdev such that DMA isolation and mapping might be managed in a common way
would be very important.
--
Thanks,
Liubo
More information about the Linux-accelerators
mailing list