To extend the feature of vfio-mdev
liguozhu at hisilicon.com
Tue Oct 24 20:42:59 AEDT 2017
On Mon, Oct 23, 2017 at 01:03:37PM +0100, Jean-Philippe Brucker wrote:
> Date: Mon, 23 Oct 2017 13:03:37 +0100
> From: Jean-Philippe Brucker <jean-philippe.brucker at arm.com>
> To: Kenneth Lee <liguozhu at hisilicon.com>
> CC: Alex Williamson <alex.williamson at redhat.com>,
> "kenneth-lee-2012 at foxmail.com" <kenneth-lee-2012 at foxmail.com>, Jon Masters
> <jcm at jonmasters.org>, "jcm at redhat.com" <jcm at redhat.com>,
> "xuzaibo at huawei.com" <xuzaibo at huawei.com>,
> "linux-accelerators at lists.ozlabs.org"
> <linux-accelerators at lists.ozlabs.org>
> Subject: Re: To extend the feature of vfio-mdev
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
> Message-ID: <bc7a595c-880f-cdb2-9b6e-d617d6b7efdb at arm.com>
> On 23/10/17 11:37, Kenneth Lee wrote:
> > On Mon, Oct 23, 2017 at 09:31:25AM +0100, Jean-Philippe Brucker wrote:
> >> Date: Mon, 23 Oct 2017 09:31:25 +0100
> >> From: Jean-Philippe Brucker <jean-philippe.brucker at arm.com>
> >> To: Kenneth Lee <liguozhu at hisilicon.com>, Alex Williamson
> >> <alex.williamson at redhat.com>
> >> CC: "kenneth-lee-2012 at foxmail.com" <kenneth-lee-2012 at foxmail.com>, Jon
> >> Masters <jcm at jonmasters.org>, "jcm at redhat.com" <jcm at redhat.com>,
> >> "xuzaibo at huawei.com" <xuzaibo at huawei.com>,
> >> "linux-accelerators at lists.ozlabs.org"
> >> <linux-accelerators at lists.ozlabs.org>
> >> Subject: Re: To extend the feature of vfio-mdev
> >> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
> >> Thunderbird/52.4.0
> >> Message-ID: <6fed6c27-4b38-7d65-ee5d-63d6bb283e95 at arm.com>
> >> On 23/10/17 06:18, Kenneth Lee wrote:
> >>>>>>> 3. Let iommu driver accept more than one iommu_domain for the same
> >>>>>>> device. The substream id or pasid should be support for that
> >>>>>> You're really extending the definition of an iommu_domain to include
> >>>>>> PASID to do this, I don't think it makes sense in the general case. So
> >>>>>> perhaps you're talking about a PASID management layer sitting on top of
> >>>>>> an iommu_domain. AIUI for PCIe, a device has a requester ID which is
> >>>>>> used to find the context entry for that device. The IOMMU may support
> >>>>>> PASID, which would cause a first level lookup via those set of page
> >>>>>> tables, or it might only support second level translation. The
> >>>>>> iommu_domain is a reflection of that initial, single requester ID.
> >>>>> Maybe I misunderstand this. But the IOMMU hardware, such as SMMU for ARM,
> >>>>> support multiple page table and is referred by something like ASID. If we should
> >>>>> support it in Linux, iommu_domain should be the best choice (no matter you call
> >>>>> it cookie or id or something else). Or where you can get a object referring to it?
> >>>> For PASID, a PASID is unique only within the requester ID. I don't
> >>>> know of anything equivalent to your ASID within PCIe.
> >>> As my understanding to the ARM Spec:
> >>> PCIE[PASID]=ARM_SMMU[ASID]=ARM_SMMU[SubstreamID],
> >>> PCIE[RequestID]=ARM_SMMU[StreamID]
> >> In the SMMUv3 spec, the PASID is dissociated from the ASID. We set the
> >> ASID in the context descriptor indexed by PASID, which provides more
> >> flexibility when the CPU's ASID capacity and endpoint's PASID capacity differ.
> > Yes. Sorry, I tried to make the concept simply. But it is wrong slightly.
> That's all right :) I'm still not sure if we should implement ASID=PASID
> for arm. I think it would be overly complicated to implement, because of
> the capacity difference and because it prevents users from allocating and
> controlling PASIDs contexts on their own with map/unmap, without attaching
> to a Linux process (some users want this).
I read your RFCv2 and saw that you had wrapped it into iommu_process. I think it
is a good bet. The only risk is, the device PASID need not to refer to a CPU
> >>> For ARM SMMU, the Stream ID is used to index a Context Descriptors
> >>> Table while and Sub-stream ID is used to index a Descriptor which refer
> >>> to a general page table in the same format as MMU.
> >>> So both RequestID and PASID uniquely identify a address space. Then
> >>> every iommu-enabled device can service more than one user process at the
> >>> same time.
> >>> So the IOMMU should support more than one page table, which in turn
> >>> should be added to somewhere in Linux. If the iommu_domain refer to one
> >>> address space, the iommu driver should accept more than one
> >>> iommu_domain.
> >> The current design choice is to have multiple address space per
> >> iommu_domain (and one PASID table per domain). It fits better with
> >> existing IOMMU and VFIO APIs.
> > I am still reading the patchset. But if the PASID is used to index the address
> > space table in iommu_domain, it will become a general concept in the whole
> > system. But if we would bind the address space with iommu_domain, the
> > iommu_domain become a handle for any process to control the device dma. It will
> > be easier for one process with two or more domains to control the device for
> > different purpose. And it is also easier for a process hand over the domain to
> > another without change anything. And the process need not to know anything about
> > pasid.
> Unfortunately this cannot work with vfio-pci, where one userspace process
> owns the whole function. Userspace has to obtain the PASID from the kernel
> allocator and somehow program it into the device, with an
> implementation-defined method.
> > A domain is a handle to control the device separately. I think this make
> > thing simpler.
> Similarly, how different parts of a function can be controlled and
> assigned separate PASIDs is implementation-defined in PCI. The
> userspace driver is given a single device fd in a VFIO container, and
> has to do the partitioning itself.
OK. This design is also good for WrapDrive, which just need to share the same
device with different channel for every application. I will try to change our
design slightly to match your new design and test on our SoC. And maybe when the
hardware is ready for release (expected to be early of next year), we can send
hardware (D06) to you for test it on real silicon.
This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!
More information about the Linux-accelerators