[RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

Jean-Philippe Brucker jean-philippe.brucker at arm.com
Tue Nov 20 23:19:07 AEDT 2018

On 20/11/2018 09:16, Jonathan Cameron wrote:
> +CC Jean-Phillipe and iommu list.

Thanks for the Cc, sorry I don't have enough bandwidth to follow this
thread at the moment.

>>>>> In WarpDrive/uacce, we make this simple. If you support IOMMU and it support
>>>>> SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need
>>>>> to write any code for that. Because it has been done by IOMMU framework. If it  
>>>> Looks like the IOMMU code uses mmu_notifier, so it is identical to
>>>> IB's ODP. The only difference is that IB tends to have the IOMMU page
>>>> table in the device, not in the CPU.
>>>> The only case I know if that is different is the new-fangled CAPI
>>>> stuff where the IOMMU can directly use the CPU's page table and the
>>>> IOMMU page table (in device or CPU) is eliminated.  
>>> Yes. We are not focusing on the current implementation. As mentioned in the
>>> cover letter. We are expecting Jean Philips' SVA patch:
>>> git://linux-arm.org/linux-jpb.  
>> This SVA stuff does not look comparable to CAPI as it still requires
>> maintaining seperate IOMMU page tables.

With SVA, we use the same page tables in the IOMMU and CPU. It's the
same pgd pointer, there is no mirroring of mappings. We bind the process
page tables with the device using a PASID (Process Address Space ID).

After fork(), the child's mm is different from the parent's one, and is
not automatically bound to the device. The device driver will have to
issue a new bind() request, and the child mm will be bound with a
different PASID.

There could be a problem if the child inherits the parent's device
handle. Then depending on the device, the child could be able to program
DMA and possibly access the parent's address space. The parent needs to
be aware of that when using the bind() API, and close the device fd in
the child after fork().

We use MMU notifiers for some address space changes:

* The IOTLB needs to be invalidated after any unmap() to the process
address space. On Arm systems the SMMU IOTLBs can be invalidated by the
CPU TLBI instructions, but we still need to invalidate TLBs private to
devices that are arch-agnostic (Address Translation Cache in PCI ATS).

* When the process mm exits, we need to remove the associated PASID
configuration in the IOMMU and invalidate the TLBs.


More information about the Linux-accelerators mailing list