[PATCH v6 2/3] uacce: add uacce driver

zhangfei.gao at foxmail.com zhangfei.gao at foxmail.com
Fri Oct 25 13:02:36 AEDT 2019



On 2019/10/24 上午12:58, Jerome Glisse wrote:
> On Wed, Oct 16, 2019 at 07:28:02PM +0200, Jean-Philippe Brucker wrote:
> [...]
>
>>> +static struct uacce_qfile_region *
>>> +uacce_create_region(struct uacce_queue *q, struct vm_area_struct *vma,
>>> +		    enum uacce_qfrt type, unsigned int flags)
>>> +{
>>> +	struct uacce_qfile_region *qfr;
>>> +	struct uacce_device *uacce = q->uacce;
>>> +	unsigned long vm_pgoff;
>>> +	int ret = -ENOMEM;
>>> +
>>> +	qfr = kzalloc(sizeof(*qfr), GFP_ATOMIC);
>>> +	if (!qfr)
>>> +		return ERR_PTR(-ENOMEM);
>>> +
>>> +	qfr->type = type;
>>> +	qfr->flags = flags;
>>> +	qfr->iova = vma->vm_start;
>>> +	qfr->nr_pages = vma_pages(vma);
>>> +
>>> +	if (vma->vm_flags & VM_READ)
>>> +		qfr->prot |= IOMMU_READ;
>>> +
>>> +	if (vma->vm_flags & VM_WRITE)
>>> +		qfr->prot |= IOMMU_WRITE;
>>> +
>>> +	if (flags & UACCE_QFRF_SELFMT) {
>>> +		if (!uacce->ops->mmap) {
>>> +			ret = -EINVAL;
>>> +			goto err_with_qfr;
>>> +		}
>>> +
>>> +		ret = uacce->ops->mmap(q, vma, qfr);
>>> +		if (ret)
>>> +			goto err_with_qfr;
>>> +		return qfr;
>>> +	}
>> I wish the SVA and !SVA paths were less interleaved. Both models are
>> fundamentally different:
>>
>> * Without SVA you cannot share the device between multiple processes. All
>>    DMA mappings are in the "main", non-PASID address space of the device.
>>
>>    Note that process isolation without SVA could be achieved with the
>>    auxiliary domains IOMMU API (introduced primarily for vfio-mdev) but
>>    this is not the model chosen here.
>>
>> * With SVA you can share the device between multiple processes. But if the
>>    process can somehow program its portion of the device to access the main
>>    address space, you loose isolation. Only the kernel must be able to
>>    program and access the main address space.
>>
>> When interleaving both code paths it's easy to make a mistake and loose
>> this isolation. Although I think this code is correct, it took me some
>> time to understand that we never end up calling dma_alloc or iommu_map
>> when using SVA. Might be worth at least adding a check that if
>> UACCE_DEV_SVA, then we never end up in the bottom part of this function.
> I would go even further, just remove the DMA path as it is not use.
> But yes at bare minimum it needs to be completely separate to avoid
> confusion.
Thanks, will remove dma path first as the first patch.
>
>
> [...]
>
>
>>> +static int uacce_fops_open(struct inode *inode, struct file *filep)
>>> +{
>>> +	struct uacce_queue *q;
>>> +	struct iommu_sva *handle = NULL;
>>> +	struct uacce_device *uacce;
>>> +	int ret;
>>> +	int pasid = 0;
>>> +
>>> +	uacce = idr_find(&uacce_idr, iminor(inode));
>>> +	if (!uacce)
>>> +		return -ENODEV;
>>> +
>>> +	if (!try_module_get(uacce->pdev->driver->owner))
>>> +		return -ENODEV;
>>> +
>>> +	ret = uacce_dev_open_check(uacce);
>>> +	if (ret)
>>> +		goto out_with_module;
>>> +
>>> +	if (uacce->flags & UACCE_DEV_SVA) {
>>> +		handle = iommu_sva_bind_device(uacce->pdev, current->mm, NULL);
>>> +		if (IS_ERR(handle))
>>> +			goto out_with_module;
>>> +		pasid = iommu_sva_get_pasid(handle);
>> We need to register an mm_exit callback. Once we return, userspace will
>> start running jobs on the accelerator. If the process is killed while the
>> accelerator is running, the mm_exit callback tells the device driver to
>> stop using this PASID (stop_queue()), so that it can be reallocated for
>> another process.
>>
>> Implementing this with the right locking and ordering can be tricky. I'll
>> try to implement the callback and test it on the device this week.
> It already exist it is call mmu notifier, you can register an mmu notifier
> and get callback once the mm exit.
Currently we register mm_exit for sva path, as suggested by Jean.
static struct iommu_sva_ops uacce_sva_ops = {
         .mm_exit = uacce_sva_exit,
};
iommu_sva_set_ops(handle, &uacce_sva_ops);

Still not certain do we have to register mm_exit for both case,
sva and !sva, since it is a common situation.

Thanks




More information about the Linux-accelerators mailing list