[PATCH kernel v5 5/6] vfio/spapr: Reference mm in tce_container

Alexey Kardashevskiy aik at ozlabs.ru
Tue Nov 22 14:49:53 AEDT 2016


On 22/11/16 13:38, David Gibson wrote:
> On Thu, Nov 17, 2016 at 06:39:41PM +1100, Alexey Kardashevskiy wrote:
>> On 11/11/16 23:32, Alexey Kardashevskiy wrote:
>>> In some situations the userspace memory context may live longer than
>>> the userspace process itself so if we need to do proper memory context
>>> cleanup, we better have tce_container take a reference to mm_struct and
>>> use it later when the process is gone (@current or @current->mm is NULL).
>>>
>>> This references mm and stores the pointer in the container; this is done
>>> in a new helper - tce_iommu_mm_set() - when one of the following happens:
>>> - a container is enabled (IOMMU v1);
>>> - a first attempt to pre-register memory is made (IOMMU v2);
>>> - a DMA window is created (IOMMU v2).
>>> The @mm stays referenced till the container is destroyed.
>>>
>>> This replaces current->mm with container->mm everywhere except debug
>>> prints.
>>>
>>> This adds a check that current->mm is the same as the one stored in
>>> the container to prevent userspace from making changes to a memory
>>> context of other processes.
>>>
>>> DMA map/unmap ioctls() do not check for @mm as they already check
>>> for @enabled which is set after tce_iommu_mm_set() is called.
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>>> ---
>>> Changes:
>>> v5:
>>> * postpone referencing of mm
>>>
>>> v4:
>>> * added check for container->mm!=current->mm in tce_iommu_ioctl()
>>> for all ioctls and removed other redundand checks
>>> ---
>>>  drivers/vfio/vfio_iommu_spapr_tce.c | 159 ++++++++++++++++++++++--------------
>>>  1 file changed, 99 insertions(+), 60 deletions(-)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
>>> index 1c02498..9a81a7e 100644
>>> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
>>> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
>>> @@ -31,49 +31,49 @@
>>>  static void tce_iommu_detach_group(void *iommu_data,
>>>  		struct iommu_group *iommu_group);
>>>  
>>> -static long try_increment_locked_vm(long npages)
>>> +static long try_increment_locked_vm(struct mm_struct *mm, long npages)
>>>  {
>>>  	long ret = 0, locked, lock_limit;
>>>  
>>> -	if (!current || !current->mm)
>>> -		return -ESRCH; /* process exited */
>>> +	if (!mm)
>>> +		return -EPERM;
>>>  
>>>  	if (!npages)
>>>  		return 0;
>>>  
>>> -	down_write(&current->mm->mmap_sem);
>>> -	locked = current->mm->locked_vm + npages;
>>> +	down_write(&mm->mmap_sem);
>>> +	locked = mm->locked_vm + npages;
>>>  	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>>>  	if (locked > lock_limit && !capable(CAP_IPC_LOCK))
>>
>>
>>
>> Oh boy. Now it seems I have to reference a task, not just mm (which I may
>> not have to reference at all after all as the task reference should keep mm
>> alive) as I missed the fact capable() and rlimit() are working with
>> @current.
> 
> Sorry, what?  I'm not seeing how a task reference comes into this.

I reference @mm to make sure that just one mm uses a container. If mm
changes, I return an error, sanity check.

The code also increments locked_vm in mm. But it looks at the current task
if there is room for increments and for CAP_IPC_LOCK.

So, the options are:
1. I do not reference the current task, and if mm changes, then the mm
sanity check won't let me proceed to the code which tries using current OR
2. reference a task when I reference mm and do that sanity check not just
for mm but also for current task.

Makes sense?


> 
>>
>>
>> Alex,
>>
>> Is there anything else I should fix before posting v6? Thanks.
>>
>>
>>
> 


-- 
Alexey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 839 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20161122/8c789538/attachment-0001.sig>


More information about the Linuxppc-dev mailing list