[PATCH 0/2] iommu: Remove iommu_sva_ops::mm_exit()

Jacob Pan jacob.jun.pan at linux.intel.com
Fri Apr 10 00:14:24 AEST 2020


On Thu, 9 Apr 2020 08:39:05 +0200
Jean-Philippe Brucker <jean-philippe at linaro.org> wrote:

> On Wed, Apr 08, 2020 at 04:48:02PM -0700, Jacob Pan wrote:
> > On Wed, 8 Apr 2020 19:32:18 -0300
> > Jason Gunthorpe <jgg at ziepe.ca> wrote:
> >   
> > > On Wed, Apr 08, 2020 at 02:35:52PM -0700, Jacob Pan wrote:  
> > > > > On Wed, Apr 08, 2020 at 11:35:52AM -0700, Jacob Pan wrote:    
> > > > > > Hi Jean,
> > > > > > 
> > > > > > On Wed,  8 Apr 2020 16:04:25 +0200
> > > > > > Jean-Philippe Brucker <jean-philippe at linaro.org> wrote:
> > > > > >       
> > > > > > > The IOMMU SVA API currently requires device drivers to
> > > > > > > implement an mm_exit() callback, which stops device jobs
> > > > > > > that do DMA. This function is called in the release() MMU
> > > > > > > notifier, when an address space that is shared with a
> > > > > > > device exits.
> > > > > > > 
> > > > > > > It has been noted several time during discussions about
> > > > > > > SVA that cancelling DMA jobs can be slow and complex, and
> > > > > > > doing it in the release() notifier might cause
> > > > > > > synchronization issues (patch 2 has more background).
> > > > > > > Device drivers must in any case call unbind() to remove
> > > > > > > their bond, after stopping DMA from a more favorable
> > > > > > > context (release of a file descriptor).
> > > > > > > 
> > > > > > > So after mm exits, rather than notifying device drivers,
> > > > > > > we can hold on to the PASID until unbind(), ask IOMMU
> > > > > > > drivers to silently abort DMA and Page Requests in the
> > > > > > > meantime. This change should relieve the mmput()
> > > > > > > path.      
> > > > > >
> > > > > > I assume mm is destroyed after all the FDs are closed      
> > > > > 
> > > > > FDs do not hold a mmget(), but they may hold a mmgrab(), ie
> > > > > anything using mmu_notifiers has to hold a grab until the
> > > > > notifier is destroyed, which is often triggered by FD close.
> > > > >     
> > > > Sorry, I don't get this. Are you saying we have to hold a
> > > > mmgrab() between svm_bind/mmu_notifier_register and
> > > > svm_unbind/mmu_notifier_unregister?    
> > > 
> > > Yes. This is done automatically for the caller inside the
> > > mmu_notifier implementation. We now even store the mm_struct
> > > pointer inside the notifier.
> > > 
> > > Once a notifier is registered the mm_struct remains valid memory
> > > until the notifier is unregistered.
> > >   
> > > > Isn't the idea of mmu_notifier is to avoid holding the mm
> > > > reference and rely on the notifier to tell us when mm is going
> > > > away?    
> > > 
> > > The notifier only holds a mmgrab(), not a mmget() - this allows
> > > exit_mmap to proceed, but the mm_struct memory remains.
> > > 
> > > This is also probably why it is a bad idea to tie the lifetime of
> > > something like a pasid to the mmdrop as a evil user could cause a
> > > large number of mm structs to be released but not freed, probably
> > > defeating cgroup limits and so forth (not sure)
> > >   
> > > > It seems both Intel and AMD iommu drivers don't hold mmgrab
> > > > after mmu_notifier_register.    
> > > 
> > > It is done internally to the implementation.
> > >   
> > > > > So the exit_mmap() -> release() may happen before the FDs are
> > > > > destroyed, but the final mmdrop() will be during some FD
> > > > > release when the final mmdrop() happens.    
> > > > 
> > > > Do you mean mmdrop() is after FD release?     
> > > 
> > > Yes, it will be done by the mmu_notifier_unregister(), which
> > > should be called during FD release if the iommu lifetime is
> > > linked to some FD. 
> > > > If so, unbind is called in FD release should take care of
> > > > everything, i.e. stops DMA, clear PASID context on IOMMU, flush
> > > > PRS queue etc.    
> > > 
> > > Yes, this is the proper way, when the DMA is stopped and no use
> > > of the PASID remains then you can drop the mmu notifier and
> > > release the PASID entirely. If that is linked to the lifetime of
> > > the FD then forget completely about the mm_struct lifetime, it
> > > doesn't matter.. 
> > Got everything above, thanks a lot.
> > 
> > If everything is in order with the FD close. Why do we need to 
> > "ask IOMMU drivers to silently abort DMA and Page Requests in the
> > meantime." in mm_exit notifier? This will be done orderly in unbind
> > anyway.  
> 
> When the process is killed, mm release can happen before fds are
> released. If you look at do_exit() in kernel/exit.c:
> 
> 	exit_mm()
> 	  mmput()
> 	   -> mmu release notifier  
> 	...
> 	exit_files()
> 	  close_files()
> 	    fput()
> 	exit_task_work()
> 	  __fput()
> 	   -> unbind()  
> 
So unbind is coming anyway, the difference in handling in mmu release
notifier is whether we silently drop DMA fault vs. reporting fault?
If a process crash during unbind, something already went seriously
wrong, DMA fault is expected.
I think having some error indication is useful, compared to "silently
drop"

Thanks,

Jacob

> Thanks,
> Jean
> 
> >   
> > > > Enforcing unbind upon FD close might be a precarious path,
> > > > perhaps that is why we have to deal with out of order
> > > > situation?    
> > > 
> > > How so? You just put it in the FD release function :)
> > >   
> > I was thinking some driver may choose to defer unbind in some
> > workqueue etc.
> >   
> > > > > > In VT-d, because of enqcmd and lazy PASID free we plan to
> > > > > > hold on to the PASID until mmdrop.
> > > > > > https://lore.kernel.org/patchwork/patch/1217762/      
> > > > > 
> > > > > Why? The bind already gets a mmu_notifier which has refcounts
> > > > > and the right lifetime for PASID.. This code could already be
> > > > > simplified by using the mmu_notifier_get()/put() stuff.
> > > > >     
> > > > Yes, I guess mmu_notifier_get()/put() is new :)
> > > > +Fenghua    
> > > 
> > > I was going to convert the intel code when I did many other
> > > drivers, but it was a bit too complex..
> > > 
> > > But the approach is straightforward. Get rid of the mm search
> > > list and use mmu_notifier_get(). This returns a singlton notifier
> > > for the mm_struct and handles refcounting/etc
> > > 
> > > Use mmu_notifier_put() during a unbind, it will callback to
> > > free_notifier() to do the final frees (ie this is where the pasid
> > > should go away)
> > > 
> > > For the SVM_FLAG_PRIVATE_PASID continue to use
> > > mmu_notifier_register, however this can now be mixed with
> > > mmu_notifier_put() so the cleanup is the same. A separate ops
> > > static struct is needed to create a unique key though
> > > 
> > > Jason  
> > 
> > [Jacob Pan]  

[Jacob Pan]


More information about the Linux-accelerators mailing list