[PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

Suren Baghdasaryan surenb at google.com
Tue Jan 24 06:57:30 AEDT 2023


On Mon, Jan 23, 2023 at 11:31 AM Matthew Wilcox <willy at infradead.org> wrote:
>
> On Mon, Jan 23, 2023 at 08:18:37PM +0100, Michal Hocko wrote:
> > On Mon 23-01-23 18:23:08, Matthew Wilcox wrote:
> > > On Mon, Jan 23, 2023 at 09:46:20AM -0800, Suren Baghdasaryan wrote:
> > [...]
> > > > Yes, batching the vmas into a list and draining it in remove_mt() and
> > > > exit_mmap() as you suggested makes sense to me and is quite simple.
> > > > Let's do that if nobody has objections.
> > >
> > > I object.  We *know* nobody has a reference to any of the VMAs because
> > > you have to have a refcount on the mm before you can get a reference
> > > to a VMA.  If Michal is saying that somebody could do:
> > >
> > >     mmget(mm);
> > >     vma = find_vma(mm);
> > >     lock_vma(vma);
> > >     mmput(mm);
> > >     vma->a = b;
> > >     unlock_vma(mm, vma);
> > >
> > > then that's something we'd catch in review -- you obviously can't use
> > > the mm after you've dropped your reference to it.
> >
> > I am not claiming this is possible now. I do not think we want to have
> > something like that in the future either but that is really hard to
> > envision. I am claiming that it is subtle and potentially error prone to
> > have two different ways of mass vma freeing wrt. locking. Also, don't we
> > have a very similar situation during last munmaps?
>
> We shouldn't have two ways of mass VMA freeing.  Nobody's suggesting that.
> There are two cases; there's munmap(), which typically frees a single
> VMA (yes, theoretically, you can free hundreds of VMAs with a single
> call which spans multiple VMAs, but in practice that doesn't happen),
> and there's exit_mmap() which happens on exec() and exit().
>
> For the munmap() case, just RCU-free each one individually.  For the
> exit_mmap() case, there's no need to use RCU because nobody should still
> have a VMA pointer after calling mmdrop() [1]
>
> [1] Sorry, the above example should have been mmgrab()/mmdrop(), not
> mmget()/mmput(); you're not allowed to look at the VMA list with an
> mmget(), you need to have grabbed.

I think it's clear that this would work with the current code and that
the concern is about any future possible misuse. So, it would be
preferable to proactively prevent such misuse.

vma_write_lock() and vma_write_unlock_mm() both have
mmap_assert_write_locked(), so they always happen under mmap_lock
protection and therefore do not pose any danger. The only issue we
need to be careful with is calling
vma_read_trylock()/vma_read_unlock() outside of mmap_lock protection
while mm is unstable. I don't think doing mmget/mmput inside these
functions is called for but maybe some assertions would prevent future
misuse?


More information about the Linuxppc-dev mailing list