[PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock

Tue Jan 17 16:58:35 AEDT 2023

On Mon, Jan 16, 2023 at 9:46 PM Matthew Wilcox <willy at infradead.org> wrote:
>
> On Mon, Jan 16, 2023 at 08:34:36PM -0800, Suren Baghdasaryan wrote:
> > On Mon, Jan 16, 2023 at 8:14 PM Matthew Wilcox <willy at infradead.org> wrote:
> > >
> > > On Mon, Jan 16, 2023 at 11:14:38AM +0000, Hyeonggon Yoo wrote:
> > > > > @@ -643,20 +647,28 @@ static inline void vma_write_lock(struct vm_area_struct *vma)
> > > > >  static inline bool vma_read_trylock(struct vm_area_struct *vma)
> > > > >  {
> > > > >     /* Check before locking. A race might cause false locked result. */
> > > > > -   if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
> > > > > +   if (vma->vm_lock->lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
> > > > >             return false;
> > > > >
> > > > > -   if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0))
> > > > > +   if (unlikely(!atomic_inc_unless_negative(&vma->vm_lock->count)))
> > > > >             return false;
> > > > >
> > > > > +   /* If atomic_t overflows, restore and fail to lock. */
> > > > > +   if (unlikely(atomic_read(&vma->vm_lock->count) < 0)) {
> > > > > +           if (atomic_dec_and_test(&vma->vm_lock->count))
> > > > > +                   wake_up(&vma->vm_mm->vma_writer_wait);
> > > > > +           return false;
> > > > > +   }
> > > > > +
> > > > >     /*
> > > > >      * Overflow might produce false locked result.
> > > > >      * False unlocked result is impossible because we modify and check
> > > > >      * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq
> > > > >      * modification invalidates all existing locks.
> > > > >      */
> > > > > -   if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) {
> > > > > -           up_read(&vma->vm_lock->lock);
> > > > > +   if (unlikely(vma->vm_lock->lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) {
> > > > > +           if (atomic_dec_and_test(&vma->vm_lock->count))
> > > > > +                   wake_up(&vma->vm_mm->vma_writer_wait);
> > > > >             return false;
> > > > >     }
> > > >
> > > > With this change readers can cause writers to starve.
> > > > What about checking waitqueue_active() before or after increasing
> > > > vma->vm_lock->count?
> > >
> > > I don't understand how readers can starve a writer.  Readers do
> > > atomic_inc_unless_negative() so a writer can always force readers
> > > to fail.
> >
> > I think the point here was that if page faults keep occuring and they
> > prevent vm_lock->count from reaching 0 then a writer will be blocked
> > and there is no reader throttling mechanism (no max time that writer
> > will be waiting).
>
> Perhaps I misunderstood your description; I thought that a _waiting_
> writer would make the count negative, not a successfully acquiring
> writer.

A waiting writer does not modify the counter, instead it's placed on
the wait queue and the last reader which sets the count to 0 while
releasing its read lock will wake it up. Once the writer is woken it
will try to set the count to negative and if successful will own the
lock, otherwise it goes back to sleep.