[PATCH v4 0/7] introduce vm_flags modifier functions

Sat Mar 18 10:04:05 AEDT 2023

On Fri, Mar 17, 2023 at 3:41 PM Alex Williamson
<alex.williamson at redhat.com> wrote:
>
> On Fri, 17 Mar 2023 12:08:32 -0700
> Suren Baghdasaryan <surenb at google.com> wrote:
>
> > On Tue, Mar 14, 2023 at 1:11 PM Alex Williamson
> > <alex.williamson at redhat.com> wrote:
> > >
> > > On Thu, 26 Jan 2023 11:37:45 -0800
> > > Suren Baghdasaryan <surenb at google.com> wrote:
> > >
> > > > This patchset was originally published as a part of per-VMA locking [1] and
> > > > was split after suggestion that it's viable on its own and to facilitate
> > > > the review process. It is now a preprequisite for the next version of per-VMA
> > > > lock patchset, which reuses vm_flags modifier functions to lock the VMA when
> > > > vm_flags are being updated.
> > > >
> > > > VMA vm_flags modifications are usually done under exclusive mmap_lock
> > > > protection because this attrubute affects other decisions like VMA merging
> > > > or splitting and races should be prevented. Introduce vm_flags modifier
> > > > functions to enforce correct locking.
> > > >
> > > > The patchset applies cleanly over mm-unstable branch of mm tree.
> > >
> > > With this series, vfio-pci developed a bunch of warnings around not
> > > holding the mmap_lock write semaphore while calling
> > > io_remap_pfn_range() from our fault handler, vfio_pci_mmap_fault().
> > >
> > > I suspect vdpa has the same issue for their use of remap_pfn_range()
> > > from their fault handler, JasonW, MST, FYI.
> > >
> > > It also looks like gru_fault() would have the same issue, Dimitri.
> > >
> > > In all cases, we're preemptively setting vm_flags to what
> > > remap_pfn_range_notrack() uses, so I thought we were safe here as I
> > > specifically remember trying to avoid changing vm_flags from the
> > > fault handler.  But apparently that doesn't take into account
> > > track_pfn_remap() where VM_PAT comes into play.
> > >
> > > The reason for using remap_pfn_range() on fault in vfio-pci is that
> > > we're mapping device MMIO to userspace, where that MMIO can be disabled
> > > and we'd rather zap the mapping when that occurs so that we can sigbus
> > > the user rather than allow the user to trigger potentially fatal bus
> > > errors on the host.
> > >
> > > Peter Xu has suggested offline that a non-lazy approach to reinsert the
> > > mappings might be more inline with mm expectations relative to touching
> > > vm_flags during fault.  What's the right solution here?  Can the fault
> > > handling be salvaged, is proactive remapping the right approach, or is
> > > there something better?  Thanks,
> >
> > Hi Alex,
> > If in your case it's safe to change vm_flags without holding exclusive
> > mmap_lock, maybe you can use __vm_flags_mod() the way I used it in
> > https://lore.kernel.org/all/20230126193752.297968-7-surenb@google.com,
> > while explaining why this should be safe?
>
> Hi Suren,
>
> Thanks for the reply, but I'm not sure I'm following.  Are you
> suggesting a bool arg added to io_remap_pfn_range(), or some new
> variant of that function to conditionally use __vm_flags_mod() in place
> of vm_flags_set() across the call chain?  Thanks,

I think either way could work but after taking a closer look, both
ways would be quite ugly. If we could somehow identify that we are
handling a page fault and use __vm_flags_mod() without additional
parameters it would be more palatable IMHO...
Peter's suggestion to avoid touching vm_flags during fault would be
much cleaner but I'm not sure how easily that can be done.

>
> Alex
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe at android.com.
>