[PATCH v5 7/9] mm/mremap: Move TLB flush outside page table lock

Linus Torvalds torvalds at linux-foundation.org
Fri May 21 12:40:41 AEST 2021

On Thu, May 20, 2021 at 6:57 AM Aneesh Kumar K.V
<aneesh.kumar at linux.ibm.com> wrote:
> Wondering whether this is correct considering we are holding mmap_sem in
> write mode in mremap.

Right. So *normally* the rule is to EITHER

 - hold the mmap_sem for writing


 - hold the page table lock

and that the TLB flush needs to happen before you release that lock.

But as that commit message of commit eb66ae030829 ("mremap: properly
flush TLB before releasing the page") says, "mremap()" is a bit
special. It's special because mremap() didn't take ownership of the
page - it only moved it somewhere else. So now the page-out logic -
that relies on the page table lock - can free the page immediately
after we've released the page table lock.

So basically, in order to delay the TLB flush after releasing the page
table lock, it's not really sufficient to _just_ hold the mmap_sem for
writing. You also need to guarantee that the lifetime of the page
itself is held until after the TLB flush.

For normal operations like "munmap()", this happens naturally, because
we remove the page from the page table, and add it to the list of
pages to be freed after the TLB flush.

But mremap never did that "remove the page and add it to a list to be
free'd later". Instead, it just moved the page somewhere else. And
thus there is no guarantee that the page that got moved will continue
to exist until a TLB flush is done.

So mremap does need to flush the TLB before releasing the page table
lock, because that's the lifetime boundary for the page that got


More information about the Linuxppc-dev mailing list