[PATCH v4 04/23] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages

John Hubbard jhubbard at nvidia.com
Thu Nov 14 09:46:50 AEDT 2019


On 11/13/19 2:00 PM, Dan Williams wrote:
...
>> Ugh, when did all this HMM specific manipulation sneak into the
>> generic ZONE_DEVICE path? It used to be gated by pgmap type with its
>> own put_zone_device_private_page(). For example it's certainly
>> unnecessary and might be broken (would need to check) to call
>> mem_cgroup_uncharge() on a DAX page. ZONE_DEVICE users are not a
>> monolith and the HMM use case leaks pages into code paths that DAX
>> explicitly avoids.
> 
> It's been this way for a while and I did not react previously,
> apologies for that. I think __ClearPageActive, __ClearPageWaiters, and
> mem_cgroup_uncharge, belong behind a device-private conditional. The
> history here is:
> 
> Move some, but not all HMM specifics to hmm_devmem_free():
>      2fa147bdbf67 mm, dev_pagemap: Do not clear ->mapping on final put
> 
> Remove the clearing of mapping since no upstream consumers needed it:
>      b7a523109fb5 mm: don't clear ->mapping in hmm_devmem_free
> 
> Add it back in once an upstream consumer arrived:
>      7ab0ad0e74f8 mm/hmm: fix ZONE_DEVICE anon page mapping reuse
> 
> We're now almost entirely free of ->page_free callbacks except for
> that weird nouveau case, can that FIXME in nouveau_dmem_page_free()
> also result in killing the ->page_free() callback altogether? In the
> meantime I'm proposing a cleanup like this:


OK, assuming this is acceptable (no obvious problems jump out at me,
and we can also test it with HMM), then how would you like to proceed, as
far as patches go: add such a patch as part of this series here, or as a
stand-alone patch either before or after this series? Or something else?
And did you plan on sending it out as such?

Also, the diffs didn't quite make it through intact to my "git apply", so
I'm re-posting the diff in hopes that this time it survives:

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f9f76f6ba07b..21db1ce8c0ae 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem)
  	put_disk(pmem->disk);
  }
  
-static void pmem_pagemap_page_free(struct page *page)
-{
-	wake_up_var(&page->_refcount);
-}
-
  static const struct dev_pagemap_ops fsdax_pagemap_ops = {
-	.page_free		= pmem_pagemap_page_free,
  	.kill			= pmem_pagemap_kill,
  	.cleanup		= pmem_pagemap_cleanup,
  };
diff --git a/mm/memremap.c b/mm/memremap.c
index 03ccbdfeb697..157edb8f7cf8 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -419,12 +419,6 @@ void __put_devmap_managed_page(struct page *page)
  	 * holds a reference on the page.
  	 */
  	if (count == 1) {
-		/* Clear Active bit in case of parallel mark_page_accessed */
-		__ClearPageActive(page);
-		__ClearPageWaiters(page);
-
-		mem_cgroup_uncharge(page);
-
  		/*
  		 * When a device_private page is freed, the page->mapping field
  		 * may still contain a (stale) mapping value. For example, the
@@ -446,10 +440,17 @@ void __put_devmap_managed_page(struct page *page)
  		 * handled differently or not done at all, so there is no need
  		 * to clear page->mapping.
  		 */
-		if (is_device_private_page(page))
-			page->mapping = NULL;
+		if (is_device_private_page(page)) {
+			/* Clear Active bit in case of parallel mark_page_accessed */
+			__ClearPageActive(page);
+			__ClearPageWaiters(page);
  
-		page->pgmap->ops->page_free(page);
+			mem_cgroup_uncharge(page);
+
+			page->mapping = NULL;
+			page->pgmap->ops->page_free(page);
+		} else
+			wake_up_var(&page->_refcount);
  	} else if (!count)
  		__put_page(page);
  }
-- 
2.24.0


thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index ad8e4df1282b..4eae441f86c9 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -337,13 +337,7 @@ static void pmem_release_disk(void *__pmem)
>          put_disk(pmem->disk);
>   }
> 
> -static void pmem_pagemap_page_free(struct page *page)
> -{
> -       wake_up_var(&page->_refcount);
> -}
> -
>   static const struct dev_pagemap_ops fsdax_pagemap_ops = {
> -       .page_free              = pmem_pagemap_page_free,
>          .kill                   = pmem_pagemap_kill,
>          .cleanup                = pmem_pagemap_cleanup,
>   };
> diff --git a/mm/memremap.c b/mm/memremap.c
> index 03ccbdfeb697..157edb8f7cf8 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -419,12 +419,6 @@ void __put_devmap_managed_page(struct page *page)
>           * holds a reference on the page.
>           */
>          if (count == 1) {
> -               /* Clear Active bit in case of parallel mark_page_accessed */
> -               __ClearPageActive(page);
> -               __ClearPageWaiters(page);
> -
> -               mem_cgroup_uncharge(page);
> -
>                  /*
>                   * When a device_private page is freed, the page->mapping field
>                   * may still contain a (stale) mapping value. For example, the
> @@ -446,10 +440,17 @@ void __put_devmap_managed_page(struct page *page)
>                   * handled differently or not done at all, so there is no need
>                   * to clear page->mapping.
>                   */
> -               if (is_device_private_page(page))
> -                       page->mapping = NULL;
> +               if (is_device_private_page(page)) {
> +                       /* Clear Active bit in case of parallel
> mark_page_accessed */
> +                       __ClearPageActive(page);
> +                       __ClearPageWaiters(page);
> 
> -               page->pgmap->ops->page_free(page);
> +                       mem_cgroup_uncharge(page);
> +
> +                       page->mapping = NULL;
> +                       page->pgmap->ops->page_free(page);
> +               } else
> +                       wake_up_var(&page->_refcount);
>          } else if (!count)
>                  __put_page(page);
>   }
> 


More information about the Linuxppc-dev mailing list