[PATCH] powerpc/THP: Wait for all hash_page calls to finish before invalidating HPTE entries

Aneesh Kumar K.V aneesh.kumar at linux.vnet.ibm.com
Wed Jun 19 16:44:54 EST 2013


From: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com>

When we collapse normal pages to hugepage, we first clear the pmd, then invalidate all
the PTE entries. The assumption here is that any low level page fault will see pmd as
none and take the slow path that will wait on mmap_sem. But we could very well be in
a hash_page with local ptep pointer value. Such a hash page can result in adding new
HPTE entries for normal subpages/small page. That means we could be modifying the
page content as we copy them to a huge page. Fix this by waiting on hash_page to finish
after marking the pmd none and bfore invalidating HPTE entries. We use the heavy
kick_all_cpus_sync(). This should be ok as we do this in the background khugepaged
thread and not in application context. But we block page fault handling for this time.
Also if we find collapse slow we can ideally increase the scan rate.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
 arch/powerpc/mm/pgtable_64.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index bbecac4..4bb44c3 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -543,6 +543,14 @@ pmd_t pmdp_clear_flush(struct vm_area_struct *vma, unsigned long address,
 		pmd = *pmdp;
 		pmd_clear(pmdp);
 		/*
+		 * Wait for all pending hash_page to finish
+		 * We can do this by waiting for a context switch to happen on
+		 * the cpus. Any new hash_page after this will see pmd none
+		 * and fallback to code that takes mmap_sem and hence will block
+		 * for collapse to finish.
+		 */
+		kick_all_cpus_sync();
+		/*
 		 * Now invalidate the hpte entries in the range
 		 * covered by pmd. This make sure we take a
 		 * fault and will find the pmd as none, which will
-- 
1.8.1.2



More information about the Linuxppc-dev mailing list