[PATCH 1/2] powerpc/mm: Enable HugeTLB page migration

Anshuman Khandual khandual at linux.vnet.ibm.com
Wed Feb 24 23:20:46 AEDT 2016


On 02/02/2016 11:49 AM, Anshuman Khandual wrote:
> On 01/29/2016 02:45 PM, Anshuman Khandual wrote:
>> > On 01/28/2016 08:14 PM, Aneesh Kumar K.V wrote:
>>>> >> > Anshuman Khandual <khandual at linux.vnet.ibm.com> writes:
>>>> >> > 
>>>>>> >>> >> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
>>>>>> >>> >> HugeTLB page at the PMD level. It enables the kernel configuration option
>>>>>> >>> >> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
>>>>>> >>> >> hugepage_migration_supported() during migration. After the recent changes
>>>>>> >>> >> to the PTE format, HugeTLB page migration happens successfully.
>>>>>> >>> >>
>>>>>> >>> >> Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com>
>>>>>> >>> >> ---
>>>>>> >>> >>  arch/powerpc/Kconfig | 4 ++++
>>>>>> >>> >>  1 file changed, 4 insertions(+)
>>>>>> >>> >>
>>>>>> >>> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>>>> >>> >> index e4824fd..65d52a0 100644
>>>>>> >>> >> --- a/arch/powerpc/Kconfig
>>>>>> >>> >> +++ b/arch/powerpc/Kconfig
>>>>>> >>> >> @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
>>>>>> >>> >>  config ARCH_HAS_DMA_SET_COHERENT_MASK
>>>>>> >>> >>          bool
>>>>>> >>> >>
>>>>>> >>> >> +config ARCH_ENABLE_HUGEPAGE_MIGRATION
>>>>>> >>> >> +	def_bool y
>>>>>> >>> >> +	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
>>>>>> >>> >> +
>>>>>> >>> >>  config PPC
>>>>>> >>> >>  	bool
>>>>>> >>> >>  	default y
>>>> >> > 
>>>> >> > 
>>>> >> > Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb
>>>> >> > migration and our follow_huge_addr will BUG_ON on that. Look at
>>>> >> > e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table
>>>> >> > lock in follow_huge_pmd()").
>> > HugeTLB page migration was successful without any error and data integrity
>> > check passed on them as well. But yes there might be some corner cases which
>> > trigger the race condition we have not faced yet. Will try to understand the
>> > situation there and get back.
> Aneesh,
> 
> The current test which passed successfully in moving HugeTLB pages is
> driven from the soft offline sysfs interface taking PFN (from pagemap
> interface) as the argument. Its kind of directly calls migrate_pages()
> handing out the page struct list as the argument. But the sample test
> case in commit e66f17ff71772b ("mm/hugetlb: take page table lock in
> follow_huge_pmd()") is able to crash the kernel at the BUG_ON as you
> had mentioned before.
> 
> CPU: 2 PID: 6386 Comm: movepages Not tainted 4.5.0-rc2-00002-gc3ac0a3 #3
> task: c0000003e8792400 ti: c0000003f65cc000 task.ti: c0000003f65cc000
> NIP: c0000000001f485c LR: c0000000001f4844 CTR: 0000000000000000
> REGS: c0000003f65cfa20 TRAP: 0700   Not tainted  (4.5.0-rc2-00002-gc3ac0a3)
> MSR: 800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 28044488  XER: 00000000
> CFAR: c000000000050310 SOFTE: 1 
> GPR00: c0000000001f4844 c0000003f65cfca0 c000000000d82e00 f000000000ec0000 
> GPR04: 00003efff6000000 c0000003f65cfc74 c0000003f65cfc70 0000000000000001 
> GPR08: f000000000000000 0000000000000000 fffffffffffff000 0000000000000000 
> GPR12: 0000000000004400 c00000000ea70800 fffffffffffffff2 c0000003d3330000 
> GPR16: 0000000000000a00 c0000003f65cfd30 0000000000000000 c0000003d3330000 
> GPR20: c000000000239f50 0000000000000100 fffffffffffff000 0000000001320122 
> GPR24: c0000003ed267ee8 c0000003f65cfd70 00003efff6000000 c0000003f65cfd80 
> GPR28: 000000000000008c c0000003ed267e80 c0000003ed2299d0 0000000000000004 
> NIP [c0000000001f485c] follow_page_mask+0x7c/0x440
> LR [c0000000001f4844] follow_page_mask+0x64/0x440
> Call Trace:
> [c0000003f65cfca0] [c0000000001f4844] follow_page_mask+0x64/0x440 (unreliable)
> [c0000003f65cfd10] [c00000000023d2d8] SyS_move_pages+0x518/0x820
> [c0000003f65cfe30] [c000000000009260] system_call+0x38/0xd0
> Instruction dump:
> 7cdb3378 7c9a2378 eba30040 91260000 7fa3eb78 4be5b9f9 60000000 3940f000 
> 7fa35040 41dd0044 57ff077a 7bff0020 <0b1f0000> 38210070 e8010010 eae1ffb8
> 
> In the function follow_page_mask() we have this code block right at the
> front where it fails.
> 
> 	page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
> 	if (!IS_ERR(page)) {
> 		BUG_ON(flags & FOLL_GET);
> 		return page;
> 	}
> 
> As you mentioned, currently we dont implement CONFIG_ARCH_WANT_GENERAL_HUGETLB.
> So we define follow_huge_addr() function which returns a valid page struct
> for any given address. But then dont understand why we bug on for FOLL_GET.
> move_pages() had called follow_page() with FOLL_GET flag at the first place.
> So this condition is going to be true all the time. I am still digging into
> this but meanwhile if you can throw some light on why we have BUG_ON for
> FOLL_GET flag it will really be helpful. Did not get much clues looking into
> the previous commit which added this BUG_ON.

I understand that the draft patch below is just a hack but it does point out
the problem we should address. The test cases of the commit e66f17ff71772b2
(" mm/hugetlb: take page table lock in follow_huge_pmd()") has not been able
to create the race condition like before.

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 744e24b..3d600162 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -668,8 +668,18 @@ struct page *
 follow_huge_pmd(struct mm_struct *mm, unsigned long address,
                pmd_t *pmd, int write)
 {
-       BUG();
-       return NULL;
+       struct page *page;
+
+       page = follow_huge_addr(mm, address, write & FOLL_WRITE);
+       if (!IS_ERR(page)) {
+               if (page && (write & FOLL_GET)) {
+                       if (PageHead(page))
+                               get_page(page);
+                       else
+                               page = NULL;
+               }
+       }
+       return page;
 }
 
 struct page *
diff --git a/mm/gup.c b/mm/gup.c
index 7bf19ff..04a80ee0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -225,7 +225,12 @@ struct page *follow_page_mask(struct vm_area_struct *vma,
 
        page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
        if (!IS_ERR(page)) {
-               BUG_ON(flags & FOLL_GET);
+               if (page && (flags & FOLL_GET)) {
+                       if (PageHead(page))
+                               get_page(page);
+                       else
+                               page = NULL;
+               }
                return page;
        }



More information about the Linuxppc-dev mailing list