[RFC v2 12/21] mm: thp: handle split failure in device migration
Nico Pache
npache at redhat.com
Tue Mar 10 02:09:26 AEDT 2026
On Thu, Mar 5, 2026 at 9:55 AM Usama Arif <usama.arif at linux.dev> wrote:
>
>
>
> On 02/03/2026 21:20, Nico Pache wrote:
> > On Thu, Feb 26, 2026 at 4:34 AM Usama Arif <usama.arif at linux.dev> wrote:
> >>
> >> Device memory migration has two call sites that split huge PMDs:
> >>
> >> migrate_vma_split_unmapped_folio():
> >> Called from migrate_vma_pages() when migrating a PMD-mapped THP to a
> >> destination that doesn't support compound pages. It splits the PMD
> >> then splits the folio via folio_split_unmapped().
> >>
> >> If the PMD split fails, folio_split_unmapped() would operate on an
> >> unsplit folio with inconsistent page table state. Propagate -ENOMEM
> >> to skip this page's migration. This is safe as folio_split_unmapped
> >> failure would be propagated in a similar way.
> >>
> >> migrate_vma_insert_page():
> >> Called from migrate_vma_pages() when inserting a page into a VMA
> >> during migration back from device memory. If a huge zero PMD exists
> >> at the target address, it must be split before PTE insertion.
> >>
> >> If the split fails, the subsequent pte_alloc() and set_pte_at() would
> >> operate on a PMD slot still occupied by the huge zero entry. Use
> >> goto abort, consistent with other allocation failures in this function.
> >>
> >> Signed-off-by: Usama Arif <usama.arif at linux.dev>
> >> ---
> >> mm/migrate_device.c | 16 ++++++++++++++--
> >> 1 file changed, 14 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >> index 78c7acf024615..bc53e06fd9735 100644
> >> --- a/mm/migrate_device.c
> >> +++ b/mm/migrate_device.c
> >> @@ -909,7 +909,13 @@ static int migrate_vma_split_unmapped_folio(struct migrate_vma *migrate,
> >> int ret = 0;
> >>
> >> folio_get(folio);
> >
> > Should we be concerned about this folio_get? Are we incrementing a
> > reference that was already held if we back out of the split?
> >
> > -- Nico
>
>
>
> Hi Nico,
>
> Thanks for pointing this out. It spun out to an entire investigation for me [1].
Hey Usama,
I'm sorry my question lead you down a rabbit hole but I'm glad you did
the proper investigation and found the correct answer :) Thanks for
looking into it and for clearing that up via a comment!
Cheers,
-- Nico
>
> Similar to [1], I inserted trace prints [2] and created a new __split_huge_pmd2
> that always returns -ENOMEM. Without folio_put on error [3], we get a refcount of 2.
>
> hmm-tests-129 [000] .l... 1.485514: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 dst=ffb48827440e8000 src==dst=1 refcount_src=2 mapcount_src=0 order_src=9 migrate=0 BEFORE remove_migration_ptes
> hmm-tests-129 [000] .l... 1.485517: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 refcount=3 mapcount=1 AFTER remove_migration_ptes
> hmm-tests-129 [000] .l... 1.485518: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 refcount=2 AFTER folio_put(src)
>
>
> With folio_put on error [4], we get a refcount of 1.
>
> hmm-tests-129 [001] ..... 1.492216: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 dst=fff7b8be840f0000 src==dst=1 refcount_src=1 mapcount_src=0 order_src=9 migrate=0 BEFORE remove_migration_ptes
> hmm-tests-129 [001] ..... 1.492219: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 refcount=2 mapcount=1 AFTER remove_migration_ptes
> hmm-tests-129 [001] ..... 1.492220: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 refcount=1 AFTER folio_put(src)
>
>
> So we need folio_put for split_huge_pmd_address failure, but NOT for
> folio_split_unmapped.
>
>
> [1] https://lore.kernel.org/all/332c9e16-46c3-4e1c-898e-2cb0a87ba1fc@linux.dev/
> [2] https://gist.github.com/uarif1/6abe4bedb85814e9be8d48a4fe742b41
> [3] https://gist.github.com/uarif1/f718af2113bc1a33484674b61b9dafcc
> [4] https://gist.github.com/uarif1/03c42f2549eaf2bc555e8b03e07a63c8
>
More information about the Linuxppc-dev
mailing list