[PATCH next] powerpc/mm: fix _PAGE_PTE breaking swapoff

Hugh Dickins hughd at google.com
Mon Jan 11 16:45:45 AEDT 2016


On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd at google.com> writes:
> 
> > Swapoff after swapping hangs on the G5.  That's because the _PAGE_PTE
> > bit, added by set_pte_at(), is not expected by swapoff: so swap ptes
> > cannot be recognized.
> >
> > I'm not sure whether a swap pte should or should not have _PAGE_PTE set:
> > this patch assumes not, and fixes set_pte_at() to set _PAGE_PTE only on
> > present entries.
> 
> One of the reason we added _PAGE_PTE is to enable HUGETLB migration. So
> we want migratio ptes to have _PAGE_PTE set.

Okay, I won't pretend to understand the role of _PAGE_PTE in that;
but if it helps you to have _PAGE_PTE set in (swap and) migration entries,
that's very easily done with the alternative I suggested for pgtable.h:

-#define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
-#define __swp_entry_to_pte(x)		__pte((x).val)
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
+#define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)

I did test that variant (with set_pte_at() restored to how you have it);
but not understanding _PAGE_PTE, I thought it odd to have in a swap entry.

> 
> >
> > But if that's wrong, a reasonable alternative would be to
> > #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) & ~_PAGE_PTE })
> > #define __swp_entry_to_pte(x)	__pte((x).val | _PAGE_PTE)
> >
> 
> We do clear _PAGE_PTE bits, when converting swp_entry_t to type and
> offset. Can you share the stack trace for the hang, which will help me
> understand this more ? . 

The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
since swapoff is circling around and around that function, reading from
each used swap block into a page, then trying to find where that page
belongs, looking at every non-file pte of every mm that ever swapped.

The code to look at is unuse_pte_range(), which at the top does
	pte_t swp_pte = swp_entry_to_pte(entry)
to get the form it hopes to find in the page table; then scans doing
		if (unlikely(maybe_same_pte(*pte, swp_pte))) {
on each pte slot.  Ignoring the MEM_SOFT_DIRTY complication (which
had its own independent bug) maybe_same_pte() just does pte_same().

Hugh


More information about the Linuxppc-dev mailing list