[PATCH next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff

Hugh Dickins hughd at google.com
Mon Jan 11 18:33:01 AEDT 2016


On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> Hugh Dickins <hughd at google.com> writes:
> > On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote:
> >> Hugh Dickins <hughd at google.com> writes:
> >> 
> >> > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
> >> > but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
> >> > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
> >> > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
> >> > recognized.
> >> >
> >> > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
> >> > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
> >> > attempt to solve this problem.)
> >> >
> >> > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
> >> > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
> >> > should be made more robust; but nevertheless, fix up the powerpc ifdefs
> >> > as x86_64 and s390 (which met the same problem) have them, defining the
> >> > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
> >> 
> >> Do we need this patch, if we make the maybe_same_pte() more robust. The
> >> #ifdef with pte bits is always a confusing one and IMHO, we should avoid
> >> that if we can ?
> >
> > If maybe_same_pte() were more robust (as in the pte_same_as_swp() patch),
> > this patch here becomes an optimization rather than a correctness patch:
> > without this patch here, pte_same_as_swp() will perform an unnecessary 
> > transformation (masking out _PAGE_SWP_SOFT_DIRTY) from every one of the
> > millions of ptes it has to examine, on configs where it couldn't be set.
> > Or perhaps the processor gets that all nicely lined up without any actual
> > delay, I don't know.
> 
> But we have
> #ifndef CONFIG_HAVE_ARCH_SOFT_DIRTY
> static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
> {
> 	return pte;
> }
> #endif 
> 
> If we fix the CONFIG_HAVE_ARCH_SOFT_DIRTY correctly, we can do the same
> optmization without the #ifdef of pte bits right ?

I'm not sure that I understand you (I'll have to look at your patch),
but suspect you're not optimizing the CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MEM_SOFT_DIRTY not set case.

Which would not be the end of the world, but...
 
> >
> > I've already agreed that the way SOFT_DIRTY is currently config'ed is
> > too confusing; but until that's improved, I strongly recommend that you
> > follow the same way of handling this as x86_64 and s390 are doing - going
> > off and doing it differently is liable to lead to error, as we have seen.

... as before, I don't think that doing it differently is a good idea.

Hugh


More information about the Linuxppc-dev mailing list