[PATCH] powerpc/mm/hash: Fix the reference bit update when handling hash fault
benh at kernel.crashing.org
Tue May 31 08:27:42 AEST 2016
On Mon, 2016-05-30 at 09:39 -0700, Hugh Dickins wrote:
> I don't mean to be churlish, and subtract from your triumph in tracking
> this down (assuming you have), but that commit log... okay, it's intended
> for powerpc mmu experts, not me, but if it hasn't already gone into git,
> then a rewrite could be very helpful.
Something along these lines:
The powerpc hash table has a R (Referenced) and C (Changed) bits that
somewhat correspond to Linux _PAGE_DIRTY and _PAGE_ACCESSED. However we
don't currently use them.
Moreover, we also require them to never be updated by HW. This is due
to an optimization we have in the hash eviction code, which would be
racy vs. a hardware update as the HW updates are done non-atomically.
Thus it's critical that valid hash PTEs always have R set and writeable
ones have C set. We do this by hashing a non-dirty linux PTE as read-only and always setting _PAGE_ACCESSED (and thus R) when hashing anything else in. Any attempt by Linux at clearing those bits also removes the corresponding hash entry.
The old commit <.....> fixed an issue where we would miss setting C in
the specific case where a Linux PTE was upgraded from read only to
read-write (and appropriately made dirty). The hash code would realize
the hash PTE is already present and would use a different path than the
normal insertion path for updating a hash entry in-place. That path
unfortunately didn't update "C".
That commit however got a bit over zealous and also forced C on any
entry including those that aren't writeable. That was unnecessary.
In commit 89ff725051d1, when converting to C, we mangled that up:
- We kept the useless part of <....> setting C always instead of
only when _PAGE_DIRTY is set
- We never set R thus letting the HW do the racy updates.
This fixes it.
More information about the Linuxppc-dev