[PATCH v2] POWERPC: Allow 32-bit pgtable code to support 36-bit physical

Benjamin Herrenschmidt benh at kernel.crashing.org
Fri Aug 29 08:42:01 EST 2008

> I understand what you're saying, I've been here before :)  However, I  
> was never able to convince myself that it's safe without the lwarx/ 
> stwcx.  There's hashing code that wanks around with the HASHPTE bit  
> doing a RMW without holding any lock (other than lwarx/stwcx-ing the  
> PTE itself).  And there's definitely code that's fairly far removed  
> from the last time you checked that an entry was valid.  I've CC'd Ben  
> on this mail - perhaps he can shed some light.  If we don't need the  
> atomics, I'm happy to yank them.
> Now, it *does* seem like set_pte_at could be optimized for the non-SMP  
> case....  I'll have to chew on that one a bit.

I haven't read the whole discussion not reviewed the patches yet, so I'm
just answering to the above sentence before I head off for the week-end
(and yes, Becky, I _DO_ have reviewing your stuff high on my todo list,
but I've been really swamped those last 2 weeks).

So all generic code always accesses PTEs with the PTE lock held (that
lock can be in various places ... typically for us it's one per PTE

44x and FSL BookE no longer write to the PTEs without that lock anymore
and thus don't require atomic access in set_pte and friends.

Hash based platforms still do because of -one- thing : the hashing code
proper which writes back using lwarx/stwcx. to update _PAGE_ACCESSED,
_PAGE_HASHPTE and _PAGE_DIRTY. The hash code does take a global lock to
avoid double-hashing of the same page but this isn't something we should
use elsewhere.

So on hash based platforms, updates of the PTEs are expected to be done
atomically. Now if you extend the size of the PTE, hopefully those
atomic updates are still necessary but only for the -low- part of the
PTE that contains those bits.

For the non-SMP case, I think it should be possible to optimize it. The
only thing that can happen at interrupt time is hashing of kernel or
vmalloc/ioremap pages, which shouldn't compete with set_pte on those
pages, so there would be no access races there, but I may be missing
something as it's the morning and I about just woke up :-)


More information about the Linuxppc-dev mailing list