[2.4] [PATCH] hash_page rework, take 2

Benjamin Herrenschmidt benh at kernel.crashing.org
Thu Feb 5 10:07:08 EST 2004

Hi Julie !

> OK, I managed to convince myself that using an eieio is ok here. I was
> concerned that other processors might not have seen any of the stores
> that preceded the eieio instruction, since eieio is normally only used
> when dealing with device memory. lwsync ensures other processors have
> seen any stores to system memory at the point the lock is released. But
> the only stores that matter here are the hpte (and it is sync'd) and the
> pte and it has the lock bit. So when another processor sees the pte
> contents without the lock bit set it will, by default, be seeing the
> updated value as well.

eieio enforce store ordering on cacheable accesses too, which is all
we should need at this point.

> So it is true that an interrupt handler can cause a page fault? Can you
> provide me with an example?

Not really a "page fault" in the linux sense, but rather a hash miss,
yes. Typically, a driver accessing ioremap'ed IO space or a module
running vmalloc'ed memory can trigger a hash miss.

With my 2.6 implementation, there shouldn't be a problem as only
hash_page will set PAGE_BUSY and this is done with interrupts off,
so it can't be re-entered on the same CPU.

> Let me see if I understand this. When someone wants to free a page
> pointed to by an entry in a 3rd level page table, they clear out the pte
> in the page table using pte_clear(). Then they call pte_free with the
> address of the page they are freeing up (not really a page table entry
> but the actual page address). This page address is added to the batch
> list. Later, the idle loop or process termination code calls
> do_check_pgt_cache which will free all the pages in the batch list.

Yes. What we need it to make sure no CPU was currently walking the
page tables when the pte_clear occured, that is that no CPU is
actually still using the PTEs in the page we are about to get rid
of, which basically means we must make sure that no CPU that was in
hash_page at the time of the pte_clear is still in that function.


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/

More information about the Linuxppc64-dev mailing list