[PATCH] [2.4] [RHEL] Backport of benh's PTE mgmt changes
Benjamin Herrenschmidt
benh at kernel.crashing.org
Wed Jan 7 16:08:09 EST 2004
On Wed, 2004-01-07 at 11:01, olof at austin.ibm.com wrote:
> Below is a 2.4 backport of parts of benh's 2.6 pte_free rewrite. It's
> different in a few ways:
>
> 1. 2.4 has no RCU. Instead I just send a syncronous IPI to all processors.
> Since the IPI won't be delivered until a processor is out of hash_page, it
> can be used as a barrier between new and old traversals.
But is also quite expensive....
> 2. There's no batching of TLB shootdowns, like in 2.6. So I had to hijack
> do_check_pgt_cache(). This is ugly, and I'm not too happy about it, but
> I think RedHat would be more likely to accept this than a change in
> generic code (at this point in the product cycle). Julie, feel free to
> prove me wrong. :-)
>
> 3. Because of the above reason, I had to add an extra per-cpu lock for the
> pte_freelist_batch structures.
>
> 4. The __hash_page locking is rougher than in 2.6. I left the hash locks
> there, since I believe they are still needed.
>
> 5. I recycled _PAGE_HASHNOIX, since it's never used. There were no other
> free bits available...
I moved bits around on 2.6, basically, _PAGE_FILE can be moved as it's
only used when !_PAGE_PRESENT, to make room.
>
> (6. RedHat disabled the fast PTE/PMD/PGD allocator, so the patch won't
> apply cleanly to an ameslab or marcelo 2.4 tree, but the differences are
> pretty obvious.)
>
>
>
> I think that's it. Please provide feedback. We're working on a deadline
> with RedHat, so sooner is better than later. I'll be beating on this with
> the specweb benchmark over the next couple of days as well. :-)
Comments in the patch.
> + /*
> + * Check the user's access rights to the page. If access should be
> + * prevented then send the problem up to do_page_fault.
> + */
> +
> access |= _PAGE_PRESENT;
> - if (unlikely(access & ~(pte_val(*ptep)))) {
> +
> + /* We'll do access checking and _PAGE_BUSY setting in assembly, since
> + * it needs to be atomic.
> + */
> +
> + __asm__ __volatile__ ("\n
> + 1: ldarx %0,0,%3\n
> + # Check access rights (access & ~(pte_val(*ptep)))\n
> + andc. %1,%2,%0\n
> + bne- 2f\n
> + # Check if PTE is busy\n
> + andi. %1,%0,%4\n
> + bne- 1b\n
> + ori %0,%0,%4\n
> + # Write the linux PTE atomically (setting busy)\n
> + stdcx. %0,0,%3\n
> + bne- 1b\n
> + li %1,1\n
> + b 3f\n
> + 2: stdcx. %0,0,%3 # to clear the reservation\n
> + li %1,0\n
> + 3:"
> + : "=r" (old_pte), "=r" (access_ok)
> + : "r" (access), "r" (ptep), "i" (_PAGE_BUSY)
> + : "cc", "memory");
.../...
Heh, so you kept the C version stuffing the asm atomic stuff
in :) Why note... well, it's definitely less invasive that what
I did in 2.6 but also less performant since I optimized the
branches to the ppc_md. hooks. That's probably ok for 2.4 though.
> + /* Clear _PAGE_BUSY flag atomically. */
> + __asm__ __volatile__ ("
> + 1: ldarx %0,0,%2\n
> + andc. %0,%0,%1\n
> + stdcx. %0,0,%2\n
> + bne- 1b\n"
> + : "=r" (new_pte)
> + : "r" (tmp), "r" (ptep)
> + : "cc", "memory");
I'm not sure we need to clear _PAGE_BUSY atomically.... I definitely
don't in 2.6... But we need to make sure this clear happens after
anything that was done previously.
The rest is a bit scary but it's 2.4 so... :) I suppose it should
work though I would have to spend more time looking at the code path
in details
Ben.
** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc64-dev
mailing list