fsl booke MM vs. SMP questions
Kumar Gala
galak at kernel.crashing.org
Tue May 22 13:03:41 EST 2007
On May 21, 2007, at 2:06 AM, Benjamin Herrenschmidt wrote:
> Hi Folks !
>
> I see that the fsl booke code has some #ifdef CONFIG_SMP bits here or
> there, thus I suppose there are some SMP implementations of these
> right ?
There will be, the SMP code that exists was just some stuff I put in
w/o going through each case. The TLB mgmt code does need some fixup
for SMP.
- k
>
> I'm having some serious issues trying to figure out how the TLB
> management is made SMP safe however.
>
> There are at least two main issues I've spotted at this point (there's
> at least one more if there are HW threading, that is the TLB is shared
> between logical processors, but I'll ignore that for now since I don't
> think there is such a thing ... yet).
>
> - How do you guys shield PTE flushing vs. TLB misses on another CPU ?
> That is, how do you prevent (if you do) the following scenario:
>
> cpu 0 cpu 1
> tlb miss pte_clear (or similar)
> load PTE value
> write 0 to PTE (or replace)
> tlbviax (tlbie)
> tlbwe
>
> That scenario, as you can see, will leave you with stale entries in
> the
> TLB which will ultimately lead to all sort of unpleasant/random
> behaviours.
>
> If the answer is "oops ... we don't", then let's try to find out ways
> out of that since I may have a similar issue in a not too distant
> future :-) And I'm trying to find out a -fast- way to deal with that
> without bloating the fast path. My main problem is that I want to
> avoid
> taking a spin lock or equivalent atomic operation in the fast TLB
> reload
> path (which would solve the problem) since lwarx/stwcx. are generally
> real slow (hundreds of cycles on some processors).
>
> - I see that your TLB miss handle is using a non-atomic store to
> write
> the _PAGE_ACCESSED bit back to the PTE. Don't you have a similar race
> where something would do:
>
> cpu 0 cpu 1
> tlb miss pte_clear (or similar)
> load PTE value
> write 0 to PTE (or replace)
> write back PTE with _PAGE_ACCESSED
> tlbwe
>
> This is an extension of the previous race but it's a different problem
> so I listed it separately. In that case, the problem is worse,
> since not
> only you have a stale TLB entry, but you -also- have corrupted the
> linux
> PTE by writing back the old value in it.
>
> At this point, I'm afraid you may have no choice but going atomic,
> which
> means paying the cost of lwarx/stwcx. on TLB misses, though if you
> have
> a solution for the first problem, then you can avoid the atomic
> operation in the second problem if _PAGE_ACCESSED is already set.
>
> If not, you might have to use a _PAGE_BUSY bit similar to what 64 bits
> uses as a per-PTE lock, or use mmu_hash_lock... Unless you come up
> with
> a great idea or some HW black magic that makes the problem go away...
>
> In any case, I'm curious about how you have or intend to solve that
> since as I said above, I might be in a similar situation soon and am
> trying to keep the TLB miss handler as fast as humanly possible.
>
> Cheers,
> Ben.
>
More information about the Linuxppc-dev
mailing list