hash table
Benjamin Herrenschmidt
benh at kernel.crashing.org
Wed Dec 10 11:02:52 EST 2003
> I can see a race between the find_linux_pte() and the use of ptep in
> __hash_page. Another CPU can come in during that window and deallocate
> the PTE, can't it? One solution for this is to set _PAGE_BUSY in
> find_linux_pte() atomically during lookup. There's even more subtle
> races in the sense that the tree is walked while someone might update it
> underneath of the lookup, but maybe they can be ignored?
Yup, this race is on my list already ;)
I want to move find_linux_pte down into __hash_page anyway, but that's
not how to fix this race.
AFAIK, the only race is (very unlikely but definitely there) if we free
a PTE page on one CPU while we are in hash_page() on another CPU.
Paulus proposed a fix for this which consist of delaying the actual
freeing of PTE pages. We gather them into a list that we free either
after a given threshold or after a while at idle time.
When we actually go to free it, we use an IPI to sync with othe CPUs,
making sure they aren't in hash_page(). At that point, we'll have
already cleared the pmd entries, so we know no CPU will go down to the
PTE any more on a further hash_page().
>Also two minor comments:
>
> * in pte_update, use _PAGE_BUSY instead of hardcoded 0x0800? Would
> increase readability a little.
Yah, maybe, I didn't feel like adding another argument to the asm
statement, I hate that syntax, but you are probably right ;)
> * in __hash_page / htab_wrong_access: There's no check for failed stdcx.
That's normal, the only point of this stdcx. is to not leave a dangling
reservation, I don't care if it succeed as the value I'm writing back is
the original value intact.
Thanks for your comments,
Ben.
** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc64-dev
mailing list