CPU-local TLB flushing
sjenning at linux.vnet.ibm.com
Tue Jun 19 06:55:33 EST 2012
This is a continuation of a thread a few months ago:
zsmalloc is now in the staging tree and there are patches
on lkml to convert the x86 only tlb flushing code to arch
A quick back story, zsmalloc does some pte/tlb manipulation
to quickly map a pair of pages into a single VM area. It
does this with interrupts disabled in a preallocated per-cpu
VM area, which means the mapping only exists in the TLB of
the cpu that does the mapping and, therefore, only needs to
be flushed on that same cpu during the unmapping process.
Right now, zsmalloc uses __flush_tlb_one() on x86 to do a
cpu-local single entry tlb flush. Afaict, there is no such
call on ppc64.
The patch replaces that x86 call with a call to a new function,
local_unmap_kernel_range(), which is exactly the same as
unmap_kernel_range() in mm/vmalloc.c except that it calls
local_flush_tlb_kernel_range() instead of
A few archs support local_flush_tlb_kernel_range() already and
another patch in the patchset above, introduces this function
for x86; basically a wrapper for __flush_tlb_single().
For PPC_STD_MMU_64, it looked like all the tlb flushing
functions were just stubs, so I just added a stub for
local_flush_tlb_kernel_range(). This was stable running
a single threaded application, bound to one cpu, but crashes
with even two threads.
With local_flush_tlb_kernel_range() being a stub,
the new function local_unmap_kernel_range() is exactly
the same as unmap_kernel_range() since
local_flush_tlb_kernel_range() and flush_tlb_kernel_range()
are both stubs on ppc64.
My knowledge of the ppc64 hashing tlb design is almost
nothing, but it seems like this should work, albeit slowly
since it would be a global flush rather than cpu-local.
I was wondering if anyone could tell me why this doesn't work,
and what needs to be done to make it work.
Thanks in advance for any help! Let me know if you need
clarification on something.
More information about the Linuxppc-dev