[PATCH 0/2] Faster MMU lookups for Book3s v3
avi at redhat.com
Thu Jul 1 22:43:54 EST 2010
On 07/01/2010 03:28 PM, Alexander Graf wrote:
>>> Wouldn't it speed up dirty bitmap flushing
>>> a lot if we'd just have a simple linked list of all sPTEs belonging to
>>> that memslot?
>> The complexity is O(pages_in_slot) + O(sptes_for_slot).
>> Usually, every page is mapped at least once, so sptes_for_slot
>> dominates. Even when it isn't so, iterating the rmap base pointers is
>> very fast since they are linear in memory, while sptes are scattered
>> around, causing cache misses.
> Why would pages be mapped often?
It's not a question of how often they are mapped (shadow: very often;
tdp: very rarely) but what percentage of pages are mapped. It's usually
> Don't you use lazy spte updates?
We do, but given enough time, the guest will touch its entire memory.
>> Another consideration is that on x86, an spte occupies just 64 bits
>> (for the hardware pte); if there are multiple sptes per page (rare on
>> modern hardware), there is also extra memory for rmap chains;
>> sometimes we also allocate 64 bits for the gfn. Having an extra
>> linked list would require more memory to be allocated and maintained.
> Hrm. I was thinking of not having an rmap but only using the chain. The
> only slots that would require such a chain would be the ones with dirty
> bitmapping enabled, so no penalty for normal RAM (unless you use kemari
> or live migration of course).
You could also only chain writeable ptes.
> But then again I probably do need an rmap for the mmu_notifier magic,
> right? But I'd rather prefer to have that code path be slow and the
> dirty bitmap invalidation fast than the other way around. Swapping is
> slow either way.
It's not just swapping, it's also page ageing. That needs to be fast.
Does ppc have a hardware-set referenced bit? If so, you need a fast
rmap for mmu notifiers.
error compiling committee.c: too many arguments to function
More information about the Linuxppc-dev