[PATCH 0/4] powernv: kvm: numa fault improvement

Aneesh Kumar K.V aneesh.kumar at linux.vnet.ibm.com
Tue Jan 21 02:45:26 EST 2014


Liu ping fan <kernelfans at gmail.com> writes:

> On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf <agraf at suse.de> wrote:
>>
>> On 11.12.2013, at 09:47, Liu Ping Fan <kernelfans at gmail.com> wrote:
>>
>>> This series is based on Aneesh's series  "[PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64"
>>>
>>> For this series, I apply the same idea from the previous thread "[PATCH 0/3] optimize for powerpc _PAGE_NUMA"
>>> (for which, I still try to get a machine to show nums)
>>>
>>> But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host,
>>> which is  well known.
>>
>> This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it.
>>
> Sorry for the unclear message. After introducing the _PAGE_NUMA,
> kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it
> should rely on host's kvmppc_book3s_hv_page_fault() to call
> do_numa_page() to do the numa fault check. This incurs the overhead
> when exiting from rmode to vmode.  My idea is that in
> kvmppc_do_h_enter(), we do a quick check, if the page is right placed,
> there is no need to exit to vmode (i.e saving htab, slab switching)

Can you explain more. Are we looking at hcall from guest  and
hypervisor handling them in real mode ? If so why would guest issue a
hcall on a pte entry that have PAGE_NUMA set. Or is this about
hypervisor handling a missing hpte, because of host swapping this page
out ? In that case how we end up in h_enter ? IIUC for that case we
should get to kvmppc_hpte_hv_fault. 


>
>>> If my suppose is correct, will CCing kvm at vger.kernel.org from next version.
>>
>> This translates to me as "This is an RFC"?
>>
> Yes, I am not quite sure about it. I have no bare-metal to verify it.
> So I hope at least, from the theory, it is correct.
>

-aneesh



More information about the Linuxppc-dev mailing list