[PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel page allocator for hash page table.

Aneesh Kumar K.V aneesh.kumar at linux.vnet.ibm.com
Wed May 7 00:20:48 EST 2014


Alexander Graf <agraf at suse.de> writes:

> On 06.05.14 09:19, Benjamin Herrenschmidt wrote:
>> On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote:
>>> On 06.05.14 02:06, Benjamin Herrenschmidt wrote:
>>>> On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote:
>>>>> Isn't this a greater problem? We should start swapping before we hit
>>>>> the point where non movable kernel allocation fails, no?
>>>> Possibly but the fact remains, this can be avoided by making sure that
>>>> if we create a CMA reserve for KVM, then it uses it rather than using
>>>> the rest of main memory for hash tables.
>>> So why were we preferring non-CMA memory before? Considering that Aneesh
>>> introduced that logic in fa61a4e3 I suppose this was just a mistake?
>> I assume so.

....
...

>>
>> Whatever remains is split between CMA and the normal page allocator.
>>
>> Without Aneesh latest patch, when creating guests, KVM starts allocating
>> it's hash tables from the latter instead of CMA (we never allocate from
>> hugetlb pool afaik, only guest pages do that, not hash tables).
>>
>> So we exhaust the page allocator and get linux into OOM conditions
>> while there's plenty of space in CMA. But the kernel cannot use CMA for
>> it's own allocations, only to back user pages, which we don't care about
>> because our guest pages are covered by our hugetlb reserve :-)
>
> Yes. Write that in the patch description and I'm happy ;).
>

How about the below:

Current KVM code first try to allocate hash page table from the normal
page allocator before falling back to the CMA reserve region. One of the
side effects of that is, we could exhaust the page allocator and get
linux into OOM conditions while we still have plenty of space in CMA. 

Fix this by trying the CMA reserve region first and then falling back
to normal page allocator if we fail to get enough memory from CMA
reserve area.

-aneesh



More information about the Linuxppc-dev mailing list