[PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation

Vaibhav Jain vaibhav at linux.ibm.com
Tue Jun 18 23:27:41 AEST 2019


Michael Ellerman <mpe at ellerman.id.au> writes:

> Vaibhav Jain <vaibhav at linux.ibm.com> writes:
>> We recently discovered an bug where physical memory meant for
>> allocation of Huge-pages was inadvertently allocated by another component
>> during early boot.
>
> Can you give me some more detail on what that was? You're seemingly the
> only person who's ever hit this :)

The specific bug I investigated was in fadump which was trying to
reserve a large chunk of physically contiguous memory from memblock and
inadvertently stomped into a memory region that was reserved for
allocation of 16G hugepages. This happened because fadump reservation
happens much earlier in prom-init compared to hugepage reservation.

The bug manifested as a panic seen when trying to 'cat
/proc/pagetypeinfo' that dumps the buddy stats for each
zone/migrate-type.

Incidentally fadump after reserving this memory, would carve out a CMA
region that was then entered into the buddy-allocater. This would cause
pagetypeinfo_showfree_print() to fail when it tries to iterate over the
free list of this CMA migrate type as the corresponding memmap for these
pages was never initialized.

>
>> The behavior of memblock_reserve() where it wont
>> indicate whether an existing reserved block overlaps with the
>> requested reservation only makes such bugs hard to investigate.
>>
>> Hence this patch proposes adding a memblock reservation check in
>> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve()
>> to ensure that the physical memory thats being reserved for is not
>> already reserved by someone else. In case this happens we panic the
>> the kernel to ensure that user of this huge-page doesn't accidentally
>> stomp on memory allocated to someone else.
>
> Do we really need to panic? Can't we just leave the block alone and not
> register it as huge page memory? With a big warning obviously.
Possibly yes, but Aneesh pointed out that this memory is supposed to be backed
only by 16G pages mapping due to limitation in phyp for hash page table size. 

>
> cheers
>
>> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
>> index 28ced26f2a00..a05be3adb8c9 100644
>> --- a/arch/powerpc/mm/book3s64/hash_utils.c
>> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
>> @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
>>  	printk(KERN_INFO "Huge page(16GB) memory: "
>>  			"addr = 0x%lX size = 0x%lX pages = %d\n",
>>  			phys_addr, block_size, expected_pages);
>> +
>> +	/* Ensure no one else has reserved memory for huge pages before */
>> +	BUG_ON(memblock_is_region_reserved(phys_addr,
>> +					   block_size * expected_pages));
>> +
>>  	if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) {
>>  		memblock_reserve(phys_addr, block_size * expected_pages);
>>  		pseries_add_gpage(phys_addr, block_size, expected_pages);
>> -- 
>> 2.21.0
>

-- 
Vaibhav Jain <vaibhav at linux.ibm.com>
Linux Technology Center, IBM India Pvt. Ltd.



More information about the Linuxppc-dev mailing list