[PATCH] powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line

Anshuman Khandual khandual at linux.vnet.ibm.com
Thu May 18 23:04:21 AEST 2017


On 05/17/2017 12:29 PM, Aneesh Kumar K.V wrote:
> 
> 
> On Wednesday 17 May 2017 10:31 AM, Anshuman Khandual wrote:
>> On 05/16/2017 02:54 PM, Aneesh Kumar K.V wrote:
>>> +void __init reserve_hugetlb_gpages(void)
>>> +{
>>> +    char buf[10];
>>> +    phys_addr_t base;
>>> +    unsigned long gpage_size = 1UL << 34;
>>> +    static __initdata char cmdline[COMMAND_LINE_SIZE];
>>> +
>>> +    if (radix_enabled())
>>> +        gpage_size = 1UL << 30;
>>> +
>>> +    strlcpy(cmdline, boot_command_line, COMMAND_LINE_SIZE);
>>> +    parse_args("hugetlb gpages", cmdline, NULL, 0, 0, 0,
>>> +           NULL, &do_gpage_early_setup);
>>> +
>>> +    if (!gpage_npages)
>>> +        return;
>>> +
>>> +    string_get_size(gpage_size, 1, STRING_UNITS_2, buf, sizeof(buf));
>>> +    pr_info("Trying to reserve %ld %s pages\n", gpage_npages, buf);
>>> +
>>> +    /* Allocate one page at a time */
>>> +    while(gpage_npages) {
>>> +        base = memblock_alloc_base(gpage_size, gpage_size,
>>> +                       MEMBLOCK_ALLOC_ANYWHERE);
>>> +        add_gpage(base, gpage_size, 1);
>>
>> For 16GB pages (1UL << 34) on POWER8, we already do these functions
>> inside htab_dt_scan_hugepage_blocks(). IIUC this happens just by
>> scanning DT without even specifying any gpages in kernel command
>> line.
>>
>> memblock_reserve()
>> add_gpage()
>>
>> Then attempting to allocate from memblock and adding it again into
>> gigantic pages list wont collide ?
> 
> That is for pseries.ie, pSeries will get the hugpages reserved by phyp
> and the details of those pages are passed via device tree. Not sure what
> is the conflict here. If we use the above kernel parameter, we will try
> to allocate another 'x' number of hugepages.
> 
>> More over its trying to allocate
>> across the RAM not specifically on the gpages mentioned in device
>> tree by the platform. Are we trying to support 16GB pages just from
>> any memory without platform notification through DT ?
>>
> 
> There are two ways to specify gpages, one via device tree which is used
> only in case of pseries and other hugepagesz=size hugepags=no-of-hugepages.

New way (Added with this patch)
-------------------------------
setup_arch()
	reserve_hugetlb_page() (Now defined for PPC64 BOOK3S)

reserve_hugetlb_page() allocate 1GB (radix) / 16GB (hash) from the
memblock during boot (with memblock_alloc_base()) looking into the
kernel command line parameters for HugeTLB gigantic pages. It then
calls add_gpage() which populates gpage_freearray[] which remains
local to powerpc arch.

Existing DT (pseries on PHYP)
-----------------------------
early_setup()
	early_init_devtree()
		mmu_early_init_devtree()
			hash__early_init_devtree()
				htab_scan_page_sizes()
					htab_dt_scan_hugepage_blocks()

htab_dt_scan_hugepage_blocks() scans and adds individual PHYP reserved
16GB pages huge pages into gpage_freearray[] through add_gpage() call.

The same kernel command line parameters then create the hstate structure
for the gigantic pages in generic HugeTLB and which then calls alloc_
bootmem_huge_page() transferring the local gpages details stored in
gpage_freearray[] to generic huge_boot_pages. I hope my understanding
here is correct, please do correct me otherwise.

DT scanned gpages are first reserved with memblock_reserve() hence
then wont be used during memblock_alloc_base() called from the other
method. Hence no race during add_gpage() on system using both methods
simultaneously. I dont see anything preventing reserve_hugetlb_page()
being called on pseries systems though in which case may allocate
gigantic pages more than required if there are some already available
through DT path. Will look into this further.



More information about the Linuxppc-dev mailing list