[PATCH RFC] powerpc/pseries: exploit H_PAGE_SET_UNUSED for partition migration

Nathan Lynch nathanl at linux.ibm.com
Wed Feb 21 03:20:14 AEDT 2024


Michael Ellerman <mpe at ellerman.id.au> writes:
> Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com at kernel.org>
> writes:
>> From: Nathan Lynch <nathanl at linux.ibm.com>
>>
>> Although the H_PAGE_INIT hcall's H_PAGE_SET_UNUSED historically has
>> been tied to the cooperative memory overcommit (CMO) platform feature,
>> the flag also is treated by the PowerVM hypervisor as a hint that the
>> page contents need not be copied to the destination during a live
>> partition migration.
>>
>> Use the "ibm,migratable-partition" root node property to determine
>> whether this partition/guest can be migrated. Mark freed pages unused
>> if so (or if CMO is in use, as before).
>>
>> Signed-off-by: Nathan Lynch <nathanl at linux.ibm.com>
>> ---
>> Several things yet to improve here:
>>
>> * powerpc's arch_free_page()/HAVE_ARCH_FREE_PAGE should be decoupled
>>   from CONFIG_PPC_SMLPAR.
>>
>> * powerpc's arch_free_page() could be made to use a static key if
>>   justified.
>>
>> * I have not yet measured the overhead this introduces, nor have I
>>   measured the benefit to a live migration.
>>
>> To date, I have smoke tested it by doing a live migration and
>> performing a build on a kernel with the change, to ensure it doesn't
>> introduce obvious memory corruption or anything. It hasn't blown up
>> yet :-)
>>
>> This will be a possibly significant behavior change in that we will be
>> flagging pages unused where we typically did not before. Until now,
>> having CMO enabled was the only way to do this, and I don't think that
>> feature is used all that much?
>
> Yeah AFAIK it has to be explicitly configured and enabled via the HMC,
> so doesn't get much testing or usage.
>
>> Posting this as RFC to see if there are any major concerns.
>  
> My worry is that this will add overhead for everyone in normal usage, an
> hcall per freed set of pages, whereas the benefit is only seen when a
> migration happens.
>
> But that does depend on how often arch_free_page() gets called in normal
> usage, which I don't know offhand.

Yes, and as I said in my followup yesterday:

>> for this to be safe, powerpc/pseries needs to implement
>> arch_alloc_page() to undo setting the "unused" flag.

So, perhaps more significantly, we'd also incur an hcall per
arch_alloc_page() with the most straightforward implementation that
doesn't eat data (unlike this version!).

Nevertheless I'll plan on doing that for the next iteration to see if I
can measure the overhead and benefit, with the expectation that we'll
ultimately need a more sophisticated design.


More information about the Linuxppc-dev mailing list