[PATCH RFC] powerpc/pseries: exploit H_PAGE_SET_UNUSED for partition migration

Nathan Lynch nathanl at linux.ibm.com
Tue Feb 20 09:25:23 AEDT 2024


>
> Although the H_PAGE_INIT hcall's H_PAGE_SET_UNUSED historically has
> been tied to the cooperative memory overcommit (CMO) platform feature,
> the flag also is treated by the PowerVM hypervisor as a hint that the
> page contents need not be copied to the destination during a live
> partition migration.
>
> Use the "ibm,migratable-partition" root node property to determine
> whether this partition/guest can be migrated. Mark freed pages unused
> if so (or if CMO is in use, as before).
>
> Signed-off-by: Nathan Lynch <nathanl at linux.ibm.com>
> ---
> Several things yet to improve here:
>
> * powerpc's arch_free_page()/HAVE_ARCH_FREE_PAGE should be decoupled
>   from CONFIG_PPC_SMLPAR.
>
> * powerpc's arch_free_page() could be made to use a static key if
>   justified.
>
> * I have not yet measured the overhead this introduces, nor have I
>   measured the benefit to a live migration.
>
> To date, I have smoke tested it by doing a live migration and
> performing a build on a kernel with the change, to ensure it doesn't
> introduce obvious memory corruption or anything.

An update on this. I've been working on quantifying the benefit, and
I've now seen memory corruption after partition migrations, with several
programs getting mysterious SIGSEGV/SIGILL:

mobility: calling ibm,suspend-me on CPU 7
mobility: CPU 7 waking all threads
mobility: waiting for memory transfer to complete...
touch[10988]: segfault (11) at 8 nip 7fff8fc2f9fc lr 7fff8fc26684 code 1 in ld-2.28.so[7fff8fc20000+30000]
touch[10987]: segfault (11) at 8 nip 7fff86f5f9fc lr 7fff86f56684 code 1 in ld-2.28.so[7fff86f50000+30000]
touch[10988]: code: 3d22ffff 81297470 71290020 4082254c e93d00f0 2fa90000 f93f00b0 409e2458 
touch[10988]: code: e93d00f8 e95d0068 eb1d0000 2fa90000 <ea2a0008> 419e00f8 e93d0158 e91d0058 
touch[10987]: code: 3d22ffff 81297470 71290020 4082254c e93d00f0 2fa90000 f93f00b0 409e2458 
touch[10987]: code: e93d00f8 e95d0068 eb1d0000 2fa90000 <ea2a0008> 419e00f8 e93d0158 e91d0058 

Maybe I had some debug options enabled that masked this before, or just
got lucky.

In any case, it seems obvious now that for this to be safe,
powerpc/pseries needs to implement arch_alloc_page() to undo setting the
"unused" flag.


>
> This will be a possibly significant behavior change in that we will be
> flagging pages unused where we typically did not before. Until now,
> having CMO enabled was the only way to do this, and I don't think that
> feature is used all that much?
>
> Posting this as RFC to see if there are any major concerns.
> ---
>  arch/powerpc/platforms/pseries/lpar.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> index 4561667832ed..b264371d8e12 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -16,6 +16,7 @@
>  #include <linux/export.h>
>  #include <linux/jump_label.h>
>  #include <linux/delay.h>
> +#include <linux/of.h>
>  #include <linux/stop_machine.h>
>  #include <linux/spinlock.h>
>  #include <linux/cpuhotplug.h>
> @@ -1772,17 +1773,25 @@ static void pSeries_set_page_state(struct page *page, int order,
>  	}
>  }
>  
> +static bool migratable_partition;
> +
>  void arch_free_page(struct page *page, int order)
>  {
> -	if (radix_enabled())
> -		return;
> -	if (!cmo_free_hint_flag || !firmware_has_feature(FW_FEATURE_CMO))
> -		return;
> -
> -	pSeries_set_page_state(page, order, H_PAGE_SET_UNUSED);
> +	if (migratable_partition ||
> +	    (firmware_has_feature(FW_FEATURE_CMO) && cmo_free_hint_flag))
> +		pSeries_set_page_state(page, order, H_PAGE_SET_UNUSED);
>  }
>  EXPORT_SYMBOL(arch_free_page);
>  
> +static int __init check_migratable_partition(void)
> +{
> +	struct device_node *root = of_find_node_by_path("/");
> +	migratable_partition = !!of_find_property(root, "ibm,migratable-partition", NULL);
> +	of_node_put(root);
> +	return 0;
> +}
> +machine_device_initcall(pseries, check_migratable_partition);
> +
>  #endif /* CONFIG_PPC_SMLPAR */
>  #endif /* CONFIG_PPC_BOOK3S_64 */
>  
>


More information about the Linuxppc-dev mailing list