[FIX PATCH v0] powerpc: Fix memory unplug failure on radix guest

Bharata B Rao bharata at linux.vnet.ibm.com
Fri Sep 1 16:53:13 AEST 2017


On Thu, Aug 10, 2017 at 02:53:48PM +0530, Bharata B Rao wrote:
> For a PowerKVM guest, it is possible to specify a DIMM device in
> addition to the system RAM at boot time. When such a cold plugged DIMM
> device is removed from a radix guest, we hit the following warning in the
> guest kernel resulting in the eventual failure of memory unplug:
> 
> remove_pud_table: unaligned range
> WARNING: CPU: 3 PID: 164 at arch/powerpc/mm/pgtable-radix.c:597 remove_pagetable+0x468/0xca0
> Call Trace:
> remove_pagetable+0x464/0xca0 (unreliable)
> radix__remove_section_mapping+0x24/0x40
> remove_section_mapping+0x28/0x60
> arch_remove_memory+0xcc/0x120
> remove_memory+0x1ac/0x270
> dlpar_remove_lmb+0x1ac/0x210
> dlpar_memory+0xbc4/0xeb0
> pseries_hp_work_fn+0x1a4/0x230
> process_one_work+0x1cc/0x660
> worker_thread+0xac/0x6d0
> kthread+0x16c/0x1b0
> ret_from_kernel_thread+0x5c/0x74
> 
> The DIMM memory that is cold plugged gets merged to the same memblock
> region as RAM and hence gets mapped at 1G alignment. However since the
> removal is done for one LMB (lmb size 256MB) at a time, the address
> of the LMB (which is 256MB aligned) would get flagged as unaligned
> in remove_pud_table() resulting in the above failure.
> 
> This problem is not seen for hot plugged memory because for the
> hot plugged memory, the mappings are created separately for each
> LMB and hence they all get aligned at 256MB.
> 
> To fix this problem for the cold plugged memory, let us mark the
> cold plugged memblock region explicitly as HOTPLUGGED so that the
> region doesn't get merged with RAM. All the memory that is discovered
> via ibm,dynamic-memory-configuration is marked so(1). Next identify
> such regions in radix_init_pgtable() and create separate mappings
> within that region for each LMB so that they get don't get aligned
> like RAM region at 1G (2).
> 
> (1) For PowerKVM guests, all boot time memory is represented via
> memory at XXXX nodes and hot plugged/pluggable memory is represented via
> ibm,dynamic-memory-reconfiguration property. We are marking all
> hotplugged memory that is in ASSIGNED state during boot as HOTPLUGGED.
> With this only cold plugged memory gets marked for PowerKVM but
> need to check how this will affect PowerVM guests.
> 
> (2) To create separate mappings for every LMB in the hot plugged
> region, we need lmb-size. I am currently using memory_block_size_bytes()
> API to get the lmb-size. Since this is early init time code, the
> machine type isn't probed yet and hence memory_block_size_bytes()
> would return the default LMB size as 16MB. Hence we end up creating
> separate mappings at much lower granularity than what we can ideally
> do for pseries machine.
> 
> Signed-off-by: Bharata B Rao <bharata at linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/prom.c      |  1 +
>  arch/powerpc/mm/pgtable-radix.c | 17 ++++++++++++++---
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index f830562..24ecf53 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -524,6 +524,7 @@ static int __init early_init_dt_scan_drconf_memory(unsigned long node)
>  					size = 0x80000000ul - base;
>  			}
>  			memblock_add(base, size);
> +			memblock_mark_hotplug(base, size);

One of the suggestions was to make the above conditional to radix so
that PowerVM doesn't get affected by this. However early_radix_enabled()
check isn't usable yet at this point and MMU_FTR_TYPE_RADIX will get set
only a bit later in early_init_devtree().

Regards,
Bharata.



More information about the Linuxppc-dev mailing list