[v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space

Nicholas Piggin npiggin at gmail.com
Sat Jul 6 19:44:16 AEST 2019


Santosh Sivaraj's on July 6, 2019 7:26 am:
> From: Balbir Singh <bsingharora at gmail.com>
> 
> The code currently assumes PAGE_SHIFT as the shift value of
> the pfn,

This comment doesn't really make sense on its own. Linux pfns
are always units of page shift, so if it's not that then it's
not a pfn.

I think you want the information from the last paragraph up the
here because that explains exactly what the problem is.

> this works correctly (mostly) for user space pages,
> but the correct thing to do is
> 
> 1. Extract the shift value returned via the pte-walk API's
> 2. Use the shift value to access the instruction address.
> 
> Note, the final physical address still use PAGE_SHIFT for
> computation. handle_ierror() is not modified and handle_derror()
> is modified just for extracting the correct instruction
> address.
> 
> This is largely due to __find_linux_pte() returning pfn's
> shifted by pdshift. The code is much more generic and can
> handle shift values returned.
> 
> Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
> 
> Signed-off-by: Balbir Singh <bsingharora at gmail.com>
> [arbab at linux.ibm.com: Fixup pseries_do_memory_failure()]
> Signed-off-by: Reza Arbab <arbab at linux.ibm.com>
> Signed-off-by: Santosh Sivaraj <santosh at fossix.org>
> ---
>  arch/powerpc/include/asm/mce.h       |  3 ++-
>  arch/powerpc/kernel/mce_power.c      | 26 ++++++++++++++++----------
>  arch/powerpc/platforms/pseries/ras.c |  6 ++++--
>  3 files changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index a4c6a74ad2fb..94888a7025b3 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -209,7 +209,8 @@ extern void release_mce_event(void);
>  extern void machine_check_queue_event(void);
>  extern void machine_check_print_event_info(struct machine_check_event *evt,
>  					   bool user_mode, bool in_guest);
> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
> +unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
> +			  unsigned int *shift);
>  #ifdef CONFIG_PPC_BOOK3S_64
>  void flush_and_reload_slb(void);
>  #endif /* CONFIG_PPC_BOOK3S_64 */
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index e39536aad30d..04666c0b40a8 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -23,7 +23,8 @@
>   * Convert an address related to an mm to a PFN. NOTE: we are in real
>   * mode, we could potentially race with page table updates.
>   */
> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
> +unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
> +			  unsigned int *shift)
>  {
>  	pte_t *ptep;
>  	unsigned long flags;

addr_to_pfn is too generic a name for what it does I think. I would
call it ppc_addr_to_pfn maybe. But it has other weirdness too, it's
assuming the kernel can't fault on user addresses AFAIKS.

> @@ -36,13 +37,15 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
>  
>  	local_irq_save(flags);
>  	if (mm == current->mm)
> -		ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
> +		ptep = find_current_mm_pte(mm->pgd, addr, NULL, shift);
>  	else
> -		ptep = find_init_mm_pte(addr, NULL);
> +		ptep = find_init_mm_pte(addr, shift);
>  	local_irq_restore(flags);
>  	if (!ptep || pte_special(*ptep))
>  		return ULONG_MAX;
> -	return pte_pfn(*ptep);
> +	if (!*shift)
> +		*shift = PAGE_SHIFT;
> +	return (pte_val(*ptep) & PTE_RPN_MASK) >> *shift;
>  }
>  
>  /* flush SLBs and reload */

AFAIKS moving this weirdness into the callers is pointles, fix it up
into a proper Linux pfn in addr_to_pfn.

The duplication and divergence of machine check code between powernv
and pseries is bad, I'm looking through it and from start to end it's
trying to do basically the same thing in slightly different ways, and
we're still got a lot of bugs. I wonder if we shouldn't try to fix it
all and merge them to a single implementation...

Actually your code does not touch the core of it too significantly so
it looks like we don't have to delay your stuff in order to do that.

This is a good fix though, it should go as a fix at the top of the
series.

Thanks,
Nick


More information about the Linuxppc-dev mailing list