[PATCH 1/2] powerpc/strict_rwx: quirks for STRICT_RWX patches

Mon Oct 16 12:01:02 AEDT 2017

On Wed, 04 Oct 2017 20:04:52 +1100
Michael Ellerman <mpe at ellerman.id.au> wrote:

> Hi Balbir,
> 
> Mainly I think we need to add a check in mark_initmem_nx() too don't we?
> Or should we put it somewhere common to both?
> 
> But seeing as I'm replying here are some more comments.
> 
> > Subject: [PATCH 1/2] powerpc/strict_rwx: quirks for STRICT_RWX patches  
> 
> This misses three key things, 1) it's a bug fix 2) it's for Power9 DD1,
> and 3) it only applies to radix.
> 
> So how about:
> 
> powerpc/64s: Fix crashes on Power9 DD1 with radix MMU and STRICT_RWX
> 
> 
> Balbir Singh <bsingharora at gmail.com> writes:
> > This patch disables STRICT_RWX for power9 DD1 machines
> > where due to some limitations with the way we do tlb
> > updates, we clear the TLB entry of the text that's doing
> > the update to kernel text and that does lead to a
> > crash.  
> 
> This mostly describes the patch, but I would prefer more detail, eg:
> 
>   When using the radix MMU on Power9 DD1, to work around a hardware
>   problem, radix__pte_update() is required to do a two stage update of
>   the PTE. First we write a zero value into the PTE, then we flush the
>   TLB, and then we write the new PTE value.
>   
>   In the normal case that works OK, but it does not work if we're
>   updating the PTE that maps the code we're executing, because the
>   mapping is removed by the TLB flush and we can no longer execute from
>   it. Unfortunately the STRICT_RWX code needs to do exactly that.
>   
>   The exact symptoms when we hit this case vary, sometimes we print an
>   oops and then get stuck after that, but I've also seen a machine just
>   get stuck continually page faulting with no oops printed. The variance
>   is presumably due to the exact layout of the text and the page size
>   used for the mappings. In all cases we are unable to boot to a shell.
>   
>   There are possible solutions such as creating a second mapping of the
>   TLB flush code, executing from that, and then jumping back to the
>   original. However we don't want to add that level of complexity for a
>   DD1 work around.
>   
>   So just detect that we're running on Power9 DD1 and refrain from
>   changing the permissions, effectively disabling STRICT_RWX on Power9
>   DD1.
>

I can copy this verbatim

> 
> > diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> > index 39c252b..c2a2b46 100644
> > --- a/arch/powerpc/mm/pgtable-radix.c
> > +++ b/arch/powerpc/mm/pgtable-radix.c
> > @@ -169,6 +169,18 @@ void radix__mark_rodata_ro(void)
> >  {
> >  	unsigned long start, end;
> >  
> > +	/*
> > +	 * mark_rodata_ro() will mark itself as !writable at some point  
>                                                                        ^
>                                                                        .
> > +	 * due to workaround in radix__pte_update(), we'll end up with  
>            ^     ^
>            D     the DD1
> > +	 * an invalid pte and the system will crash quite severly.  
> 
> > +	 * The alternatives are quite cumbersome and leaving out
> > +	 * the page containing the flush code is not very nice.  
> 
> Just drop the last sentence, it doesn't have enough detail to be useful
> and we chose not to implement those alternatives, so better not to
> confuse the reader by talking about them.
> 

Can and will do

> > +	 */
> > +	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> > +		pr_warn("Warning: Unable to mark rodata read only on P9 DD1\n");
> > +		return;
> > +	}  
> 
> Don't we also need to do the check in radix__mark_initmem_nx() ?

No.. not for NX, since we mark NX pages (.data and .init.text..init.end)
from .text. We don't change permissions for anything in .text in that
routine.

> 
> cheers