[PATCH] powerpc/64s: Fix compiler store ordering to SLB shadow area

Nicholas Piggin npiggin at gmail.com
Fri Jun 1 08:52:27 AEST 2018


On Fri, 01 Jun 2018 00:22:21 +1000
Michael Ellerman <mpe at ellerman.id.au> wrote:

> Nicholas Piggin <npiggin at gmail.com> writes:
> 
> > The stores to update the SLB shadow area must be made as they appear
> > in the C code, so that the hypervisor does not see an entry with
> > mismatched vsid and esid. Use WRITE_ONCE for this.
> >
> > GCC has been observed to elide the first store to esid in the update,
> > which means that if the hypervisor interrupts the guest after storing
> > to vsid, it could see an entry with old esid and new vsid, which may
> > possibly result in memory corruption.
> >
> > Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> > ---
> >  arch/powerpc/mm/slb.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> > index 66577cc66dc9..2f4b33b24b3b 100644
> > --- a/arch/powerpc/mm/slb.c
> > +++ b/arch/powerpc/mm/slb.c
> > @@ -63,14 +63,14 @@ static inline void slb_shadow_update(unsigned long ea, int ssize,
> >  	 * updating it.  No write barriers are needed here, provided
> >  	 * we only update the current CPU's SLB shadow buffer.
> >  	 */
> > -	p->save_area[index].esid = 0;
> > -	p->save_area[index].vsid = cpu_to_be64(mk_vsid_data(ea, ssize, flags));
> > -	p->save_area[index].esid = cpu_to_be64(mk_esid_data(ea, ssize, index));
> > +	WRITE_ONCE(p->save_area[index].esid, 0);
> > +	WRITE_ONCE(p->save_area[index].vsid, cpu_to_be64(mk_vsid_data(ea, ssize, flags)));
> > +	WRITE_ONCE(p->save_area[index].esid, cpu_to_be64(mk_esid_data(ea, ssize, index)));  
> 
> What's the code-gen for that look like? I suspect it's terrible?

Yeah it's not great.

> 
> Should we just do it in inline-asm I wonder?

There should be no fundamental correctness reason why we can't store
to a volatile with a byteswap store. The other option we could do is
add a compiler barrier() between each store. The reason I didn't is
that in theory we don't need to invalidate all memory contents here,
but in practice probably the end result code generation would be
better.

Thanks,
Nick



More information about the Linuxppc-dev mailing list