[PATCH] powerpc/64s: exception optimise MSR handling
Nicholas Piggin
npiggin at gmail.com
Tue Sep 20 15:32:53 AEST 2016
On Tue, 20 Sep 2016 14:25:48 +1000
Michael Ellerman <mpe at ellerman.id.au> wrote:
> Nicholas Piggin <npiggin at gmail.com> writes:
>
> > mtmsrd with L=1 only affects MSR_EE and MSR_RI bits, and we always
> > know what state those bits are, so the kernel MSR does not need to be
> > loaded when modifying them.
> >
> > mtmsrd is often in the critical execution path, so avoiding dependency
> > on even L1 load is noticable. On a POWER8 this saves about 3 cycles
> > from the syscall path, and possibly a few from other exception returns
> > (not measured).
>
> This looks good in principle.
>
> My worry is that these code paths have lots of assumptions about what's
> left in registers, so we may have a path somewhere which expects rX to
> contain PACAKMSR. Hence the review below ...
>
> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> > index 6b8bc0d..585b9ca 100644
> > --- a/arch/powerpc/kernel/entry_64.S
> > +++ b/arch/powerpc/kernel/entry_64.S
> > @@ -139,7 +139,7 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
> > #ifdef CONFIG_PPC_BOOK3E
> > wrteei 1
> > #else
> > - ld r11,PACAKMSR(r13)
> > + li r11,MSR_RI
> > ori r11,r11,MSR_EE
> > mtmsrd r11,1
> > #endif /* CONFIG_PPC_BOOK3E */
>
> /* We do need to set SOFTE in the stack frame or the return
> * from interrupt will be painful
> */
> li r10,1
> std r10,SOFTE(r1)
>
> CURRENT_THREAD_INFO(r11, r1)
>
> So that's one OK. r11 isn't used again until its clobbered here.
>
>
> > @@ -195,7 +195,6 @@ system_call: /* label this so stack traces look sane */
>
> #ifdef CONFIG_PPC_BOOK3S
> /* No MSR:RI on BookE */
> andi. r10,r8,MSR_RI
> beq- unrecov_restore
> #endif
>
> So at this point r10 == MSR_RI, otherwise we would have taken the branch.
>
> /*
> * Disable interrupts so current_thread_info()->flags can't change,
> * and so that we don't get interrupted after loading SRR0/1.
> */
> > #ifdef CONFIG_PPC_BOOK3E
> > wrteei 0
> > #else
> > - ld r10,PACAKMSR(r13)
> > /*
> > * For performance reasons we clear RI the same time that we
> > * clear EE. We only need to clear RI just before we restore r13
> > * below, but batching it with EE saves us one expensive mtmsrd call.
> > * We have to be careful to restore RI if we branch anywhere from
> > * here (eg syscall_exit_work).
> > */
> > - li r9,MSR_RI
> > - andc r11,r10,r9
> > + li r11,0
> > mtmsrd r11,1
> > #endif /* CONFIG_PPC_BOOK3E */
>
> ld r9,TI_FLAGS(r12)
> li r11,-MAX_ERRNO
> andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
> bne- syscall_exit_work
>
> Which is:
>
> syscall_exit_work:
> #ifdef CONFIG_PPC_BOOK3S
> mtmsrd r10,1 /* Restore RI */
> #endif
>
> So that appears to still work. But it's super fragile.
Agreed. We'll go with the idea you mentioned offline to just load r10 again
here to avoid the long dependency -- it's not going to be a measurable cost.
> What I'd like to do is drop that optimisation of clearing RI early with
> EE. That would avoid us needing to restore RI in syscall_exit_work and
> before restore_math (and reclearing it after).
>
> But I guess that will hurt performance too much :/
Yes that's something to look into. Newer cores, more kernel code using fp
registers. I'll look into it.
Thanks,
Nick
More information about the Linuxppc-dev
mailing list