[PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
Nicholas Piggin
npiggin at gmail.com
Tue Dec 6 19:31:33 AEDT 2016
On Tue, 6 Dec 2016 17:09:07 +1100
Paul Mackerras <paulus at ozlabs.org> wrote:
> On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote:
> > Change the calling convention to put the trap number together with
> > CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
> > handler, and r9 free.
>
> Cute idea! Some comments below...
>
> > The 64-bit PR handler entry translates the calling convention back
> > to match the previous call convention (i.e., shared with 32-bit), for
> > simplicity.
> >
> > Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> > ---
> > arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++-------------
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 15 +++++++--------
> > arch/powerpc/kvm/book3s_segment.S | 27 ++++++++++++++++++++-------
> > 3 files changed, 42 insertions(+), 28 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> > index 9a3eee6..bc8fc45 100644
> > --- a/arch/powerpc/include/asm/exception-64s.h
> > +++ b/arch/powerpc/include/asm/exception-64s.h
> > @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> >
> > #endif
> >
> > -#define __KVM_HANDLER_PROLOG(area, n) \
> > +#define __KVM_HANDLER(area, h, n) \
> > BEGIN_FTR_SECTION_NESTED(947) \
> > ld r10,area+EX_CFAR(r13); \
> > std r10,HSTATE_CFAR(r13); \
> > @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> > std r10,HSTATE_PPR(r13); \
> > END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948); \
> > ld r10,area+EX_R10(r13); \
> > - stw r9,HSTATE_SCRATCH1(r13); \
> > - ld r9,area+EX_R9(r13); \
> > std r12,HSTATE_SCRATCH0(r13); \
> > -
> > -#define __KVM_HANDLER(area, h, n) \
> > - __KVM_HANDLER_PROLOG(area, n) \
> > - li r12,n; \
> > + li r12,(n); \
> > + sldi r12,r12,32; \
> > + or r12,r12,r9; \
>
> Did you consider doing it the other way around, i.e. with r12
> containing (cr << 32) | trap? That would save 1 instruction in each
> handler:
When I tinkered with it I thought it came out slightly nicer this way, but
your suggested versions seem to prove me wrong. I can change it if you'd
like.
>
> + sldi r12,r9,32; \
> + ori r12,r12,(n); \
>
> > + ld r9,area+EX_R9(r13); \
> > + std r9,HSTATE_SCRATCH1(r13); \
>
> Why not put this std in kvmppc_interrupt[_hv] rather than in each
> handler?
Patch 3/3 uses r9 to load the ctr when CONFIG_RELOCATABLE is turned on, so
this resulted in the smaller difference between the two cases. I agree it's
not ideal when config relocatable is off.
[snip]
Thanks,
Nick
More information about the Linuxppc-dev
mailing list