[PATCH 1/3] powerpc: move the testing of CPU_FTR_COHERENT_ICACHE into __flush_icache_range
Kevin Hao
haokexin at gmail.com
Wed Aug 7 16:09:08 EST 2013
On Tue, Aug 06, 2013 at 08:35:10PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2013-08-06 at 18:23 +0800, Kevin Hao wrote:
> > In function flush_icache_range(), we use cpu_has_feature() to test
> > the feature bit of CPU_FTR_COHERENT_ICACHE. But this seems not optimal
> > for two reasons:
> > a) For ppc32, the function __flush_icache_range() already do this
> > check with the macro END_FTR_SECTION_IFSET.
> > b) Compare with the cpu_has_feature(), the method of using macro
> > END_FTR_SECTION_IFSET will not introduce any runtime overhead.
>
> Nak.
>
> It adds the overhead of calling into a function :-)
>
> What about modifying cpu_has_feature to use jump labels ?
That's a great idea. I would like to gave it a try later.
>It might solve
> the problem of no runtime overhead ... however it might also be hard to
> keep the ability to remove the whole statement at compile time if the
> bit doesn't fit in the POSSIBLE mask... unless you find the right macro
> magic.
>
> In any case, I suspect the function call introduces more overhead than
> the bit test + conditional branch which will generally predict very
> well, so the patch as-is is probably a regression.
I don't think so. For the 64bit CPU which has a non-coherent icache,
there is no any effect introduced by this patch since (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
always yield true at compile time. But for the 64bit CPU which does have
the coherent icache, the following is the asm code before applying this patch:
c000000000023b4c: e9 22 86 68 ld r9,-31128(r2)
c000000000023b50: e9 29 00 00 ld r9,0(r9)
c000000000023b54: e9 29 00 10 ld r9,16(r9)
c000000000023b58: 79 2a 07 e1 clrldi. r10,r9,63
c000000000023b5c: 41 82 00 94 beq c000000000023bf0 <.handle_rt_signal64+0x670>
...
c000000000023bf0: 7f 23 cb 78 mr r3,r25
c000000000023bf4: 7f 64 db 78 mr r4,r27
c000000000023bf8: 48 7a 65 99 bl c0000000007ca190 <.__flush_icache_range>
After applying this patch, the following code is generated:
c000000000023b48: 7f 23 cb 78 mr r3,r25
c000000000023b4c: 7f 64 db 78 mr r4,r27
c000000000023b50: 48 7a 65 81 bl c0000000007ca0d0 <.flush_icache_range>
The run path for these two cases are:
before after
ld r9,-31128(r2) mr r3,r25
ld r9,0(r9) mr r4,r27
ld r9,16(r9) bl flush_icache_range
clrldi. r10,r9,63 blr
beq xxxx
So I don't think this will introduce more overhead than before.
On the contrary, I believe it yield less overhead than before.
Correct me if I am wrong.
>
> Did you measure ?
No. I don't have a access to 64bit CPU which has a coherent icache.
For a non-coherent icache 64bit CPU this patch will not cause any effect.
Thanks,
Kevin
>
> Ben.
>
>
> > Signed-off-by: Kevin Hao <haokexin at gmail.com>
> > ---
> > arch/powerpc/include/asm/cacheflush.h | 3 +--
> > arch/powerpc/kernel/misc_64.S | 4 +++-
> > 2 files changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> > index b843e35..60b620d 100644
> > --- a/arch/powerpc/include/asm/cacheflush.h
> > +++ b/arch/powerpc/include/asm/cacheflush.h
> > @@ -35,8 +35,7 @@ extern void __flush_disable_L1(void);
> > extern void __flush_icache_range(unsigned long, unsigned long);
> > static inline void flush_icache_range(unsigned long start, unsigned long stop)
> > {
> > - if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> > - __flush_icache_range(start, stop);
> > + __flush_icache_range(start, stop);
> > }
> >
> > extern void flush_icache_user_range(struct vm_area_struct *vma,
> > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> > index 6820e45..74d87f1 100644
> > --- a/arch/powerpc/kernel/misc_64.S
> > +++ b/arch/powerpc/kernel/misc_64.S
> > @@ -68,7 +68,9 @@ PPC64_CACHES:
> > */
> >
> > _KPROBE(__flush_icache_range)
> > -
> > +BEGIN_FTR_SECTION
> > + blr
> > +END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > /*
> > * Flush the data cache to memory
> > *
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20130807/662fa02b/attachment.sig>
More information about the Linuxppc-dev
mailing list