[PATCH 1/3] powerpc: move the testing of CPU_FTR_COHERENT_ICACHE into __flush_icache_range

Kevin Hao haokexin at gmail.com
Wed Aug 7 16:09:08 EST 2013


On Tue, Aug 06, 2013 at 08:35:10PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2013-08-06 at 18:23 +0800, Kevin Hao wrote:
> > In function flush_icache_range(), we use cpu_has_feature() to test
> > the feature bit of CPU_FTR_COHERENT_ICACHE. But this seems not optimal
> > for two reasons:
> >  a) For ppc32, the function __flush_icache_range() already do this
> >     check with the macro END_FTR_SECTION_IFSET.
> >  b) Compare with the cpu_has_feature(), the method of using macro
> >     END_FTR_SECTION_IFSET will not introduce any runtime overhead.
> 
> Nak.
> 
> It adds the overhead of calling into a function :-)
> 
> What about modifying cpu_has_feature to use jump labels ? 

That's a great idea. I would like to gave it a try later.

>It might solve
> the problem of no runtime overhead ... however it might also be hard to
> keep the ability to remove the whole statement at compile time if the
> bit doesn't fit in the POSSIBLE mask... unless you find the right macro
> magic.
> 
> In any case, I suspect the function call introduces more overhead than
> the bit test + conditional branch which will generally predict very
> well, so the patch as-is is probably a regression.

I don't think so. For the 64bit CPU which has a non-coherent icache,
there is no any effect introduced by this patch since (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
always yield true at compile time. But for the 64bit CPU which does have
the coherent icache, the following is the asm code before applying this patch:

  c000000000023b4c:	e9 22 86 68 	ld      r9,-31128(r2)
  c000000000023b50:	e9 29 00 00 	ld      r9,0(r9)
  c000000000023b54:	e9 29 00 10 	ld      r9,16(r9)
  c000000000023b58:	79 2a 07 e1 	clrldi. r10,r9,63
  c000000000023b5c:	41 82 00 94 	beq     c000000000023bf0 <.handle_rt_signal64+0x670>
  ... 
  c000000000023bf0:	7f 23 cb 78 	mr      r3,r25
  c000000000023bf4:	7f 64 db 78 	mr      r4,r27
  c000000000023bf8:	48 7a 65 99 	bl      c0000000007ca190 <.__flush_icache_range>

After applying this patch, the following code is generated:
  c000000000023b48:	7f 23 cb 78 	mr      r3,r25
  c000000000023b4c:	7f 64 db 78 	mr      r4,r27
  c000000000023b50:	48 7a 65 81 	bl      c0000000007ca0d0 <.flush_icache_range>


The run path for these two cases are:

        before                              after
  ld r9,-31128(r2)                        mr r3,r25
  ld r9,0(r9)                             mr r4,r27
  ld r9,16(r9)                            bl flush_icache_range
  clrldi. r10,r9,63                       blr
  beq   xxxx

So I don't think this will introduce more overhead than before.
On the contrary, I believe it yield less overhead than before.
Correct me if I am wrong.

> 
> Did you measure ?

No. I don't have a access to 64bit CPU which has a coherent icache.
For a non-coherent icache 64bit CPU this patch will not cause any effect.

Thanks,
Kevin

> 
> Ben.
> 
> 
> > Signed-off-by: Kevin Hao <haokexin at gmail.com>
> > ---
> >  arch/powerpc/include/asm/cacheflush.h | 3 +--
> >  arch/powerpc/kernel/misc_64.S         | 4 +++-
> >  2 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> > index b843e35..60b620d 100644
> > --- a/arch/powerpc/include/asm/cacheflush.h
> > +++ b/arch/powerpc/include/asm/cacheflush.h
> > @@ -35,8 +35,7 @@ extern void __flush_disable_L1(void);
> >  extern void __flush_icache_range(unsigned long, unsigned long);
> >  static inline void flush_icache_range(unsigned long start, unsigned long stop)
> >  {
> > -	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> > -		__flush_icache_range(start, stop);
> > +	__flush_icache_range(start, stop);
> >  }
> >  
> >  extern void flush_icache_user_range(struct vm_area_struct *vma,
> > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> > index 6820e45..74d87f1 100644
> > --- a/arch/powerpc/kernel/misc_64.S
> > +++ b/arch/powerpc/kernel/misc_64.S
> > @@ -68,7 +68,9 @@ PPC64_CACHES:
> >   */
> >  
> >  _KPROBE(__flush_icache_range)
> > -
> > +BEGIN_FTR_SECTION
> > +	blr
> > +END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> >  /*
> >   * Flush the data cache to memory 
> >   * 
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20130807/662fa02b/attachment.sig>


More information about the Linuxppc-dev mailing list