[RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

Gunnar von Boehn VONBOEHN at de.ibm.com
Fri Jun 27 23:30:58 EST 2008


Hi Paul,


> In my experience, dcbz slows down the hot-cache case because it adds a
> few cycles to the execution time of the inner loop, and on most 64-bit
> PowerPC implementations, it doesn't actually help even in the
> cold-cache case because the store queue does enough write combining

I agree with you that on POWER the dcbz is probably not helping.

On PowerPC my experience is different.
>From what I have seen DCBZ help enormously on 970,PA-Semi and CELL.


Cheers
Gunnar



                                                                           
             Paul Mackerras                                                
             <paulus at samba.org                                             
             >                                                          To 
                                       Gunnar von                          
             24/06/2008 01:49          Boehn/Germany/Contr/IBM at IBMDE       
                                                                        cc 
                                       sanjay3000 at yahoo.com, Mark Nelson   
                                       <markn at au1.ibm.com>,                
                                       linuxppc-dev at ozlabs.org, Michael    
                                       Ellerman <ellerman at au1.ibm.com>,    
                                       cbe-oss-dev at ozlabs.org, Arnd        
                                       Bergmann <arnd at arndb.de>            
                                                                   Subject 
                                       Re: [RFC 1/3] powerpc:              
                                       __copy_tofrom_user tweaked for Cell 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Gunnar von Boehn writes:

> Interesting points.
> Can you help me to understand where the negative effect of DCBZ does come
> from?

In my experience, dcbz slows down the hot-cache case because it adds a
few cycles to the execution time of the inner loop, and on most 64-bit
PowerPC implementations, it doesn't actually help even in the
cold-cache case because the store queue does enough write combining
that the cache doesn't end up reading the line from memory.  I don't
know whether the Cell PPE can do that, but I could believe that it
can't.

Paul.





More information about the Linuxppc-dev mailing list