Some issues to resolve with XFree 4.0 yet

Kevin B. Hendricks khendricks at ivey.uwo.ca
Tue Mar 28 05:06:35 EST 2000


Hi Ani and Ryuichi,

>are you using the patch I posted last week?  If not, then I suggest you
>do.  I fixed the improper load/stores in r128 and it shows a 200% increase
>in almost all x11perf tests.

Actually, you might want to try Gabriel Paubert's patch which simply
removes the "volatile" from the base_addr parameter.  The incirrectly
specified volatile on the parameter (which really makes no sense if you
think about it ;-)) is what was causing all the problems with inefficiency.

Interestingly, with this patch you can actually save one extra instruction
over Ani's patch but either one is a big big improvement.

Kevin


----snip-here-for Gabriel_Paubert's_e-mail_with_patch----

> Hi,
>
> >From comparing the performance of the XFree 4.0 r128 drivers across x86 and
> ppc we noticed that the ppc version was much slower.  The following patch
> made a huge change in x11perf results (improivement).  This is on a ppc
> with glibc 2.1.3 and the latest gcc 2.95.2 from Franz Sirl.
>
> Did I write the output constraint version incorrectly?  Is this what you
> expected the generated code to look like?

I have just made a test with suppressing the volatile in the parameter to
the regr/regw/regr16/regw16 macros and the code is even better (one
instruction less than with the memory clobber):

000003d4 <R128Blank>:
     3d4:       81 43 00 f8     lwz     r10,248(r3)
     3d8:       81 6a 00 24     lwz     r11,36(r10)
     3dc:       39 20 00 54     li      r9,84
     3e0:       7c 09 5c 2c     lwbrx   r0,r9,r11
     3e4:       7c 00 06 ac     eieio
     3e8:       60 00 04 00     ori     r0,r0,1024
     3ec:       7c 09 5d 2c     stwbrx  r0,r9,r11
     3f0:       7c 00 06 ac     eieio
     3f4:       4e 80 00 20     blr

the diff is:
--- r128_reg.h~	Sat Feb 26 06:38:43 2000
+++ r128_reg.h	Fri Mar 24 23:47:31 2000
@@ -48,19 +48,19 @@

 #if defined(__powerpc__)

-static inline void regw(volatile unsigned long base_addr, unsigned long
regindex, unsigned long regdata)
+static inline void regw(unsigned long base_addr, unsigned long regindex,
unsigned long regdata)
 {
  asm volatile ("stwbrx %1,%2,%3; eieio"
           : "=m" (*(volatile unsigned *)(base_addr+regindex))
           : "r" (regdata), "b" (regindex), "r" (base_addr));
 }

-static inline void regw16(volatile unsigned long base_addr, unsigned long
regindex, unsigned short regdata)
+static inline void regw16(unsigned long base_addr, unsigned long regindex,
unsigned short regdata)
 {
   asm volatile ("sthbrx %0,%1,%2; eieio": : "r"(regdata), "b"(regindex),
"r"(base_addr));
 }

-static inline unsigned long regr(volatile unsigned long base_addr,
unsigned long regindex)
+static inline unsigned long regr(unsigned long base_addr, unsigned long
regindex)
 {
   register unsigned long val;
   asm volatile ("lwbrx %0,%1,%2; eieio"
@@ -70,7 +70,7 @@
   return(val);
 }

-static inline unsigned short regr16(volatile unsigned long base_addr,
unsigned long regindex)
+static inline unsigned short regr16(unsigned long base_addr, unsigned long
regindex)
 {
   register unsigned short val;
   asm volatile ("lhbrx %0,%1,%2; eieio": "=r"(val):"b"(regindex),
"r"(base_addr));


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc-dev mailing list