Fwd: Re: still no accelerated X ($#!$*)
Franz Sirl
Franz.Sirl-kernel at lauterbach.com
Fri Jan 21 05:46:17 EST 2000
At 19:12 20.01.00 , Kevin Hendricks wrote:
>Hi,
>
>Can anyone explain this to me?
>
> > Finally I got it!
>
> > asm("stwbrx %0,%1,%2": : "r"(regdata), "r"(regindex), "r"(base_addr));
>
> > asm("lwbrx %0,%1,%2": "=r"(val):"r"(regindex), "r"(base_addr));
>
> > asm("stwbrx %0,%1,%2": : "r"(regdata), "b"(regindex), "r"(base_addr));
>
> > asm("lwbrx %0,%1,%2": "=r"(val):"b"(regindex), "r"(base_addr));
>
>
> > Don't know if this is correct (no clue about ppc assembly), but it works...
>
>Well I did the following with the attached sample program:
>
>gcc -O0 -S testit.c
>
>then I looked at testit.s (the assembler).
>
>old_regw:
> stwu 1,-32(1)
> stw 31,28(1)
> mr 31,1
> stw 3,8(31)
> mr 0,4
> lis 11,mach64MemReg at ha
> lwz 9,mach64MemReg at l(11)
> lwz 11,8(31)
> stwbrx 0,11,9
>.L2:
> lwz 11,0(1)
> lwz 31,-4(11)
> mr 1,11
> blr
>.Lfe2:
> .size old_regw,.Lfe2-old_regw
> .align 2
> .type old_regr, at function
>old_regr:
> stwu 1,-32(1)
> stw 31,28(1)
> mr 31,1
> stw 3,8(31)
> lis 9,mach64MemReg at ha
> lwz 0,mach64MemReg at l(9)
> lwz 11,8(31)
> lwbrx 9,11,0
> mr 3,9
> b .L3
>.L3:
> lwz 11,0(1)
> lwz 31,-4(11)
> mr 1,11
> blr
>.Lfe3:
> .size old_regr,.Lfe3-old_regr
> .align 2
>:regw:
> stwu 1,-32(1)
> stw 31,28(1)
> mr 31,1
> stw 3,8(31)
> mr 0,4
> lis 11,mach64MemReg at ha
> lwz 9,mach64MemReg at l(11)
> lwz 11,8(31)
> stwbrx 0,11,9
>.L4:
> lwz 11,0(1)
> lwz 31,-4(11)
> mr 1,11
> blr
>.Lfe4:
> .size regw,.Lfe4-regw
> .align 2
> .type regr, at function
>regr:
> stwu 1,-32(1)
> stw 31,28(1)
> mr 31,1
> stw 3,8(31)
> lis 9,mach64MemReg at ha
> lwz 0,mach64MemReg at l(9)
> lwz 11,8(31)
> lwbrx 9,11,0
> mr 3,9
> b .L5
>.L5:
> lwz 11,0(1)
> lwz 31,-4(11)
> mr 1,11
> blr
>
>And I simply can not see any difference in the actual code produced by each
>bunch of asm statements which leads me to believe that there is something else
>going on here.
>
>I would love to know exactly what.
>
>Will you please try compiling the code I attached to get the assembler out and
>compare old_regr and regr and old_rew and regw and see if you find any
>differences.
Kevin,
the fix is correct, you cannot use "r" (allow r0-r31) as a base register
constraint, you have to use "b" (allow r1-r31). This is not easy to
reproduce with a small testprogram, because it will only fail if r0 is
assigned for the inline assembly by the compiler, which depends on a lot of
factors. In practice the outdated egcs-1.1.2 seems to have a lower
probability to trigger this bug than the current gcc-2.95.2, probably due
to the better optimizers in gcc-2.95.2 producing higher register pressure.
Additionally with inline assembly you should always be as explicit as
possible, so for stuff possible relying on ordering, you should add
'volatile', and a "memory" clobber for writes:
asm volatile ("stwbrx %0,%1,%2" : :"r"(regdata), "b"(regindex),
"r"(base_addr) : "memory");
asm volatile ("lwbrx %0,%1,%2" : "=r"(val) : "b"(regindex), "r"(base_addr));
Franz.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-dev
mailing list