pte_update and 64-bit PTEs on PPC32?

Kumar Gala kumar.gala at freescale.com
Sat Apr 9 00:08:28 EST 2005


On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote:

> On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote:
>  > Here is a version that works if CONFIG_PTE_64BIT is defined.  If we
> > like this, I can simplify the pte_update so we dont need the 
> (unsigned
> > long)(p+1) - 4) trick anymore.  Let me know.
>  >
> > - kumar
>  >
> > #ifdef CONFIG_PTE_64BIT
> > static inline unsigned long long pte_update(pte_t *p, unsigned long 
> clr,
>  >                                        unsigned long set)
>  > {
>  >         unsigned long long old;
>  >         unsigned long tmp;
>  >
> >         __asm__ __volatile__("\
>  > 1:      lwarx   %L0,0,%4\n\
>  >         lwzx    %0,0,%3\n\
>  >         andc    %1,%L0,%5\n\
> >         or      %1,%1,%6\n\
>  >         stwcx.  %1,0,%4\n\
>  >         bne-    1b"
>  >         : "=&r" (old), "=&r" (tmp), "=m" (*p)
> >         : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" 
> (set),
> > "m" (*p)
>
> Are you sure of your pointer arithmetic? I believe that
>  you'd rather want to use (unsigned char)(p)+4. Or even better:

Realize that I'm converting the pointer to an int, so its not exactly 
normal pointer math.  Was stick with the pre-existing stye.

>
> :"r" (p), "b" (4), "r" (clr), "r" (set)
>
> and change the first line to:  lwarx %L0,%4,%3.
>
> Even more devious, you don't need the %4 parameter:
>
>         li %L0,4
>          lwarx %L0,%L0,%3
>
> since %L0 cannot be r0. This saves one register.

Actually the compiler effective does this for me.  If you look at the 
generated asm, the only additional instruction is an 'addi' and some 
'mr' to handle getting things in the correct registers for the return.  
Not really sure if there is much else to do to optimize this.

>  >         : "cc" );
>
> On PPC, I always prefer saying cr0 over cc. Maybe it's just
>  me, but it's the canonical register name in the architecture.

Was sticking with the style of what already existed, but I agree that 
cr is more natural to read than cc.

- kumar




More information about the Linuxppc-dev mailing list