[RFC] Optimize __arch_swab32 and __arch_swab16

Joakim Tjernlund joakim.tjernlund at transmode.se
Thu Aug 11 19:43:57 EST 2011


"David Laight" <David.Laight at ACULAB.COM> wrote on 2011/08/11 11:29:33:
>
>
> > > Which is a problem because the compiler could schedule
> > > it be written back to real memory between the instructions.
> >
> > It can? There is no memory here, just registers. Even if it
> > is written to memory, how would that affect the register?
>
> Although the function argument is passed in a register, the
> compiler could generate a store-load sequence before and
> after each __asm__() line.

Ah, I see. Seems strange that the complier would do that for
the register in use(value). Other regs perhaps.

>
> > Assuming you are right, would rewriting it to
> >   __asm__("rlwimi %0,%0,16,0x00ff0000\n\t"
> >        "rlwinm %0,%0,24,0x0000ffff"
> >        : "+r"(value));
> > help?
>
> Except that now you've stopped the compiler scheduling
> another instruction between the two - probably forcing a
> execution stall.

But this should be better than using 2 more insns and an
extra register.

 Jocke



More information about the Linuxppc-dev mailing list