[RFC] Optimize __arch_swab32 and __arch_swab16
Joakim Tjernlund
joakim.tjernlund at transmode.se
Thu Aug 11 19:23:01 EST 2011
"David Laight" <David.Laight at ACULAB.COM> wrote on 2011/08/11 10:56:26:
>
> > Joakim Tjernlund <joakim.tjernlund at transmode.se> writes:
> >
> > > unsigned short my__arch_swab16(unsigned short value)
> > > {
> > > __asm__("rlwimi %0,%0,16,0x00ff0000"
> > > : "+r" (value));
> >
> > You are creating a value that does not fit in a short.
>
> Which is a problem because the compiler could schedule
> it be written back to real memory between the instructions.
It can? There is no memory here, just registers. Even if it
is written to memory, how would that affect the register?
Assuming you are right, would rewriting it to
__asm__("rlwimi %0,%0,16,0x00ff0000\n\t"
"rlwinm %0,%0,24,0x0000ffff"
: "+r"(value));
help?
>
> Actually the generated code would be better if the swap16()
> functions operated on 'unsigned int' fields - since it would
> save the compiler from doing a lot of shifts/masks elsewhere.
>
> For instance there is likely to be a mask with 0xffff prior
> to the call to swab16().
>
> Since one use of these is for htons() (etc), when
> the host is the correct endianness these are #defines that
> do nothing - so out of range values aren't masked.
> So it seems to me that defining:
> unsigned int swab(unsigned int);
> would be fine - except it clashes with standard headers :-(
>
> David
>
>
More information about the Linuxppc-dev
mailing list