[RFC] Optimize __arch_swab32 and __arch_swab16

David Laight David.Laight at ACULAB.COM
Thu Aug 11 18:56:26 EST 2011


> Joakim Tjernlund <joakim.tjernlund at transmode.se> writes:
> 
> > unsigned short my__arch_swab16(unsigned short value)
> > {
> > 	__asm__("rlwimi %0,%0,16,0x00ff0000"
> > 		: "+r" (value));
> 
> You are creating a value that does not fit in a short.

Which is a problem because the compiler could schedule
it be written back to real memory between the instructions.

Actually the generated code would be better if the swap16()
functions operated on 'unsigned int' fields - since it would
save the compiler from doing a lot of shifts/masks elsewhere.

For instance there is likely to be a mask with 0xffff prior
to the call to swab16().

Since one use of these is for htons() (etc), when
the host is the correct endianness these are #defines that
do nothing - so out of range values aren't masked.
So it seems to me that defining:
   unsigned int swab(unsigned int);
would be fine - except it clashes with standard headers :-(

	David




More information about the Linuxppc-dev mailing list