[PATCH 2/9] powerpc: Add macros to access floating point registers in thread_struct.

Kumar Gala galak at kernel.crashing.org
Thu Jun 26 03:07:19 EST 2008


On Jun 25, 2008, at 11:17 AM, Scott Wood wrote:

> Gabriel Paubert wrote:
>> On Wed, Jun 25, 2008 at 10:34:32AM -0500, Scott Wood wrote:
>>> Kumar Gala wrote:
>>>>> +/* Macros to workout the correct index for the FPR in the  
>>>>> thread struct */
>>>>> +#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
>>>>> +#define FPRHALF(i) (((i) - PT_FPR0) % 2)
>>>> Have you looked at what the compiler spits out here to make sure  
>>>> we aren't getting a divide?  Seems like we could use '& 0x1'.
>>> GCC's not *that* dumb.  However, you may get some unnecessary sign- 
>>> twiddling if "i" is signed.
>> Not for modulo 2, it's only an even/odd choice and GCC implements  
>> that efficiently IIRC. For other powers of 2,
>> making the left hand side unsigned helps the compiler.
>
> From this:
>
> int foo(int x)
> {
> 	return x % 2;
> }
>
> I get this with -O3:
>
> foo:
>        mr 0,3
>        srawi 3,3,1
>        addze 3,3
>        slwi 3,3,1
>        subf 3,3,0
>        blr
>        .size   foo, .-foo
>        .ident  "GCC: (GNU) 4.1.2"
>
> Changing it to "x & 1", or to unsigned, gives this:
>
> foo:
>        rlwinm 3,3,0,31,31
>        blr
>        .size   foo, .-foo
>        .ident  "GCC: (GNU) 4.1.2"
>
> Maybe newer GCCs are better?

Nope. gcc-4.3.0 from fedora 9:

foo:
         mr 0,3
         srawi 3,3,1
         addze 3,3
         slwi 3,3,1
         subf 3,3,0
         blr

bar:
         rlwinm 3,3,0,31,31
         blr

if you make 'x' unsigned things are better.

- k



More information about the Linuxppc-dev mailing list