[PATCH 2/9] powerpc: Add macros to access floating point registers in thread_struct.

Michael Neuling mikey at neuling.org
Thu Jun 26 10:09:33 EST 2008


In message <1DD06CDB-428E-4832-93CA-6F0404CA6692 at kernel.crashing.org> you wrote:
> 
> On Jun 25, 2008, at 11:17 AM, Scott Wood wrote:
> 
> > Gabriel Paubert wrote:
> >> On Wed, Jun 25, 2008 at 10:34:32AM -0500, Scott Wood wrote:
> >>> Kumar Gala wrote:
> >>>>> +/* Macros to workout the correct index for the FPR in the  
> >>>>> thread struct */
> >>>>> +#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
> >>>>> +#define FPRHALF(i) (((i) - PT_FPR0) % 2)
> >>>> Have you looked at what the compiler spits out here to make sure  
> >>>> we aren't getting a divide?  Seems like we could use '& 0x1'.
> >>> GCC's not *that* dumb.  However, you may get some unnecessary sign- 
> >>> twiddling if "i" is signed.
> >> Not for modulo 2, it's only an even/odd choice and GCC implements  
> >> that efficiently IIRC. For other powers of 2,
> >> making the left hand side unsigned helps the compiler.
> >
> > From this:
> >
> > int foo(int x)
> > {
> > 	return x % 2;
> > }
> >
> > I get this with -O3:
> >
> > foo:
> >        mr 0,3
> >        srawi 3,3,1
> >        addze 3,3
> >        slwi 3,3,1
> >        subf 3,3,0
> >        blr
> >        .size   foo, .-foo
> >        .ident  "GCC: (GNU) 4.1.2"
> >
> > Changing it to "x & 1", or to unsigned, gives this:
> >
> > foo:
> >        rlwinm 3,3,0,31,31
> >        blr
> >        .size   foo, .-foo
> >        .ident  "GCC: (GNU) 4.1.2"
> >
> > Maybe newer GCCs are better?
> 
> Nope. gcc-4.3.0 from fedora 9:
> 
> foo:
>          mr 0,3
>          srawi 3,3,1
>          addze 3,3
>          slwi 3,3,1
>          subf 3,3,0
>          blr
> 
> bar:
>          rlwinm 3,3,0,31,31
>          blr
> 
> if you make 'x' unsigned things are better.

I've changed it to '& 0x1', which compiles to something better here.

Mikey



More information about the Linuxppc-dev mailing list