[PATCH 2/9] powerpc: Add macros to access floating point registers in thread_struct.
Michael Neuling
mikey at neuling.org
Thu Jun 26 10:09:33 EST 2008
In message <1DD06CDB-428E-4832-93CA-6F0404CA6692 at kernel.crashing.org> you wrote:
>
> On Jun 25, 2008, at 11:17 AM, Scott Wood wrote:
>
> > Gabriel Paubert wrote:
> >> On Wed, Jun 25, 2008 at 10:34:32AM -0500, Scott Wood wrote:
> >>> Kumar Gala wrote:
> >>>>> +/* Macros to workout the correct index for the FPR in the
> >>>>> thread struct */
> >>>>> +#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
> >>>>> +#define FPRHALF(i) (((i) - PT_FPR0) % 2)
> >>>> Have you looked at what the compiler spits out here to make sure
> >>>> we aren't getting a divide? Seems like we could use '& 0x1'.
> >>> GCC's not *that* dumb. However, you may get some unnecessary sign-
> >>> twiddling if "i" is signed.
> >> Not for modulo 2, it's only an even/odd choice and GCC implements
> >> that efficiently IIRC. For other powers of 2,
> >> making the left hand side unsigned helps the compiler.
> >
> > From this:
> >
> > int foo(int x)
> > {
> > return x % 2;
> > }
> >
> > I get this with -O3:
> >
> > foo:
> > mr 0,3
> > srawi 3,3,1
> > addze 3,3
> > slwi 3,3,1
> > subf 3,3,0
> > blr
> > .size foo, .-foo
> > .ident "GCC: (GNU) 4.1.2"
> >
> > Changing it to "x & 1", or to unsigned, gives this:
> >
> > foo:
> > rlwinm 3,3,0,31,31
> > blr
> > .size foo, .-foo
> > .ident "GCC: (GNU) 4.1.2"
> >
> > Maybe newer GCCs are better?
>
> Nope. gcc-4.3.0 from fedora 9:
>
> foo:
> mr 0,3
> srawi 3,3,1
> addze 3,3
> slwi 3,3,1
> subf 3,3,0
> blr
>
> bar:
> rlwinm 3,3,0,31,31
> blr
>
> if you make 'x' unsigned things are better.
I've changed it to '& 0x1', which compiles to something better here.
Mikey
More information about the Linuxppc-dev
mailing list