[RFC/PATCH] Make powerpc64 use __thread for per-cpu variables
olof at lixom.net
Wed May 10 15:16:50 EST 2006
On Wed, May 10, 2006 at 02:03:59PM +1000, Paul Mackerras wrote:
> With this patch, 64-bit powerpc uses __thread for per-cpu variables.
Nice! I like the way you hid the slb functions so they can't ever be
called by mistake from C code. :-)
This patch a ppc64_defconfig vmlinux a bit (with the other two percpu
olof at quad:~/work/linux/powerpc $ ls -l vmlinux.pre vmlinux
-rwxr-xr-x 1 olof olof 10290928 2006-05-09 23:48 vmlinux.pre
-rwxr-xr-x 1 olof olof 10307499 2006-05-09 23:50 vmlinux
olof at quad:~/work/linux/powerpc $ size vmlinux.pre vmlinux
text data bss dec hex filename
5554034 2404256 480472 8438762 80c3ea vmlinux.pre
5578866 2384944 498848 8462658 812142 vmlinux
Looks like alot of the text growth is from the added mfsprg3 instructions:
$ objdump -d vmlinux.pre | egrep mfsprg.\*,3\$ | wc -l
$ objdump -d vmlinux | egrep mfsprg.\*,3\$ | wc -l
... so, as the PACA gets deprecated, the bloat will go away again.
> The motivation for doing this is that getting the address of a per-cpu
> variable currently requires two loads (one to get our per-cpu offset
> and one to get the address of the variable in the .data.percpu
> section) plus an add. With __thread we can get the address of our
> copy of a per-cpu variable with just an add (r13 plus a constant).
It would be interesting to see benchmarks of how much it improves
things. I guess it doesn't really get interesting until after the paca
gets removed though, due to the added mfsprg's.
More information about the Linuxppc-dev