RFC: Reducing the number of non volatile GPRs in the ppc64 kernel
segher at kernel.crashing.org
Wed Aug 12 06:08:29 AEST 2015
On Mon, Aug 10, 2015 at 02:52:28PM +1000, Anton Blanchard wrote:
> Hi Bill, Segher,
> > I agree with Segher. We already know we have opportunities to do a
> > better job with shrink-wrapping (pushing this kind of useless
> > activity down past early exits), so having examples of code to look
> > at to improve this would be useful.
> I'll look out for specific examples. I noticed this one today when
> analysing malloc(8). It is an instruction trace of _int_malloc().
> The overall function is pretty huge, which I assume leads to gcc using
> so many non volatiles.
That is one part of it; also GCC deals out volatiles too generously.
> Perhaps in this case we should separate out the
> slow path into another function marked noinline.
Or GCC could do that, effectively at least.
> This is just an upstream glibc build, but I'll send the preprocessed
> source off list.
After the prologue there are 46 insns executed before the epilogue.
Many of those are conditional branches (that are not executed); it is
all fall-through until it jumps to the "tail" (the few insns before
the epilogue). GCC knows how to duplicate a tail so that it can do
shrink-wrapping (the original tail needs to be followed by an epilogue,
the duplicated one does not want one); but it can only do it in very
simple cases (one basic block or at least no control flow), and that
is not the case here. We need to handle more generic tails.
This seems related to (if not the same as!) <http://gcc.gnu.org/PR51982>.
More information about the Linuxppc-dev