<html><body>
<p><tt><font size="2">Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> wrote on 05/09/2014 09:46:38 PM:<br>
<br>
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com></font></tt><br>
<tt><font size="2">> To: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>, </font></tt><br>
<tt><font size="2">> Cc: linux-kernel@vger.kernel.org, Anton Blanchard <br>
> <anton@au1.ibm.com>, Ulrich.Weigand@de.ibm.com, Michael Ellerman <br>
> <michaele@au1.ibm.com>, Maynard Johnson/Rochester/IBM@IBMUS, <br>
> linuxppc-dev@lists.ozlabs.org</font></tt><br>
<tt><font size="2">> Date: 05/09/2014 09:46 PM</font></tt><br>
<tt><font size="2">> Subject: [PATCH 1/1] powerpc/perf: Adjust callchain based on DWARF debug info</font></tt><br>
<tt><font size="2">> <br>
> [PATCH 1/1] powerpc/perf: Adjust callchain based on DWARF debug info</font></tt><br>
<br>
<tt><font size="2">Acked-by: Maynard Johnson <maynardj@us.ibm.com></font></tt><br>
<br>
<tt><font size="2">Reviewed and tested. Thanks, Suka.</font></tt><br>
<br>
<tt><font size="2">-Maynard</font></tt><br>
<tt><font size="2"><br>
> <br>
> When saving the callchain on Power, the kernel conservatively saves excess<br>
> entries in the callchain. A few of these entries are needed in some cases<br>
> but not others.<br>
> <br>
> Eg: the value in the link register (LR) is needed only when it holds the<br>
> return address of a function. At other times it must be ignored.<br>
> <br>
> If the unnecessary entries are not ignored, we end up with duplicate arcs<br>
> in the call-graphs.<br>
> <br>
> Use DWARF debug information to ignore the unnecessary entries.<br>
> <br>
> Callgraph before the patch:<br>
> <br>
> 14.67% 2234 sprintft libc-2.18.so [.] __random<br>
> |<br>
> --- __random<br>
> |<br>
> |--61.12%-- __random<br>
> | |<br>
> | |--97.15%-- rand<br>
> | | do_my_sprintf<br>
> | | main<br>
> | | generic_start_main.isra.0<br>
> | | __libc_start_main<br>
> | | 0x0<br>
> | |<br>
> | --2.85%-- do_my_sprintf<br>
> | main<br>
> | generic_start_main.isra.0<br>
> | __libc_start_main<br>
> | 0x0<br>
> |<br>
> --38.88%-- rand<br>
> |<br>
> |--94.01%-- rand<br>
> | do_my_sprintf<br>
> | main<br>
> | generic_start_main.isra.0<br>
> | __libc_start_main<br>
> | 0x0<br>
> |<br>
> --5.99%-- do_my_sprintf<br>
> main<br>
> generic_start_main.isra.0<br>
> __libc_start_main<br>
> 0x0<br>
> <br>
> Callgraph after the patch:<br>
> <br>
> 14.67% 2234 sprintft libc-2.18.so [.] __random<br>
> |<br>
> --- __random<br>
> |<br>
> |--95.93%-- rand<br>
> | do_my_sprintf<br>
> | main<br>
> | generic_start_main.isra.0<br>
> | __libc_start_main<br>
> | 0x0<br>
> |<br>
> --4.07%-- do_my_sprintf<br>
> main<br>
> generic_start_main.isra.0<br>
> __libc_start_main<br>
> 0x0<br>
> <br>
> TODO: For split-debug info objects like glibc, we can only determine<br>
> the call-frame-address only when both .eh_frame and .debug_info<br>
> sections are available. We should be able to determin the CFA<br>
> even without the .eh_frame section.<br>
> <br>
> Thanks to Ulrich Weigand for help with DWARF debug information.<br>
> <br>
> Fix suggested by Anton Blanchard.<br>
> <br>
> Reported-by: Maynard Johnson <maynard@us.ibm.com><br>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com><br>
> ---<br>
> tools/perf/arch/powerpc/Makefile | 1 +<br>
> tools/perf/arch/powerpc/util/adjust-callchain.c | 278 +++++++++++++<br>
> +++++++++++<br>
> tools/perf/config/Makefile | 5 +<br>
> tools/perf/util/callchain.h | 12 +<br>
> tools/perf/util/machine.c | 16 +-<br>
> 5 files changed, 310 insertions(+), 2 deletions(-)<br>
> create mode 100644 tools/perf/arch/powerpc/util/adjust-callchain.c<br>
> <br>
[snip]</font></tt></body></html>