<html><body>
<p><tt><font size="2">Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> wrote on 05/09/2014 09:46:38 PM:<br>
<br>
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com></font></tt><br>
<tt><font size="2">> To: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>, </font></tt><br>
<tt><font size="2">> Cc: linux-kernel@vger.kernel.org, Anton Blanchard <br>
> <anton@au1.ibm.com>, Ulrich.Weigand@de.ibm.com, Michael Ellerman <br>
> <michaele@au1.ibm.com>, Maynard Johnson/Rochester/IBM@IBMUS, <br>
> linuxppc-dev@lists.ozlabs.org</font></tt><br>
<tt><font size="2">> Date: 05/09/2014 09:46 PM</font></tt><br>
<tt><font size="2">> Subject: [PATCH 1/1] powerpc/perf: Adjust callchain based on DWARF debug info</font></tt><br>
<tt><font size="2">> <br>
> [PATCH 1/1] powerpc/perf: Adjust callchain based on DWARF debug info</font></tt><br>
<br>
<tt><font size="2">Acked-by: Maynard Johnson <maynardj@us.ibm.com></font></tt><br>
<br>
<tt><font size="2">Reviewed and tested.  Thanks, Suka.</font></tt><br>
<br>
<tt><font size="2">-Maynard</font></tt><br>
<tt><font size="2"><br>
> <br>
> When saving the callchain on Power, the kernel conservatively saves excess<br>
> entries in the callchain. A few of these entries are needed in some cases<br>
> but not others.<br>
> <br>
> Eg: the value in the link register (LR) is needed only when it holds the<br>
> return address of a function. At other times it must be ignored.<br>
> <br>
> If the unnecessary entries are not ignored, we end up with duplicate arcs<br>
> in the call-graphs.<br>
> <br>
> Use DWARF debug information to ignore the unnecessary entries.<br>
> <br>
> Callgraph before the patch:<br>
> <br>
>     14.67%          2234  sprintft  libc-2.18.so       [.] __random<br>
>             |<br>
>             --- __random<br>
>                |<br>
>                |--61.12%-- __random<br>
>                |          |<br>
>                |          |--97.15%-- rand<br>
>                |          |          do_my_sprintf<br>
>                |          |          main<br>
>                |          |          generic_start_main.isra.0<br>
>                |          |          __libc_start_main<br>
>                |          |          0x0<br>
>                |          |<br>
>                |           --2.85%-- do_my_sprintf<br>
>                |                     main<br>
>                |                     generic_start_main.isra.0<br>
>                |                     __libc_start_main<br>
>                |                     0x0<br>
>                |<br>
>                 --38.88%-- rand<br>
>                           |<br>
>                           |--94.01%-- rand<br>
>                           |          do_my_sprintf<br>
>                           |          main<br>
>                           |          generic_start_main.isra.0<br>
>                           |          __libc_start_main<br>
>                           |          0x0<br>
>                           |<br>
>                            --5.99%-- do_my_sprintf<br>
>                                      main<br>
>                                      generic_start_main.isra.0<br>
>                                      __libc_start_main<br>
>                                      0x0<br>
> <br>
> Callgraph after the patch:<br>
> <br>
>     14.67%          2234  sprintft  libc-2.18.so       [.] __random<br>
>             |<br>
>             --- __random<br>
>                |<br>
>                |--95.93%-- rand<br>
>                |          do_my_sprintf<br>
>                |          main<br>
>                |          generic_start_main.isra.0<br>
>                |          __libc_start_main<br>
>                |          0x0<br>
>                |<br>
>                 --4.07%-- do_my_sprintf<br>
>                           main<br>
>                           generic_start_main.isra.0<br>
>                           __libc_start_main<br>
>                           0x0<br>
> <br>
> TODO:   For split-debug info objects like glibc, we can only determine<br>
>    the call-frame-address only when both .eh_frame and .debug_info<br>
>    sections are available. We should be able to determin the CFA<br>
>    even without the .eh_frame section.<br>
> <br>
> Thanks to Ulrich Weigand for help with DWARF debug information.<br>
> <br>
> Fix suggested by Anton Blanchard.<br>
> <br>
> Reported-by: Maynard Johnson <maynard@us.ibm.com><br>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com><br>
> ---<br>
>  tools/perf/arch/powerpc/Makefile                |   1 +<br>
>  tools/perf/arch/powerpc/util/adjust-callchain.c | 278 +++++++++++++<br>
> +++++++++++<br>
>  tools/perf/config/Makefile                      |   5 +<br>
>  tools/perf/util/callchain.h                     |  12 +<br>
>  tools/perf/util/machine.c                       |  16 +-<br>
>  5 files changed, 310 insertions(+), 2 deletions(-)<br>
>  create mode 100644 tools/perf/arch/powerpc/util/adjust-callchain.c<br>
> <br>
[snip]</font></tt></body></html>