[PATCH v6 1/9] ppc64 (le): prepare for -mprofile-kernel

Petr Mladek pmladek at suse.com
Thu Feb 4 22:02:24 AEDT 2016


On Thu 2016-02-04 18:31:40, AKASHI Takahiro wrote:
> Jiri, Torsten
> 
> Thank you for your explanation.
> 
> On 02/03/2016 08:24 PM, Torsten Duwe wrote:
> >On Wed, Feb 03, 2016 at 09:55:11AM +0100, Jiri Kosina wrote:
> >>On Wed, 3 Feb 2016, AKASHI Takahiro wrote:
> >>>those efforts, we are proposing[1] a new *generic* gcc option, -fprolog-add=N.
> >>>This option will insert N nop instructions at the beginning of each function.
> >
> >>The interesting part of the story with ppc64 is that you indeed want to
> >>create the callsite before the *most* of the prologue, but not really :)
> >
> >I was silently assuming that GCC would do this right on ppc64le; add the NOPs
> >right after the TOC load. Or after TOC load and LR save? ...
> 
> On arm/arm64, link register must be saved before any function call. So anyhow
> we will have to add something, 3 instructions at the minimum, like:
>    save lr
>    branch _mcount
>    restore lr
>    <prologue>
>    ...
>    <body>
>    ...

So, it is similar to PPC that has to handle LR as well.


> >>The part of the prologue where TOC pointer is saved needs to happen before
> >>the fentry/profiling call.
> >
> >Yes, any call, to any profiler/tracer/live patcher is potentially global
> >and needs the _new_ TOC value.

The code below is generated for PPC64LE with -mprofile-kernel using:

$> gcc --version
gcc (SUSE Linux) 6.0.0 20160121 (experimental) [trunk revision 232670]
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


0000000000000050 <cmdline_proc_show>:
  50:   00 00 4c 3c     addis   r2,r12,0
                        50: R_PPC64_REL16_HA    .TOC.
  54:   00 00 42 38     addi    r2,r2,0
                        54: R_PPC64_REL16_LO    .TOC.+0x4
  58:   a6 02 08 7c     mflr    r0
  5c:   01 00 00 48     bl      5c <cmdline_proc_show+0xc>
                        5c: R_PPC64_REL24       _mcount
  60:   a6 02 08 7c     mflr    r0
  64:   10 00 01 f8     std     r0,16(r1)
  68:   a1 ff 21 f8     stdu    r1,-96(r1)
  6c:   00 00 22 3d     addis   r9,r2,0
                        6c: R_PPC64_TOC16_HA    .toc
  70:   00 00 82 3c     addis   r4,r2,0
                        70: R_PPC64_TOC16_HA    .rodata.str1.8
  74:   00 00 29 e9     ld      r9,0(r9)
                        74: R_PPC64_TOC16_LO_DS .toc
  78:   00 00 84 38     addi    r4,r4,0
                        78: R_PPC64_TOC16_LO    .rodata.str1.8
  7c:   00 00 a9 e8     ld      r5,0(r9)
  80:   01 00 00 48     bl      80 <cmdline_proc_show+0x30>
                        80: R_PPC64_REL24       seq_printf
  84:   00 00 00 60     nop
  88:   00 00 60 38     li      r3,0
  8c:   60 00 21 38     addi    r1,r1,96
  90:   10 00 01 e8     ld      r0,16(r1)
  94:   a6 03 08 7c     mtlr    r0
  98:   20 00 80 4e     blr


And the same function compiled using:

$> gcc --version
gcc (SUSE Linux) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


0000000000000050 <cmdline_proc_show>:
  50:   00 00 4c 3c     addis   r2,r12,0
                        50: R_PPC64_REL16_HA    .TOC.
  54:   00 00 42 38     addi    r2,r2,0
                        54: R_PPC64_REL16_LO    .TOC.+0x4
  58:   a6 02 08 7c     mflr    r0
  5c:   10 00 01 f8     std     r0,16(r1)
  60:   01 00 00 48     bl      60 <cmdline_proc_show+0x10>
                        60: R_PPC64_REL24       _mcount
  64:   a6 02 08 7c     mflr    r0
  68:   10 00 01 f8     std     r0,16(r1)
  6c:   a1 ff 21 f8     stdu    r1,-96(r1)
  70:   00 00 42 3d     addis   r10,r2,0
                        70: R_PPC64_TOC16_HA    .toc
  74:   00 00 82 3c     addis   r4,r2,0
                        74: R_PPC64_TOC16_HA    .rodata.str1.8
  78:   00 00 2a e9     ld      r9,0(r10)
                        78: R_PPC64_TOC16_LO_DS .toc
  7c:   00 00 84 38     addi    r4,r4,0
                        7c: R_PPC64_TOC16_LO    .rodata.str1.8
  80:   00 00 a9 e8     ld      r5,0(r9)
  84:   01 00 00 48     bl      84 <cmdline_proc_show+0x34>
                        84: R_PPC64_REL24       seq_printf
  88:   00 00 00 60     nop
  8c:   00 00 60 38     li      r3,0
  90:   60 00 21 38     addi    r1,r1,96
  94:   10 00 01 e8     ld      r0,16(r1)
  98:   a6 03 08 7c     mtlr    r0
  9c:   20 00 80 4e     blr


Please, note that are used either 3 or 4 instructions before the
mcount location depending on the compiler version.

Best Regards,
Petr


More information about the Linuxppc-dev mailing list