[V6 00/11] perf: New conditional branch filter

Stephane Eranian eranian at google.com
Mon Jun 2 22:59:04 EST 2014


On Wed, May 28, 2014 at 10:04 AM, Anshuman Khandual
<khandual at linux.vnet.ibm.com> wrote:
> On 05/27/2014 05:39 PM, Stephane Eranian wrote:
>> I have been looking at those patches and ran some tests.
>> And I found a few issues so far.
>>
>> I am running:
>> $ perf record -j any_ret -e cycles:u test_program
>> $ perf report -D
>>
>> Most entries are okay and match the filter, however some do not make sense:
>>
>> 3642586996762 0x15d0 [0x108]: PERF_RECORD_SAMPLE(IP, 2): 17921/17921:
>> 0x10001170 period: 613678 addr: 0
>> .... branch stack: nr:9
>> .....  0: 00000000100011cc -> 0000000010000e38
>> .....  1: 0000000010001150 -> 00000000100011bc
>> .....  2: 0000000010001208 -> 0000000010000e38
>> .....  3: 0000000010001160 -> 00000000100011f8
>> .....  4: 00000000100011cc -> 0000000010000e38
>> .....  5: 0000000010001150 -> 00000000100011bc
>> .....  6: 0000000010001208 -> 0000000010000e38
>> .....  7: 0000000010001160 -> 00000000100011f8
>> .....  8: 0000000000000000 -> 0000000010001160
>> ^^^^^^
>> Entry 8 does not make sense, unless 0x0 is a valid return branch
>> instruction address.
>> If an address is invalid, the whole entry needs to be eliminated. It
>> is okay to have
>> less than the max number of entries supported by HW.
>
> Hey Stephane,
>
> Okay. The same behaviour is also reflected in the test results what I have
> shared in the patchset. Here is that section.
>
> (3) perf record -j any_ret -e branch-misses:u ./cprog
>
> # Overhead  Command  Source Shared Object          Source Symbol  Target Shared Object          Target Symbol
> # ........  .......  ....................  .....................  ....................  .....................
> #
>     15.61%    cprog  [unknown]             [.] 00000000           cprog                 [.] sw_3_1
>      6.28%    cprog  cprog                 [.] symbol2            cprog                 [.] hw_1_2
>      6.28%    cprog  cprog                 [.] ctr_addr           cprog                 [.] sw_4_1
>      6.26%    cprog  cprog                 [.] success_3_1_3      cprog                 [.] sw_3_1
>      6.24%    cprog  cprog                 [.] symbol1            cprog                 [.] hw_1_1
>      6.24%    cprog  cprog                 [.] sw_4_2             cprog                 [.] callme
>      6.21%    cprog  [unknown]             [.] 00000000           cprog                 [.] callme
>      6.19%    cprog  cprog                 [.] lr_addr            cprog                 [.] sw_4_2
>      3.16%    cprog  cprog                 [.] hw_1_2             cprog                 [.] callme
>      3.15%    cprog  cprog                 [.] success_3_1_1      cprog                 [.] sw_3_1
>      3.15%    cprog  cprog                 [.] sw_4_1             cprog                 [.] callme
>      3.14%    cprog  cprog                 [.] callme             cprog                 [.] main
>      3.13%    cprog  cprog                 [.] hw_1_1             cprog                 [.] callme
>
> So a lot of samples above have 0x0 as the "from" address. This originates from the code
> section here inside the function "power_pmu_bhrb_read", where we hit two back to back

Could you explain the back-to-back case a bit more here?
Back-to-back returns to me means something like:

int foo()
{
  ...
   return bar();
}

int bar()
{
  return 0;
}

Not counting the leaf optimization here, bar return to foo which
immediately returns: 2 back-2-back returns.
Is that the case you're talking about here?

> target addresses. So we zero out the from address for the first target address and re-read
> the second address over again. So thats how we get zero as the from address. This is how the
> HW capture the samples. I was reluctant to drop these samples but I agree that these kind of
> samples can be dropped if we need to.
>
I think we need to make it as simple as possible for tools, i.e.,
avoid having to decode the
disassembly to figure out what happened. Here address 0 is not exploitable.

> if (val & BHRB_TARGET) {
>         /* Shouldn't have two targets in a
>            row.. Reset index and try again */
>         r_index--;
>         addr = 0;
> }


More information about the Linuxppc-dev mailing list