[PATCH 3/3] perf, x86, lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL

Michael Neuling mikey at neuling.org
Fri May 17 21:32:08 EST 2013


Peter Zijlstra <peterz at infradead.org> wrote:

> On Thu, May 16, 2013 at 05:36:11PM +0200, Stephane Eranian wrote:
> > On Thu, May 16, 2013 at 1:16 PM, Peter Zijlstra <peterz at infradead.org> wrote:
> > > On Thu, May 16, 2013 at 08:15:17PM +1000, Michael Neuling wrote:
> > >> Peter,
> > >>
> > >> BTW PowerPC also has the ability to filter on conditional branches.  Any
> > >> chance we could add something like the follow to perf also?
> > >>
> > >
> > > I don't see an immediate problem with that except that we on x86 need to
> > > implement that in the software filter. Stephane do you see any
> > > fundamental issue with that?
> > >
> > On X86, the LBR cannot filter on conditional in HW. Thus as Peter said, it would
> > have to be done in SW. I did not add that because I think those branches are
> > not necessarily useful for tools.
> 
> Wouldn't it be mostly conditional branches that are the primary control flow
> and can get predicted wrong? I mean, I'm sure someone will miss-predict an
> unconditional branch but its not like we care about people with such
> afflictions do we?

You could mispredict the target address of a computed goto.  You'd know
it was taken but not know target address until later in the pipeline.

On this, the POWER8 branch history buffer tells us two things about the
prediction status.  
  1) if the branch was predicted taken/not taken correctly
  2) if the target address was predicted correctly or not (for computed
     gotos only)
So we'd actually like more prediction bits too :-D

> Anyway, since PPC people thought it worth baking into hardware,
> presumably they have a compelling use case. Mikey could you see if you
> can retrieve that from someone in the know? It might be interesting.

I don't think we can mispredict a non-conditional non-computed but I'll
have to check with the HW folks.

Mikey

>
> Also, it looks like its trivial to add to x86, you seem to have already done
> all the hard work by having X86_BR_JCC.
> 
> The only missing piece would be:
> 
> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -337,6 +337,10 @@ static int intel_pmu_setup_sw_lbr_filter
>  
>  	if (br_type & PERF_SAMPLE_BRANCH_IND_CALL)
>  		mask |= X86_BR_IND_CALL;
> +
> +	if (br_type & PERF_SAMPLE_BRANCH_CONDITIONAL)
> +		mask |= X86_BR_JCC;
> +
>  	/*
>  	 * stash actual user request into reg, it may
>  	 * be used by fixup code for some CPU
> 


More information about the Linuxppc-dev mailing list