[Cbe-oss-dev] Refactored cell powerpc oprofile patch

Geoff Levand geoffrey.levand at am.sony.com
Tue Mar 4 09:26:35 EST 2008


Hi Denis,

Barrow, Denis wrote:
> It should be functionally identical to Geoff Levands ps3-oprofile.patch
> just that op_model_ps3's functionality has been moved into op_model_cell.c
> & the architecture dependant parts have been put into pmu.c & ps3-lpm.c


I think there are two major things here that you have mixed together, one is
the abstraction of the platform specifics of the BE's performance monitor,
and the other is hooking oprofile up to that.

Also, as a general comment, you need to breakout your changes into smaller
patches, each one making a specific change.  Just a single monolithic
patch is difficult to understand and review.  Please look into the 'quilt'
utility, which is made to manage such a set of patches.


> If I installed more bugs than were in Geoff's code the special case
> code is a good place to look & do a diff with Geoffs original op_model_ps3.c &
> op_model_cell.c.


Well, I don't consider that existing work something to hold as a standard.


> Caveats
> I could not figure a way to avoid touching the generic oprofile code.


Then I think something is wrong in your method.  The cell MMIO stuff
is already hooked in, and I think all we need to do is to make a
platform abstraction inside it.


> The pmu.c is currently compiled into the cell oprofile code rather
> than being a seperate driver.


My question is then how do other profilers use pmu.c?  This seems to not
be correct.


>  arch/powerpc/oprofile/cell/pmu.c           | 1175 +++++++++++++++++++++++++++++


As I said above, this doesn't look correct, as the pmu interface provides
generic profiling support, used by any profiler, not just oprofile.


>  arch/powerpc/oprofile/cell/spu_profiler.c  |   10 
>  arch/powerpc/oprofile/cell/spu_task_sync.c |    2 
>  arch/powerpc/oprofile/common.c             |   25 
>  arch/powerpc/oprofile/op_model_cell.c      |  834 ++------------------
>  arch/powerpc/platforms/cell/Makefile       |    2 
>  arch/powerpc/platforms/cell/cbe_regs.c     |    1 
>  arch/powerpc/platforms/cell/interrupt.c    |    1 
>  arch/powerpc/platforms/cell/pmu.c          |  423 ----------
>  arch/powerpc/platforms/ps3/Kconfig         |    7 
>  drivers/oprofile/cpu_buffer.c              |    2 
>  drivers/oprofile/oprof.c                   |   97 ++
>  drivers/oprofile/oprof.h                   |   14 
>  drivers/ps3/ps3-lpm.c                      |  371 +++++++--
>  include/asm-powerpc/cell-pmu.h             |  114 ++
>  include/asm-powerpc/oprofile_impl.h        |    1 
>  include/asm-powerpc/ps3.h                  |   35 
>  19 files changed, 1861 insertions(+), 1262 deletions(-)
> 
> --- a/arch/powerpc/kernel/pmc.c
> +++ b/arch/powerpc/kernel/pmc.c
> @@ -40,6 +40,7 @@ static void dummy_perf(struct pt_regs *r
>  static DEFINE_SPINLOCK(pmc_owner_lock);
>  static void *pmc_owner_caller; /* mostly for debugging */
>  perf_irq_t perf_irq = dummy_perf;
> +EXPORT_SYMBOL_GPL(perf_irq);
>  
>  int reserve_pmc_hardware(perf_irq_t new_perf_irq)
>  {
> --- a/arch/powerpc/oprofile/Makefile
> +++ b/arch/powerpc/oprofile/Makefile
> @@ -11,9 +11,11 @@ DRIVER_OBJS := $(addprefix ../../../driv
>  		timer_int.o )
>  
>  oprofile-y := $(DRIVER_OBJS) common.o backtrace.o
> -oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o \
> -		cell/spu_profiler.o cell/vma_map.o \
> -		cell/spu_task_sync.o
> +oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o
> +ifeq ($(CONFIG_PPC_CELL_NATIVE),y)
> +oprofile-$(CONFIG_OPROFILE_CELL) += cell/spu_profiler.o cell/vma_map.o \
> +		cell/spu_task_sync.o cell/pmu.o
> +endif


I think you can arrange this to not have that conditional.
Will this work?

oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o
oprofile-$(CONFIG_PPC_CELL_NATIVE) += cell/spu_profiler.o cell/vma_map.o cell/spu_task_sync.o cell/pmu.o

We don't need any of these for ps3 or celleb?


>  oprofile-$(CONFIG_PPC64) += op_model_rs64.o op_model_power4.o op_model_pa6t.o
>  oprofile-$(CONFIG_FSL_EMB_PERFMON) += op_model_fsl_emb.o
>  oprofile-$(CONFIG_6xx) += op_model_7450.o
> --- /dev/null
> +++ b/arch/powerpc/oprofile/cell/pmu.c
> @@ -0,0 +1,1175 @@


Now this seems to be just a file move, or does it have some changes too???
Generally, a file move or rename is put into a separate stand-alone patch,
then another patch on top of that for any code modifications.


> --- a/arch/powerpc/oprofile/common.c
> +++ b/arch/powerpc/oprofile/common.c
> @@ -180,11 +186,21 @@ int __init oprofile_arch_init(struct opr
>  #ifdef CONFIG_PPC64
>  #ifdef CONFIG_OPROFILE_CELL
>  		case PPC_OPROFILE_CELL:
> -			if (firmware_has_feature(FW_FEATURE_LPAR))
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
> +			if (firmware_has_feature(FW_FEATURE_LPAR) &&
> +				!firmware_has_feature(FW_FEATURE_PS3_LV1)) {
> +				pr_debug("%s:%d: \n", __func__, __LINE__);
> +				return -ENODEV;
> +			}
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
> +			if (!op_powerpc_model)
>  				return -ENODEV;


This check was put here for ps3 and beat (celleb), so just a
check for beat would be needed now:

	if (firmware_has_feature(FW_FEATURE_BEAT)))
		return -ENODEV;


> -			model = &op_model_cell;
> +			model = op_powerpc_model;


I'm just wondering how we get op_model_cell hooked up.


>  			ops->sync_start = model->sync_start;
>  			ops->sync_stop = model->sync_stop;
> +#ifdef CONFIG_PPC_CELL_NATIVE
> +			cbe_init_pm_irq();
> +#endif


This doesn't look like the proper place nor thing to do...


>  			break;
>  #endif
>  		case PPC_OPROFILE_RS64:
> @@ -208,8 +224,10 @@ int __init oprofile_arch_init(struct opr
>  			break;
>  #endif
>  		default:
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
>  			return -ENODEV;
>  	}
> +	pr_debug("%s:%d: \n", __func__, __LINE__);


You should put this added debug output into a separate temporary patch.


>  	model->num_counters = cur_cpu_spec->num_pmcs;
>  
> @@ -229,4 +247,7 @@ int __init oprofile_arch_init(struct opr
>  
>  void oprofile_arch_exit(void)
>  {
> +#ifdef CONFIG_PPC_CELL_NATIVE
> +       cbe_remove_pm_irq();
> +#endif
>  }
> --- a/arch/powerpc/oprofile/op_model_cell.c
> +++ b/arch/powerpc/oprofile/op_model_cell.c
> @@ -2,11 +2,14 @@
>   * Cell Broadband Engine OProfile Support
>   *
>   * (C) Copyright IBM Corporation 2006
> + * Copyright (C) 2007 Sony Computer Entertainment Inc.
> + * Copyright 2007 Sony Corporation.


We (ps3-linux team) don't add this kind of thing when we
are just making some changes.


>   *
>   * Author: David Erb (djerb at us.ibm.com)
>   * Modifications:
>   *	   Carl Love <carll at us.ibm.com>
>   *	   Maynard Johnson <maynardj at us.ibm.com>
> + *         D.J. Barrow <denis.barrow at eu.sony.com>


We (ps3-linux team) don't put this kind of thing into files.
Our modifications are logged via the kernel's change log.


>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License
> @@ -26,45 +29,26 @@
>  #include <linux/timer.h>
>  #include <asm/cell-pmu.h>
>  #include <asm/cputable.h>
> -#include <asm/firmware.h>
> -#include <asm/io.h>
> -#include <asm/oprofile_impl.h>
>  #include <asm/processor.h>
>  #include <asm/prom.h>
> -#include <asm/ptrace.h>
>  #include <asm/reg.h>
> -#include <asm/rtas.h>
>  #include <asm/system.h>
>  #include <asm/cell-regs.h>
> -
>  #include "../platforms/cell/interrupt.h"
>  #include "cell/pr_util.h"
>  
> -static void cell_global_stop_spu(void);
>  
> +struct pmu_ops *pmu_ops;
> +EXPORT_SYMBOL_GPL(pmu_ops);


This seems totaly wrong, exporting the generic pmu
support in oprofile.


> +struct op_powerpc_model *op_powerpc_model;
> +EXPORT_SYMBOL_GPL(op_powerpc_model);


I think all we need is op_model_cell, and then that
calls platform specific routines.  Unfortunatly,
I see op_model_cell has a lot of platform specific
code in it.


>  /*
>   * spu_cycle_reset is the number of cycles between samples.
>   * This variable is used for SPU profiling and should ONLY be set
>   * at the beginning of cell_reg_setup; otherwise, it's read-only.
>   */
> -static unsigned int spu_cycle_reset;
> -
> -#define NUM_SPUS_PER_NODE    8
> -#define SPU_CYCLES_EVENT_NUM 2	/*  event number for SPU_CYCLES */
> -
> -#define PPU_CYCLES_EVENT_NUM 1	/*  event number for CYCLES */
> -#define PPU_CYCLES_GRP_NUM   1	/* special group number for identifying
> -				 * PPU_CYCLES event
> -				 */
> -#define CBE_COUNT_ALL_CYCLES 0x42800000 /* PPU cycle event specifier */
> -
> -#define NUM_THREADS 2         /* number of physical threads in
> -			       * physical processor
> -			       */
> -#define NUM_DEBUG_BUS_WORDS 4
> -#define NUM_INPUT_BUS_WORDS 2
> -
> -#define MAX_SPU_COUNT 0xFFFFFF	/* maximum 24 bit LFSR value */
> +unsigned int spu_cycle_reset;
> +EXPORT_SYMBOL_GPL(spu_cycle_reset);
>  
>  struct pmc_cntrl_data {
>  	unsigned long vcntr;
> @@ -73,31 +57,7 @@ struct pmc_cntrl_data {
>  	unsigned long enabled;
>  };
>  
> -/*
> - * ibm,cbe-perftools rtas parameters
> - */
> -struct pm_signal {
> -	u16 cpu;		/* Processor to modify */
> -	u16 sub_unit;		/* hw subunit this applies to (if applicable)*/
> -	short int signal_group; /* Signal Group to Enable/Disable */
> -	u8 bus_word;		/* Enable/Disable on this Trace/Trigger/Event
> -				 * Bus Word(s) (bitmask)
> -				 */
> -	u8 bit;			/* Trigger/Event bit (if applicable) */
> -};
>  
> -/*
> - * rtas call arguments
> - */
> -enum {
> -	SUBFUNC_RESET = 1,
> -	SUBFUNC_ACTIVATE = 2,
> -	SUBFUNC_DEACTIVATE = 3,
> -
> -	PASSTHRU_IGNORE = 0,
> -	PASSTHRU_ENABLE = 1,
> -	PASSTHRU_DISABLE = 2,
> -};
>  
>  struct pm_cntrl {
>  	u16 enable;
> @@ -121,9 +81,10 @@ static struct {
>  #define GET_COUNT_CYCLES(x) (x & 0x00000001)
>  #define GET_INPUT_CONTROL(x) ((x & 0x00000004) >> 2)
>  
> -static DEFINE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values);
> +DEFINE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values);
> +EXPORT_SYMBOL_GPL(per_cpu_var(pmc_values));
>  
> -static struct pmc_cntrl_data pmc_cntrl[NUM_THREADS][NR_PHYS_CTRS];
> +static struct pmc_cntrl_data pmc_cntrl[PPU_NUM_THREADS][NR_PHYS_CTRS];
>  
>  /*
>   * The CELL profiling code makes rtas calls to setup the debug bus to


Whoa! Bad news...


> @@ -141,129 +102,43 @@ static struct pmc_cntrl_data pmc_cntrl[N
>   */
>  
>  /*
> - * Interpetation of hdw_thread:
> + * Interpetation of oprofile_hdw_thread:
>   * 0 - even virtual cpus 0, 2, 4,...
>   * 1 - odd virtual cpus 1, 3, 5, ...
>   *
>   * FIXME: this is strictly wrong, we need to clean this up in a number
>   * of places. It works for now. -arnd
>   */
> -static u32 hdw_thread;
> +u32 oprofile_hdw_thread;
> +EXPORT_SYMBOL_GPL(oprofile_hdw_thread);
>  
> -static u32 virt_cntr_inter_mask;
> -static struct timer_list timer_virt_cntr;
> +u32 oprofile_virt_cntr_inter_mask;
> +EXPORT_SYMBOL_GPL(oprofile_virt_cntr_inter_mask);
> +struct timer_list oprofile_timer_virt_cntr;
> +EXPORT_SYMBOL_GPL(oprofile_timer_virt_cntr);
>  
>  /*
>   * pm_signal needs to be global since it is initialized in
>   * cell_reg_setup at the time when the necessary information
>   * is available.
>   */
> -static struct pm_signal pm_signal[NR_PHYS_CTRS];
> -static int pm_rtas_token;    /* token for debug bus setup call */
> -static int spu_rtas_token;   /* token for SPU cycle profiling */
> -
> -static u32 reset_value[NR_PHYS_CTRS];
> -static int num_counters;
> -static int oprofile_running;
> -static DEFINE_SPINLOCK(virt_cntr_lock);
> +struct pm_signal pm_signal[NR_PHYS_CTRS];
> +EXPORT_SYMBOL_GPL(pm_signal);
> +u32 oprofile_reset_value[NR_PHYS_CTRS];
> +EXPORT_SYMBOL_GPL(oprofile_reset_value);
> +
> +int oprofile_num_counters;
> +EXPORT_SYMBOL_GPL(oprofile_num_counters);
> +int oprofile_running;
> +EXPORT_SYMBOL_GPL(oprofile_running);
> +DEFINE_SPINLOCK(oprofile_virt_cntr_lock);
> +EXPORT_SYMBOL_GPL(oprofile_virt_cntr_lock);


With the design of a generic op_model_cell, all these things don't
need to be exported, as the platform specifics will be called indirectly
through a poinetr.


> @@ -441,13 +328,13 @@ static void cell_virtual_cntr(unsigned l
>  	 * not both playing with the counters on the same node.
>  	 */
>  
> -	spin_lock_irqsave(&virt_cntr_lock, flags);
> +	spin_lock_irqsave(&oprofile_virt_cntr_lock, flags);
>  
> -	prev_hdw_thread = hdw_thread;
> +	prev_hdw_thread = oprofile_hdw_thread;


This kind of name changing needs to go into a seperate patch,
but I think it is not needed if we get the platform  abstraction
right.


> @@ -469,45 +356,31 @@ static void cell_virtual_cntr(unsigned l
>  	 * we need cpu #, not node #, to pass to the cbe_xxx functions.
>  	 */
>  	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> +		if (pmu_ops->get_hw_thread_id(cpu))
>  			continue;
>  
>  		/*
>  		 * stop counters, save counter values, restore counts
>  		 * for previous thread
>  		 */
> -		cbe_disable_pm(cpu);
> -		cbe_disable_pm_interrupts(cpu);
> -		for (i = 0; i < num_counters; i++) {
> -			per_cpu(pmc_values, cpu + prev_hdw_thread)[i]
> -			    = cbe_read_ctr(cpu, i);
> -
> -			if (per_cpu(pmc_values, cpu + next_hdw_thread)[i]
> -			    == 0xFFFFFFFF)
> -				/* If the cntr value is 0xffffffff, we must
> -				 * reset that to 0xfffffff0 when the current
> -				 * thread is restarted.	 This will generate a
> -				 * new interrupt and make sure that we never
> -				 * restore the counters to the max value.  If
> -				 * the counters were restored to the max value,
> -				 * they do not increment and no interrupts are
> -				 * generated.  Hence no more samples will be
> -				 * collected on that cpu.
> -				 */
> -				cbe_write_ctr(cpu, i, 0xFFFFFFF0);
> -			else
> -				cbe_write_ctr(cpu, i,
> -					      per_cpu(pmc_values,
> -						      cpu +
> -						      next_hdw_thread)[i]);
> -		}
>  
> +		pmu_ops->disable_pm(cpu);
> +		pmu_ops->disable_pm_interrupts(cpu);
> +		pmu_ops->virtual_cntr_part1(oprofile_num_counters,
> +					    prev_hdw_thread,
> +					    next_hdw_thread, cpu);
> +		/*
> +		 * Add sample data at here.
> +		 * Because PS3 hypervisor does not have
> +		 * the performance monitor interrupt feature.
> +		 */
> +		pmu_ops->add_sample(cpu);


It seems like you could just put this into the ps3 virtual_cntr_part1?


-Geoff




More information about the cbe-oss-dev mailing list