[Cbe-oss-dev] Refactored cell powerpc oprofile patch

Carl Love cel at us.ibm.com
Tue Mar 4 04:32:38 EST 2008



I have not tested any of this code on Cell yet.  Don't know if this
breaks anything.  But I do have a number of questions/concerns.  In
summary, it really seems like adding support for PS 3 should not require
such massive changes to the existing code.  After all, it is the same
processor and you are adding the support in the same directory as the
Cell processor.  I am very confused as to why you chose to pull cell
code from multiple files/directories and replicate the code verbatim for
ps3.  Seems like you should be able to use the existing code without
creating a second identical copy.  I point out specifically where I see
this happening below.

At this point, the only issue I see with PS3 support using the existing
CELL code is the FW calls.  The CELL code makes FW calls to setup the
debug bus and do SPU cycle based profiling.  We started the process
within IBM to get approval to share the FW code with Sony and Toshiba so
that all three platforms can share the same FW interface.  This will
eliminate adding any kernel support to access the debug bus registers.
The debug bus setup register address will probably change with the next
version of the HW.  Hence it is necessary to hide that info from the
kernel in the HW specific firmware.  I have not checked to see where the
process is at in terms of actually delivering the code.  We can pursue
this issue off line.  

I think this patch is not a good idea.  I feel we can accomplish the
same thing with much less code changes.  The benefit is there will not
be a need to maintain two identical/nearly identical sets of source
code.

                Carl Love

On Fri, 2008-02-29 at 16:37 +0000, Barrow, Denis wrote:
> Hi all,
> I'm sending this patch for peer review oprofile test script
> This patch passes Geoff Levands linux/scripts/oprofile-test
> on ps3 but is completely untested on cell or celleb other
> than compiling, please test it.
> 
> It should be functionally identical to Geoff Levands ps3-oprofile.patch
> just that op_model_ps3's functionality has been moved into op_model_cell.c
> & the architecture dependant parts have been put into pmu.c & ps3-lpm.c
> If I installed more bugs than were in Geoff's code the special case
> code is a good place to look & do a diff with Geoffs original op_model_ps3.c &
> op_model_cell.c.
> 
> Caveats
> I could not figure a way to avoid touching the generic oprofile code.
> 
> The pmu.c is currently compiled into the cell oprofile code rather
> than being a seperate driver.
> 
> There is one fixme in the patch 
> /* Fixme DJB is the irq_dispose_mapping adviseable */
> #if 0
> 				irq_dispose_mapping(irq);
> #endif
> 
> Also oprofile.ko has a module usage count of 2 & only
> goes down to 1 when ps3-lpm.ko is rmmod'ed
> I've no idea why this is.
> 
> Enjoy
> Subject: PS3: Add oprofile support
> 
> This is a WIP.
> 
> Add PS3 oprofile support.
> 
> wip-by: Geoff Levand <geoffrey.levand at am.sony.com>
> code refactoring by: D.J. Barrow <Denis.Barrow at eu.sony.com>
> ---
>  arch/powerpc/kernel/pmc.c                  |    1 
>  arch/powerpc/oprofile/Makefile             |    8 
>  arch/powerpc/oprofile/cell/pmu.c           | 1175 +++++++++++++++++++++++++++++
>  arch/powerpc/oprofile/cell/spu_profiler.c  |   10 
>  arch/powerpc/oprofile/cell/spu_task_sync.c |    2 
>  arch/powerpc/oprofile/common.c             |   25 
>  arch/powerpc/oprofile/op_model_cell.c      |  834 ++------------------
>  arch/powerpc/platforms/cell/Makefile       |    2 
>  arch/powerpc/platforms/cell/cbe_regs.c     |    1 
>  arch/powerpc/platforms/cell/interrupt.c    |    1 
>  arch/powerpc/platforms/cell/pmu.c          |  423 ----------
>  arch/powerpc/platforms/ps3/Kconfig         |    7 
>  drivers/oprofile/cpu_buffer.c              |    2 
>  drivers/oprofile/oprof.c                   |   97 ++
>  drivers/oprofile/oprof.h                   |   14 
>  drivers/ps3/ps3-lpm.c                      |  371 +++++++--
>  include/asm-powerpc/cell-pmu.h             |  114 ++
>  include/asm-powerpc/oprofile_impl.h        |    1 
>  include/asm-powerpc/ps3.h                  |   35 
>  19 files changed, 1861 insertions(+), 1262 deletions(-)
> 
> --- a/arch/powerpc/kernel/pmc.c
> +++ b/arch/powerpc/kernel/pmc.c
> @@ -40,6 +40,7 @@ static void dummy_perf(struct pt_regs *r
>  static DEFINE_SPINLOCK(pmc_owner_lock);
>  static void *pmc_owner_caller; /* mostly for debugging */
>  perf_irq_t perf_irq = dummy_perf;
> +EXPORT_SYMBOL_GPL(perf_irq);
>  
>  int reserve_pmc_hardware(perf_irq_t new_perf_irq)
>  {
> --- a/arch/powerpc/oprofile/Makefile
> +++ b/arch/powerpc/oprofile/Makefile
> @@ -11,9 +11,11 @@ DRIVER_OBJS := $(addprefix ../../../driv
>  		timer_int.o )
>  
>  oprofile-y := $(DRIVER_OBJS) common.o backtrace.o
> -oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o \
> -		cell/spu_profiler.o cell/vma_map.o \
> -		cell/spu_task_sync.o
> +oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o
> +ifeq ($(CONFIG_PPC_CELL_NATIVE),y)
> +oprofile-$(CONFIG_OPROFILE_CELL) += cell/spu_profiler.o cell/vma_map.o \
> +		cell/spu_task_sync.o cell/pmu.o
> +endif
>  oprofile-$(CONFIG_PPC64) += op_model_rs64.o op_model_power4.o op_model_pa6t.o
>  oprofile-$(CONFIG_FSL_EMB_PERFMON) += op_model_fsl_emb.o
>  oprofile-$(CONFIG_6xx) += op_model_7450.o
> --- /dev/null
> +++ b/arch/powerpc/oprofile/cell/pmu.c
> @@ -0,0 +1,1175 @@
> +/*
> + * Cell Broadband Engine Performance Monitor
> + *
> + * (C) Copyright IBM Corporation 2001,2006
> + *
> + * Author:
> + *    David Erb (djerb at us.ibm.com)
> + *    Kevin Corry (kevcorry at us.ibm.com)
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2, or (at your option)
> + * any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/types.h>
> +#include <asm/io.h>
> +#include <asm/irq_regs.h>
> +#include <asm/machdep.h>
> +#include <asm/pmc.h>
> +#include <asm/reg.h>
> +#include <asm/spu.h>
> +#include <asm/cell-regs.h>
> +#include <asm/oprofile_impl.h>
> +#include <asm/cell-pmu.h>
> +#include <asm/firmware.h>
> +#include <asm/io.h>
> +#include <asm/ptrace.h>
> +#include <asm/rtas.h>
> +#include <linux/cpufreq.h>
> +#include "pr_util.h"
> +
> +#include "../../platforms/cell/interrupt.h"

You have pulled all of the shadow register code verbatim from
arch/powerpc/platforms/cell/pmu.c.  Why can't you just call these
existing functions?  It is not clear to me that ps3 support has to be
separate from cell, again, they are the same processor.


> +
> +/*
> + * When writing to write-only mmio addresses, save a shadow copy. All of the
> + * registers are 32-bit, but stored in the upper-half of a 64-bit field in
> + * pmd_regs.
> + */
> +
> +#define WRITE_WO_MMIO(reg, x)					\
> +	do {							\
> +		u32 _x = (x);					\
> +		struct cbe_pmd_regs __iomem *pmd_regs;		\
> +		struct cbe_pmd_shadow_regs *shadow_regs;	\
> +		pmd_regs = cbe_get_cpu_pmd_regs(cpu);		\
> +		shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);	\
> +		out_be64(&(pmd_regs->reg), (((u64)_x) << 32));	\
> +		shadow_regs->reg = _x;				\
> +	} while (0)
> +
> +#define READ_SHADOW_REG(val, reg)				\
> +	do {							\
> +		struct cbe_pmd_shadow_regs *shadow_regs;	\
> +		shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);	\
> +		(val) = shadow_regs->reg;			\
> +	} while (0)
> +
> +#define READ_MMIO_UPPER32(val, reg)				\
> +	do {							\
> +		struct cbe_pmd_regs __iomem *pmd_regs;		\
> +		pmd_regs = cbe_get_cpu_pmd_regs(cpu);		\
> +		(val) = (u32)(in_be64(&pmd_regs->reg) >> 32);	\
> +	} while (0)
> +
> +/*
> + * rtas call arguments
> + */
> +enum {
> +	SUBFUNC_RESET = 1,
> +	SUBFUNC_ACTIVATE = 2,
> +	SUBFUNC_DEACTIVATE = 3,
> +
> +	PASSTHRU_IGNORE = 0,
> +	PASSTHRU_ENABLE = 1,
> +	PASSTHRU_DISABLE = 2,
> +};
> +
> +
> +static int pm_rtas_token;    /* token for debug bus setup call */
> +static int spu_rtas_token;   /* token for SPU cycle profiling */
> +
> +
> +/* Routines for reading/writing the PMU registers. */
> +static u32  cbe_read_phys_ctr(u32 cpu, u32 phys_ctr);
> +static void cbe_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val);
> +static u32  cbe_read_ctr(u32 cpu, u32 ctr);
> +static void cbe_write_ctr(u32 cpu, u32 ctr, u32 val);
> +
> +static u32  cbe_read_pm07_control(u32 cpu, u32 ctr);
> +static void cbe_write_pm07_control(u32 cpu, u32 ctr, u32 val);
> +static u32  cbe_read_pm(u32 cpu, enum pm_reg_name reg);
> +static void cbe_write_pm(u32 cpu, enum pm_reg_name reg, u32 val);
> +
> +static u32  cbe_get_ctr_size(u32 cpu, u32 phys_ctr);
> +static void cbe_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size);
> +
> +static void cbe_enable_pm(u32 cpu);
> +static void cbe_disable_pm(u32 cpu);
> +
> +static void cbe_read_trace_buffer(u32 cpu, u64 *buf);
> +
> +static void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask);
> +static void cbe_disable_pm_interrupts(u32 cpu);
> +static u32  cbe_get_and_clear_pm_interrupts(u32 cpu);
> +static void cbe_sync_irq(int node);
> +
> +
> +/*
> + * Physical counter registers.
> + * Each physical counter can act as one 32-bit counter or two 16-bit counters.
> + */
> +
> +static u32 cbe_read_phys_ctr(u32 cpu, u32 phys_ctr)
> +{
> +	u32 val_in_latch, val = 0;
> +
> +	if (phys_ctr < NR_PHYS_CTRS) {
> +		READ_SHADOW_REG(val_in_latch, counter_value_in_latch);
> +
> +		/* Read the latch or the actual counter, whichever is newer. */
> +		if (val_in_latch & (1 << phys_ctr)) {
> +			READ_SHADOW_REG(val, pm_ctr[phys_ctr]);
> +		} else {
> +			READ_MMIO_UPPER32(val, pm_ctr[phys_ctr]);
> +		}
> +	}
> +
> +	return val;
> +}
> +
> +static void cbe_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val)
> +{
> +	struct cbe_pmd_shadow_regs *shadow_regs;
> +	u32 pm_ctrl;
> +
> +	if (phys_ctr < NR_PHYS_CTRS) {
> +		/* Writing to a counter only writes to a hardware latch.
> +		 * The new value is not propagated to the actual counter
> +		 * until the performance monitor is enabled.
> +		 */
> +		WRITE_WO_MMIO(pm_ctr[phys_ctr], val);
> +
> +		pm_ctrl = cbe_read_pm(cpu, pm_control);
> +		if (pm_ctrl & CBE_PM_ENABLE_PERF_MON) {
> +			/* The counters are already active, so we need to
> +			 * rewrite the pm_control register to "re-enable"
> +			 * the PMU.
> +			 */
> +			cbe_write_pm(cpu, pm_control, pm_ctrl);
> +		} else {
> +			shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);
> +			shadow_regs->counter_value_in_latch |= (1 << phys_ctr);
> +		}
> +	}
> +}
> +
> +
> +/*
> + * "Logical" counter registers.
> + * These will read/write 16-bits or 32-bits depending on the
> + * current size of the counter. Counters 4 - 7 are always 16-bit.
> + */
> +
> +static u32 cbe_read_ctr(u32 cpu, u32 ctr)
> +{
> +	u32 val;
> +	u32 phys_ctr = ctr & (NR_PHYS_CTRS - 1);
> +
> +	val = cbe_read_phys_ctr(cpu, phys_ctr);
> +
> +	if (cbe_get_ctr_size(cpu, phys_ctr) == 16)
> +		val = (ctr < NR_PHYS_CTRS) ? (val >> 16) : (val & 0xffff);
> +
> +	return val;
> +}
> +
> +
> +static void cbe_write_ctr(u32 cpu, u32 ctr, u32 val)
> +{
> +	u32 phys_ctr;
> +	u32 phys_val;
> +
> +	phys_ctr = ctr & (NR_PHYS_CTRS - 1);
> +
> +	if (cbe_get_ctr_size(cpu, phys_ctr) == 16) {
> +		phys_val = cbe_read_phys_ctr(cpu, phys_ctr);
> +
> +		if (ctr < NR_PHYS_CTRS)
> +			val = (val << 16) | (phys_val & 0xffff);
> +		else
> +			val = (val & 0xffff) | (phys_val & 0xffff0000);
> +	}
> +
> +	cbe_write_phys_ctr(cpu, phys_ctr, val);
> +}
> +
> +
> +/*
> + * Counter-control registers.
> + * Each "logical" counter has a corresponding control register.
> + */
> +
> +static u32 cbe_read_pm07_control(u32 cpu, u32 ctr)
> +{
> +	u32 pm07_control = 0;
> +
> +	if (ctr < NR_CTRS)
> +		READ_SHADOW_REG(pm07_control, pm07_control[ctr]);
> +
> +	return pm07_control;
> +}
> +
> +
> +static void cbe_write_pm07_control(u32 cpu, u32 ctr, u32 val)
> +{
> +	if (ctr < NR_CTRS)
> +		WRITE_WO_MMIO(pm07_control[ctr], val);
> +}
> +
> +
> +/*
> + * Other PMU control registers. Most of these are write-only.
> + */
> +
> +static u32 cbe_read_pm(u32 cpu, enum pm_reg_name reg)
> +{
> +	u32 val = 0;
> +
> +	switch (reg) {
> +	case group_control:
> +		READ_SHADOW_REG(val, group_control);
> +		break;
> +
> +	case debug_bus_control:
> +		READ_SHADOW_REG(val, debug_bus_control);
> +		break;
> +
> +	case trace_address:
> +		READ_MMIO_UPPER32(val, trace_address);
> +		break;
> +
> +	case ext_tr_timer:
> +		READ_SHADOW_REG(val, ext_tr_timer);
> +		break;
> +
> +	case pm_status:
> +		READ_MMIO_UPPER32(val, pm_status);
> +		break;
> +
> +	case pm_control:
> +		READ_SHADOW_REG(val, pm_control);
> +		break;
> +
> +	case pm_interval:
> +		READ_MMIO_UPPER32(val, pm_interval);
> +		break;
> +
> +	case pm_start_stop:
> +		READ_SHADOW_REG(val, pm_start_stop);
> +		break;
> +	}
> +
> +	return val;
> +}
> +
> +
> +static void cbe_write_pm(u32 cpu, enum pm_reg_name reg, u32 val)
> +{
> +	switch (reg) {
> +	case group_control:
> +		WRITE_WO_MMIO(group_control, val);
> +		break;
> +
> +	case debug_bus_control:
> +		WRITE_WO_MMIO(debug_bus_control, val);
> +		break;
> +
> +	case trace_address:
> +		WRITE_WO_MMIO(trace_address, val);
> +		break;
> +
> +	case ext_tr_timer:
> +		WRITE_WO_MMIO(ext_tr_timer, val);
> +		break;
> +
> +	case pm_status:
> +		WRITE_WO_MMIO(pm_status, val);
> +		break;
> +
> +	case pm_control:
> +		WRITE_WO_MMIO(pm_control, val);
> +		break;
> +
> +	case pm_interval:
> +		WRITE_WO_MMIO(pm_interval, val);
> +		break;
> +
> +	case pm_start_stop:
> +		WRITE_WO_MMIO(pm_start_stop, val);
> +		break;
> +	}
> +}
> +
> +
> +/*
> + * Get/set the size of a physical counter to either 16 or 32 bits.
> + */
> +
> +static u32 cbe_get_ctr_size(u32 cpu, u32 phys_ctr)
> +{
> +	u32 pm_ctrl, size = 0;
> +
> +	if (phys_ctr < NR_PHYS_CTRS) {
> +		pm_ctrl = cbe_read_pm(cpu, pm_control);
> +		size = (pm_ctrl & CBE_PM_16BIT_CTR(phys_ctr)) ? 16 : 32;
> +	}
> +
> +	return size;
> +}
> +
> +
> +static void cbe_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size)
> +{
> +	u32 pm_ctrl;
> +
> +	if (phys_ctr < NR_PHYS_CTRS) {
> +		pm_ctrl = cbe_read_pm(cpu, pm_control);
> +		switch (ctr_size) {
> +		case 16:
> +			pm_ctrl |= CBE_PM_16BIT_CTR(phys_ctr);
> +			break;
> +
> +		case 32:
> +			pm_ctrl &= ~CBE_PM_16BIT_CTR(phys_ctr);
> +			break;
> +		}
> +		cbe_write_pm(cpu, pm_control, pm_ctrl);
> +	}
> +}
> +
> +
> +/*
> + * Enable/disable the entire performance monitoring unit.
> + * When we enable the PMU, all pending writes to counters get committed.
> + */
> +
> +static void cbe_enable_pm(u32 cpu)
> +{
> +	struct cbe_pmd_shadow_regs *shadow_regs;
> +	u32 pm_ctrl;
> +
> +	shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);
> +	shadow_regs->counter_value_in_latch = 0;
> +
> +	pm_ctrl = cbe_read_pm(cpu, pm_control) | CBE_PM_ENABLE_PERF_MON;
> +	cbe_write_pm(cpu, pm_control, pm_ctrl);
> +}
> +
> +
> +static void cbe_disable_pm(u32 cpu)
> +{
> +	u32 pm_ctrl;
> +	pm_ctrl = cbe_read_pm(cpu, pm_control) & ~CBE_PM_ENABLE_PERF_MON;
> +	cbe_write_pm(cpu, pm_control, pm_ctrl);
> +}
> +
> +
> +/*
> + * Reading from the trace_buffer.
> + * The trace buffer is two 64-bit registers. Reading from
> + * the second half automatically increments the trace_address.
> + */
> +
> +static void cbe_read_trace_buffer(u32 cpu, u64 *buf)
> +{
> +	struct cbe_pmd_regs __iomem *pmd_regs = cbe_get_cpu_pmd_regs(cpu);
> +
> +	*buf++ = in_be64(&pmd_regs->trace_buffer_0_63);
> +	*buf++ = in_be64(&pmd_regs->trace_buffer_64_127);
> +}
> +
> +
> +/*
> + * Enabling/disabling interrupts for the entire performance monitoring unit.
> + */
> +
> +static u32 cbe_get_and_clear_pm_interrupts(u32 cpu)
> +{
> +	/* Reading pm_status clears the interrupt bits. */
> +	return cbe_read_pm(cpu, pm_status);
> +}
> +
> +
> +static void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask)
> +{
> +	/* Set which node and thread will handle the next interrupt. */
> +	iic_set_interrupt_routing(cpu, thread, 0);
> +
> +	/* Enable the interrupt bits in the pm_status register. */
> +	if (mask)
> +		cbe_write_pm(cpu, pm_status, mask);
> +}
> +
> +
> +static void cbe_disable_pm_interrupts(u32 cpu)
> +{
> +	cbe_get_and_clear_pm_interrupts(cpu);
> +	cbe_write_pm(cpu, pm_status, 0);
> +}
> +
> +
> +static irqreturn_t cbe_pm_irq(int irq, void *dev_id)
> +{
> +	perf_irq(get_irq_regs());
> +	return IRQ_HANDLED;
> +}

Everything above here is pulled from arch/powerpc/platforms/cell/pmu.c.
Isn't there some way to use/share existing code?  Duplicating the code
adds unnecessary bloat to the kernel source.

>From here down, you are taking a lot of code from
arch/powerpc/oprofile/op_model_cell.c. 

> +
> +static void cell_set_pm_event_part1(struct pm_signal *p)
> +{
> +}
> +
> +static void cell_set_pm_event_part2(struct pm_signal *p, u32  *bus_type)
> +{
> +}
> +
> +static void cell_virtual_cntr_part1(int num_counters, int prev_hdw_thread,
> +				    int next_hdw_thread, u32 cpu)
> +{
> +	int i;
> +
> +	for (i = 0; i < num_counters; i++) {
> +		per_cpu(pmc_values, cpu + prev_hdw_thread)[i]
> +			= cbe_read_ctr(cpu, i);
> +		if (per_cpu(pmc_values, cpu + next_hdw_thread)[i]
> +		    == 0xFFFFFFFF)
> +			/* If the cntr value is 0xffffffff, we must
> +			 * reset that to 0xfffffff0 when the current
> +			 * thread is restarted.	 This will generate a
> +			 * new interrupt and make sure that we never
> +			 * restore the counters to the max value.  If
> +			 * the counters were restored to the max value,
> +			 * they do not increment and no interrupts are
> +			 * generated.  Hence no more samples will be
> +			 * collected on that cpu.
> +			 */
> +			cbe_write_ctr(cpu, i, 0xFFFFFFF0);
> +		else
> +			cbe_write_ctr(cpu, i,
> +					   per_cpu(pmc_values,
> +						   cpu +
> +						   next_hdw_thread)[i]);
> +	}
> +}
> +
> +static void cell_add_sample(u32 cpu)
> +{
> +}
> +
> +static int cell_reg_setup_part1(struct op_counter_config *ctr)
> +{
> +	spu_cycle_reset = 0;
> +
> +	if (ctr[0].event == SPU_CYCLES_EVENT_NUM) {
> +		spu_cycle_reset = ctr[0].count;
> +
> +		/*
> +		 * Each node will need to make the rtas call to start
> +		 * and stop SPU profiling.  Get the token once and store it.
> +		 */
> +		spu_rtas_token = rtas_token("ibm,cbe-spu-perftools");
> +
> +		if (unlikely(spu_rtas_token == RTAS_UNKNOWN_SERVICE)) {
> +			printk(KERN_ERR
> +			       "%s: rtas token ibm,cbe-spu-perftools unknown\n",
> +			       __func__);
> +			return -EIO;
> +		}
> +	}
> +
> +	pm_rtas_token = rtas_token("ibm,cbe-perftools");
> +
> +	/*
> +	 * For all events excetp PPU CYCLEs, each node will need to make
> +	 * the rtas cbe-perftools call to setup and reset the debug bus.
> +	 * Make the token lookup call once and store it in the global
> +	 * variable pm_rtas_token.
> +	 */
> +	if (unlikely(pm_rtas_token == RTAS_UNKNOWN_SERVICE)) {
> +		printk(KERN_ERR
> +		       "%s: rtas token ibm,cbe-perftools unknown\n",
> +		       __func__);
> +		return -EIO;
> +	}
> +	return 0;
> +}
> +
> +static void cell_reg_setup_part2(int i, struct op_counter_config *ctr)
> +{
> +	/* Using 32bit counters, reset max - count */
> +	oprofile_reset_value[i] = 0xFFFFFFFF - ctr[i].count;
> +}
> +
> +/*
> + * Firmware interface functions
> + */
> +static int
> +rtas_ibm_cbe_perftools(int subfunc, int passthru,
> +		       void *address, unsigned long length)
> +{
> +	u64 paddr = __pa(address);
> +
> +	return rtas_call(pm_rtas_token, 5, 1, NULL, subfunc,
> +			 passthru, paddr >> 32, paddr & 0xffffffff, length);
> +}
> +
> +static void pm_rtas_reset_signals(u32 node)
> +{
> +	int ret;
> +	struct pm_signal pm_signal_local;
> +
> +	/*
> +	 * The debug bus is being set to the passthru disable state.
> +	 * However, the FW still expects atleast one legal signal routing
> +	 * entry or it will return an error on the arguments.	If we don't
> +	 * supply a valid entry, we must ignore all return values.  Ignoring
> +	 * all return values means we might miss an error we should be
> +	 * concerned about.
> +	 */
> +
> +	/*  fw expects physical cpu #. */
> +	pm_signal_local.cpu = node;
> +	pm_signal_local.signal_group = 21;
> +	pm_signal_local.bus_word = 1;
> +	pm_signal_local.sub_unit = 0;
> +	pm_signal_local.bit = 0;
> +
> +	ret = rtas_ibm_cbe_perftools(SUBFUNC_RESET, PASSTHRU_DISABLE,
> +				     &pm_signal_local,
> +				     sizeof(struct pm_signal));
> +
> +	if (unlikely(ret))
> +		/*
> +		 * Not a fatal error. For Oprofile stop, the oprofile
> +		 * functions do not support returning an error for
> +		 * failure to stop OProfile.
> +		 */
> +		printk(KERN_WARNING "%s: rtas returned: %d\n",
> +		       __func__, ret);
> +}
> +
> +static int cell_pm_activate_signals(u32 node, u32 count)
> +{
> +	int ret;
> +	int i, j;
> +	struct pm_signal pm_signal_local[NR_PHYS_CTRS];
> +
> +	/*
> +	 * There is no debug setup required for the cycles event.
> +	 * Note that only events in the same group can be used.
> +	 * Otherwise, there will be conflicts in correctly routing
> +	 * the signals on the debug bus.  It is the responsiblity
> +	 * of the OProfile user tool to check the events are in
> +	 * the same group.
> +	 */
> +	i = 0;
> +	for (j = 0; j < count; j++) {
> +		if (pm_signal[j].signal_group != PPU_CYCLES_GRP_NUM) {
> +
> +			/* fw expects physical cpu # */
> +			pm_signal_local[i].cpu = node;
> +			pm_signal_local[i].signal_group
> +				= pm_signal[j].signal_group;
> +			pm_signal_local[i].bus_word = pm_signal[j].bus_word;
> +			pm_signal_local[i].sub_unit = pm_signal[j].sub_unit;
> +			pm_signal_local[i].bit = pm_signal[j].bit;
> +			i++;
> +		}
> +	}
> +
> +	if (i != 0) {
> +		ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE, PASSTHRU_ENABLE,
> +					     pm_signal_local,
> +					     i * sizeof(struct pm_signal));
> +
> +		if (unlikely(ret)) {
> +			printk(KERN_WARNING "%s: rtas returned: %d\n",
> +			       __func__, ret);
> +			return -EIO;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +#define ENTRIES	 303
> +#define MAXLFSR	 0xFFFFFF
> +
> +/* precomputed table of 24 bit LFSR values */
> +static int initial_lfsr[] = {
> + 8221349, 12579195, 5379618, 10097839, 7512963, 7519310, 3955098, 10753424,
> + 15507573, 7458917, 285419, 2641121, 9780088, 3915503, 6668768, 1548716,
> + 4885000, 8774424, 9650099, 2044357, 2304411, 9326253, 10332526, 4421547,
> + 3440748, 10179459, 13332843, 10375561, 1313462, 8375100, 5198480, 6071392,
> + 9341783, 1526887, 3985002, 1439429, 13923762, 7010104, 11969769, 4547026,
> + 2040072, 4025602, 3437678, 7939992, 11444177, 4496094, 9803157, 10745556,
> + 3671780, 4257846, 5662259, 13196905, 3237343, 12077182, 16222879, 7587769,
> + 14706824, 2184640, 12591135, 10420257, 7406075, 3648978, 11042541, 15906893,
> + 11914928, 4732944, 10695697, 12928164, 11980531, 4430912, 11939291, 2917017,
> + 6119256, 4172004, 9373765, 8410071, 14788383, 5047459, 5474428, 1737756,
> + 15967514, 13351758, 6691285, 8034329, 2856544, 14394753, 11310160, 12149558,
> + 7487528, 7542781, 15668898, 12525138, 12790975, 3707933, 9106617, 1965401,
> + 16219109, 12801644, 2443203, 4909502, 8762329, 3120803, 6360315, 9309720,
> + 15164599, 10844842, 4456529, 6667610, 14924259, 884312, 6234963, 3326042,
> + 15973422, 13919464, 5272099, 6414643, 3909029, 2764324, 5237926, 4774955,
> + 10445906, 4955302, 5203726, 10798229, 11443419, 2303395, 333836, 9646934,
> + 3464726, 4159182, 568492, 995747, 10318756, 13299332, 4836017, 8237783,
> + 3878992, 2581665, 11394667, 5672745, 14412947, 3159169, 9094251, 16467278,
> + 8671392, 15230076, 4843545, 7009238, 15504095, 1494895, 9627886, 14485051,
> + 8304291, 252817, 12421642, 16085736, 4774072, 2456177, 4160695, 15409741,
> + 4902868, 5793091, 13162925, 16039714, 782255, 11347835, 14884586, 366972,
> + 16308990, 11913488, 13390465, 2958444, 10340278, 1177858, 1319431, 10426302,
> + 2868597, 126119, 5784857, 5245324, 10903900, 16436004, 3389013, 1742384,
> + 14674502, 10279218, 8536112, 10364279, 6877778, 14051163, 1025130, 6072469,
> + 1988305, 8354440, 8216060, 16342977, 13112639, 3976679, 5913576, 8816697,
> + 6879995, 14043764, 3339515, 9364420, 15808858, 12261651, 2141560, 5636398,
> + 10345425, 10414756, 781725, 6155650, 4746914, 5078683, 7469001, 6799140,
> + 10156444, 9667150, 10116470, 4133858, 2121972, 1124204, 1003577, 1611214,
> + 14304602, 16221850, 13878465, 13577744, 3629235, 8772583, 10881308, 2410386,
> + 7300044, 5378855, 9301235, 12755149, 4977682, 8083074, 10327581, 6395087,
> + 9155434, 15501696, 7514362, 14520507, 15808945, 3244584, 4741962, 9658130,
> + 14336147, 8654727, 7969093, 15759799, 14029445, 5038459, 9894848, 8659300,
> + 13699287, 8834306, 10712885, 14753895, 10410465, 3373251, 309501, 9561475,
> + 5526688, 14647426, 14209836, 5339224, 207299, 14069911, 8722990, 2290950,
> + 3258216, 12505185, 6007317, 9218111, 14661019, 10537428, 11731949, 9027003,
> + 6641507, 9490160, 200241, 9720425, 16277895, 10816638, 1554761, 10431375,
> + 7467528, 6790302, 3429078, 14633753, 14428997, 11463204, 3576212, 2003426,
> + 6123687, 820520, 9992513, 15784513, 5778891, 6428165, 8388607
> +};
> +
> +/*
> + * The hardware uses an LFSR counting sequence to determine when to capture
> + * the SPU PCs.	 An LFSR sequence is like a puesdo random number sequence
> + * where each number occurs once in the sequence but the sequence is not in
> + * numerical order. The SPU PC capture is done when the LFSR sequence reaches
> + * the last value in the sequence.  Hence the user specified value N
> + * corresponds to the LFSR number that is N from the end of the sequence.
> + *
> + * To avoid the time to compute the LFSR, a lookup table is used.  The 24 bit
> + * LFSR sequence is broken into four ranges.  The spacing of the precomputed
> + * values is adjusted in each range so the error between the user specifed
> + * number (N) of events between samples and the actual number of events based
> + * on the precomputed value will be les then about 6.2%.  Note, if the user
> + * specifies N < 2^16, the LFSR value that is 2^16 from the end will be used.
> + * This is to prevent the loss of samples because the trace buffer is full.
> + *
> + *	   User specified N		     Step between	   Index in
> + *					 precomputed values	 precomputed
> + *								    table
> + * 0		    to	2^16-1			----		      0
> + * 2^16	    to	2^16+2^19-1		2^12		    1 to 128
> + * 2^16+2^19	    to	2^16+2^19+2^22-1	2^15		  129 to 256
> + * 2^16+2^19+2^22  to	2^24-1			2^18		  257 to 302
> + *
> + *
> + * For example, the LFSR values in the second range are computed for 2^16,
> + * 2^16+2^12, ... , 2^19-2^16, 2^19 and stored in the table at indicies
> + * 1, 2,..., 127, 128.
> + *
> + * The 24 bit LFSR value for the nth number in the sequence can be
> + * calculated using the following code:
> + *
> + * #define size 24
> + * int calculate_lfsr(int n)
> + * {
> + *	int i;
> + *	unsigned int newlfsr0;
> + *	unsigned int lfsr = 0xFFFFFF;
> + *	unsigned int howmany = n;
> + *
> + *	for (i = 2; i < howmany + 2; i++) {
> + *		newlfsr0 = (((lfsr >> (size - 1 - 0)) & 1) ^
> + *		((lfsr >> (size - 1 - 1)) & 1) ^
> + *		(((lfsr >> (size - 1 - 6)) & 1) ^
> + *		((lfsr >> (size - 1 - 23)) & 1)));
> + *
> + *		lfsr >>= 1;
> + *		lfsr = lfsr | (newlfsr0 << (size - 1));
> + *	}
> + *	return lfsr;
> + * }
> + */
> +
> +#define V2_16  (0x1 << 16)
> +#define V2_19  (0x1 << 19)
> +#define V2_22  (0x1 << 22)
> +
> +static int calculate_lfsr(int n)
> +{
> +	/*
> +	 * The ranges and steps are in powers of 2 so the calculations
> +	 * can be done using shifts rather then divide.
> +	 */
> +	int index;
> +
> +	if ((n >> 16) == 0)
> +		index = 0;
> +	else if (((n - V2_16) >> 19) == 0)
> +		index = ((n - V2_16) >> 12) + 1;
> +	else if (((n - V2_16 - V2_19) >> 22) == 0)
> +		index = ((n - V2_16 - V2_19) >> 15) + 1 + 128;
> +	else if (((n - V2_16 - V2_19 - V2_22) >> 24) == 0)
> +		index = ((n - V2_16 - V2_19 - V2_22) >> 18) + 1 + 256;
> +	else
> +		index = ENTRIES-1;
> +
> +	/* make sure index is valid */
> +	if ((index > ENTRIES) || (index < 0))
> +		index = ENTRIES-1;
> +
> +	return initial_lfsr[index];
> +}
> +
> +static int pm_rtas_activate_spu_profiling(u32 node)
> +{
> +	int ret, i;
> +	struct pm_signal pm_signal_local[NR_PHYS_CTRS];
> +
> +	/*
> +	 * Set up the rtas call to configure the debug bus to
> +	 * route the SPU PCs.  Setup the pm_signal for each SPU
> +	 */
> +	for (i = 0; i < NUM_SPUS_PER_NODE; i++) {
> +		pm_signal_local[i].cpu = node;
> +		pm_signal_local[i].signal_group = 41;
> +		/* spu i on word (i/2) */
> +		pm_signal_local[i].bus_word = 1 << i / 2;
> +		/* spu i */
> +		pm_signal_local[i].sub_unit = i;
> +		pm_signal_local[i].bit = 63;
> +	}
> +
> +	ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE,
> +				     PASSTHRU_ENABLE, pm_signal_local,
> +				     (NUM_SPUS_PER_NODE
> +				      * sizeof(struct pm_signal)));
> +
> +	if (unlikely(ret)) {
> +		printk(KERN_WARNING "%s: rtas returned: %d\n",
> +		       __func__, ret);
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +#ifdef CONFIG_CPU_FREQ
> +static int
> +oprof_cpufreq_notify(struct notifier_block *nb, unsigned long val, void *data)
> +{
> +	int ret = 0;
> +	struct cpufreq_freqs *frq = data;
> +	if ((val == CPUFREQ_PRECHANGE && frq->old < frq->new) ||
> +	    (val == CPUFREQ_POSTCHANGE && frq->old > frq->new) ||
> +	    (val == CPUFREQ_RESUMECHANGE || val == CPUFREQ_SUSPENDCHANGE))
> +		set_spu_profiling_frequency(frq->new, spu_cycle_reset);
> +	return ret;
> +}
> +
> +static struct notifier_block cpu_freq_notifier_block = {
> +	.notifier_call	= oprof_cpufreq_notify
> +};
> +#endif
> +
> +
> +/*
> + * Note the generic OProfile stop calls do not support returning
> + * an error on stop.  Hence, will not return an error if the FW
> + * calls fail on stop.	Failure to reset the debug bus is not an issue.
> + * Failure to disable the SPU profiling is not an issue.  The FW calls
> + * to enable the performance counters and debug bus will work even if
> + * the hardware was not cleanly reset.
> + */
> +static void cell_global_stop_spu(void)
> +{
> +	int subfunc, rtn_value;
> +	unsigned int lfsr_value;
> +	int cpu;
> +
> +	oprofile_running = 0;
> +
> +#ifdef CONFIG_CPU_FREQ
> +	cpufreq_unregister_notifier(&cpu_freq_notifier_block,
> +				    CPUFREQ_TRANSITION_NOTIFIER);
> +#endif
> +
> +	for_each_online_cpu(cpu) {
> +		if (cbe_get_hw_thread_id(cpu))
> +			continue;
> +
> +		subfunc = 3;	/*
> +				 * 2 - activate SPU tracing,
> +				 * 3 - deactivate
> +				 */
> +		lfsr_value = 0x8f100000;
> +
> +		rtn_value = rtas_call(spu_rtas_token, 3, 1, NULL,
> +				      subfunc, cbe_cpu_to_node(cpu),
> +				      lfsr_value);
> +
> +		if (unlikely(rtn_value != 0)) {
> +			printk(KERN_ERR
> +			       "%s: rtas call ibm, "
> +			       "cbe-spu-perftools failed, return = %d\n",
> +			       __func__, rtn_value);
> +		}
> +
> +		/* Deactivate the signals */
> +		pm_rtas_reset_signals(cbe_cpu_to_node(cpu));
> +	}
> +
> +	stop_spu_profiling();
> +}
> +
> +static void cell_global_stop_ppu(void)
> +{
> +	int cpu;
> +
> +	/*
> +	 * This routine will be called once for the system.
> +	 * There is one performance monitor per node, so we
> +	 * only need to perform this function once per node.
> +	 */
> +	del_timer_sync(&oprofile_timer_virt_cntr);
> +	oprofile_running = 0;
> +	/* complete the previous store */
> +	smp_wmb();
> +
> +	for_each_online_cpu(cpu) {
> +		if (cbe_get_hw_thread_id(cpu))
> +			continue;
> +		cbe_sync_irq(cbe_cpu_to_node(cpu));
> +		/* Stop the counters */
> +		cbe_disable_pm(cpu);
> +
> +		/* Deactivate interrupts */
> +		cbe_disable_pm_interrupts(cpu);
> +	}
> +}
> +
> +
> +static int cell_global_start_spu(struct op_counter_config *ctr)
> +{
> +	int subfunc;
> +	unsigned int lfsr_value;
> +	int cpu;
> +	int ret;
> +	int rtas_error;
> +	unsigned int cpu_khzfreq = 0;
> +
> +	/* The SPU profiling uses time-based profiling based on
> +	 * cpu frequency, so if configured with the CPU_FREQ
> +	 * option, we should detect frequency changes and react
> +	 * accordingly.
> +	 */
> +#ifdef CONFIG_CPU_FREQ
> +	ret = cpufreq_register_notifier(&cpu_freq_notifier_block,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret < 0)
> +		/* this is not a fatal error */
> +		printk(KERN_ERR "CPU freq change registration failed: %d\n",
> +		       ret);
> +
> +	else
> +		cpu_khzfreq = cpufreq_quick_get(smp_processor_id());
> +#endif
> +
> +	set_spu_profiling_frequency(cpu_khzfreq, spu_cycle_reset);
> +
> +	for_each_online_cpu(cpu) {
> +		if (cbe_get_hw_thread_id(cpu))
> +			continue;
> +
> +		/*
> +		 * Setup SPU cycle-based profiling.
> +		 * Set perf_mon_control bit 0 to a zero before
> +		 * enabling spu collection hardware.
> +		 */
> +		cbe_write_pm(cpu, pm_control, 0);
> +
> +		if (spu_cycle_reset > MAX_SPU_COUNT)
> +			/* use largest possible value */
> +			lfsr_value = calculate_lfsr(MAX_SPU_COUNT-1);
> +		else
> +			lfsr_value = calculate_lfsr(spu_cycle_reset);
> +
> +		/* must use a non zero value. Zero disables data collection. */
> +		if (lfsr_value == 0)
> +			lfsr_value = calculate_lfsr(1);
> +
> +		lfsr_value = lfsr_value << 8; /* shift lfsr to correct
> +						* register location
> +						*/
> +
> +		/* debug bus setup */
> +		ret = pm_rtas_activate_spu_profiling(cbe_cpu_to_node(cpu));
> +
> +		if (unlikely(ret)) {
> +			rtas_error = ret;
> +			goto out;
> +		}
> +
> +
> +		subfunc = 2;	/* 2 - activate SPU tracing, 3 - deactivate */
> +
> +		/* start profiling */
> +		ret = rtas_call(spu_rtas_token, 3, 1, NULL, subfunc,
> +		  cbe_cpu_to_node(cpu), lfsr_value);
> +
> +		if (unlikely(ret != 0)) {
> +			printk(KERN_ERR
> +			       "%s: rtas call ibm,cbe-spu-perftools "
> +			       "failed, return = %d\n",
> +			       __func__, ret);
> +			rtas_error = -EIO;
> +			goto out;
> +		}
> +	}
> +
> +	rtas_error = start_spu_profiling(spu_cycle_reset);
> +	if (rtas_error)
> +		goto out_stop;
> +
> +	oprofile_running = 1;
> +	return 0;
> +
> +out_stop:
> +	cell_global_stop_spu();		/* clean up the PMU/debug bus */
> +out:
> +	return rtas_error;
> +}
> +
> +
> +
> +
> +
> +
> +struct pmu_ops pmu_ops_cell = {
> +	.read_phys_ctr               = cbe_read_phys_ctr,
> +	.write_phys_ctr              = cbe_write_phys_ctr,
> +	.read_ctr                    = cbe_read_ctr,
> +	.write_ctr	             = cbe_write_ctr,
> +	.read_pm07_control           = cbe_read_pm07_control,
> +	.write_pm07_control          = cbe_write_pm07_control,
> +	.read_pm                     = cbe_read_pm,
> +	.write_pm                    = cbe_write_pm,
> +	.get_ctr_size                = cbe_get_ctr_size,
> +	.set_ctr_size                = cbe_set_ctr_size,
> +	.enable_pm                   = cbe_enable_pm,
> +	.disable_pm                  = cbe_disable_pm,
> +	.read_trace_buffer           = cbe_read_trace_buffer,
> +	.get_and_clear_pm_interrupts = cbe_get_and_clear_pm_interrupts,
> +	.enable_pm_interrupts        = cbe_enable_pm_interrupts,
> +	.disable_pm_interrupts       = cbe_disable_pm_interrupts,
> +	.pmu_cpu_to_node             = cbe_cpu_to_node,
> +	.get_hw_thread_id            = cbe_get_hw_thread_id,
> +	.set_pm_event_part1          = cell_set_pm_event_part1,
> +	.set_pm_event_part2          = cell_set_pm_event_part2,
> +	.set_count_mode_var1         = CBE_COUNT_ALL_MODES,
> +	.set_count_mode_var2         = CBE_COUNT_HYPERVISOR_MODE,
> +	.virtual_cntr_part1          = cell_virtual_cntr_part1,
> +	.add_sample                  = cell_add_sample,
> +	.reg_setup_part1             = cell_reg_setup_part1,
> +	.reg_setup_part2             = cell_reg_setup_part2,
> +	.pm_activate_signals         = cell_pm_activate_signals,
> +	.global_start_spu            = cell_global_start_spu,
> +	.global_stop_spu             = cell_global_stop_spu,
> +	.global_stop_ppu             = cell_global_stop_ppu,
> +};
> +
> +static void cell_handle_interrupt(struct pt_regs *regs,
> +				struct op_counter_config *ctr)
> +{
> +	u32 cpu;
> +	u64 pc;
> +	int is_kernel;
> +	unsigned long flags = 0;
> +	u32 interrupt_mask;
> +	int i;
> +
> +	cpu = smp_processor_id();
> +
> +	/*
> +	 * Need to make sure the interrupt handler and the virt counter
> +	 * routine are not running at the same time. See the
> +	 * cell_virtual_cntr() routine for additional comments.
> +	 */
> +	spin_lock_irqsave(&oprofile_virt_cntr_lock, flags);
> +
> +	/*
> +	 * Need to disable and reenable the performance counters
> +	 * to get the desired behavior from the hardware.  This
> +	 * is hardware specific.
> +	 */
> +
> +	cbe_disable_pm(cpu);
> +
> +	interrupt_mask = cbe_get_and_clear_pm_interrupts(cpu);
> +
> +	/*
> +	 * If the interrupt mask has been cleared, then the virt cntr
> +	 * has cleared the interrupt.  When the thread that generated
> +	 * the interrupt is restored, the data count will be restored to
> +	 * 0xffffff0 to cause the interrupt to be regenerated.
> +	 */
> +
> +	if ((oprofile_running == 1) && (interrupt_mask != 0)) {
> +		pc = regs->nip;
> +		is_kernel = is_kernel_addr(pc);
> +
> +		for (i = 0; i < oprofile_num_counters; ++i) {
> +			if ((interrupt_mask & CBE_PM_CTR_OVERFLOW_INTR(i))
> +			    && ctr[i].enabled) {
> +				oprofile_add_pc(pc, is_kernel, i);
> +				cbe_write_ctr(cpu, i, oprofile_reset_value[i]);
> +			}
> +		}
> +
> +		/*
> +		 * The counters were frozen by the interrupt.
> +		 * Reenable the interrupt and restart the counters.
> +		 * If there was a race between the interrupt handler and
> +		 * the virtual counter routine.	 The virutal counter
> +		 * routine may have cleared the interrupts.  Hence must
> +		 * use the virt_cntr_inter_mask to re-enable the interrupts.
> +		 */
> +		cbe_enable_pm_interrupts(cpu, oprofile_hdw_thread,
> +					 oprofile_virt_cntr_inter_mask);
> +
> +		/*
> +		 * The writes to the various performance counters only writes
> +		 * to a latch.	The new values (interrupt setting bits, reset
> +		 * counter value etc.) are not copied to the actual registers
> +		 * until the performance monitor is enabled.  In order to get
> +		 * this to work as desired, the permormance monitor needs to
> +		 * be disabled while writing to the latches.  This is a
> +		 * HW design issue.
> +		 */
> +		cbe_enable_pm(cpu);
> +	}
> +	spin_unlock_irqrestore(&oprofile_virt_cntr_lock, flags);
> +}
> +
> +/*
> + * This function is called from the generic OProfile
> + * driver.  When profiling PPUs, we need to do the
> + * generic sync start; otherwise, do spu_sync_start.
> + */
> +static int cell_sync_start(void)
> +{
> +	if (spu_cycle_reset)
> +		return spu_sync_start();
> +	else
> +		return DO_GENERIC_SYNC;
> +}
> +
> +static int cell_sync_stop(void)
> +{
> +	if (spu_cycle_reset)
> +		return spu_sync_stop();
> +	else
> +		return 1;
> +}
> +
> +
> +
> +
> +struct op_powerpc_model op_model_cell = {
> +	.reg_setup = cell_reg_setup,
> +	.cpu_setup = cell_cpu_setup,
> +	.global_start = cell_global_start,
> +	.global_stop = cell_global_stop,
> +	.sync_start = cell_sync_start,
> +	.sync_stop = cell_sync_stop,
> +	.handle_interrupt = cell_handle_interrupt,
> +};
> +
> +
> +
> +
> +int __init cbe_init_pm_irq(void)
> +{
> +	unsigned int irq;
> +	int rc, node;
> +
> +	if (firmware_has_feature(FW_FEATURE_PS3_LV1))
> +		return -ENODEV;
> +	else {
> +		pmu_ops = &pmu_ops_cell;
> +		op_powerpc_model = &op_model_cell;
> +	}
> +	for_each_node(node) {
> +		irq = irq_create_mapping(NULL, IIC_IRQ_IOEX_PMI |
> +					       (node << IIC_IRQ_NODE_SHIFT));
> +		if (irq == NO_IRQ) {
> +			printk(KERN_ERR "ERROR: Unable to allocate "
> +			       "irq for node %d\n",
> +			       node);
> +			return -EINVAL;
> +		}
> +
> +		rc = request_irq(irq, cbe_pm_irq,
> +				 IRQF_DISABLED, "cbe-pmu-0", NULL);
> +		if (rc) {
> +			printk("ERROR: Request for irq on node %d failed\n",
> +			       node);
> +			return rc;
> +		} else if (oprofile_registration_ops)
> +			oprofile_registration_ops->oprofile_lock_reregister();
> +	}
> +
> +	return 0;
> +}
> +
> +void cbe_remove_pm_irq(void)
> +{
> +	unsigned int irq;
> +	int node;
> +
> +	if (pmu_ops == &pmu_ops_cell) {
> +		for_each_node(node) {
> +			irq = irq_find_mapping(NULL, IIC_IRQ_IOEX_PMI |
> +					     (node << IIC_IRQ_NODE_SHIFT));
> +			if (irq == NO_IRQ) {
> +				free_irq(irq, NULL);
> +/* Fixme DJB is the irq_dispose_mapping adviseable */
> +#if 0
> +				irq_dispose_mapping(irq);
> +#endif
> +			}
> +		}
> +		if (oprofile_registration_ops)
> +			oprofile_registration_ops->oprofile_lock_exit();
> +
> +		pmu_ops = NULL;
> +		op_powerpc_model = NULL;
> +	}
> +}
> +
> +void cbe_sync_irq(int node)
> +{
> +	unsigned int irq;
> +
> +	irq = irq_find_mapping(NULL,
> +			       IIC_IRQ_IOEX_PMI
> +			       | (node << IIC_IRQ_NODE_SHIFT));
> +
> +	if (irq == NO_IRQ) {
> +		printk(KERN_WARNING "ERROR, unable to get existing irq %d " \
> +		"for node %d\n", irq, node);
> +		return;
> +	}
> +
> +	synchronize_irq(irq);
> +}
> --- a/arch/powerpc/oprofile/cell/spu_profiler.c
> +++ b/arch/powerpc/oprofile/cell/spu_profiler.c
> @@ -81,7 +81,7 @@ static void spu_pc_extract(int cpu, int 
>  	 * trace[1] SPU PC contents are: 4 5 6 7
>  	 */
>  
> -	cbe_read_trace_buffer(cpu, trace_buffer);
> +	pmu_ops->read_trace_buffer(cpu, trace_buffer);
>  
>  	for (spu = SPUS_PER_TB_ENTRY-1; spu >= 0; spu--) {
>  		/* spu PC trace entry is upper 16 bits of the
> @@ -106,7 +106,7 @@ static int cell_spu_pc_collection(int cp
>  
>  	entry = 0;
>  
> -	trace_addr = cbe_read_pm(cpu, trace_address);
> +	trace_addr = pmu_ops->read_pm(cpu, trace_address);
>  	while (!(trace_addr & CBE_PM_TRACE_BUF_EMPTY)) {
>  		/* there is data in the trace buffer to process */
>  		spu_pc_extract(cpu, entry);
> @@ -117,7 +117,7 @@ static int cell_spu_pc_collection(int cp
>  			/* spu_samples is full */
>  			break;
>  
> -		trace_addr = cbe_read_pm(cpu, trace_address);
> +		trace_addr = pmu_ops->read_pm(cpu, trace_address);
>  	}
>  
>  	return entry;
> @@ -133,10 +133,10 @@ static enum hrtimer_restart profile_spus
>  		goto stop;
>  
>  	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> +		if (pmu_ops->get_hw_thread_id(cpu))
>  			continue;
>  
> -		node = cbe_cpu_to_node(cpu);
> +		node = pmu_ops->pmu_cpu_to_node(cpu);
>  
>  		/* There should only be one kernel thread at a time processing
>  		 * the samples.	 In the very unlikely case that the processing
> --- a/arch/powerpc/oprofile/cell/spu_task_sync.c
> +++ b/arch/powerpc/oprofile/cell/spu_task_sync.c
> @@ -353,7 +353,7 @@ static int number_of_online_nodes(void)
>          u32 cpu; u32 tmp;
>          int nodes = 0;
>          for_each_online_cpu(cpu) {
> -                tmp = cbe_cpu_to_node(cpu) + 1;
> +		tmp = pmu_ops->pmu_cpu_to_node(cpu) + 1;
>                  if (tmp > nodes)
>                          nodes++;
>          }
> --- a/arch/powerpc/oprofile/common.c
> +++ b/arch/powerpc/oprofile/common.c
> @@ -23,6 +23,8 @@
>  #include <asm/cputable.h>
>  #include <asm/oprofile_impl.h>
>  #include <asm/firmware.h>
> +#include <asm/cell-pmu.h>
> +#include <../drivers/oprofile/oprof.h>
>  
>  static struct op_powerpc_model *model;
>  
> @@ -170,6 +172,10 @@ static int op_powerpc_create_files(struc
>  
>  int __init oprofile_arch_init(struct oprofile_operations *ops)
>  {
> +#ifdef CONFIG_OPROFILE_CELL
> +	if (oprofile_registration_ops)
> +		oprofile_registration_ops->oprofile_lock_reregister();
> +#endif
>  	if (!cur_cpu_spec->oprofile_cpu_type)
>  		return -ENODEV;
>  
> @@ -180,11 +186,21 @@ int __init oprofile_arch_init(struct opr
>  #ifdef CONFIG_PPC64
>  #ifdef CONFIG_OPROFILE_CELL
>  		case PPC_OPROFILE_CELL:
> -			if (firmware_has_feature(FW_FEATURE_LPAR))
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
> +			if (firmware_has_feature(FW_FEATURE_LPAR) &&
> +				!firmware_has_feature(FW_FEATURE_PS3_LV1)) {
> +				pr_debug("%s:%d: \n", __func__, __LINE__);
> +				return -ENODEV;
> +			}
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
> +			if (!op_powerpc_model)
>  				return -ENODEV;
> -			model = &op_model_cell;
> +			model = op_powerpc_model;
>  			ops->sync_start = model->sync_start;
>  			ops->sync_stop = model->sync_stop;
> +#ifdef CONFIG_PPC_CELL_NATIVE
> +			cbe_init_pm_irq();
> +#endif
>  			break;
>  #endif
>  		case PPC_OPROFILE_RS64:
> @@ -208,8 +224,10 @@ int __init oprofile_arch_init(struct opr
>  			break;
>  #endif
>  		default:
> +			pr_debug("%s:%d: \n", __func__, __LINE__);
>  			return -ENODEV;
>  	}
> +	pr_debug("%s:%d: \n", __func__, __LINE__);
>  
>  	model->num_counters = cur_cpu_spec->num_pmcs;
>  
> @@ -229,4 +247,7 @@ int __init oprofile_arch_init(struct opr
>  
>  void oprofile_arch_exit(void)
>  {
> +#ifdef CONFIG_PPC_CELL_NATIVE
> +       cbe_remove_pm_irq();
> +#endif
>  }
> --- a/arch/powerpc/oprofile/op_model_cell.c
> +++ b/arch/powerpc/oprofile/op_model_cell.c
> @@ -2,11 +2,14 @@
>   * Cell Broadband Engine OProfile Support
>   *
>   * (C) Copyright IBM Corporation 2006
> + * Copyright (C) 2007 Sony Computer Entertainment Inc.
> + * Copyright 2007 Sony Corporation.
>   *
>   * Author: David Erb (djerb at us.ibm.com)
>   * Modifications:
>   *	   Carl Love <carll at us.ibm.com>
>   *	   Maynard Johnson <maynardj at us.ibm.com>
> + *         D.J. Barrow <denis.barrow at eu.sony.com>
>   *
>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License
> @@ -26,45 +29,26 @@
>  #include <linux/timer.h>
>  #include <asm/cell-pmu.h>
>  #include <asm/cputable.h>
> -#include <asm/firmware.h>
> -#include <asm/io.h>
> -#include <asm/oprofile_impl.h>
>  #include <asm/processor.h>
>  #include <asm/prom.h>
> -#include <asm/ptrace.h>
>  #include <asm/reg.h>
> -#include <asm/rtas.h>
>  #include <asm/system.h>
>  #include <asm/cell-regs.h>
> -
>  #include "../platforms/cell/interrupt.h"
>  #include "cell/pr_util.h"
>  
> -static void cell_global_stop_spu(void);
>  
> +struct pmu_ops *pmu_ops;
> +EXPORT_SYMBOL_GPL(pmu_ops);
> +struct op_powerpc_model *op_powerpc_model;
> +EXPORT_SYMBOL_GPL(op_powerpc_model);
>  /*
>   * spu_cycle_reset is the number of cycles between samples.
>   * This variable is used for SPU profiling and should ONLY be set
>   * at the beginning of cell_reg_setup; otherwise, it's read-only.
>   */
> -static unsigned int spu_cycle_reset;
> -
> -#define NUM_SPUS_PER_NODE    8
> -#define SPU_CYCLES_EVENT_NUM 2	/*  event number for SPU_CYCLES */
> -
> -#define PPU_CYCLES_EVENT_NUM 1	/*  event number for CYCLES */
> -#define PPU_CYCLES_GRP_NUM   1	/* special group number for identifying
> -				 * PPU_CYCLES event
> -				 */
> -#define CBE_COUNT_ALL_CYCLES 0x42800000 /* PPU cycle event specifier */
> -
> -#define NUM_THREADS 2         /* number of physical threads in
> -			       * physical processor
> -			       */
> -#define NUM_DEBUG_BUS_WORDS 4
> -#define NUM_INPUT_BUS_WORDS 2
> -
> -#define MAX_SPU_COUNT 0xFFFFFF	/* maximum 24 bit LFSR value */
> +unsigned int spu_cycle_reset;
> +EXPORT_SYMBOL_GPL(spu_cycle_reset);
>  
>  struct pmc_cntrl_data {
>  	unsigned long vcntr;
> @@ -73,31 +57,7 @@ struct pmc_cntrl_data {
>  	unsigned long enabled;
>  };
>  
> -/*
> - * ibm,cbe-perftools rtas parameters
> - */
> -struct pm_signal {
> -	u16 cpu;		/* Processor to modify */
> -	u16 sub_unit;		/* hw subunit this applies to (if applicable)*/
> -	short int signal_group; /* Signal Group to Enable/Disable */
> -	u8 bus_word;		/* Enable/Disable on this Trace/Trigger/Event
> -				 * Bus Word(s) (bitmask)
> -				 */
> -	u8 bit;			/* Trigger/Event bit (if applicable) */
> -};
>  
> -/*
> - * rtas call arguments
> - */
> -enum {
> -	SUBFUNC_RESET = 1,
> -	SUBFUNC_ACTIVATE = 2,
> -	SUBFUNC_DEACTIVATE = 3,
> -
> -	PASSTHRU_IGNORE = 0,
> -	PASSTHRU_ENABLE = 1,
> -	PASSTHRU_DISABLE = 2,
> -};
>  
>  struct pm_cntrl {
>  	u16 enable;
> @@ -121,9 +81,10 @@ static struct {
>  #define GET_COUNT_CYCLES(x) (x & 0x00000001)
>  #define GET_INPUT_CONTROL(x) ((x & 0x00000004) >> 2)
>  
> -static DEFINE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values);
> +DEFINE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values);
> +EXPORT_SYMBOL_GPL(per_cpu_var(pmc_values));
>  
> -static struct pmc_cntrl_data pmc_cntrl[NUM_THREADS][NR_PHYS_CTRS];
> +static struct pmc_cntrl_data pmc_cntrl[PPU_NUM_THREADS][NR_PHYS_CTRS];
>  
>  /*
>   * The CELL profiling code makes rtas calls to setup the debug bus to
> @@ -141,129 +102,43 @@ static struct pmc_cntrl_data pmc_cntrl[N
>   */
>  
>  /*
> - * Interpetation of hdw_thread:
> + * Interpetation of oprofile_hdw_thread:
>   * 0 - even virtual cpus 0, 2, 4,...
>   * 1 - odd virtual cpus 1, 3, 5, ...
>   *
>   * FIXME: this is strictly wrong, we need to clean this up in a number
>   * of places. It works for now. -arnd
>   */
> -static u32 hdw_thread;
> +u32 oprofile_hdw_thread;
> +EXPORT_SYMBOL_GPL(oprofile_hdw_thread);
>  
> -static u32 virt_cntr_inter_mask;
> -static struct timer_list timer_virt_cntr;
> +u32 oprofile_virt_cntr_inter_mask;
> +EXPORT_SYMBOL_GPL(oprofile_virt_cntr_inter_mask);
> +struct timer_list oprofile_timer_virt_cntr;
> +EXPORT_SYMBOL_GPL(oprofile_timer_virt_cntr);
>  
>  /*
>   * pm_signal needs to be global since it is initialized in
>   * cell_reg_setup at the time when the necessary information
>   * is available.
>   */
> -static struct pm_signal pm_signal[NR_PHYS_CTRS];
> -static int pm_rtas_token;    /* token for debug bus setup call */
> -static int spu_rtas_token;   /* token for SPU cycle profiling */
> -
> -static u32 reset_value[NR_PHYS_CTRS];
> -static int num_counters;
> -static int oprofile_running;
> -static DEFINE_SPINLOCK(virt_cntr_lock);
> +struct pm_signal pm_signal[NR_PHYS_CTRS];
> +EXPORT_SYMBOL_GPL(pm_signal);
> +u32 oprofile_reset_value[NR_PHYS_CTRS];
> +EXPORT_SYMBOL_GPL(oprofile_reset_value);
> +
> +int oprofile_num_counters;
> +EXPORT_SYMBOL_GPL(oprofile_num_counters);
> +int oprofile_running;
> +EXPORT_SYMBOL_GPL(oprofile_running);
> +DEFINE_SPINLOCK(oprofile_virt_cntr_lock);
> +EXPORT_SYMBOL_GPL(oprofile_virt_cntr_lock);
>  
>  static u32 ctr_enabled;
>  
>  static unsigned char input_bus[NUM_INPUT_BUS_WORDS];
>  
>  /*
> - * Firmware interface functions
> - */
> -static int
> -rtas_ibm_cbe_perftools(int subfunc, int passthru,
> -		       void *address, unsigned long length)
> -{
> -	u64 paddr = __pa(address);
> -
> -	return rtas_call(pm_rtas_token, 5, 1, NULL, subfunc,
> -			 passthru, paddr >> 32, paddr & 0xffffffff, length);
> -}
> -
> -static void pm_rtas_reset_signals(u32 node)
> -{
> -	int ret;
> -	struct pm_signal pm_signal_local;
> -
> -	/*
> -	 * The debug bus is being set to the passthru disable state.
> -	 * However, the FW still expects atleast one legal signal routing
> -	 * entry or it will return an error on the arguments.	If we don't
> -	 * supply a valid entry, we must ignore all return values.  Ignoring
> -	 * all return values means we might miss an error we should be
> -	 * concerned about.
> -	 */
> -
> -	/*  fw expects physical cpu #. */
> -	pm_signal_local.cpu = node;
> -	pm_signal_local.signal_group = 21;
> -	pm_signal_local.bus_word = 1;
> -	pm_signal_local.sub_unit = 0;
> -	pm_signal_local.bit = 0;
> -
> -	ret = rtas_ibm_cbe_perftools(SUBFUNC_RESET, PASSTHRU_DISABLE,
> -				     &pm_signal_local,
> -				     sizeof(struct pm_signal));
> -
> -	if (unlikely(ret))
> -		/*
> -		 * Not a fatal error. For Oprofile stop, the oprofile
> -		 * functions do not support returning an error for
> -		 * failure to stop OProfile.
> -		 */
> -		printk(KERN_WARNING "%s: rtas returned: %d\n",
> -		       __FUNCTION__, ret);
> -}
> -
> -static int pm_rtas_activate_signals(u32 node, u32 count)
> -{
> -	int ret;
> -	int i, j;
> -	struct pm_signal pm_signal_local[NR_PHYS_CTRS];
> -
> -	/*
> -	 * There is no debug setup required for the cycles event.
> -	 * Note that only events in the same group can be used.
> -	 * Otherwise, there will be conflicts in correctly routing
> -	 * the signals on the debug bus.  It is the responsiblity
> -	 * of the OProfile user tool to check the events are in
> -	 * the same group.
> -	 */
> -	i = 0;
> -	for (j = 0; j < count; j++) {
> -		if (pm_signal[j].signal_group != PPU_CYCLES_GRP_NUM) {
> -
> -			/* fw expects physical cpu # */
> -			pm_signal_local[i].cpu = node;
> -			pm_signal_local[i].signal_group
> -				= pm_signal[j].signal_group;
> -			pm_signal_local[i].bus_word = pm_signal[j].bus_word;
> -			pm_signal_local[i].sub_unit = pm_signal[j].sub_unit;
> -			pm_signal_local[i].bit = pm_signal[j].bit;
> -			i++;
> -		}
> -	}
> -
> -	if (i != 0) {
> -		ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE, PASSTHRU_ENABLE,
> -					     pm_signal_local,
> -					     i * sizeof(struct pm_signal));
> -
> -		if (unlikely(ret)) {
> -			printk(KERN_WARNING "%s: rtas returned: %d\n",
> -			       __FUNCTION__, ret);
> -			return -EIO;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
> -/*
>   * PM Signal functions
>   */
>  static void set_pm_event(u32 ctr, int event, u32 unit_mask)
> @@ -298,7 +173,7 @@ static void set_pm_event(u32 ctr, int ev
>  	p->signal_group = event / 100;
>  	p->bus_word = bus_word;
>  	p->sub_unit = GET_SUB_UNIT(unit_mask);
> -
> +	pmu_ops->set_pm_event_part1(p);
>  	pm_regs.pm07_cntrl[ctr] = 0;
>  	pm_regs.pm07_cntrl[ctr] |= PM07_CTR_COUNT_CYCLES(count_cycles);
>  	pm_regs.pm07_cntrl[ctr] |= PM07_CTR_POLARITY(polarity);
> @@ -324,6 +199,8 @@ static void set_pm_event(u32 ctr, int ev
>  
>  		if ((bus_type == 0) && p->signal_group >= 60)
>  			bus_type = 2;
> +		pmu_ops->set_pm_event_part2(p, &bus_type);
> +
>  		if ((bus_type == 1) && p->signal_group >= 50)
>  			bus_type = 0;
>  
> @@ -349,6 +226,9 @@ static void set_pm_event(u32 ctr, int ev
>  			}
>  		}
>  	}
> +	OP_DBG("pm07_ctrl[%d] : 0x%x", ctr, pm_regs.pm07_cntrl[ctr]);
> +	OP_DBG("group_control : 0x%x", pm_regs.group_control);
> +	OP_DBG("debug_bus_control : 0x%x", pm_regs.debug_bus_control);
>  out:
>  	;
>  }
> @@ -378,20 +258,26 @@ static void write_pm_cntrl(int cpu)
>  	 * the count mode based on the user selection of user and kernel.
>  	 */
>  	val |= CBE_PM_COUNT_MODE_SET(pm_regs.pm_cntrl.count_mode);
> -	cbe_write_pm(cpu, pm_control, val);
> +	pmu_ops->write_pm(cpu, pm_control, val);
>  }
>  
>  static inline void
>  set_count_mode(u32 kernel, u32 user)
>  {
> +	OP_DBG("set_count_mode k:%d u:%d", kernel, user);
>  	/*
>  	 * The user must specify user and kernel if they want them. If
>  	 *  neither is specified, OProfile will count in hypervisor mode.
>  	 *  pm_regs.pm_cntrl is a global
> +	 *
> +	 *  NOTE : PS3 hypervisor rejects ALL_MODES and HYPERVISOR_MODE.
> +	 *         So, ALL_MODES and HYPERVISOR_MODE are changed to
> +	 *         PROBLEM_MODE.
>  	 */
>  	if (kernel) {
>  		if (user)
> -			pm_regs.pm_cntrl.count_mode = CBE_COUNT_ALL_MODES;
> +			pm_regs.pm_cntrl.count_mode =
> +				pmu_ops->set_count_mode_var1;
>  		else
>  			pm_regs.pm_cntrl.count_mode =
>  				CBE_COUNT_SUPERVISOR_MODE;
> @@ -400,7 +286,7 @@ set_count_mode(u32 kernel, u32 user)
>  			pm_regs.pm_cntrl.count_mode = CBE_COUNT_PROBLEM_MODE;
>  		else
>  			pm_regs.pm_cntrl.count_mode =
> -				CBE_COUNT_HYPERVISOR_MODE;
> +				pmu_ops->set_count_mode_var2;
>  	}
>  }
>  
> @@ -408,9 +294,10 @@ static inline void enable_ctr(u32 cpu, u
>  {
>  
>  	pm07_cntrl[ctr] |= CBE_PM_CTR_ENABLE;
> -	cbe_write_pm07_control(cpu, ctr, pm07_cntrl[ctr]);
> +	pmu_ops->write_pm07_control(cpu, ctr, pm07_cntrl[ctr]);
>  }
>  
> +
>  /*
>   * Oprofile is expected to collect data on all CPUs simultaneously.
>   * However, there is one set of performance counters per node.	There are
> @@ -441,13 +328,13 @@ static void cell_virtual_cntr(unsigned l
>  	 * not both playing with the counters on the same node.
>  	 */
>  
> -	spin_lock_irqsave(&virt_cntr_lock, flags);
> +	spin_lock_irqsave(&oprofile_virt_cntr_lock, flags);
>  
> -	prev_hdw_thread = hdw_thread;
> +	prev_hdw_thread = oprofile_hdw_thread;
>  
>  	/* switch the cpu handling the interrupts */
> -	hdw_thread = 1 ^ hdw_thread;
> -	next_hdw_thread = hdw_thread;
> +	oprofile_hdw_thread = 1 ^ oprofile_hdw_thread;
> +	next_hdw_thread = oprofile_hdw_thread;
>  
>  	pm_regs.group_control = 0;
>  	pm_regs.debug_bus_control = 0;
> @@ -459,7 +346,7 @@ static void cell_virtual_cntr(unsigned l
>  	 * There are some per thread events.  Must do the
>  	 * set event, for the thread that is being started
>  	 */
> -	for (i = 0; i < num_counters; i++)
> +	for (i = 0; i < oprofile_num_counters; i++)
>  		set_pm_event(i,
>  			pmc_cntrl[next_hdw_thread][i].evnts,
>  			pmc_cntrl[next_hdw_thread][i].masks);
> @@ -469,45 +356,31 @@ static void cell_virtual_cntr(unsigned l
>  	 * we need cpu #, not node #, to pass to the cbe_xxx functions.
>  	 */
>  	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> +		if (pmu_ops->get_hw_thread_id(cpu))
>  			continue;
>  
>  		/*
>  		 * stop counters, save counter values, restore counts
>  		 * for previous thread
>  		 */
> -		cbe_disable_pm(cpu);
> -		cbe_disable_pm_interrupts(cpu);
> -		for (i = 0; i < num_counters; i++) {
> -			per_cpu(pmc_values, cpu + prev_hdw_thread)[i]
> -			    = cbe_read_ctr(cpu, i);
> -
> -			if (per_cpu(pmc_values, cpu + next_hdw_thread)[i]
> -			    == 0xFFFFFFFF)
> -				/* If the cntr value is 0xffffffff, we must
> -				 * reset that to 0xfffffff0 when the current
> -				 * thread is restarted.	 This will generate a
> -				 * new interrupt and make sure that we never
> -				 * restore the counters to the max value.  If
> -				 * the counters were restored to the max value,
> -				 * they do not increment and no interrupts are
> -				 * generated.  Hence no more samples will be
> -				 * collected on that cpu.
> -				 */
> -				cbe_write_ctr(cpu, i, 0xFFFFFFF0);
> -			else
> -				cbe_write_ctr(cpu, i,
> -					      per_cpu(pmc_values,
> -						      cpu +
> -						      next_hdw_thread)[i]);
> -		}
>  
> +		pmu_ops->disable_pm(cpu);
> +		pmu_ops->disable_pm_interrupts(cpu);
> +		pmu_ops->virtual_cntr_part1(oprofile_num_counters,
> +					    prev_hdw_thread,
> +					    next_hdw_thread, cpu);
> +		/*
> +		 * Add sample data at here.
> +		 * Because PS3 hypervisor does not have
> +		 * the performance monitor interrupt feature.
> +		 */
> +		pmu_ops->add_sample(cpu);
>  		/*
>  		 * Switch to the other thread. Change the interrupt
>  		 * and control regs to be scheduled on the CPU
>  		 * corresponding to the thread to execute.
>  		 */
> -		for (i = 0; i < num_counters; i++) {
> +		for (i = 0; i < oprofile_num_counters; i++) {
>  			if (pmc_cntrl[next_hdw_thread][i].enabled) {
>  				/*
>  				 * There are some per thread events.
> @@ -517,70 +390,42 @@ static void cell_virtual_cntr(unsigned l
>  				enable_ctr(cpu, i,
>  					   pm_regs.pm07_cntrl);
>  			} else {
> -				cbe_write_pm07_control(cpu, i, 0);
> +				pmu_ops->write_pm07_control(cpu, i, 0);
>  			}
>  		}
>  
>  		/* Enable interrupts on the CPU thread that is starting */
> -		cbe_enable_pm_interrupts(cpu, next_hdw_thread,
> -					 virt_cntr_inter_mask);
> -		cbe_enable_pm(cpu);
> +		pmu_ops->enable_pm_interrupts(cpu, next_hdw_thread,
> +					 oprofile_virt_cntr_inter_mask);
> +		pmu_ops->enable_pm(cpu);
>  	}
>  
> -	spin_unlock_irqrestore(&virt_cntr_lock, flags);
> +	spin_unlock_irqrestore(&oprofile_virt_cntr_lock, flags);
>  
> -	mod_timer(&timer_virt_cntr, jiffies + HZ / 10);
> +	mod_timer(&oprofile_timer_virt_cntr, jiffies + HZ / 10);
>  }
>  
>  static void start_virt_cntrs(void)
>  {
> -	init_timer(&timer_virt_cntr);
> -	timer_virt_cntr.function = cell_virtual_cntr;
> -	timer_virt_cntr.data = 0UL;
> -	timer_virt_cntr.expires = jiffies + HZ / 10;
> -	add_timer(&timer_virt_cntr);
> +	init_timer(&oprofile_timer_virt_cntr);
> +	oprofile_timer_virt_cntr.function = cell_virtual_cntr;
> +	oprofile_timer_virt_cntr.data = 0UL;
> +	oprofile_timer_virt_cntr.expires = jiffies + HZ / 10;
> +	add_timer(&oprofile_timer_virt_cntr);
>  }
>  
>  /* This function is called once for all cpus combined */
> -static int cell_reg_setup(struct op_counter_config *ctr,
> +int cell_reg_setup(struct op_counter_config *ctr,
>  			struct op_system_config *sys, int num_ctrs)
>  {
>  	int i, j, cpu;
> -	spu_cycle_reset = 0;
> -
> -	if (ctr[0].event == SPU_CYCLES_EVENT_NUM) {
> -		spu_cycle_reset = ctr[0].count;
> -
> -		/*
> -		 * Each node will need to make the rtas call to start
> -		 * and stop SPU profiling.  Get the token once and store it.
> -		 */
> -		spu_rtas_token = rtas_token("ibm,cbe-spu-perftools");
> -
> -		if (unlikely(spu_rtas_token == RTAS_UNKNOWN_SERVICE)) {
> -			printk(KERN_ERR
> -			       "%s: rtas token ibm,cbe-spu-perftools unknown\n",
> -			       __FUNCTION__);
> -			return -EIO;
> -		}
> -	}
> -
> -	pm_rtas_token = rtas_token("ibm,cbe-perftools");
> +	int ret;
>  
> -	/*
> -	 * For all events excetp PPU CYCLEs, each node will need to make
> -	 * the rtas cbe-perftools call to setup and reset the debug bus.
> -	 * Make the token lookup call once and store it in the global
> -	 * variable pm_rtas_token.
> -	 */
> -	if (unlikely(pm_rtas_token == RTAS_UNKNOWN_SERVICE)) {
> -		printk(KERN_ERR
> -		       "%s: rtas token ibm,cbe-perftools unknown\n",
> -		       __FUNCTION__);
> -		return -EIO;
> -	}
> +	ret = pmu_ops->reg_setup_part1(ctr);
> +	if (ret)
> +		return ret;
>  
> -	num_counters = num_ctrs;
> +	oprofile_num_counters = num_ctrs;
>  
>  	pm_regs.group_control = 0;
>  	pm_regs.debug_bus_control = 0;
> @@ -634,11 +479,10 @@ static int cell_reg_setup(struct op_coun
>  	 * which will give us "count" until overflow.
>  	 * Then we set the events on the enabled counters.
>  	 */
> -	for (i = 0; i < num_counters; ++i) {
> +	for (i = 0; i < oprofile_num_counters; ++i) {
>  		/* start with virtual counter set 0 */
>  		if (pmc_cntrl[0][i].enabled) {
> -			/* Using 32bit counters, reset max - count */
> -			reset_value[i] = 0xFFFFFFFF - ctr[i].count;
> +			pmu_ops->reg_setup_part2(i, ctr);
>  			set_pm_event(i,
>  				     pmc_cntrl[0][i].evnts,
>  				     pmc_cntrl[0][i].masks);
> @@ -650,17 +494,15 @@ static int cell_reg_setup(struct op_coun
>  
>  	/* initialize the previous counts for the virtual cntrs */
>  	for_each_online_cpu(cpu)
> -		for (i = 0; i < num_counters; ++i) {
> -			per_cpu(pmc_values, cpu)[i] = reset_value[i];
> -		}
> -
> +		for (i = 0; i < oprofile_num_counters; ++i)
> +			per_cpu(pmc_values, cpu)[i] = oprofile_reset_value[i];
>  	return 0;
>  }
> -
> +EXPORT_SYMBOL_GPL(cell_reg_setup);
>  
> 
>  /* This function is called once for each cpu */
> -static int cell_cpu_setup(struct op_counter_config *cntr)
> +int cell_cpu_setup(struct op_counter_config *cntr)
>  {
>  	u32 cpu = smp_processor_id();
>  	u32 num_enabled = 0;
> @@ -672,22 +514,23 @@ static int cell_cpu_setup(struct op_coun
>  	/* There is one performance monitor per processor chip (i.e. node),
>  	 * so we only need to perform this function once per node.
>  	 */
> -	if (cbe_get_hw_thread_id(cpu))
> +	if (pmu_ops->get_hw_thread_id(cpu))
>  		return 0;
>  
>  	/* Stop all counters */
> -	cbe_disable_pm(cpu);
> -	cbe_disable_pm_interrupts(cpu);
> +	pmu_ops->disable_pm(cpu);
> +	pmu_ops->disable_pm_interrupts(cpu);
>  
> -	cbe_write_pm(cpu, pm_interval, 0);
> -	cbe_write_pm(cpu, pm_start_stop, 0);
> -	cbe_write_pm(cpu, group_control, pm_regs.group_control);
> -	cbe_write_pm(cpu, debug_bus_control, pm_regs.debug_bus_control);
> +	pmu_ops->write_pm(cpu, pm_interval, 0);
> +	pmu_ops->write_pm(cpu, pm_start_stop, 0);
> +	pmu_ops->write_pm(cpu, group_control, pm_regs.group_control);
> +	pmu_ops->write_pm(cpu, debug_bus_control, pm_regs.debug_bus_control);
>  	write_pm_cntrl(cpu);
>  
> -	for (i = 0; i < num_counters; ++i) {
> +	for (i = 0; i < oprofile_num_counters; ++i) {
>  		if (ctr_enabled & (1 << i)) {
> -			pm_signal[num_enabled].cpu = cbe_cpu_to_node(cpu);
> +			pm_signal[num_enabled].cpu =
> +				pmu_ops->pmu_cpu_to_node(cpu);
>  			num_enabled++;
>  		}
>  	}
> @@ -696,277 +539,12 @@ static int cell_cpu_setup(struct op_coun
>  	 * The pm_rtas_activate_signals will return -EIO if the FW
>  	 * call failed.
>  	 */
> -	return pm_rtas_activate_signals(cbe_cpu_to_node(cpu), num_enabled);
> +	return pmu_ops->pm_activate_signals
> +		(pmu_ops->pmu_cpu_to_node(cpu), num_enabled);
>  }
> +EXPORT_SYMBOL_GPL(cell_cpu_setup);
>  
> -#define ENTRIES	 303
> -#define MAXLFSR	 0xFFFFFF
> -
> -/* precomputed table of 24 bit LFSR values */
> -static int initial_lfsr[] = {
> - 8221349, 12579195, 5379618, 10097839, 7512963, 7519310, 3955098, 10753424,
> - 15507573, 7458917, 285419, 2641121, 9780088, 3915503, 6668768, 1548716,
> - 4885000, 8774424, 9650099, 2044357, 2304411, 9326253, 10332526, 4421547,
> - 3440748, 10179459, 13332843, 10375561, 1313462, 8375100, 5198480, 6071392,
> - 9341783, 1526887, 3985002, 1439429, 13923762, 7010104, 11969769, 4547026,
> - 2040072, 4025602, 3437678, 7939992, 11444177, 4496094, 9803157, 10745556,
> - 3671780, 4257846, 5662259, 13196905, 3237343, 12077182, 16222879, 7587769,
> - 14706824, 2184640, 12591135, 10420257, 7406075, 3648978, 11042541, 15906893,
> - 11914928, 4732944, 10695697, 12928164, 11980531, 4430912, 11939291, 2917017,
> - 6119256, 4172004, 9373765, 8410071, 14788383, 5047459, 5474428, 1737756,
> - 15967514, 13351758, 6691285, 8034329, 2856544, 14394753, 11310160, 12149558,
> - 7487528, 7542781, 15668898, 12525138, 12790975, 3707933, 9106617, 1965401,
> - 16219109, 12801644, 2443203, 4909502, 8762329, 3120803, 6360315, 9309720,
> - 15164599, 10844842, 4456529, 6667610, 14924259, 884312, 6234963, 3326042,
> - 15973422, 13919464, 5272099, 6414643, 3909029, 2764324, 5237926, 4774955,
> - 10445906, 4955302, 5203726, 10798229, 11443419, 2303395, 333836, 9646934,
> - 3464726, 4159182, 568492, 995747, 10318756, 13299332, 4836017, 8237783,
> - 3878992, 2581665, 11394667, 5672745, 14412947, 3159169, 9094251, 16467278,
> - 8671392, 15230076, 4843545, 7009238, 15504095, 1494895, 9627886, 14485051,
> - 8304291, 252817, 12421642, 16085736, 4774072, 2456177, 4160695, 15409741,
> - 4902868, 5793091, 13162925, 16039714, 782255, 11347835, 14884586, 366972,
> - 16308990, 11913488, 13390465, 2958444, 10340278, 1177858, 1319431, 10426302,
> - 2868597, 126119, 5784857, 5245324, 10903900, 16436004, 3389013, 1742384,
> - 14674502, 10279218, 8536112, 10364279, 6877778, 14051163, 1025130, 6072469,
> - 1988305, 8354440, 8216060, 16342977, 13112639, 3976679, 5913576, 8816697,
> - 6879995, 14043764, 3339515, 9364420, 15808858, 12261651, 2141560, 5636398,
> - 10345425, 10414756, 781725, 6155650, 4746914, 5078683, 7469001, 6799140,
> - 10156444, 9667150, 10116470, 4133858, 2121972, 1124204, 1003577, 1611214,
> - 14304602, 16221850, 13878465, 13577744, 3629235, 8772583, 10881308, 2410386,
> - 7300044, 5378855, 9301235, 12755149, 4977682, 8083074, 10327581, 6395087,
> - 9155434, 15501696, 7514362, 14520507, 15808945, 3244584, 4741962, 9658130,
> - 14336147, 8654727, 7969093, 15759799, 14029445, 5038459, 9894848, 8659300,
> - 13699287, 8834306, 10712885, 14753895, 10410465, 3373251, 309501, 9561475,
> - 5526688, 14647426, 14209836, 5339224, 207299, 14069911, 8722990, 2290950,
> - 3258216, 12505185, 6007317, 9218111, 14661019, 10537428, 11731949, 9027003,
> - 6641507, 9490160, 200241, 9720425, 16277895, 10816638, 1554761, 10431375,
> - 7467528, 6790302, 3429078, 14633753, 14428997, 11463204, 3576212, 2003426,
> - 6123687, 820520, 9992513, 15784513, 5778891, 6428165, 8388607
> -};
>  
> -/*
> - * The hardware uses an LFSR counting sequence to determine when to capture
> - * the SPU PCs.	 An LFSR sequence is like a puesdo random number sequence
> - * where each number occurs once in the sequence but the sequence is not in
> - * numerical order. The SPU PC capture is done when the LFSR sequence reaches
> - * the last value in the sequence.  Hence the user specified value N
> - * corresponds to the LFSR number that is N from the end of the sequence.
> - *
> - * To avoid the time to compute the LFSR, a lookup table is used.  The 24 bit
> - * LFSR sequence is broken into four ranges.  The spacing of the precomputed
> - * values is adjusted in each range so the error between the user specifed
> - * number (N) of events between samples and the actual number of events based
> - * on the precomputed value will be les then about 6.2%.  Note, if the user
> - * specifies N < 2^16, the LFSR value that is 2^16 from the end will be used.
> - * This is to prevent the loss of samples because the trace buffer is full.
> - *
> - *	   User specified N		     Step between	   Index in
> - *					 precomputed values	 precomputed
> - *								    table
> - * 0		    to	2^16-1			----		      0
> - * 2^16	    to	2^16+2^19-1		2^12		    1 to 128
> - * 2^16+2^19	    to	2^16+2^19+2^22-1	2^15		  129 to 256
> - * 2^16+2^19+2^22  to	2^24-1			2^18		  257 to 302
> - *
> - *
> - * For example, the LFSR values in the second range are computed for 2^16,
> - * 2^16+2^12, ... , 2^19-2^16, 2^19 and stored in the table at indicies
> - * 1, 2,..., 127, 128.
> - *
> - * The 24 bit LFSR value for the nth number in the sequence can be
> - * calculated using the following code:
> - *
> - * #define size 24
> - * int calculate_lfsr(int n)
> - * {
> - *	int i;
> - *	unsigned int newlfsr0;
> - *	unsigned int lfsr = 0xFFFFFF;
> - *	unsigned int howmany = n;
> - *
> - *	for (i = 2; i < howmany + 2; i++) {
> - *		newlfsr0 = (((lfsr >> (size - 1 - 0)) & 1) ^
> - *		((lfsr >> (size - 1 - 1)) & 1) ^
> - *		(((lfsr >> (size - 1 - 6)) & 1) ^
> - *		((lfsr >> (size - 1 - 23)) & 1)));
> - *
> - *		lfsr >>= 1;
> - *		lfsr = lfsr | (newlfsr0 << (size - 1));
> - *	}
> - *	return lfsr;
> - * }
> - */
> -
> -#define V2_16  (0x1 << 16)
> -#define V2_19  (0x1 << 19)
> -#define V2_22  (0x1 << 22)
> -
> -static int calculate_lfsr(int n)
> -{
> -	/*
> -	 * The ranges and steps are in powers of 2 so the calculations
> -	 * can be done using shifts rather then divide.
> -	 */
> -	int index;
> -
> -	if ((n >> 16) == 0)
> -		index = 0;
> -	else if (((n - V2_16) >> 19) == 0)
> -		index = ((n - V2_16) >> 12) + 1;
> -	else if (((n - V2_16 - V2_19) >> 22) == 0)
> -		index = ((n - V2_16 - V2_19) >> 15 ) + 1 + 128;
> -	else if (((n - V2_16 - V2_19 - V2_22) >> 24) == 0)
> -		index = ((n - V2_16 - V2_19 - V2_22) >> 18 ) + 1 + 256;
> -	else
> -		index = ENTRIES-1;
> -
> -	/* make sure index is valid */
> -	if ((index > ENTRIES) || (index < 0))
> -		index = ENTRIES-1;
> -
> -	return initial_lfsr[index];
> -}
> -
> -static int pm_rtas_activate_spu_profiling(u32 node)
> -{
> -	int ret, i;
> -	struct pm_signal pm_signal_local[NR_PHYS_CTRS];
> -
> -	/*
> -	 * Set up the rtas call to configure the debug bus to
> -	 * route the SPU PCs.  Setup the pm_signal for each SPU
> -	 */
> -	for (i = 0; i < NUM_SPUS_PER_NODE; i++) {
> -		pm_signal_local[i].cpu = node;
> -		pm_signal_local[i].signal_group = 41;
> -		/* spu i on word (i/2) */
> -		pm_signal_local[i].bus_word = 1 << i / 2;
> -		/* spu i */
> -		pm_signal_local[i].sub_unit = i;
> -		pm_signal_local[i].bit = 63;
> -	}
> -
> -	ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE,
> -				     PASSTHRU_ENABLE, pm_signal_local,
> -				     (NUM_SPUS_PER_NODE
> -				      * sizeof(struct pm_signal)));
> -
> -	if (unlikely(ret)) {
> -		printk(KERN_WARNING "%s: rtas returned: %d\n",
> -		       __FUNCTION__, ret);
> -		return -EIO;
> -	}
> -
> -	return 0;
> -}
> -
> -#ifdef CONFIG_CPU_FREQ
> -static int
> -oprof_cpufreq_notify(struct notifier_block *nb, unsigned long val, void *data)
> -{
> -	int ret = 0;
> -	struct cpufreq_freqs *frq = data;
> -	if ((val == CPUFREQ_PRECHANGE && frq->old < frq->new) ||
> -	    (val == CPUFREQ_POSTCHANGE && frq->old > frq->new) ||
> -	    (val == CPUFREQ_RESUMECHANGE || val == CPUFREQ_SUSPENDCHANGE))
> -		set_spu_profiling_frequency(frq->new, spu_cycle_reset);
> -	return ret;
> -}
> -
> -static struct notifier_block cpu_freq_notifier_block = {
> -	.notifier_call	= oprof_cpufreq_notify
> -};
> -#endif
> -
> -static int cell_global_start_spu(struct op_counter_config *ctr)
> -{
> -	int subfunc;
> -	unsigned int lfsr_value;
> -	int cpu;
> -	int ret;
> -	int rtas_error;
> -	unsigned int cpu_khzfreq = 0;
> -
> -	/* The SPU profiling uses time-based profiling based on
> -	 * cpu frequency, so if configured with the CPU_FREQ
> -	 * option, we should detect frequency changes and react
> -	 * accordingly.
> -	 */
> -#ifdef CONFIG_CPU_FREQ
> -	ret = cpufreq_register_notifier(&cpu_freq_notifier_block,
> -					CPUFREQ_TRANSITION_NOTIFIER);
> -	if (ret < 0)
> -		/* this is not a fatal error */
> -		printk(KERN_ERR "CPU freq change registration failed: %d\n",
> -		       ret);
> -
> -	else
> -		cpu_khzfreq = cpufreq_quick_get(smp_processor_id());
> -#endif
> -
> -	set_spu_profiling_frequency(cpu_khzfreq, spu_cycle_reset);
> -
> -	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> -			continue;
> -
> -		/*
> -		 * Setup SPU cycle-based profiling.
> -		 * Set perf_mon_control bit 0 to a zero before
> -		 * enabling spu collection hardware.
> -		 */
> -		cbe_write_pm(cpu, pm_control, 0);
> -
> -		if (spu_cycle_reset > MAX_SPU_COUNT)
> -			/* use largest possible value */
> -			lfsr_value = calculate_lfsr(MAX_SPU_COUNT-1);
> -		else
> -			lfsr_value = calculate_lfsr(spu_cycle_reset);
> -
> -		/* must use a non zero value. Zero disables data collection. */
> -		if (lfsr_value == 0)
> -			lfsr_value = calculate_lfsr(1);
> -
> -		lfsr_value = lfsr_value << 8; /* shift lfsr to correct
> -						* register location
> -						*/
> -
> -		/* debug bus setup */
> -		ret = pm_rtas_activate_spu_profiling(cbe_cpu_to_node(cpu));
> -
> -		if (unlikely(ret)) {
> -			rtas_error = ret;
> -			goto out;
> -		}
> -
> -
> -		subfunc = 2;	/* 2 - activate SPU tracing, 3 - deactivate */
> -
> -		/* start profiling */
> -		ret = rtas_call(spu_rtas_token, 3, 1, NULL, subfunc,
> -		  cbe_cpu_to_node(cpu), lfsr_value);
> -
> -		if (unlikely(ret != 0)) {
> -			printk(KERN_ERR
> -			       "%s: rtas call ibm,cbe-spu-perftools failed, return = %d\n",
> -			       __FUNCTION__, ret);
> -			rtas_error = -EIO;
> -			goto out;
> -		}
> -	}
> -
> -	rtas_error = start_spu_profiling(spu_cycle_reset);
> -	if (rtas_error)
> -		goto out_stop;
> -
> -	oprofile_running = 1;
> -	return 0;
> -
> -out_stop:
> -	cell_global_stop_spu();		/* clean up the PMU/debug bus */
> -out:
> -	return rtas_error;
> -}
>  
>  static int cell_global_start_ppu(struct op_counter_config *ctr)
>  {
> @@ -978,30 +556,33 @@ static int cell_global_start_ppu(struct 
>  	 * only need to perform this function once per node.
>  	 */
>  	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> +		if (pmu_ops->get_hw_thread_id(cpu))
>  			continue;
>  
>  		interrupt_mask = 0;
>  
> -		for (i = 0; i < num_counters; ++i) {
> +		for (i = 0; i < oprofile_num_counters; ++i) {
>  			if (ctr_enabled & (1 << i)) {
> -				cbe_write_ctr(cpu, i, reset_value[i]);
> +				pmu_ops->write_ctr
> +					(cpu, i, oprofile_reset_value[i]);
>  				enable_ctr(cpu, i, pm_regs.pm07_cntrl);
>  				interrupt_mask |=
>  				    CBE_PM_CTR_OVERFLOW_INTR(i);
>  			} else {
>  				/* Disable counter */
> -				cbe_write_pm07_control(cpu, i, 0);
> +				pmu_ops->write_pm07_control(cpu, i, 0);
>  			}
>  		}
>  
> -		cbe_get_and_clear_pm_interrupts(cpu);
> -		cbe_enable_pm_interrupts(cpu, hdw_thread, interrupt_mask);
> -		cbe_enable_pm(cpu);
> +		pmu_ops->get_and_clear_pm_interrupts(cpu);
> +		pmu_ops->enable_pm_interrupts
> +			(cpu, oprofile_hdw_thread, interrupt_mask);
> +		pmu_ops->enable_pm(cpu);
>  	}
>  
> -	virt_cntr_inter_mask = interrupt_mask;
> +	oprofile_virt_cntr_inter_mask = interrupt_mask;
>  	oprofile_running = 1;
> +	/* complete the previous store */
>  	smp_wmb();
>  
>  	/*
> @@ -1015,199 +596,22 @@ static int cell_global_start_ppu(struct 
>  	return 0;
>  }
>  
> -static int cell_global_start(struct op_counter_config *ctr)
> +int cell_global_start(struct op_counter_config *ctr)
>  {
>  	if (spu_cycle_reset)
> -		return cell_global_start_spu(ctr);
> +		return pmu_ops->global_start_spu(ctr);
>  	else
>  		return cell_global_start_ppu(ctr);
>  }
> +EXPORT_SYMBOL_GPL(cell_global_start);
>  
> -/*
> - * Note the generic OProfile stop calls do not support returning
> - * an error on stop.  Hence, will not return an error if the FW
> - * calls fail on stop.	Failure to reset the debug bus is not an issue.
> - * Failure to disable the SPU profiling is not an issue.  The FW calls
> - * to enable the performance counters and debug bus will work even if
> - * the hardware was not cleanly reset.
> - */
> -static void cell_global_stop_spu(void)
> -{
> -	int subfunc, rtn_value;
> -	unsigned int lfsr_value;
> -	int cpu;
> -
> -	oprofile_running = 0;
> -
> -#ifdef CONFIG_CPU_FREQ
> -	cpufreq_unregister_notifier(&cpu_freq_notifier_block,
> -				    CPUFREQ_TRANSITION_NOTIFIER);
> -#endif
> -
> -	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> -			continue;
> -
> -		subfunc = 3;	/*
> -				 * 2 - activate SPU tracing,
> -				 * 3 - deactivate
> -				 */
> -		lfsr_value = 0x8f100000;
> -
> -		rtn_value = rtas_call(spu_rtas_token, 3, 1, NULL,
> -				      subfunc, cbe_cpu_to_node(cpu),
> -				      lfsr_value);
> -
> -		if (unlikely(rtn_value != 0)) {
> -			printk(KERN_ERR
> -			       "%s: rtas call ibm,cbe-spu-perftools failed, return = %d\n",
> -			       __FUNCTION__, rtn_value);
> -		}
> -
> -		/* Deactivate the signals */
> -		pm_rtas_reset_signals(cbe_cpu_to_node(cpu));
> -	}
> -
> -	stop_spu_profiling();
> -}
> -
> -static void cell_global_stop_ppu(void)
> -{
> -	int cpu;
> -
> -	/*
> -	 * This routine will be called once for the system.
> -	 * There is one performance monitor per node, so we
> -	 * only need to perform this function once per node.
> -	 */
> -	del_timer_sync(&timer_virt_cntr);
> -	oprofile_running = 0;
> -	smp_wmb();
> -
> -	for_each_online_cpu(cpu) {
> -		if (cbe_get_hw_thread_id(cpu))
> -			continue;
> -
> -		cbe_sync_irq(cbe_cpu_to_node(cpu));
> -		/* Stop the counters */
> -		cbe_disable_pm(cpu);
> -
> -		/* Deactivate the signals */
> -		pm_rtas_reset_signals(cbe_cpu_to_node(cpu));
> -
> -		/* Deactivate interrupts */
> -		cbe_disable_pm_interrupts(cpu);
> -	}
> -}
> -
> -static void cell_global_stop(void)
> +void cell_global_stop(void)
>  {
>  	if (spu_cycle_reset)
> -		cell_global_stop_spu();
> +		pmu_ops->global_stop_spu();
>  	else
> -		cell_global_stop_ppu();
> +		pmu_ops->global_stop_ppu();
>  }
> +EXPORT_SYMBOL_GPL(cell_global_stop);
>  
> -static void cell_handle_interrupt(struct pt_regs *regs,
> -				struct op_counter_config *ctr)
> -{
> -	u32 cpu;
> -	u64 pc;
> -	int is_kernel;
> -	unsigned long flags = 0;
> -	u32 interrupt_mask;
> -	int i;
> -
> -	cpu = smp_processor_id();
>  
> -	/*
> -	 * Need to make sure the interrupt handler and the virt counter
> -	 * routine are not running at the same time. See the
> -	 * cell_virtual_cntr() routine for additional comments.
> -	 */
> -	spin_lock_irqsave(&virt_cntr_lock, flags);
> -
> -	/*
> -	 * Need to disable and reenable the performance counters
> -	 * to get the desired behavior from the hardware.  This
> -	 * is hardware specific.
> -	 */
> -
> -	cbe_disable_pm(cpu);
> -
> -	interrupt_mask = cbe_get_and_clear_pm_interrupts(cpu);
> -
> -	/*
> -	 * If the interrupt mask has been cleared, then the virt cntr
> -	 * has cleared the interrupt.  When the thread that generated
> -	 * the interrupt is restored, the data count will be restored to
> -	 * 0xffffff0 to cause the interrupt to be regenerated.
> -	 */
> -
> -	if ((oprofile_running == 1) && (interrupt_mask != 0)) {
> -		pc = regs->nip;
> -		is_kernel = is_kernel_addr(pc);
> -
> -		for (i = 0; i < num_counters; ++i) {
> -			if ((interrupt_mask & CBE_PM_CTR_OVERFLOW_INTR(i))
> -			    && ctr[i].enabled) {
> -				oprofile_add_pc(pc, is_kernel, i);
> -				cbe_write_ctr(cpu, i, reset_value[i]);
> -			}
> -		}
> -
> -		/*
> -		 * The counters were frozen by the interrupt.
> -		 * Reenable the interrupt and restart the counters.
> -		 * If there was a race between the interrupt handler and
> -		 * the virtual counter routine.	 The virutal counter
> -		 * routine may have cleared the interrupts.  Hence must
> -		 * use the virt_cntr_inter_mask to re-enable the interrupts.
> -		 */
> -		cbe_enable_pm_interrupts(cpu, hdw_thread,
> -					 virt_cntr_inter_mask);
> -
> -		/*
> -		 * The writes to the various performance counters only writes
> -		 * to a latch.	The new values (interrupt setting bits, reset
> -		 * counter value etc.) are not copied to the actual registers
> -		 * until the performance monitor is enabled.  In order to get
> -		 * this to work as desired, the permormance monitor needs to
> -		 * be disabled while writing to the latches.  This is a
> -		 * HW design issue.
> -		 */
> -		cbe_enable_pm(cpu);
> -	}
> -	spin_unlock_irqrestore(&virt_cntr_lock, flags);
> -}
> -
> -/*
> - * This function is called from the generic OProfile
> - * driver.  When profiling PPUs, we need to do the
> - * generic sync start; otherwise, do spu_sync_start.
> - */
> -static int cell_sync_start(void)
> -{
> -	if (spu_cycle_reset)
> -		return spu_sync_start();
> -	else
> -		return DO_GENERIC_SYNC;
> -}
> -
> -static int cell_sync_stop(void)
> -{
> -	if (spu_cycle_reset)
> -		return spu_sync_stop();
> -	else
> -		return 1;
> -}
> -
> -struct op_powerpc_model op_model_cell = {
> -	.reg_setup = cell_reg_setup,
> -	.cpu_setup = cell_cpu_setup,
> -	.global_start = cell_global_start,
> -	.global_stop = cell_global_stop,
> -	.sync_start = cell_sync_start,
> -	.sync_stop = cell_sync_stop,
> -	.handle_interrupt = cell_handle_interrupt,
> -};
> --- a/arch/powerpc/platforms/cell/Makefile
> +++ b/arch/powerpc/platforms/cell/Makefile
> @@ -1,6 +1,6 @@
>  obj-$(CONFIG_PPC_CELL_NATIVE)		+= interrupt.o iommu.o setup.o \
>  					   cbe_regs.o spider-pic.o \
> -					   pervasive.o pmu.o io-workarounds.o
> +					   pervasive.o io-workarounds.o
>  obj-$(CONFIG_CBE_RAS)			+= ras.o
>  
>  obj-$(CONFIG_CBE_THERM)			+= cbe_thermal.o
> --- a/arch/powerpc/platforms/cell/cbe_regs.c
> +++ b/arch/powerpc/platforms/cell/cbe_regs.c
> @@ -103,6 +103,7 @@ struct cbe_pmd_shadow_regs *cbe_get_pmd_
>  		return NULL;
>  	return &map->pmd_shadow_regs;
>  }
> +EXPORT_SYMBOL_GPL(cbe_get_cpu_pmd_shadow_regs);
>  
>  struct cbe_pmd_shadow_regs *cbe_get_cpu_pmd_shadow_regs(int cpu)
>  {
> --- a/arch/powerpc/platforms/cell/interrupt.c
> +++ b/arch/powerpc/platforms/cell/interrupt.c
> @@ -412,3 +412,4 @@ void iic_set_interrupt_routing(int cpu, 
>  		iic_ir |= CBE_IIC_IR_DEST_UNIT(CBE_IIC_IR_PT_1);
>  	out_be64(&iic_regs->iic_ir, iic_ir);
>  }
> +EXPORT_SYMBOL_GPL(iic_set_interrupt_routing);
> --- a/arch/powerpc/platforms/cell/pmu.c
> +++ /dev/null
> @@ -1,423 +0,0 @@
> -/*
> - * Cell Broadband Engine Performance Monitor
> - *
> - * (C) Copyright IBM Corporation 2001,2006
> - *
> - * Author:
> - *    David Erb (djerb at us.ibm.com)
> - *    Kevin Corry (kevcorry at us.ibm.com)
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2, or (at your option)
> - * any later version.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - */
> -
> -#include <linux/interrupt.h>
> -#include <linux/types.h>
> -#include <asm/io.h>
> -#include <asm/irq_regs.h>
> -#include <asm/machdep.h>
> -#include <asm/pmc.h>
> -#include <asm/reg.h>
> -#include <asm/spu.h>
> -#include <asm/cell-regs.h>
> -
> -#include "interrupt.h"
> -
> -/*
> - * When writing to write-only mmio addresses, save a shadow copy. All of the
> - * registers are 32-bit, but stored in the upper-half of a 64-bit field in
> - * pmd_regs.
> - */
> -
> -#define WRITE_WO_MMIO(reg, x)					\
> -	do {							\
> -		u32 _x = (x);					\
> -		struct cbe_pmd_regs __iomem *pmd_regs;		\
> -		struct cbe_pmd_shadow_regs *shadow_regs;	\
> -		pmd_regs = cbe_get_cpu_pmd_regs(cpu);		\
> -		shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);	\
> -		out_be64(&(pmd_regs->reg), (((u64)_x) << 32));	\
> -		shadow_regs->reg = _x;				\
> -	} while (0)
> -
> -#define READ_SHADOW_REG(val, reg)				\
> -	do {							\
> -		struct cbe_pmd_shadow_regs *shadow_regs;	\
> -		shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);	\
> -		(val) = shadow_regs->reg;			\
> -	} while (0)
> -
> -#define READ_MMIO_UPPER32(val, reg)				\
> -	do {							\
> -		struct cbe_pmd_regs __iomem *pmd_regs;		\
> -		pmd_regs = cbe_get_cpu_pmd_regs(cpu);		\
> -		(val) = (u32)(in_be64(&pmd_regs->reg) >> 32);	\
> -	} while (0)
> -
> -/*
> - * Physical counter registers.
> - * Each physical counter can act as one 32-bit counter or two 16-bit counters.
> - */
> -
> -u32 cbe_read_phys_ctr(u32 cpu, u32 phys_ctr)
> -{
> -	u32 val_in_latch, val = 0;
> -
> -	if (phys_ctr < NR_PHYS_CTRS) {
> -		READ_SHADOW_REG(val_in_latch, counter_value_in_latch);
> -
> -		/* Read the latch or the actual counter, whichever is newer. */
> -		if (val_in_latch & (1 << phys_ctr)) {
> -			READ_SHADOW_REG(val, pm_ctr[phys_ctr]);
> -		} else {
> -			READ_MMIO_UPPER32(val, pm_ctr[phys_ctr]);
> -		}
> -	}
> -
> -	return val;
> -}
> -EXPORT_SYMBOL_GPL(cbe_read_phys_ctr);
> -
> -void cbe_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val)
> -{
> -	struct cbe_pmd_shadow_regs *shadow_regs;
> -	u32 pm_ctrl;
> -
> -	if (phys_ctr < NR_PHYS_CTRS) {
> -		/* Writing to a counter only writes to a hardware latch.
> -		 * The new value is not propagated to the actual counter
> -		 * until the performance monitor is enabled.
> -		 */
> -		WRITE_WO_MMIO(pm_ctr[phys_ctr], val);
> -
> -		pm_ctrl = cbe_read_pm(cpu, pm_control);
> -		if (pm_ctrl & CBE_PM_ENABLE_PERF_MON) {
> -			/* The counters are already active, so we need to
> -			 * rewrite the pm_control register to "re-enable"
> -			 * the PMU.
> -			 */
> -			cbe_write_pm(cpu, pm_control, pm_ctrl);
> -		} else {
> -			shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);
> -			shadow_regs->counter_value_in_latch |= (1 << phys_ctr);
> -		}
> -	}
> -}
> -EXPORT_SYMBOL_GPL(cbe_write_phys_ctr);
> -
> -/*
> - * "Logical" counter registers.
> - * These will read/write 16-bits or 32-bits depending on the
> - * current size of the counter. Counters 4 - 7 are always 16-bit.
> - */
> -
> -u32 cbe_read_ctr(u32 cpu, u32 ctr)
> -{
> -	u32 val;
> -	u32 phys_ctr = ctr & (NR_PHYS_CTRS - 1);
> -
> -	val = cbe_read_phys_ctr(cpu, phys_ctr);
> -
> -	if (cbe_get_ctr_size(cpu, phys_ctr) == 16)
> -		val = (ctr < NR_PHYS_CTRS) ? (val >> 16) : (val & 0xffff);
> -
> -	return val;
> -}
> -EXPORT_SYMBOL_GPL(cbe_read_ctr);
> -
> -void cbe_write_ctr(u32 cpu, u32 ctr, u32 val)
> -{
> -	u32 phys_ctr;
> -	u32 phys_val;
> -
> -	phys_ctr = ctr & (NR_PHYS_CTRS - 1);
> -
> -	if (cbe_get_ctr_size(cpu, phys_ctr) == 16) {
> -		phys_val = cbe_read_phys_ctr(cpu, phys_ctr);
> -
> -		if (ctr < NR_PHYS_CTRS)
> -			val = (val << 16) | (phys_val & 0xffff);
> -		else
> -			val = (val & 0xffff) | (phys_val & 0xffff0000);
> -	}
> -
> -	cbe_write_phys_ctr(cpu, phys_ctr, val);
> -}
> -EXPORT_SYMBOL_GPL(cbe_write_ctr);
> -
> -/*
> - * Counter-control registers.
> - * Each "logical" counter has a corresponding control register.
> - */
> -
> -u32 cbe_read_pm07_control(u32 cpu, u32 ctr)
> -{
> -	u32 pm07_control = 0;
> -
> -	if (ctr < NR_CTRS)
> -		READ_SHADOW_REG(pm07_control, pm07_control[ctr]);
> -
> -	return pm07_control;
> -}
> -EXPORT_SYMBOL_GPL(cbe_read_pm07_control);
> -
> -void cbe_write_pm07_control(u32 cpu, u32 ctr, u32 val)
> -{
> -	if (ctr < NR_CTRS)
> -		WRITE_WO_MMIO(pm07_control[ctr], val);
> -}
> -EXPORT_SYMBOL_GPL(cbe_write_pm07_control);
> -
> -/*
> - * Other PMU control registers. Most of these are write-only.
> - */
> -
> -u32 cbe_read_pm(u32 cpu, enum pm_reg_name reg)
> -{
> -	u32 val = 0;
> -
> -	switch (reg) {
> -	case group_control:
> -		READ_SHADOW_REG(val, group_control);
> -		break;
> -
> -	case debug_bus_control:
> -		READ_SHADOW_REG(val, debug_bus_control);
> -		break;
> -
> -	case trace_address:
> -		READ_MMIO_UPPER32(val, trace_address);
> -		break;
> -
> -	case ext_tr_timer:
> -		READ_SHADOW_REG(val, ext_tr_timer);
> -		break;
> -
> -	case pm_status:
> -		READ_MMIO_UPPER32(val, pm_status);
> -		break;
> -
> -	case pm_control:
> -		READ_SHADOW_REG(val, pm_control);
> -		break;
> -
> -	case pm_interval:
> -		READ_MMIO_UPPER32(val, pm_interval);
> -		break;
> -
> -	case pm_start_stop:
> -		READ_SHADOW_REG(val, pm_start_stop);
> -		break;
> -	}
> -
> -	return val;
> -}
> -EXPORT_SYMBOL_GPL(cbe_read_pm);
> -
> -void cbe_write_pm(u32 cpu, enum pm_reg_name reg, u32 val)
> -{
> -	switch (reg) {
> -	case group_control:
> -		WRITE_WO_MMIO(group_control, val);
> -		break;
> -
> -	case debug_bus_control:
> -		WRITE_WO_MMIO(debug_bus_control, val);
> -		break;
> -
> -	case trace_address:
> -		WRITE_WO_MMIO(trace_address, val);
> -		break;
> -
> -	case ext_tr_timer:
> -		WRITE_WO_MMIO(ext_tr_timer, val);
> -		break;
> -
> -	case pm_status:
> -		WRITE_WO_MMIO(pm_status, val);
> -		break;
> -
> -	case pm_control:
> -		WRITE_WO_MMIO(pm_control, val);
> -		break;
> -
> -	case pm_interval:
> -		WRITE_WO_MMIO(pm_interval, val);
> -		break;
> -
> -	case pm_start_stop:
> -		WRITE_WO_MMIO(pm_start_stop, val);
> -		break;
> -	}
> -}
> -EXPORT_SYMBOL_GPL(cbe_write_pm);
> -
> -/*
> - * Get/set the size of a physical counter to either 16 or 32 bits.
> - */
> -
> -u32 cbe_get_ctr_size(u32 cpu, u32 phys_ctr)
> -{
> -	u32 pm_ctrl, size = 0;
> -
> -	if (phys_ctr < NR_PHYS_CTRS) {
> -		pm_ctrl = cbe_read_pm(cpu, pm_control);
> -		size = (pm_ctrl & CBE_PM_16BIT_CTR(phys_ctr)) ? 16 : 32;
> -	}
> -
> -	return size;
> -}
> -EXPORT_SYMBOL_GPL(cbe_get_ctr_size);
> -
> -void cbe_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size)
> -{
> -	u32 pm_ctrl;
> -
> -	if (phys_ctr < NR_PHYS_CTRS) {
> -		pm_ctrl = cbe_read_pm(cpu, pm_control);
> -		switch (ctr_size) {
> -		case 16:
> -			pm_ctrl |= CBE_PM_16BIT_CTR(phys_ctr);
> -			break;
> -
> -		case 32:
> -			pm_ctrl &= ~CBE_PM_16BIT_CTR(phys_ctr);
> -			break;
> -		}
> -		cbe_write_pm(cpu, pm_control, pm_ctrl);
> -	}
> -}
> -EXPORT_SYMBOL_GPL(cbe_set_ctr_size);
> -
> -/*
> - * Enable/disable the entire performance monitoring unit.
> - * When we enable the PMU, all pending writes to counters get committed.
> - */
> -
> -void cbe_enable_pm(u32 cpu)
> -{
> -	struct cbe_pmd_shadow_regs *shadow_regs;
> -	u32 pm_ctrl;
> -
> -	shadow_regs = cbe_get_cpu_pmd_shadow_regs(cpu);
> -	shadow_regs->counter_value_in_latch = 0;
> -
> -	pm_ctrl = cbe_read_pm(cpu, pm_control) | CBE_PM_ENABLE_PERF_MON;
> -	cbe_write_pm(cpu, pm_control, pm_ctrl);
> -}
> -EXPORT_SYMBOL_GPL(cbe_enable_pm);
> -
> -void cbe_disable_pm(u32 cpu)
> -{
> -	u32 pm_ctrl;
> -	pm_ctrl = cbe_read_pm(cpu, pm_control) & ~CBE_PM_ENABLE_PERF_MON;
> -	cbe_write_pm(cpu, pm_control, pm_ctrl);
> -}
> -EXPORT_SYMBOL_GPL(cbe_disable_pm);
> -
> -/*
> - * Reading from the trace_buffer.
> - * The trace buffer is two 64-bit registers. Reading from
> - * the second half automatically increments the trace_address.
> - */
> -
> -void cbe_read_trace_buffer(u32 cpu, u64 *buf)
> -{
> -	struct cbe_pmd_regs __iomem *pmd_regs = cbe_get_cpu_pmd_regs(cpu);
> -
> -	*buf++ = in_be64(&pmd_regs->trace_buffer_0_63);
> -	*buf++ = in_be64(&pmd_regs->trace_buffer_64_127);
> -}
> -EXPORT_SYMBOL_GPL(cbe_read_trace_buffer);
> -
> -/*
> - * Enabling/disabling interrupts for the entire performance monitoring unit.
> - */
> -
> -u32 cbe_get_and_clear_pm_interrupts(u32 cpu)
> -{
> -	/* Reading pm_status clears the interrupt bits. */
> -	return cbe_read_pm(cpu, pm_status);
> -}
> -EXPORT_SYMBOL_GPL(cbe_get_and_clear_pm_interrupts);
> -
> -void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask)
> -{
> -	/* Set which node and thread will handle the next interrupt. */
> -	iic_set_interrupt_routing(cpu, thread, 0);
> -
> -	/* Enable the interrupt bits in the pm_status register. */
> -	if (mask)
> -		cbe_write_pm(cpu, pm_status, mask);
> -}
> -EXPORT_SYMBOL_GPL(cbe_enable_pm_interrupts);
> -
> -void cbe_disable_pm_interrupts(u32 cpu)
> -{
> -	cbe_get_and_clear_pm_interrupts(cpu);
> -	cbe_write_pm(cpu, pm_status, 0);
> -}
> -EXPORT_SYMBOL_GPL(cbe_disable_pm_interrupts);
> -
> -static irqreturn_t cbe_pm_irq(int irq, void *dev_id)
> -{
> -	perf_irq(get_irq_regs());
> -	return IRQ_HANDLED;
> -}
> -
> -static int __init cbe_init_pm_irq(void)
> -{
> -	unsigned int irq;
> -	int rc, node;
> -
> -	for_each_node(node) {
> -		irq = irq_create_mapping(NULL, IIC_IRQ_IOEX_PMI |
> -					       (node << IIC_IRQ_NODE_SHIFT));
> -		if (irq == NO_IRQ) {
> -			printk("ERROR: Unable to allocate irq for node %d\n",
> -			       node);
> -			return -EINVAL;
> -		}
> -
> -		rc = request_irq(irq, cbe_pm_irq,
> -				 IRQF_DISABLED, "cbe-pmu-0", NULL);
> -		if (rc) {
> -			printk("ERROR: Request for irq on node %d failed\n",
> -			       node);
> -			return rc;
> -		}
> -	}
> -
> -	return 0;
> -}
> -machine_arch_initcall(cell, cbe_init_pm_irq);
> -
> -void cbe_sync_irq(int node)
> -{
> -	unsigned int irq;
> -
> -	irq = irq_find_mapping(NULL,
> -			       IIC_IRQ_IOEX_PMI
> -			       | (node << IIC_IRQ_NODE_SHIFT));
> -
> -	if (irq == NO_IRQ) {
> -		printk(KERN_WARNING "ERROR, unable to get existing irq %d " \
> -		"for node %d\n", irq, node);
> -		return;
> -	}
> -
> -	synchronize_irq(irq);
> -}
> -EXPORT_SYMBOL_GPL(cbe_sync_irq);
> -
> --- a/arch/powerpc/platforms/ps3/Kconfig
> +++ b/arch/powerpc/platforms/ps3/Kconfig
> @@ -127,9 +127,14 @@ config PS3_FLASH
>  	  be disabled on the kernel command line using "ps3flash=off", to
>  	  not allocate this fixed buffer.
>  
> +config OPROFILE_CELL
> +	def_bool y
> +	depends on PPC_PS3 && (OPROFILE = m || OPROFILE = y)
> +	select PS3_LPM
> +
>  config PS3_LPM
>  	tristate "PS3 Logical Performance Monitor support"
> -	depends on PPC_PS3
> +	depends on PPC_PS3 && ( OPROFILE_CELL = y)
>  	help
>  	  Include support for the PS3 Logical Performance Monitor.
>  
> --- a/drivers/oprofile/cpu_buffer.c
> +++ b/drivers/oprofile/cpu_buffer.c
> @@ -21,6 +21,7 @@
>  #include <linux/oprofile.h>
>  #include <linux/vmalloc.h>
>  #include <linux/errno.h>
> +#include <linux/module.h>
>   
>  #include "event_buffer.h"
>  #include "cpu_buffer.h"
> @@ -257,6 +258,7 @@ void oprofile_add_pc(unsigned long pc, i
>  	struct oprofile_cpu_buffer * cpu_buf = &cpu_buffer[smp_processor_id()];
>  	log_sample(cpu_buf, pc, is_kernel, event);
>  }
> +EXPORT_SYMBOL_GPL(oprofile_add_pc);
>  
>  void oprofile_add_trace(unsigned long pc)
>  {
> --- a/drivers/oprofile/oprof.c
> +++ b/drivers/oprofile/oprof.c
> @@ -179,10 +179,23 @@ out:
>  	return err;
>  }
>  
> -static int __init oprofile_init(void)
> +#ifdef CONFIG_OPROFILE_CELL
> +struct oprofile_registration_ops *oprofile_registration_ops;
> +EXPORT_SYMBOL_GPL(oprofile_registration_ops);
> +
> +DEFINE_SPINLOCK(oprofile_register_lock);
> +static int oprofile_lock_init(void);
> +static int oprofile_lock_exit(void);
> +static int oprofile_lock_reregister(void);
> +#endif
> +
> +static int
> +#ifndef CONFIG_OPROFILE_CELL
> +__init
> +#endif
> +oprofile_init(void)
>  {
>  	int err;
> -
>  	err = oprofile_arch_init(&oprofile_ops);
>  
>  	if (err < 0 || timer) {
> @@ -193,21 +206,99 @@ static int __init oprofile_init(void)
>  	err = oprofilefs_register();
>  	if (err)
>  		oprofile_arch_exit();
> +#ifdef CONFIG_OPROFILE_CELL
> +	else if (!oprofile_registration_ops) {
> +		oprofile_registration_ops = kzalloc(
> +			sizeof(struct oprofile_registration_ops),
> +			GFP_KERNEL);
> +		if (!oprofile_registration_ops)
> +			err = -ENOMEM;
> +	}
> +	if (err == 0) {
> +		oprofile_registration_ops->oprofile_lock_init =
> +			oprofile_lock_init;
> +		oprofile_registration_ops->oprofile_lock_exit =
> +			oprofile_lock_exit;
> +		oprofile_registration_ops->oprofile_lock_reregister =
> +			oprofile_lock_reregister;
> +		pr_debug("oprofile initialised successfully");
> +	} else {
> +		kfree(oprofile_registration_ops);
> +		oprofile_registration_ops = NULL;
> +	}
> +#endif
>  
>  	return err;
>  }
>  
> 
> -static void __exit oprofile_exit(void)
> +static void
> +#ifndef CONFIG_OPROFILE_CELL
> +__exit
> +#endif
> + oprofile_exit(void)
>  {
> +
>  	oprofilefs_unregister();
>  	oprofile_arch_exit();
> +#ifdef CONFIG_OPROFILE_CELL
> +	kfree(oprofile_registration_ops);
> +	oprofile_registration_ops = NULL;
> +#endif
> +
> +}
> +
> +#ifdef CONFIG_OPROFILE_CELL
> +
> +/* These functions can reenter/recurse when called from
> + *arch/powerpc/oprofile/common.c that is why I'm using trylock
> + * instead of lock.
> + */
> +static int oprofile_lock_init(void)
> +{
> +	int err = 0;
> +
> +	if (spin_trylock(&oprofile_register_lock)) {
> +		if (oprofile_registration_ops == NULL)
> +			oprofile_init();
> +		spin_unlock(&oprofile_register_lock);
> +	} else
> +		err = -EAGAIN;
> +	return err;
>  }
>  
> +static int oprofile_lock_exit(void)
> +{
> +	int err = 0;
> +	if (spin_trylock(&oprofile_register_lock)) {
> +		if (oprofile_registration_ops)
> +			oprofile_exit();
> +		spin_unlock(&oprofile_register_lock);
> +	} else
> +		err = -EAGAIN;
> +	return err;
> +}
> +
> +
> +static int oprofile_lock_reregister(void)
> +{
> +	int err;
> +	pr_debug("oprofile_lock_reregister\n");
> +	if (spin_trylock(&oprofile_register_lock)) {
> +		if (oprofile_registration_ops)
> +			oprofile_exit();
> +		err = oprofile_init();
> +		spin_unlock(&oprofile_register_lock);
> +	} else
> +		err = -EAGAIN;
> +	return err;
> +}
> +#endif
>   
>  module_init(oprofile_init);
>  module_exit(oprofile_exit);
>  
> +
>  module_param_named(timer, timer, int, 0644);
>  MODULE_PARM_DESC(timer, "force use of timer interrupt");
>   
> --- a/drivers/oprofile/oprof.h
> +++ b/drivers/oprofile/oprof.h
> @@ -35,5 +35,17 @@ void oprofile_create_files(struct super_
>  void oprofile_timer_init(struct oprofile_operations * ops);
>  
>  int oprofile_set_backtrace(unsigned long depth);
> - 
> +
> +#ifdef CONFIG_OPROFILE_CELL
> +struct oprofile_registration_ops {
> +	int (*oprofile_lock_init)(void);
> +	int (*oprofile_lock_exit)(void);
> +	int (*oprofile_lock_reregister)(void);
> +};
> +
> +
> +extern struct oprofile_registration_ops *oprofile_registration_ops;
> +
> +#endif
> +
>  #endif /* OPROF_H */
> --- a/drivers/ps3/ps3-lpm.c
> +++ b/drivers/ps3/ps3-lpm.c
> @@ -24,8 +24,20 @@
>  #include <linux/uaccess.h>
>  #include <asm/ps3.h>
>  #include <asm/lv1call.h>
> +#include <linux/firmware.h>
> +#include <linux/io.h>
> +#include <linux/ptrace.h>
>  #include <asm/cell-pmu.h>
> -
> +#include <asm/cputable.h>
> +#include <asm/pmc.h>
> +#include <asm/irq_regs.h>
> +#include <../arch/powerpc/oprofile/cell/pr_util.h>
> +#include <asm/firmware.h>
> +#include <../drivers/oprofile/oprof.h>
> +
> +#ifndef CONFIG_OPROFILE_CELL
> +#error "ps3-lpm.c CONFIG_OPROFILE_CELL is not set."
> +#endif
>  
>  /* BOOKMARK tag macros */
>  #define PS3_PM_BOOKMARK_START                    0x8000000000000000ULL
> @@ -71,6 +83,9 @@
>  #define PM_SIG_GROUP_SPU_EVENT               43
>  #define PM_SIG_GROUP_MFC_MAX                 60
>  
> +static u32 oprofile_count_value[NR_PHYS_CTRS];
> +
> +
>  /**
>   * struct ps3_lpm_shadow_regs - Performance monitor shadow registers.
>   *
> @@ -152,6 +167,8 @@ enum {
>  
>  static struct ps3_lpm_priv *lpm_priv;
>  
> +static u32 ps3_get_ctr_size(u32 cpu, u32 phys_ctr);
> +
>  static struct device *sbd_core(void)
>  {
>  	BUG_ON(!lpm_priv || !lpm_priv->sbd);
> @@ -170,7 +187,7 @@ static struct device *sbd_core(void)
>  
>  enum {use_start_stop_bookmark = 1,};
>  
> -void ps3_set_bookmark(u64 bookmark)
> +static void ps3_set_bookmark(u64 bookmark)
>  {
>  	/*
>  	 * As per the PPE book IV, to avoid bookmark loss there must
> @@ -183,9 +200,10 @@ void ps3_set_bookmark(u64 bookmark)
>  	mtspr(SPRN_BKMK, bookmark);
>  	asm volatile("nop;nop;nop;nop;nop;nop;nop;nop;nop;");
>  }
> -EXPORT_SYMBOL_GPL(ps3_set_bookmark);
>  
> -void ps3_set_pm_bookmark(u64 tag, u64 incident, u64 th_id)
> +
> +#if 0
> +static void ps3_set_pm_bookmark(u64 tag, u64 incident, u64 th_id)
>  {
>  	u64 bookmark;
>  
> @@ -195,7 +213,7 @@ void ps3_set_pm_bookmark(u64 tag, u64 in
>  		(incident << 48) | (th_id << 32) | bookmark;
>  	ps3_set_bookmark(bookmark);
>  }
> -EXPORT_SYMBOL_GPL(ps3_set_pm_bookmark);
> +#endif
>  
>  /**
>   * ps3_read_phys_ctr - Read physical counter registers.
> @@ -204,7 +222,7 @@ EXPORT_SYMBOL_GPL(ps3_set_pm_bookmark);
>   * counters.
>   */
>  
> -u32 ps3_read_phys_ctr(u32 cpu, u32 phys_ctr)
> +static u32 ps3_read_phys_ctr(u32 cpu, u32 phys_ctr)
>  {
>  	int result;
>  	u64 counter0415;
> @@ -239,7 +257,7 @@ u32 ps3_read_phys_ctr(u32 cpu, u32 phys_
>  	}
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_read_phys_ctr);
> +
>  
>  /**
>   * ps3_write_phys_ctr - Write physical counter registers.
> @@ -248,7 +266,7 @@ EXPORT_SYMBOL_GPL(ps3_read_phys_ctr);
>   * counters.
>   */
>  
> -void ps3_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val)
> +static void ps3_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val)
>  {
>  	u64 counter0415;
>  	u64 counter0415_mask;
> @@ -300,7 +318,7 @@ void ps3_write_phys_ctr(u32 cpu, u32 phy
>  			"phys_ctr %u, val %u, %s\n", __func__, __LINE__,
>  			phys_ctr, val, ps3_result(result));
>  }
> -EXPORT_SYMBOL_GPL(ps3_write_phys_ctr);
> +
>  
>  /**
>   * ps3_read_ctr - Read counter.
> @@ -309,7 +327,7 @@ EXPORT_SYMBOL_GPL(ps3_write_phys_ctr);
>   * Counters 4, 5, 6 & 7 are always 16 bit.
>   */
>  
> -u32 ps3_read_ctr(u32 cpu, u32 ctr)
> +static u32 ps3_read_ctr(u32 cpu, u32 ctr)
>  {
>  	u32 val;
>  	u32 phys_ctr = ctr & (NR_PHYS_CTRS - 1);
> @@ -321,7 +339,7 @@ u32 ps3_read_ctr(u32 cpu, u32 ctr)
>  
>  	return val;
>  }
> -EXPORT_SYMBOL_GPL(ps3_read_ctr);
> +
>  
>  /**
>   * ps3_write_ctr - Write counter.
> @@ -330,7 +348,7 @@ EXPORT_SYMBOL_GPL(ps3_read_ctr);
>   * Counters 4, 5, 6 & 7 are always 16 bit.
>   */
>  
> -void ps3_write_ctr(u32 cpu, u32 ctr, u32 val)
> +static void ps3_write_ctr(u32 cpu, u32 ctr, u32 val)
>  {
>  	u32 phys_ctr;
>  	u32 phys_val;
> @@ -348,7 +366,7 @@ void ps3_write_ctr(u32 cpu, u32 ctr, u32
>  
>  	ps3_write_phys_ctr(cpu, phys_ctr, val);
>  }
> -EXPORT_SYMBOL_GPL(ps3_write_ctr);
> +
>  
>  /**
>   * ps3_read_pm07_control - Read counter control registers.
> @@ -356,11 +374,11 @@ EXPORT_SYMBOL_GPL(ps3_write_ctr);
>   * Each logical counter has a corresponding control register.
>   */
>  
> -u32 ps3_read_pm07_control(u32 cpu, u32 ctr)
> +static u32 ps3_read_pm07_control(u32 cpu, u32 ctr)
>  {
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_read_pm07_control);
> +
>  
>  /**
>   * ps3_write_pm07_control - Write counter control registers.
> @@ -368,7 +386,7 @@ EXPORT_SYMBOL_GPL(ps3_read_pm07_control)
>   * Each logical counter has a corresponding control register.
>   */
>  
> -void ps3_write_pm07_control(u32 cpu, u32 ctr, u32 val)
> +static void ps3_write_pm07_control(u32 cpu, u32 ctr, u32 val)
>  {
>  	int result;
>  	static const u64 mask = 0xFFFFFFFFFFFFFFFFULL;
> @@ -387,13 +405,13 @@ void ps3_write_pm07_control(u32 cpu, u32
>  			"failed: ctr %u, %s\n", __func__, __LINE__, ctr,
>  			ps3_result(result));
>  }
> -EXPORT_SYMBOL_GPL(ps3_write_pm07_control);
> +
>  
>  /**
>   * ps3_read_pm - Read Other LPM control registers.
>   */
>  
> -u32 ps3_read_pm(u32 cpu, enum pm_reg_name reg)
> +static u32 ps3_read_pm(u32 cpu, enum pm_reg_name reg)
>  {
>  	int result = 0;
>  	u64 val = 0;
> @@ -439,13 +457,13 @@ u32 ps3_read_pm(u32 cpu, enum pm_reg_nam
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_read_pm);
> +
>  
>  /**
>   * ps3_write_pm - Write Other LPM control registers.
>   */
>  
> -void ps3_write_pm(u32 cpu, enum pm_reg_name reg, u32 val)
> +static void ps3_write_pm(u32 cpu, enum pm_reg_name reg, u32 val)
>  {
>  	int result = 0;
>  	u64 dummy;
> @@ -507,7 +525,7 @@ void ps3_write_pm(u32 cpu, enum pm_reg_n
>  			"reg %u, %s\n", __func__, __LINE__, reg,
>  			ps3_result(result));
>  }
> -EXPORT_SYMBOL_GPL(ps3_write_pm);
> +
>  
>  /**
>   * ps3_get_ctr_size - Get the size of a physical counter.
> @@ -515,7 +533,7 @@ EXPORT_SYMBOL_GPL(ps3_write_pm);
>   * Returns either 16 or 32.
>   */
>  
> -u32 ps3_get_ctr_size(u32 cpu, u32 phys_ctr)
> +static u32 ps3_get_ctr_size(u32 cpu, u32 phys_ctr)
>  {
>  	u32 pm_ctrl;
>  
> @@ -528,13 +546,13 @@ u32 ps3_get_ctr_size(u32 cpu, u32 phys_c
>  	pm_ctrl = ps3_read_pm(cpu, pm_control);
>  	return (pm_ctrl & CBE_PM_16BIT_CTR(phys_ctr)) ? 16 : 32;
>  }
> -EXPORT_SYMBOL_GPL(ps3_get_ctr_size);
> +
>  
>  /**
>   * ps3_set_ctr_size - Set the size of a physical counter to 16 or 32 bits.
>   */
>  
> -void ps3_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size)
> +static void ps3_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size)
>  {
>  	u32 pm_ctrl;
>  
> @@ -560,7 +578,7 @@ void ps3_set_ctr_size(u32 cpu, u32 phys_
>  		BUG();
>  	}
>  }
> -EXPORT_SYMBOL_GPL(ps3_set_ctr_size);
> +
>  
>  static u64 pm_translate_signal_group_number_on_island2(u64 subgroup)
>  {
> @@ -770,7 +788,7 @@ static int __ps3_set_signal(u64 lv1_sign
>  	return ret;
>  }
>  
> -int ps3_set_signal(u64 signal_group, u8 signal_bit, u16 sub_unit,
> +static int ps3_set_signal(u64 signal_group, u8 signal_bit, u16 sub_unit,
>  		   u8 bus_word)
>  {
>  	int ret;
> @@ -830,13 +848,13 @@ int ps3_set_signal(u64 signal_group, u8 
>  
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(ps3_set_signal);
>  
> -u32 ps3_get_hw_thread_id(int cpu)
> +
> +static u32 ps3_get_hw_thread_id(int cpu)
>  {
>  	return get_hard_smp_processor_id(cpu);
>  }
> -EXPORT_SYMBOL_GPL(ps3_get_hw_thread_id);
> +
>  
>  /**
>   * ps3_enable_pm - Enable the entire performance monitoring unit.
> @@ -844,7 +862,7 @@ EXPORT_SYMBOL_GPL(ps3_get_hw_thread_id);
>   * When we enable the LPM, all pending writes to counters get committed.
>   */
>  
> -void ps3_enable_pm(u32 cpu)
> +static void ps3_enable_pm(u32 cpu)
>  {
>  	int result;
>  	u64 tmp;
> @@ -882,13 +900,13 @@ void ps3_enable_pm(u32 cpu)
>  	if (use_start_stop_bookmark && !result && insert_bookmark)
>  		ps3_set_bookmark(get_tb() | PS3_PM_BOOKMARK_START);
>  }
> -EXPORT_SYMBOL_GPL(ps3_enable_pm);
> +
>  
>  /**
>   * ps3_disable_pm - Disable the entire performance monitoring unit.
>   */
>  
> -void ps3_disable_pm(u32 cpu)
> +static void ps3_disable_pm(u32 cpu)
>  {
>  	int result;
>  	u64 tmp;
> @@ -909,7 +927,15 @@ void ps3_disable_pm(u32 cpu)
>  	dev_dbg(sbd_core(), "%s:%u: tb_count %lu (%lxh)\n", __func__, __LINE__,
>  		lpm_priv->tb_count, lpm_priv->tb_count);
>  }
> -EXPORT_SYMBOL_GPL(ps3_disable_pm);
> +
> +
> +/* This is a fixme DJB */
> +static void ps3_read_trace_buffer(u32 cpu, u64 *buf)
> +{
> +	*buf++ = 0;
> +	*buf++ = 0;
> +}
> +
>  
>  /**
>   * ps3_lpm_copy_tb - Copy data from the trace buffer to a kernel buffer.
> @@ -922,8 +948,8 @@ EXPORT_SYMBOL_GPL(ps3_disable_pm);
>   * On error @buf will contain any successfully copied trace buffer data
>   * and bytes_copied will be set to the number of bytes successfully copied.
>   */
> -
> -int ps3_lpm_copy_tb(unsigned long offset, void *buf, unsigned long count,
> +#if 0
> +static int ps3_lpm_copy_tb(unsigned long offset, void *buf, unsigned long count,
>  		    unsigned long *bytes_copied)
>  {
>  	int result;
> @@ -964,7 +990,7 @@ int ps3_lpm_copy_tb(unsigned long offset
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_lpm_copy_tb);
> +
>  
>  /**
>   * ps3_lpm_copy_tb_to_user - Copy data from the trace buffer to a user buffer.
> @@ -978,7 +1004,7 @@ EXPORT_SYMBOL_GPL(ps3_lpm_copy_tb);
>   * and bytes_copied will be set to the number of bytes successfully copied.
>   */
>  
> -int ps3_lpm_copy_tb_to_user(unsigned long offset, void __user *buf,
> +static int ps3_lpm_copy_tb_to_user(unsigned long offset, void __user *buf,
>  			    unsigned long count, unsigned long *bytes_copied)
>  {
>  	int result;
> @@ -1027,7 +1053,7 @@ int ps3_lpm_copy_tb_to_user(unsigned lon
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_lpm_copy_tb_to_user);
> +#endif /* 0 */
>  
>  /**
>   * ps3_get_and_clear_pm_interrupts -
> @@ -1036,11 +1062,11 @@ EXPORT_SYMBOL_GPL(ps3_lpm_copy_tb_to_use
>   * Reading pm_status clears the interrupt bits.
>   */
>  
> -u32 ps3_get_and_clear_pm_interrupts(u32 cpu)
> +static u32 ps3_get_and_clear_pm_interrupts(u32 cpu)
>  {
>  	return ps3_read_pm(cpu, pm_status);
>  }
> -EXPORT_SYMBOL_GPL(ps3_get_and_clear_pm_interrupts);
> +
>  
>  /**
>   * ps3_enable_pm_interrupts -
> @@ -1049,12 +1075,12 @@ EXPORT_SYMBOL_GPL(ps3_get_and_clear_pm_i
>   * Enables the interrupt bits in the pm_status register.
>   */
>  
> -void ps3_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask)
> +static void ps3_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask)
>  {
>  	if (mask)
>  		ps3_write_pm(cpu, pm_status, mask);
>  }
> -EXPORT_SYMBOL_GPL(ps3_enable_pm_interrupts);
> +
>  
>  /**
>   * ps3_enable_pm_interrupts -
> @@ -1062,12 +1088,12 @@ EXPORT_SYMBOL_GPL(ps3_enable_pm_interrup
>   * Disabling interrupts for the entire performance monitoring unit.
>   */
>  
> -void ps3_disable_pm_interrupts(u32 cpu)
> +static void ps3_disable_pm_interrupts(u32 cpu)
>  {
>  	ps3_get_and_clear_pm_interrupts(cpu);
>  	ps3_write_pm(cpu, pm_status, 0);
>  }
> -EXPORT_SYMBOL_GPL(ps3_disable_pm_interrupts);
> +
>  
>  /**
>   * ps3_lpm_open - Open the logical performance monitor device.
> @@ -1080,7 +1106,7 @@ EXPORT_SYMBOL_GPL(ps3_disable_pm_interru
>   *  Unused when @tb_cache is NULL or @tb_type is PS3_LPM_TB_TYPE_NONE.
>   */
>  
> -int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void *tb_cache,
> +static int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void *tb_cache,
>  	u64 tb_cache_size)
>  {
>  	int result;
> @@ -1160,14 +1186,14 @@ fail_align:
>  	atomic_dec(&lpm_priv->open);
>  	return result;
>  }
> -EXPORT_SYMBOL_GPL(ps3_lpm_open);
> +
>  
>  /**
>   * ps3_lpm_close - Close the lpm device.
>   *
>   */
>  
> -int ps3_lpm_close(void)
> +static int ps3_lpm_close(void)
>  {
>  	dev_dbg(sbd_core(), "%s:%u\n", __func__, __LINE__);
>  
> @@ -1180,7 +1206,7 @@ int ps3_lpm_close(void)
>  	atomic_dec(&lpm_priv->open);
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(ps3_lpm_close);
> +
>  
>  static int __devinit ps3_lpm_probe(struct ps3_system_bus_device *dev)
>  {
> @@ -1229,9 +1255,248 @@ static struct ps3_system_bus_driver ps3_
>  	.shutdown	= ps3_lpm_remove,
>  };
>  
> +
> +
> +static u32 ps3_cpu_to_node(int cpu)
> +{
> +	return 0;
> +}
> +
> +static void ps3_set_pm_event_part1(struct pm_signal *p)
> +{
> +	/*
> +	 * This parameter is used to specify the target physical/logical
> +	 * PPE/SPE object.
> +	 */
> +	if (p->signal_group < 42 || 56 < p->signal_group)
> +		p->sub_unit = 1;
> +}
> +
> +static void ps3_set_pm_event_part2(struct pm_signal *p, u32  *bus_type)
> +{
> +	if ((*bus_type == 0) &&
> +	    (30 <= p->signal_group && p->signal_group <= 40))
> +		*bus_type = 2;
> +}
> +
> +
> +static void ps3_virtual_cntr_part1(int num_counters,
> +				   int prev_hdw_thread,
> +				   int next_hdw_thread, u32 cpu)
> +{
> +}
> +
> +static void ps3_add_sample(u32 cpu)
> +{
> +	struct pt_regs *regs;
> +	u64 pc;
> +	int is_kernel;
> +	int i;
> +	u32 value;
> +
> +	regs = get_irq_regs();
> +	if (oprofile_running == 1) {
> +		pc = regs->nip;
> +		is_kernel = is_kernel_addr(pc);
> +
> +		for (i = 0; i < oprofile_num_counters; ++i) {
> +			value = pmu_ops->read_ctr(cpu, i);
> +			if (value >= oprofile_count_value[i] &&
> +			    oprofile_count_value[i] != 0) {
> +				OP_DBG("pmu:add_sample ctr:%d"
> +				       " value:0x%x reset:0x%x count:0x%x",
> +				       i, value, oprofile_reset_value[i],
> +				       oprofile_count_value[i]);
> +				oprofile_add_pc(pc, is_kernel, i);
> +				ps3_write_ctr(cpu, i, oprofile_reset_value[i]);
> +			}
> +		}
> +	}
> +}
> +
> +static int ps3_reg_setup_part1(struct op_counter_config *ctr)
> +{
> +	int ret;
> +
> +	spu_cycle_reset = 0;
> +
> +	ret = ps3_lpm_open(PS3_LPM_TB_TYPE_NONE, NULL, 0);
> +	if (ret) {
> +		OP_ERR("lpm_open error. %d", ret);
> +		return -EFAULT;
> +	}
> +
> +	if (ctr[0].event == SPU_CYCLES_EVENT_NUM)
> +		spu_cycle_reset = ctr[0].count;
> +	return 0;
> +
> +}
> +
> +static void ps3_reg_setup_part2(int i, struct op_counter_config *ctr)
> +{
> +	oprofile_reset_value[i] = 0;
> +	oprofile_count_value[i] = ctr[i].count;
> +}
> +
> +static int ps3_pm_activate_signals(u32 node, u32 count)
> +{
> +	int i, j;
> +	struct pm_signal pm_signal_local[NR_PHYS_CTRS];
> +
> +	/*
> +	 * There is no debug setup required for the cycles event.
> +	 * Note that only events in the same group can be used.
> +	 * Otherwise, there will be conflicts in correctly routing
> +	 * the signals on the debug bus.  It is the responsiblity
> +	 * of the OProfile user tool to check the events are in
> +	 * the same group.
> +	 */
> +	i = 0;
> +	for (j = 0; j < count; j++) {
> +		if (pm_signal[j].signal_group != PPU_CYCLES_GRP_NUM) {
> +
> +			/* fw expects physical cpu # */
> +			pm_signal_local[i].cpu = node;
> +			pm_signal_local[i].signal_group
> +			    = pm_signal[j].signal_group;
> +			pm_signal_local[i].bus_word =
> +			    pm_signal[j].bus_word;
> +			pm_signal_local[i].sub_unit =
> +			    pm_signal[j].sub_unit;
> +			pm_signal_local[i].bit = pm_signal[j].bit;
> +			ps3_set_signal(pm_signal[j].signal_group,
> +				       pm_signal[j].bit,
> +				       pm_signal[j].sub_unit,
> +				       pm_signal[j].bus_word);
> +			i++;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int ps3_global_start_spu(struct op_counter_config *ctr)
> +{
> +	return -ENOSYS;
> +}
> +
> +static void ps3_global_stop_spu(void)
> +{
> +}
> +
> +static void ps3_global_stop_ppu(void)
> +{
> +	int cpu;
> +
> +	/*
> +	 * This routine will be called once for the system.
> +	 * There is one performance monitor per node, so we
> +	 * only need to perform this function once per node.
> +	 */
> +	del_timer_sync(&oprofile_timer_virt_cntr);
> +	oprofile_running = 0;
> +	/* complete the previous store */
> +	smp_wmb();
> +
> +	for_each_online_cpu(cpu) {
> +		if (ps3_get_hw_thread_id(cpu))
> +			continue;
> +		/* Stop the counters */
> +		ps3_disable_pm(cpu);
> +
> +		/* Deactivate the signals */
> +		ps3_set_signal(0, 0, 0, 0);	/*clear all */
> +
> +		/* Deactivate interrupts */
> +		ps3_disable_pm_interrupts(cpu);
> +	}
> +}
> +
> +
> +
> +struct pmu_ops pmu_ops_ps3 = {
> +	.read_phys_ctr               = ps3_read_phys_ctr,
> +	.write_phys_ctr              = ps3_write_phys_ctr,
> +	.read_ctr                    = ps3_read_ctr,
> +	.write_ctr	             = ps3_write_ctr,
> +	.read_pm07_control           = ps3_read_pm07_control,
> +	.write_pm07_control          = ps3_write_pm07_control,
> +	.read_pm                     = ps3_read_pm,
> +	.write_pm                    = ps3_write_pm,
> +	.get_ctr_size                = ps3_get_ctr_size,
> +	.set_ctr_size                = ps3_set_ctr_size,
> +	.enable_pm                   = ps3_enable_pm,
> +	.disable_pm                  = ps3_disable_pm,
> +	.read_trace_buffer           = ps3_read_trace_buffer,
> +	.get_and_clear_pm_interrupts = ps3_get_and_clear_pm_interrupts,
> +	.enable_pm_interrupts        = ps3_enable_pm_interrupts,
> +	.disable_pm_interrupts       = ps3_disable_pm_interrupts,
> +	.pmu_cpu_to_node             = ps3_cpu_to_node,
> +	.get_hw_thread_id            = ps3_get_hw_thread_id,
> +	.set_pm_event_part1          = ps3_set_pm_event_part1,
> +	.set_pm_event_part2          = ps3_set_pm_event_part2,
> +	.set_count_mode_var1         = CBE_COUNT_PROBLEM_MODE,
> +	.set_count_mode_var2         = CBE_COUNT_PROBLEM_MODE,
> +	.virtual_cntr_part1          = ps3_virtual_cntr_part1,
> +	.add_sample                  = ps3_add_sample,
> +	.reg_setup_part1             = ps3_reg_setup_part1,
> +	.reg_setup_part2             = ps3_reg_setup_part2,
> +	.pm_activate_signals         = ps3_pm_activate_signals,
> +	.global_start_spu            = ps3_global_start_spu,
> +	.global_stop_spu             = ps3_global_stop_spu,
> +	.global_stop_ppu             = ps3_global_stop_ppu,
> +};
> +
> +/*
> + * This function is called from the generic OProfile
> + * driver.  When profiling PPUs, we need to do the
> + * generic sync start; otherwise, do spu_sync_start.
> + */
> +static int ps3_sync_start(void)
> +{
> +	OP_ERR("PS3 oprofile support");
> +	return DO_GENERIC_SYNC;
> +}
> +
> +static int ps3_sync_stop(void)
> +{
> +	int ret;
> +
> +	ret = ps3_lpm_close();
> +	if (ret)
> +		OP_ERR("lpm_close error. %d", ret);
> +
> +	return 1;
> +	OP_ERR("PS3 oprofile support");
> +	return DO_GENERIC_SYNC;
> +}
> +
> +
> +
> +
> +struct op_powerpc_model op_model_ps3 = {
> +	.reg_setup = cell_reg_setup,
> +	.cpu_setup = cell_cpu_setup,
> +	.global_start = cell_global_start,
> +	.global_stop = cell_global_stop,
> +	.sync_start = ps3_sync_start,
> +	.sync_stop = ps3_sync_stop,
> +};
> +
> +
> +
>  static int __init ps3_lpm_init(void)
>  {
>  	pr_debug("%s:%d:\n", __func__, __LINE__);
> +	if (firmware_has_feature(FW_FEATURE_PS3_LV1)) {
> +		pr_debug("%s:%d:\n", __func__, __LINE__);
> +		pmu_ops = &pmu_ops_ps3;
> +		op_powerpc_model = &op_model_ps3;
> +		if (oprofile_registration_ops) {
> +			pr_debug("%s:%d:\n", __func__, __LINE__);
> +			oprofile_registration_ops->oprofile_lock_reregister();
> +		}
> +	} else
> +		return -ENODEV;
>  	return ps3_system_bus_driver_register(&ps3_lpm_driver);
>  }
>  
> @@ -1239,6 +1504,12 @@ static void __exit ps3_lpm_exit(void)
>  {
>  	pr_debug("%s:%d:\n", __func__, __LINE__);
>  	ps3_system_bus_driver_unregister(&ps3_lpm_driver);
> +	if (pmu_ops == &pmu_ops_ps3) {
> +		if (oprofile_registration_ops)
> +			oprofile_registration_ops->oprofile_lock_exit();
> +		pmu_ops = NULL;
> +		op_powerpc_model = NULL;
> +	}
>  }
>  
>  module_init(ps3_lpm_init);
> @@ -1248,3 +1519,7 @@ MODULE_LICENSE("GPL v2");
>  MODULE_DESCRIPTION("PS3 Logical Performance Monitor Driver");
>  MODULE_AUTHOR("Sony Corporation");
>  MODULE_ALIAS(PS3_MODULE_ALIAS_LPM);
> +
> +
> +
> +
> --- a/include/asm-powerpc/cell-pmu.h
> +++ b/include/asm-powerpc/cell-pmu.h
> @@ -24,6 +24,8 @@
>  
>  #ifndef __ASM_CELL_PMU_H__
>  #define __ASM_CELL_PMU_H__
> +#include <asm/oprofile_impl.h>
> +#include <asm/percpu.h>
>  
>  /* The Cell PMU has four hardware performance counters, which can be
>   * configured as four 32-bit counters or eight 16-bit counters.
> @@ -73,33 +75,97 @@ enum pm_reg_name {
>  	pm_start_stop,
>  };
>  
> -/* Routines for reading/writing the PMU registers. */
> -extern u32  cbe_read_phys_ctr(u32 cpu, u32 phys_ctr);
> -extern void cbe_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val);
> -extern u32  cbe_read_ctr(u32 cpu, u32 ctr);
> -extern void cbe_write_ctr(u32 cpu, u32 ctr, u32 val);
> -
> -extern u32  cbe_read_pm07_control(u32 cpu, u32 ctr);
> -extern void cbe_write_pm07_control(u32 cpu, u32 ctr, u32 val);
> -extern u32  cbe_read_pm(u32 cpu, enum pm_reg_name reg);
> -extern void cbe_write_pm(u32 cpu, enum pm_reg_name reg, u32 val);
> -
> -extern u32  cbe_get_ctr_size(u32 cpu, u32 phys_ctr);
> -extern void cbe_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size);
> -
> -extern void cbe_enable_pm(u32 cpu);
> -extern void cbe_disable_pm(u32 cpu);
> -
> -extern void cbe_read_trace_buffer(u32 cpu, u64 *buf);
> -
> -extern void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask);
> -extern void cbe_disable_pm_interrupts(u32 cpu);
> -extern u32  cbe_get_and_clear_pm_interrupts(u32 cpu);
> -extern void cbe_sync_irq(int node);
> -
>  #define CBE_COUNT_SUPERVISOR_MODE       0
>  #define CBE_COUNT_HYPERVISOR_MODE       1
>  #define CBE_COUNT_PROBLEM_MODE          2
>  #define CBE_COUNT_ALL_MODES             3
>  
> +
> +#define NUM_SPUS_PER_NODE    8
> +#define SPU_CYCLES_EVENT_NUM 2	/*  event number for SPU_CYCLES */
> +
> +#define PPU_CYCLES_EVENT_NUM 1	/*  event number for CYCLES */
> +#define PPU_CYCLES_GRP_NUM   1	/* special group number for identifying
> +				 * PPU_CYCLES event
> +				 */
> +#define CBE_COUNT_ALL_CYCLES 0x42800000 /* PPU cycle event specifier */
> +
> +#define PPU_NUM_THREADS 2         /* number of physical threads in
> +			       * physical processor
> +			       */
> +#define NUM_DEBUG_BUS_WORDS 4
> +#define NUM_INPUT_BUS_WORDS 2
> +
> +#define MAX_SPU_COUNT 0xFFFFFF	/* maximum 24 bit LFSR value */
> +
> +#define OP_ERR(f, x...)  pr_info("pmu: " f "\n", ## x)
> +#define OP_DBG(f, x...)  pr_debug("pmu: " f "\n", ## x)
> +
> +/*
> + * ibm,cbe-perftools rtas parameters
> + */
> +struct pm_signal {
> +	u16 cpu;		/* Processor to modify */
> +	u16 sub_unit;		/* hw subunit this applies to (if applicable)*/
> +	short int signal_group; /* Signal Group to Enable/Disable */
> +	u8 bus_word;		/* Enable/Disable on this Trace/Trigger/Event
> +				 * Bus Word(s) (bitmask)
> +				 */
> +	u8 bit;			/* Trigger/Event bit (if applicable) */
> +};
> +
> +struct pmu_ops {
> +	u32 (*read_phys_ctr)(u32 cpu, u32 phys_ctr);
> +	void (*write_phys_ctr)(u32 cpu, u32 phys_ctr, u32 val);
> +	u32 (*read_ctr)(u32 cpu, u32 ctr);
> +	void (*write_ctr)(u32 cpu, u32 ctr, u32 val);
> +	u32 (*read_pm07_control)(u32 cpu, u32 ctr);
> +	void (*write_pm07_control)(u32 cpu, u32 ctr, u32 val);
> +	u32 (*read_pm)(u32 cpu, enum pm_reg_name reg);
> +	void (*write_pm)(u32 cpu, enum pm_reg_name reg, u32 val);
> +	u32  (*get_ctr_size)(u32 cpu, u32 phys_ctr);
> +	void (*set_ctr_size)(u32 cpu, u32 phys_ctr, u32 ctr_size);
> +	void (*enable_pm)(u32 cpu);
> +	void (*disable_pm)(u32 cpu);
> +	void (*read_trace_buffer)(u32 cpu, u64 *buf);
> +	u32  (*get_and_clear_pm_interrupts)(u32 cpu);
> +	void (*enable_pm_interrupts)(u32 cpu, u32 thread, u32 mask);
> +	void (*disable_pm_interrupts)(u32 cpu);
> +	u32  (*pmu_cpu_to_node)(int cpu);
> +	u32  (*get_hw_thread_id)(int cpu);
> +	void (*set_pm_event_part1)(struct pm_signal *p);
> +	void (*set_pm_event_part2)(struct pm_signal *p, u32 *bus_type);
> +	u16   set_count_mode_var1;
> +	u16   set_count_mode_var2;
> +	void (*virtual_cntr_part1)(int num_counters, int prev_hdw_thread,
> +				   int next_hdw_thread, u32 cpu);
> +	void (*add_sample)(u32 cpu);
> +	int  (*reg_setup_part1)(struct op_counter_config *ctr);
> +	void  (*reg_setup_part2)(int i, struct op_counter_config *ctr);
> +	int  (*pm_activate_signals)(u32 node, u32 count);
> +	int  (*global_start_spu)(struct op_counter_config *ctr);
> +	void (*global_stop_spu)(void);
> +	void (*global_stop_ppu)(void);
> +};
> +
> +extern struct pmu_ops *pmu_ops;
> +extern struct op_powerpc_model *op_powerpc_model;
> +extern int oprofile_num_counters;
> +extern int oprofile_running;
> +extern unsigned int spu_cycle_reset;
> +extern u32 oprofile_hdw_thread;
> +extern struct pm_signal pm_signal[NR_PHYS_CTRS];
> +extern u32 oprofile_reset_value[NR_PHYS_CTRS];
> +extern int cell_reg_setup(struct op_counter_config *ctr,
> +			  struct op_system_config *sys, int num_ctrs);
> +extern int cell_cpu_setup(struct op_counter_config *cntr);
> +extern int cell_global_start(struct op_counter_config *ctr);
> +extern void cell_global_stop(void);
> +DECLARE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values);
> +extern spinlock_t oprofile_virt_cntr_lock;
> +extern u32 oprofile_virt_cntr_inter_mask;
> +extern struct timer_list oprofile_timer_virt_cntr;
> +extern int cbe_init_pm_irq(void);
> +extern void cbe_remove_pm_irq(void);
> +
>  #endif /* __ASM_CELL_PMU_H__ */
> --- a/include/asm-powerpc/oprofile_impl.h
> +++ b/include/asm-powerpc/oprofile_impl.h
> @@ -59,6 +59,7 @@ extern struct op_powerpc_model op_model_
>  extern struct op_powerpc_model op_model_power4;
>  extern struct op_powerpc_model op_model_7450;
>  extern struct op_powerpc_model op_model_cell;
> +extern struct op_powerpc_model op_model_ps3;
>  extern struct op_powerpc_model op_model_pa6t;
>  
> 
> --- a/include/asm-powerpc/ps3.h
> +++ b/include/asm-powerpc/ps3.h
> @@ -475,39 +475,4 @@ enum ps3_lpm_tb_type {
>  	PS3_LPM_TB_TYPE_INTERNAL = 1,
>  };
>  
> -int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void *tb_cache,
> -	u64 tb_cache_size);
> -int ps3_lpm_close(void);
> -int ps3_lpm_copy_tb(unsigned long offset, void *buf, unsigned long count,
> -	unsigned long *bytes_copied);
> -int ps3_lpm_copy_tb_to_user(unsigned long offset, void __user *buf,
> -	unsigned long count, unsigned long *bytes_copied);
> -void ps3_set_bookmark(u64 bookmark);
> -void ps3_set_pm_bookmark(u64 tag, u64 incident, u64 th_id);
> -int ps3_set_signal(u64 rtas_signal_group, u8 signal_bit, u16 sub_unit,
> -	u8 bus_word);
> -
> -u32 ps3_read_phys_ctr(u32 cpu, u32 phys_ctr);
> -void ps3_write_phys_ctr(u32 cpu, u32 phys_ctr, u32 val);
> -u32 ps3_read_ctr(u32 cpu, u32 ctr);
> -void ps3_write_ctr(u32 cpu, u32 ctr, u32 val);
> -
> -u32 ps3_read_pm07_control(u32 cpu, u32 ctr);
> -void ps3_write_pm07_control(u32 cpu, u32 ctr, u32 val);
> -u32 ps3_read_pm(u32 cpu, enum pm_reg_name reg);
> -void ps3_write_pm(u32 cpu, enum pm_reg_name reg, u32 val);
> -
> -u32 ps3_get_ctr_size(u32 cpu, u32 phys_ctr);
> -void ps3_set_ctr_size(u32 cpu, u32 phys_ctr, u32 ctr_size);
> -
> -void ps3_enable_pm(u32 cpu);
> -void ps3_disable_pm(u32 cpu);
> -void ps3_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask);
> -void ps3_disable_pm_interrupts(u32 cpu);
> -
> -u32 ps3_get_and_clear_pm_interrupts(u32 cpu);
> -void ps3_sync_irq(int node);
> -u32 ps3_get_hw_thread_id(int cpu);
> -u64 ps3_get_spe_id(void *arg);
> -
>  #endif
> 
> 
> 
> 
> 
> ************************************************************************
> The information contained in this message or any of its attachments may be confidential and is intended for the exclusive use of the addressee(s).  Any disclosure, reproduction, distribution or other dissemination or use of this communication is strictly prohibited without the express permission of the sender.  The views expressed in this email are those of the individual and not necessarily those of Sony or Sony affiliated companies.  Sony email is for business use only.
> 
> This email and any response may be monitored by Sony to be in compliance with Sonys global policies and standards
> _______________________________________________
> cbe-oss-dev mailing list
> cbe-oss-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev




More information about the cbe-oss-dev mailing list