[Skiboot] [PATCH v12 08/10] skiboot: Add opal call to enable/disable Nest IMC

Michael Neuling mikey at neuling.org
Thu Jun 15 13:47:57 AEST 2017


On Wed, 2017-06-14 at 20:53 +0530, Madhavan Srinivasan wrote:
> 
> On Wednesday 14 June 2017 10:06 AM, Michael Neuling wrote:
> > On Sun, 2017-05-21 at 20:40 +0530, Madhavan Srinivasan wrote:
> > > From: Anju T Sudhakar <anju at linux.vnet.ibm.com>
> > > 
> > > Add new opal calls to start, stop the Nest IMC microcode running in the
> > > OCC complex. Also, check the status from the control block structure
> > > before
> > > starting/stopping the IMC microcode.
> > > 
> > > This patch also add an opal call to init the counters. But for Nest IMC
> > > this
> > > function is a no-op currently.
> > > 
> > > Signed-off-by: Hemant Kumar <hemant at linux.vnet.ibm.com>
> > > Signed-off-by: Anju T Sudhakar <anju at linux.vnet.ibm.com>
> > > Signed-off-by: Madhavan Srinivasan <maddy at linux.vnet.ibm.com>
> > > ---
> > >   hw/imc.c           | 77
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >   include/opal-api.h | 11 +++++++-
> > >   2 files changed, 87 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/imc.c b/hw/imc.c
> > > index 00e19154d641..d56b3eb251b2 100644
> > > --- a/hw/imc.c
> > > +++ b/hw/imc.c
> > > @@ -336,3 +336,80 @@ err:
> > >   	prerror("IMC Devices not added\n");
> > >   	free(imc_catalog_buf);
> > >   }
> > > +
> > > +/*
> > > + * opal_imc_counters_init : This call initialize the IMC engine.
> > > + *
> > > + * For Nest IMC, this is no-op and returns OPAL_SUCCESS at this point.
> > > + */
> > > +static int64_t opal_imc_counters_init(uint32_t type)
> > > +{
> > > +	return OPAL_SUCCESS;
> > > +}
> > 
> > Why don't we do this on in imc_init()?  Why do we rely on the OS doing this?
> 
> Yes. From the nest imc point of view, we dont have any initialization
> today. But incase of core imc, we do have initialization procedure.
> That is, we need to update couple of scoms with a static value and
> one scom called "PDBAR" (scom spec did not expand what is PDBAR
> but will dig more and update it here) needs to be initialized with
> a memory location (where the core imc counter data will be posted
> per-the core logic). And this is a per-core resource.
> 
> Now, not doing this memory location by OPAL will avoid a
> complicated device tree node update from OPAL to pass this
> information to kernel. Secondly size request for the core imc
> is 8K (per core) and in a 2 socket or more we will increase
> the opal reserve memory area alot.
> 
> Now incase of doing it in kernel, at the kexec scenario, we
> first stop the core imc engine via OPAL_IMC_COUNTER_STOP
> and then we free the memory. And this is primarily done
> in the powernv device shutdown path.
> 
> And in the powernv device probe path we have a check
> for the kdump kernel, so we dont load or create IMC devices
> and hence will not do any OPAL_IMC* calls.

Ok, but the microcode will still be writing to memory in the kdump case.  It
would be good to be able to stop that with an opal call.



> > 
> > Anyway... as mentioned before, this alone doesn't compile.
> 
> Yes i will merge the patch 8 and 9 as mentioned before.
> 
> > 
> > > +opal_call(OPAL_IMC_COUNTERS_INIT, opal_imc_counters_init, 1);
> > > +
> > > +/* opal_imc_counters_control_start: This call starts the nest imc engine.
> > > */
> > > +static int64_t opal_imc_counters_start(uint32_t type)
> > > +{
> > > +	u64 op, status;
> > > +	struct imc_chip_cb *cb;
> > > +	int ret = OPAL_SUCCESS;
> > 
> > I'm not sure there is any benefit in having ret.  Makes the code less
> > readable.
> > Just do return OPAL_SUCCESS below.
> 
> Ok sure. Will make the changes.
> 
> > > +
> > > +	switch (type) {
> > > +	case OPAL_IMC_COUNTERS_NEST:
> > > +		/* Fetch the IMC control block structure */
> > > +		cb = get_imc_cb();
> > > +		status = be64_to_cpu(cb->imc_chip_run_status);
> > > +
> > > +		/* Check whether the engine is already running */
> > > +		if (status == NEST_IMC_RUNNING)
> > > +			return ret;
> > 
> > Do we need a memory barrier here?  What ensures ordering between the status
> > read
> > above and the command below?
> 
> Currently nest counters are per-chip and only one thread from a chip
> will update the command at a give time. Secondly, "command" is more
> an input to the microcode. OPAL will not read it back.

I still think there could be a race here.  Sequence seems to be:

1) Microcode sets status to running
2) Microcode start running and hence writing to buffer
3) CPU (in OPAL) reads status
4) CPU sends code

What ensures that at 3 the CPU sees the status correctly as running set in 1?

What are the consequences of sending a command when the microcode thinks it's
running in this scenario? Is it just that it keeps running?  If so, why do we
bother checking the status at all?

> > > +
> > > +		/* Set the run command */
> > > +		op = NEST_IMC_ENABLE;
> > > +
> > > +		/* Write the command to the control block now */
> > > +		cb->imc_chip_command = op;
> > 
> > Do we need a memory barrier before we can start reading these after this op?
> > and if so, who's responsibility is it do do that?  OPAL, or Linux?
> 
> Linux will pass on the state the engine should be and OPAL will update
> the command request from the linux. And microcode is the consumer
> when comes to "command" field. OPAL does not read it back. Now thinking
> more on it having a memory barrier will help. Will add it here.

I'm not sure we need anything... micro code just starts writing and linux will
just start seeing updates at some point.

Mikey


More information about the Skiboot mailing list