[Skiboot] [PATCH] System reset IPI facility and Mambo implementation

Nicholas Piggin npiggin at gmail.com
Fri Feb 3 19:28:47 AEDT 2017


On Fri, 03 Feb 2017 17:08:51 +1100
Stewart Smith <stewart at linux.vnet.ibm.com> wrote:

> Nicholas Piggin <npiggin at gmail.com> writes:
> > Add an opal call OPAL_SIGNAL_SYSTEM_RESET which allows system reset
> > exceptions to be raised on other CPUs and act as an NMI IPI. There
> > is an initial simple Mambo implementation, but allowances are made
> > for a more complex hardware implementation.
> >
> > This is based on the hardware implementation patch by Alistair Popple.
> >
> > Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> > ---
> > Hi,
> >
> > I'm not sure what the state of Alistair's patch is, but it may
> > make testing easier to initially add this mambo implementation.  
> 
> I'm hoping he spins a V2 on top of this patch :)
> 
> A thought... this is the kind of functionality that's ideal to get some
> kind of test case written for so that we don't go and by accident break
> it. So I'm CCing in Pridhiviraj so that it's on the todo list to get
> going as part of op-test-framework tests.
> 
> > diff --git a/doc/opal-api/opal-signal-system-reset-145.txt b/doc/opal-api/opal-signal-system-reset-145.txt
> > new file mode 100644
> > index 00000000..25b96dbb
> > --- /dev/null
> > +++ b/doc/opal-api/opal-signal-system-reset-145.txt  
> 
> FYI, we've moved to .rst and ReSTructured text, but it's not a big deal,
> I can convert when merging if needed.
> 
> > @@ -0,0 +1,31 @@
> > +OPAL_SIGNAL_SYSTEM_RESET
> > +-------------------
> > +
> > +#define OPAL_SIGNAL_SYSTEM_RESET			145
> > +
> > +int64_t signal_system_reset(int32_t cpu_nr)
> > +
> > +Arguments:
> > +
> > +  int32_t cpu_nr
> > +    Either the cpu server number of the target cpu to reset, or
> > +    SYS_RESET_ALL (-1) to indicate all cpus should be reset, or
> > +    SYS_RESET_ALL_OTHERS (-2) to indicate all but the current cpu
> > +    should be reset.
> > +
> > +This OPAL call causes the specified cpu(s) to be reset to the system
> > +reset exception handler (0x100).
> > +
> > +The exact contents of system registers (e.g., SRR1 wakeup causes) may
> > +vary depending on implementation and should not be relied upon.
> > +
> > +Resetting active threads on the same core as this call is run may
> > +not be supported by some platforms. In that case, OPAL_PARTIAL will be
> > +returned and NONE of the interrupts will be delivered.  
> 
> Cool - I gather the API definition has settled down a bit?

Yes, although we're still deciding whether to expose this OPAL_PARTIAL
and have the OS fix it up, or handle it all in opal by having a system
reset interrupt call in opal (see the other reply to Linux implementation
patch).

> For OPAL_PARTIAL, does the "and none were delivered" also apply to
> SYS_RESET_ALL_OTHERS ? I'd guess not, but we may want to be explicit.

Well if we do expose this to Linux, it would be easier to ensure none
are delivered (and that's what my Linux implementation assumed). We
want to avoid sending multiple NMIs to any CPU.


> > +static int64_t mambo_signal_system_reset(int32_t cpu_nr)
> > +{
> > +	struct cpu_thread *cpu;
> > +
> > +	if (cpu_nr < 0) {
> > +		if (cpu_nr < SYS_RESET_ALL_OTHERS)
> > +			return OPAL_PARAMETER;
> > +
> > +		for_each_cpu(cpu) {
> > +			if (cpu_nr == SYS_RESET_ALL_OTHERS && cpu == this_cpu())
> > +				continue;
> > +			mambo_system_reset_cpu(cpu);  
> 
> For SYS_RESET_ALL, If we're not running on the last CPU, we would reset
> ourselves before we could reset ourselves before we reset the others, right?

Yeah that's a good catch and I was just thinking about it. We should
do ours last.

The Linux implementation doesn't actually use _ALL broadcasts, but the
hypercall has it and so I've thought it could go in the API. A platform
can return PARTIAL if it doesn't want to support it. There was some
idea it might be useful to pull *all* CPUs into known states on emergency
stacks in some cases, so we may yet consider using it.

Thanks,
Nick


More information about the Skiboot mailing list