[Skiboot] [PATCH] System reset IPI facility and Mambo implementation

Stewart Smith stewart at linux.vnet.ibm.com
Mon Feb 6 19:18:07 AEDT 2017


Nicholas Piggin <npiggin at gmail.com> writes:

> On Fri, 03 Feb 2017 17:08:51 +1100
> Stewart Smith <stewart at linux.vnet.ibm.com> wrote:
>
>> Nicholas Piggin <npiggin at gmail.com> writes:
>> > Add an opal call OPAL_SIGNAL_SYSTEM_RESET which allows system reset
>> > exceptions to be raised on other CPUs and act as an NMI IPI. There
>> > is an initial simple Mambo implementation, but allowances are made
>> > for a more complex hardware implementation.
>> >
>> > This is based on the hardware implementation patch by Alistair Popple.
>> >
>> > Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>> > ---
>> > Hi,
>> >
>> > I'm not sure what the state of Alistair's patch is, but it may
>> > make testing easier to initially add this mambo implementation.  
>> 
>> I'm hoping he spins a V2 on top of this patch :)
>> 
>> A thought... this is the kind of functionality that's ideal to get some
>> kind of test case written for so that we don't go and by accident break
>> it. So I'm CCing in Pridhiviraj so that it's on the todo list to get
>> going as part of op-test-framework tests.
>> 
>> > diff --git a/doc/opal-api/opal-signal-system-reset-145.txt b/doc/opal-api/opal-signal-system-reset-145.txt
>> > new file mode 100644
>> > index 00000000..25b96dbb
>> > --- /dev/null
>> > +++ b/doc/opal-api/opal-signal-system-reset-145.txt  
>> 
>> FYI, we've moved to .rst and ReSTructured text, but it's not a big deal,
>> I can convert when merging if needed.
>> 
>> > @@ -0,0 +1,31 @@
>> > +OPAL_SIGNAL_SYSTEM_RESET
>> > +-------------------
>> > +
>> > +#define OPAL_SIGNAL_SYSTEM_RESET			145
>> > +
>> > +int64_t signal_system_reset(int32_t cpu_nr)
>> > +
>> > +Arguments:
>> > +
>> > +  int32_t cpu_nr
>> > +    Either the cpu server number of the target cpu to reset, or
>> > +    SYS_RESET_ALL (-1) to indicate all cpus should be reset, or
>> > +    SYS_RESET_ALL_OTHERS (-2) to indicate all but the current cpu
>> > +    should be reset.
>> > +
>> > +This OPAL call causes the specified cpu(s) to be reset to the system
>> > +reset exception handler (0x100).
>> > +
>> > +The exact contents of system registers (e.g., SRR1 wakeup causes) may
>> > +vary depending on implementation and should not be relied upon.
>> > +
>> > +Resetting active threads on the same core as this call is run may
>> > +not be supported by some platforms. In that case, OPAL_PARTIAL will be
>> > +returned and NONE of the interrupts will be delivered.  
>> 
>> Cool - I gather the API definition has settled down a bit?
>
> Yes, although we're still deciding whether to expose this OPAL_PARTIAL
> and have the OS fix it up, or handle it all in opal by having a system
> reset interrupt call in opal (see the other reply to Linux implementation
> patch).
>
>> For OPAL_PARTIAL, does the "and none were delivered" also apply to
>> SYS_RESET_ALL_OTHERS ? I'd guess not, but we may want to be explicit.
>
> Well if we do expose this to Linux, it would be easier to ensure none
> are delivered (and that's what my Linux implementation assumed). We
> want to avoid sending multiple NMIs to any CPU.
>
>
>> > +static int64_t mambo_signal_system_reset(int32_t cpu_nr)
>> > +{
>> > +	struct cpu_thread *cpu;
>> > +
>> > +	if (cpu_nr < 0) {
>> > +		if (cpu_nr < SYS_RESET_ALL_OTHERS)
>> > +			return OPAL_PARAMETER;
>> > +
>> > +		for_each_cpu(cpu) {
>> > +			if (cpu_nr == SYS_RESET_ALL_OTHERS && cpu == this_cpu())
>> > +				continue;
>> > +			mambo_system_reset_cpu(cpu);  
>> 
>> For SYS_RESET_ALL, If we're not running on the last CPU, we would reset
>> ourselves before we could reset ourselves before we reset the others, right?
>
> Yeah that's a good catch and I was just thinking about it. We should
> do ours last.
>
> The Linux implementation doesn't actually use _ALL broadcasts, but the
> hypercall has it and so I've thought it could go in the API. A platform
> can return PARTIAL if it doesn't want to support it. There was some
> idea it might be useful to pull *all* CPUs into known states on emergency
> stacks in some cases, so we may yet consider using it.

ack. I'll await a v2 with the SYS_RESET_ALL fix and I think it looks
good to go.

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list