[Skiboot] [PATCH 58/61] P10 Cleanup special wakeup and xive stop api usage

Nicholas Piggin npiggin at gmail.com
Tue Jul 27 13:37:56 AEST 2021


Excerpts from Pratik Sampat's message of July 24, 2021 12:58 am:
> 
> 
> On 21/07/21 5:02 pm, Nicholas Piggin wrote:
>> Excerpts from Vasant Hegde's message of July 19, 2021 11:20 pm:
>>> From: Vaidyanathan Srinivasan <svaidy at linux.ibm.com>
>>>
>>> Cleanup P9 code pending implementation for P10.
>>>
>>> P10 stop-api integration will be needed for STOP11
>>> support only.  STOP11 will be a restricted usage
>>> for testing core re-init on P10.  Only stop0,2,3
>>> will be available for general usage.
>>>
>>> Also, do not treat gated core with SPW done as error.
>>> Core gating bit is a software state updated by microcode,
>>> while SPWU done bit comes from hardware logic to indicate
>>> successful operation.  Print a warning if the status bits
>>> are out of sync, but no need to fail the special wakeup
>>> operation.
>> Hmm, I wonder if similar should be done with P9 code first,
>> and then some of the P10 updates merged into the P10 enable
>> patches.
> 
> I believe P9 didn't fail the SPW command if the core is gated, that is just the
> behavior I observe on P10, is that what you mean?

I mean changing from treating core gated as a SPW failure as P9 does,
to allowing it.

And P9 may not require it, but if it's a valid sequence on P9 as well,
I would prefer to keep them in synch. If the P9 and P10 sequences must 
be different, a small comment might be in order to explain.

>>> Signed-off-by: Vaidyanathan Srinivasan <svaidy at linux.ibm.com>
>>> Signed-off-by: Pratik R. Sampat <psampat at linux.ibm.com>
>>> Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
>>> ---
>>>   core/direct-controls.c | 34 +++++++++++++++++++++++++++-------
>>>   hw/slw.c               | 30 +++++++++---------------------
>>>   hw/xive2.c             | 26 +-------------------------
>>>   3 files changed, 37 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/core/direct-controls.c b/core/direct-controls.c
>>> index 879a537af..4795c19dc 100644
>>> --- a/core/direct-controls.c
>>> +++ b/core/direct-controls.c
>>> @@ -600,15 +600,35 @@ static int p10_core_set_special_wakeup(struct cpu_thread *cpu)
>>>   			 * CORE_GATED will be unset on a successful special
>>>   			 * wakeup of the core which indicates that the core is
>>>   			 * out of stop state. If CORE_GATED is still set then
>>> -			 * raise error.
>>> +			 * check SPWU register and raise error only if SPWU_DONE
>>> +			 * is not set, else print a warning and consider SPWU
>>> +			 * operation as successful.
>>>   			 */
>>>   			if (p10_core_is_gated(cpu)) {
>>> -				/* Deassert spwu for this strange error */
>>> -				xscom_write(chip_id, spwu_addr, 0);
>>> -				prlog(PR_ERR, "Failed special wakeup on %u:%u"
>>> -						" core remains gated.\n",
>>> -						chip_id, core_id);
>>> -				return OPAL_HARDWARE;
>>> +				if(xscom_read(chip_id, spwu_addr, &val)) {
>>> +					prlog(PR_ERR, "Core %u:%u:"
>>> +					      " unable to read QME_SPWU_HYP\n",
>>> +					      chip_id, core_id);
>>> +                                        return OPAL_HARDWARE;
>>> +                                }

Shoud clear the spwu request in case of errors to be consistent.

>>> +				if (val & P10_SPWU_DONE) {
>>> +					/*
>>> +					 * If SPWU DONE bit is set then
>>> +					 * SPWU operation is complete
>>> +					 */
>>> +					prlog(PR_WARNING, "Special wakeup on "
>>> +					      "%u:%u: core remains gated while"
>>> +					      " SPWU_HYP DONE set\n",
>>> +					      chip_id, core_id);
>>> +					return 0;
>>> +				}

So succeeded, but we have this error message. Does that mean microcode 
has a bug? Under what circumstances does this happen?

>>> +                                /* Deassert spwu for this strange error */
>>> +                                xscom_write(chip_id, spwu_addr, 0);
>>> +                                prlog(PR_ERR,
>>> +                                      "Failed special wakeup on %u:%u"
>>> +                                      " core remains gated.\n",
>>> +                                      chip_id, core_id);
>>> +                                return OPAL_HARDWARE;
>> Should the equivalent be done for P9 here?
> 
> I believe P9 does not fail on special wakeup when core is gated hence, it may
> not be necessary.

Do you mean that P9 clears the core gated bit, but P10 does not?

> 
>>>   			} else {
>>>   				return 0;
>>>   			}
>>> diff --git a/hw/slw.c b/hw/slw.c
>>> index e22d1bdde..52536db06 100644
>>> --- a/hw/slw.c
>>> +++ b/hw/slw.c
>>> @@ -228,32 +228,20 @@ static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c)
>>>   	int rc;
>>>   	uint32_t core = pir_to_core_id(c->pir);
>>>   
>>> -	/* Clear special wakeup bits that could hold power mgt */
>>> -	rc = xscom_write(chip->id,
>>> -			 XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP),
>>> -			 0);
>>> -	if (rc) {
>>> -		log_simple_error(&e_info(OPAL_RC_SLW_SET),
>>> -			"SLW: Failed to write P10_QME_SPWU_HYP\n");
>>> -		return false;
>>> -	}
>>> -	/* Read back for debug */
>>> +	/* Special wakeup bits that could hold power mgt */
>>>   	rc = xscom_read(chip->id,
>>>   			XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP),
>>>   			&tmp);
>>> -	if (tmp)
>>> +        if (rc) {
>>> +          log_simple_error(&e_info(OPAL_RC_SLW_SET),
>>> +                           "SLW: Failed to read P10_QME_SPWU_HYP\n");
>>> +          return false;
>>> +        }
>> Ditto here -- should the p9 code be changed? I wonder if the special
>> wakeup check should be made in direct controls init code rather than
>> here.
> 
> P9 made those check here too and that's why I followed pattern, however if you
> believe this belongs in direct controls I could maybe clean it up for both P9
> and P10.

Well P9 just does a blind clear, whereas you are checking it here (and 
seems you don't try to clear at all).

I just wonder if that's the better way to go for P9 code as well, so 
they don't have to diverge?

And yes I think it might be nice to just move that (at least P9 and P10)
out of SLW and into direct-controls init entirely. The only reason not 
to AFAIKS would be if some bring-up firmware without power management
just keeps special wakeup asserted.

Thanks,
Nick


More information about the Skiboot mailing list