[Skiboot] [PATCH 58/61] P10 Cleanup special wakeup and xive stop api usage
Nicholas Piggin
npiggin at gmail.com
Tue Jul 27 13:37:56 AEST 2021
Excerpts from Pratik Sampat's message of July 24, 2021 12:58 am:
>
>
> On 21/07/21 5:02 pm, Nicholas Piggin wrote:
>> Excerpts from Vasant Hegde's message of July 19, 2021 11:20 pm:
>>> From: Vaidyanathan Srinivasan <svaidy at linux.ibm.com>
>>>
>>> Cleanup P9 code pending implementation for P10.
>>>
>>> P10 stop-api integration will be needed for STOP11
>>> support only. STOP11 will be a restricted usage
>>> for testing core re-init on P10. Only stop0,2,3
>>> will be available for general usage.
>>>
>>> Also, do not treat gated core with SPW done as error.
>>> Core gating bit is a software state updated by microcode,
>>> while SPWU done bit comes from hardware logic to indicate
>>> successful operation. Print a warning if the status bits
>>> are out of sync, but no need to fail the special wakeup
>>> operation.
>> Hmm, I wonder if similar should be done with P9 code first,
>> and then some of the P10 updates merged into the P10 enable
>> patches.
>
> I believe P9 didn't fail the SPW command if the core is gated, that is just the
> behavior I observe on P10, is that what you mean?
I mean changing from treating core gated as a SPW failure as P9 does,
to allowing it.
And P9 may not require it, but if it's a valid sequence on P9 as well,
I would prefer to keep them in synch. If the P9 and P10 sequences must
be different, a small comment might be in order to explain.
>>> Signed-off-by: Vaidyanathan Srinivasan <svaidy at linux.ibm.com>
>>> Signed-off-by: Pratik R. Sampat <psampat at linux.ibm.com>
>>> Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
>>> ---
>>> core/direct-controls.c | 34 +++++++++++++++++++++++++++-------
>>> hw/slw.c | 30 +++++++++---------------------
>>> hw/xive2.c | 26 +-------------------------
>>> 3 files changed, 37 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/core/direct-controls.c b/core/direct-controls.c
>>> index 879a537af..4795c19dc 100644
>>> --- a/core/direct-controls.c
>>> +++ b/core/direct-controls.c
>>> @@ -600,15 +600,35 @@ static int p10_core_set_special_wakeup(struct cpu_thread *cpu)
>>> * CORE_GATED will be unset on a successful special
>>> * wakeup of the core which indicates that the core is
>>> * out of stop state. If CORE_GATED is still set then
>>> - * raise error.
>>> + * check SPWU register and raise error only if SPWU_DONE
>>> + * is not set, else print a warning and consider SPWU
>>> + * operation as successful.
>>> */
>>> if (p10_core_is_gated(cpu)) {
>>> - /* Deassert spwu for this strange error */
>>> - xscom_write(chip_id, spwu_addr, 0);
>>> - prlog(PR_ERR, "Failed special wakeup on %u:%u"
>>> - " core remains gated.\n",
>>> - chip_id, core_id);
>>> - return OPAL_HARDWARE;
>>> + if(xscom_read(chip_id, spwu_addr, &val)) {
>>> + prlog(PR_ERR, "Core %u:%u:"
>>> + " unable to read QME_SPWU_HYP\n",
>>> + chip_id, core_id);
>>> + return OPAL_HARDWARE;
>>> + }
Shoud clear the spwu request in case of errors to be consistent.
>>> + if (val & P10_SPWU_DONE) {
>>> + /*
>>> + * If SPWU DONE bit is set then
>>> + * SPWU operation is complete
>>> + */
>>> + prlog(PR_WARNING, "Special wakeup on "
>>> + "%u:%u: core remains gated while"
>>> + " SPWU_HYP DONE set\n",
>>> + chip_id, core_id);
>>> + return 0;
>>> + }
So succeeded, but we have this error message. Does that mean microcode
has a bug? Under what circumstances does this happen?
>>> + /* Deassert spwu for this strange error */
>>> + xscom_write(chip_id, spwu_addr, 0);
>>> + prlog(PR_ERR,
>>> + "Failed special wakeup on %u:%u"
>>> + " core remains gated.\n",
>>> + chip_id, core_id);
>>> + return OPAL_HARDWARE;
>> Should the equivalent be done for P9 here?
>
> I believe P9 does not fail on special wakeup when core is gated hence, it may
> not be necessary.
Do you mean that P9 clears the core gated bit, but P10 does not?
>
>>> } else {
>>> return 0;
>>> }
>>> diff --git a/hw/slw.c b/hw/slw.c
>>> index e22d1bdde..52536db06 100644
>>> --- a/hw/slw.c
>>> +++ b/hw/slw.c
>>> @@ -228,32 +228,20 @@ static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c)
>>> int rc;
>>> uint32_t core = pir_to_core_id(c->pir);
>>>
>>> - /* Clear special wakeup bits that could hold power mgt */
>>> - rc = xscom_write(chip->id,
>>> - XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP),
>>> - 0);
>>> - if (rc) {
>>> - log_simple_error(&e_info(OPAL_RC_SLW_SET),
>>> - "SLW: Failed to write P10_QME_SPWU_HYP\n");
>>> - return false;
>>> - }
>>> - /* Read back for debug */
>>> + /* Special wakeup bits that could hold power mgt */
>>> rc = xscom_read(chip->id,
>>> XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP),
>>> &tmp);
>>> - if (tmp)
>>> + if (rc) {
>>> + log_simple_error(&e_info(OPAL_RC_SLW_SET),
>>> + "SLW: Failed to read P10_QME_SPWU_HYP\n");
>>> + return false;
>>> + }
>> Ditto here -- should the p9 code be changed? I wonder if the special
>> wakeup check should be made in direct controls init code rather than
>> here.
>
> P9 made those check here too and that's why I followed pattern, however if you
> believe this belongs in direct controls I could maybe clean it up for both P9
> and P10.
Well P9 just does a blind clear, whereas you are checking it here (and
seems you don't try to clear at all).
I just wonder if that's the better way to go for P9 code as well, so
they don't have to diverge?
And yes I think it might be nice to just move that (at least P9 and P10)
out of SLW and into direct-controls init entirely. The only reason not
to AFAIKS would be if some bring-up firmware without power management
just keeps special wakeup asserted.
Thanks,
Nick
More information about the Skiboot
mailing list