From anton at ozlabs.org Mon Aug 2 11:02:10 2021 From: anton at ozlabs.org (Anton Blanchard) Date: Mon, 2 Aug 2021 11:02:10 +1000 Subject: [Skiboot] [PATCH] Don't warn about stack size on host binaries Message-ID: <20210802110210.13d1300c@kryten.localdomain> I'm hitting a stack size warning when building pflash: common/arch_flash_powerpc.c: In function ?get_dev_mtd.constprop?: common/arch_flash_powerpc.c:177:1: error: the frame size of 8240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] That function has 2 PATH_MAX strings, each of which will use up 4kB of stack. We've tried to work around the issue of stack size warnings on host binaries in a few places, with limited success. This patch removes the check completely instead. We need to modify the HOSTCFLAGS variable assignment to be immediate for this to work. Signed-off-by: Anton Blanchard --- diff --git a/Makefile.main b/Makefile.main index d21f27be..189b4ae4 100644 --- a/Makefile.main +++ b/Makefile.main @@ -51,7 +51,7 @@ endif # Host tools and options HOSTCC=gcc HOSTEND=$(shell uname -m | sed -e 's/^i.*86$$/LITTLE/' -e 's/^x86.*/LITTLE/' -e 's/^ppc64le/LITTLE/' -e 's/^ppc.*/BIG/') -HOSTCFLAGS=-O1 $(CWARNS) -DHAVE_$(HOSTEND)_ENDIAN -MMD +HOSTCFLAGS:=-O1 $(CWARNS) -DHAVE_$(HOSTEND)_ENDIAN -MMD HOSTCFLAGS += $(call try-cflag,$(HOSTCC),-std=gnu11) HOSTCFLAGS += $(call try-cflag,$(HOSTCC),-m64) HOSTCFLAGS += $(call try-cflag,$(HOSTCC),-Wjump-misses-init) \ @@ -62,9 +62,6 @@ HOSTCFLAGS += $(call try-cflag,$(HOSTCC),-Wjump-misses-init) \ HOSTCFLAGS += -DDEBUG -DCCAN_LIST_DEBUG # We want small stack usage for skiboot -# but host compilation of unit tests tend to inline heavily, -# which creates larger stack frames and triggering useless warnings -HOSTCFLAGS += -Wframe-larger-than=4096 CWARNS += -Wframe-larger-than=1024 HOSTGCOVCFLAGS = -fprofile-arcs -ftest-coverage -lgcov -O0 -g -pg diff --git a/external/pflash/rules.mk b/external/pflash/rules.mk index 8d5a7bfd..1d1b6048 100644 --- a/external/pflash/rules.mk +++ b/external/pflash/rules.mk @@ -52,7 +52,6 @@ $(LIBFLASH_OBJS): libflash-%.o : libflash/%.c | links $(CCAN_OBJS): ccan-list-%.o: ccan/list/%.c | links $(Q_CC)$(CC) $(CFLAGS) -c $< -o $@ -$(EXE): CFLAGS += -Wframe-larger-than=2048 $(EXE): $(OBJS) $(Q_CC)$(CC) $(LDFLAGS) $(CFLAGS) $^ -lrt -o $@ From npiggin at gmail.com Mon Aug 2 12:13:12 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Mon, 02 Aug 2021 12:13:12 +1000 Subject: [Skiboot] [PATCH 58/61] P10 Cleanup special wakeup and xive stop api usage In-Reply-To: <074b624e-5fb6-74bd-cee9-61eb6d7b7725@linux.ibm.com> References: <20210719132012.150948-1-hegdevasant@linux.vnet.ibm.com> <20210719132012.150948-59-hegdevasant@linux.vnet.ibm.com> <1626863977.xxqfyyc4p3.astroid@bobo.none> <58a9b94d-7ad1-ff37-c78f-b1be3268b5d5@linux.ibm.com> <1627355554.k4ar6kqv7j.astroid@bobo.none> <074b624e-5fb6-74bd-cee9-61eb6d7b7725@linux.ibm.com> Message-ID: <1627869842.du83029fda.astroid@bobo.none> Excerpts from Pratik Sampat's message of July 30, 2021 5:08 pm: > Hello, > > Apologies for late response, I was collating answers for the questions from > @vaidy and the firmware team. No problem thanks for checking. > On 27/07/21 9:07 am, Nicholas Piggin wrote: >> Excerpts from Pratik Sampat's message of July 24, 2021 12:58 am: >>> >>> On 21/07/21 5:02 pm, Nicholas Piggin wrote: >>>> Excerpts from Vasant Hegde's message of July 19, 2021 11:20 pm: >>>>> From: Vaidyanathan Srinivasan >>>>> >>>>> Cleanup P9 code pending implementation for P10. >>>>> >>>>> P10 stop-api integration will be needed for STOP11 >>>>> support only. STOP11 will be a restricted usage >>>>> for testing core re-init on P10. Only stop0,2,3 >>>>> will be available for general usage. >>>>> >>>>> Also, do not treat gated core with SPW done as error. >>>>> Core gating bit is a software state updated by microcode, >>>>> while SPWU done bit comes from hardware logic to indicate >>>>> successful operation. Print a warning if the status bits >>>>> are out of sync, but no need to fail the special wakeup >>>>> operation. >>>> Hmm, I wonder if similar should be done with P9 code first, >>>> and then some of the P10 updates merged into the P10 enable >>>> patches. >>> I believe P9 didn't fail the SPW command if the core is gated, that is just the >>> behavior I observe on P10, is that what you mean? >> I mean changing from treating core gated as a SPW failure as P9 does, >> to allowing it. >> >> And P9 may not require it, but if it's a valid sequence on P9 as well, >> I would prefer to keep them in synch. If the P9 and P10 sequences must >> be different, a small comment might be in order to explain. >> > On P9 the behavior on the check failing is properly defined to fail. However, on > P10 there does seem to be a bug which seems to say if the core gated but SPW is > done it is still a pass. > > Sure, I could add a comment on it explaining about this in the code. Yeah that would be good. > >>>>> Signed-off-by: Vaidyanathan Srinivasan >>>>> Signed-off-by: Pratik R. Sampat >>>>> Signed-off-by: Vasant Hegde >>>>> --- >>>>> core/direct-controls.c | 34 +++++++++++++++++++++++++++------- >>>>> hw/slw.c | 30 +++++++++--------------------- >>>>> hw/xive2.c | 26 +------------------------- >>>>> 3 files changed, 37 insertions(+), 53 deletions(-) >>>>> >>>>> diff --git a/core/direct-controls.c b/core/direct-controls.c >>>>> index 879a537af..4795c19dc 100644 >>>>> --- a/core/direct-controls.c >>>>> +++ b/core/direct-controls.c >>>>> @@ -600,15 +600,35 @@ static int p10_core_set_special_wakeup(struct cpu_thread *cpu) >>>>> * CORE_GATED will be unset on a successful special >>>>> * wakeup of the core which indicates that the core is >>>>> * out of stop state. If CORE_GATED is still set then >>>>> - * raise error. >>>>> + * check SPWU register and raise error only if SPWU_DONE >>>>> + * is not set, else print a warning and consider SPWU >>>>> + * operation as successful. >>>>> */ >>>>> if (p10_core_is_gated(cpu)) { >>>>> - /* Deassert spwu for this strange error */ >>>>> - xscom_write(chip_id, spwu_addr, 0); >>>>> - prlog(PR_ERR, "Failed special wakeup on %u:%u" >>>>> - " core remains gated.\n", >>>>> - chip_id, core_id); >>>>> - return OPAL_HARDWARE; >>>>> + if(xscom_read(chip_id, spwu_addr, &val)) { >>>>> + prlog(PR_ERR, "Core %u:%u:" >>>>> + " unable to read QME_SPWU_HYP\n", >>>>> + chip_id, core_id); >>>>> + return OPAL_HARDWARE; >>>>> + } >> Shoud clear the spwu request in case of errors to be consistent. >> > If we fail to read QME_SPWU_HYP then we have to just fail and return because > chances of we writing same QME_SPWU_HYP xscom to clear SPW is less. This can > happen if xscom engine is broken or a code bug or we are targeting some dead > core/invalid core etc. > > So I would be inclined to not try any other xscom, just return error as is. Okay you're right, other code does not do this, it's only the strange gated error that does. > >>>>> + if (val & P10_SPWU_DONE) { >>>>> + /* >>>>> + * If SPWU DONE bit is set then >>>>> + * SPWU operation is complete >>>>> + */ >>>>> + prlog(PR_WARNING, "Special wakeup on " >>>>> + "%u:%u: core remains gated while" >>>>> + " SPWU_HYP DONE set\n", >>>>> + chip_id, core_id); >>>>> + return 0; >>>>> + } >> So succeeded, but we have this error message. Does that mean microcode >> has a bug? Under what circumstances does this happen? > > I just confirmed that this is indeed a microcode bug seen only on P10, So > instead of failing SPW and hence the callers, we know that SPW has succeeded by > checking both the sources: SPWU_HYP and SSH_HYP. So we print a warning and > make the SPW call success. May not need to be a warning then unless it would be useful for debugging. In that case possibly use PR_DEBUG. Some skiboots are way too verbose and scary which we have to watch out for. If your firmware starts printing messages like this then you'd be inclined to think the machine is failing or unstable. > >>>>> + /* Deassert spwu for this strange error */ >>>>> + xscom_write(chip_id, spwu_addr, 0); >>>>> + prlog(PR_ERR, >>>>> + "Failed special wakeup on %u:%u" >>>>> + " core remains gated.\n", >>>>> + chip_id, core_id); >>>>> + return OPAL_HARDWARE; >>>> Should the equivalent be done for P9 here? >>> I believe P9 does not fail on special wakeup when core is gated hence, it may >>> not be necessary. >> Do you mean that P9 clears the core gated bit, but P10 does not? > > No, I think you're right. Even though we haven't hit this issue in P9. I think > we should still keep the de-assert logic mimicked for P9 as well. > >>>>> } else { >>>>> return 0; >>>>> } >>>>> diff --git a/hw/slw.c b/hw/slw.c >>>>> index e22d1bdde..52536db06 100644 >>>>> --- a/hw/slw.c >>>>> +++ b/hw/slw.c >>>>> @@ -228,32 +228,20 @@ static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) >>>>> int rc; >>>>> uint32_t core = pir_to_core_id(c->pir); >>>>> >>>>> - /* Clear special wakeup bits that could hold power mgt */ >>>>> - rc = xscom_write(chip->id, >>>>> - XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), >>>>> - 0); >>>>> - if (rc) { >>>>> - log_simple_error(&e_info(OPAL_RC_SLW_SET), >>>>> - "SLW: Failed to write P10_QME_SPWU_HYP\n"); >>>>> - return false; >>>>> - } >>>>> - /* Read back for debug */ >>>>> + /* Special wakeup bits that could hold power mgt */ >>>>> rc = xscom_read(chip->id, >>>>> XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), >>>>> &tmp); >>>>> - if (tmp) >>>>> + if (rc) { >>>>> + log_simple_error(&e_info(OPAL_RC_SLW_SET), >>>>> + "SLW: Failed to read P10_QME_SPWU_HYP\n"); >>>>> + return false; >>>>> + } >>>> Ditto here -- should the p9 code be changed? I wonder if the special >>>> wakeup check should be made in direct controls init code rather than >>>> here. >>> P9 made those check here too and that's why I followed pattern, however if you >>> believe this belongs in direct controls I could maybe clean it up for both P9 >>> and P10. >> Well P9 just does a blind clear, whereas you are checking it here (and >> seems you don't try to clear at all). >> >> I just wonder if that's the better way to go for P9 code as well, so >> they don't have to diverge? > > I agree. For P9, we probably should not do a blind clear and just check for the > bit and report. I'll clean that up for P9. Thanks, I agree. > >> >> And yes I think it might be nice to just move that (at least P9 and P10) >> out of SLW and into direct-controls init entirely. The only reason not >> to AFAIKS would be if some bring-up firmware without power management >> just keeps special wakeup asserted. > > I would still be inclined to keep this in SLW init as now we know that it's job > is to just check and report and moving this check to direct controls may cause > more confusion. > However, if you still believe that it would be better suited in direct-controls > I would be happy to move it there after testing once if there isn't any > microcode interaction that keeps SPW asserted at that point. Sure leave it there for now, it's not a big deal. Thanks, Nick From psampat at linux.ibm.com Mon Aug 2 17:14:29 2021 From: psampat at linux.ibm.com (Pratik Sampat) Date: Mon, 2 Aug 2021 12:44:29 +0530 Subject: [Skiboot] [PATCH 58/61] P10 Cleanup special wakeup and xive stop api usage In-Reply-To: <1627869842.du83029fda.astroid@bobo.none> References: <20210719132012.150948-1-hegdevasant@linux.vnet.ibm.com> <20210719132012.150948-59-hegdevasant@linux.vnet.ibm.com> <1626863977.xxqfyyc4p3.astroid@bobo.none> <58a9b94d-7ad1-ff37-c78f-b1be3268b5d5@linux.ibm.com> <1627355554.k4ar6kqv7j.astroid@bobo.none> <074b624e-5fb6-74bd-cee9-61eb6d7b7725@linux.ibm.com> <1627869842.du83029fda.astroid@bobo.none> Message-ID: On 02/08/21 7:43 am, Nicholas Piggin wrote: > Excerpts from Pratik Sampat's message of July 30, 2021 5:08 pm: >> Hello, >> >> Apologies for late response, I was collating answers for the questions from >> @vaidy and the firmware team. > No problem thanks for checking. > >> On 27/07/21 9:07 am, Nicholas Piggin wrote: >>> Excerpts from Pratik Sampat's message of July 24, 2021 12:58 am: >>>> On 21/07/21 5:02 pm, Nicholas Piggin wrote: >>>>> Excerpts from Vasant Hegde's message of July 19, 2021 11:20 pm: >>>>>> From: Vaidyanathan Srinivasan >>>>>> >>>>>> Cleanup P9 code pending implementation for P10. >>>>>> >>>>>> P10 stop-api integration will be needed for STOP11 >>>>>> support only. STOP11 will be a restricted usage >>>>>> for testing core re-init on P10. Only stop0,2,3 >>>>>> will be available for general usage. >>>>>> >>>>>> Also, do not treat gated core with SPW done as error. >>>>>> Core gating bit is a software state updated by microcode, >>>>>> while SPWU done bit comes from hardware logic to indicate >>>>>> successful operation. Print a warning if the status bits >>>>>> are out of sync, but no need to fail the special wakeup >>>>>> operation. >>>>> Hmm, I wonder if similar should be done with P9 code first, >>>>> and then some of the P10 updates merged into the P10 enable >>>>> patches. >>>> I believe P9 didn't fail the SPW command if the core is gated, that is just the >>>> behavior I observe on P10, is that what you mean? >>> I mean changing from treating core gated as a SPW failure as P9 does, >>> to allowing it. >>> >>> And P9 may not require it, but if it's a valid sequence on P9 as well, >>> I would prefer to keep them in synch. If the P9 and P10 sequences must >>> be different, a small comment might be in order to explain. >>> >> On P9 the behavior on the check failing is properly defined to fail. However, on >> P10 there does seem to be a bug which seems to say if the core gated but SPW is >> done it is still a pass. >> >> Sure, I could add a comment on it explaining about this in the code. > Yeah that would be good. > Sure, added. >>>>>> Signed-off-by: Vaidyanathan Srinivasan >>>>>> Signed-off-by: Pratik R. Sampat >>>>>> Signed-off-by: Vasant Hegde >>>>>> --- >>>>>> core/direct-controls.c | 34 +++++++++++++++++++++++++++------- >>>>>> hw/slw.c | 30 +++++++++--------------------- >>>>>> hw/xive2.c | 26 +------------------------- >>>>>> 3 files changed, 37 insertions(+), 53 deletions(-) >>>>>> >>>>>> diff --git a/core/direct-controls.c b/core/direct-controls.c >>>>>> index 879a537af..4795c19dc 100644 >>>>>> --- a/core/direct-controls.c >>>>>> +++ b/core/direct-controls.c >>>>>> @@ -600,15 +600,35 @@ static int p10_core_set_special_wakeup(struct cpu_thread *cpu) >>>>>> * CORE_GATED will be unset on a successful special >>>>>> * wakeup of the core which indicates that the core is >>>>>> * out of stop state. If CORE_GATED is still set then >>>>>> - * raise error. >>>>>> + * check SPWU register and raise error only if SPWU_DONE >>>>>> + * is not set, else print a warning and consider SPWU >>>>>> + * operation as successful. >>>>>> */ >>>>>> if (p10_core_is_gated(cpu)) { >>>>>> - /* Deassert spwu for this strange error */ >>>>>> - xscom_write(chip_id, spwu_addr, 0); >>>>>> - prlog(PR_ERR, "Failed special wakeup on %u:%u" >>>>>> - " core remains gated.\n", >>>>>> - chip_id, core_id); >>>>>> - return OPAL_HARDWARE; >>>>>> + if(xscom_read(chip_id, spwu_addr, &val)) { >>>>>> + prlog(PR_ERR, "Core %u:%u:" >>>>>> + " unable to read QME_SPWU_HYP\n", >>>>>> + chip_id, core_id); >>>>>> + return OPAL_HARDWARE; >>>>>> + } >>> Shoud clear the spwu request in case of errors to be consistent. >>> >> If we fail to read QME_SPWU_HYP then we have to just fail and return because >> chances of we writing same QME_SPWU_HYP xscom to clear SPW is less. This can >> happen if xscom engine is broken or a code bug or we are targeting some dead >> core/invalid core etc. >> >> So I would be inclined to not try any other xscom, just return error as is. > Okay you're right, other code does not do this, it's only the strange > gated error that does. > >>>>>> + if (val & P10_SPWU_DONE) { >>>>>> + /* >>>>>> + * If SPWU DONE bit is set then >>>>>> + * SPWU operation is complete >>>>>> + */ >>>>>> + prlog(PR_WARNING, "Special wakeup on " >>>>>> + "%u:%u: core remains gated while" >>>>>> + " SPWU_HYP DONE set\n", >>>>>> + chip_id, core_id); >>>>>> + return 0; >>>>>> + } >>> So succeeded, but we have this error message. Does that mean microcode >>> has a bug? Under what circumstances does this happen? >> I just confirmed that this is indeed a microcode bug seen only on P10, So >> instead of failing SPW and hence the callers, we know that SPW has succeeded by >> checking both the sources: SPWU_HYP and SSH_HYP. So we print a warning and >> make the SPW call success. > May not need to be a warning then unless it would be useful for > debugging. In that case possibly use PR_DEBUG. Some skiboots are > way too verbose and scary which we have to watch out for. If your > firmware starts printing messages like this then you'd be inclined to > think the machine is failing or unstable. > Sure I understand, this warning can cause unnecessary worry. I can convert this to a PR_DEBUG then. >>>>>> + /* Deassert spwu for this strange error */ >>>>>> + xscom_write(chip_id, spwu_addr, 0); >>>>>> + prlog(PR_ERR, >>>>>> + "Failed special wakeup on %u:%u" >>>>>> + " core remains gated.\n", >>>>>> + chip_id, core_id); >>>>>> + return OPAL_HARDWARE; >>>>> Should the equivalent be done for P9 here? >>>> I believe P9 does not fail on special wakeup when core is gated hence, it may >>>> not be necessary. >>> Do you mean that P9 clears the core gated bit, but P10 does not? >> No, I think you're right. Even though we haven't hit this issue in P9. I think >> we should still keep the de-assert logic mimicked for P9 as well. >> >>>>>> } else { >>>>>> return 0; >>>>>> } >>>>>> diff --git a/hw/slw.c b/hw/slw.c >>>>>> index e22d1bdde..52536db06 100644 >>>>>> --- a/hw/slw.c >>>>>> +++ b/hw/slw.c >>>>>> @@ -228,32 +228,20 @@ static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) >>>>>> int rc; >>>>>> uint32_t core = pir_to_core_id(c->pir); >>>>>> >>>>>> - /* Clear special wakeup bits that could hold power mgt */ >>>>>> - rc = xscom_write(chip->id, >>>>>> - XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), >>>>>> - 0); >>>>>> - if (rc) { >>>>>> - log_simple_error(&e_info(OPAL_RC_SLW_SET), >>>>>> - "SLW: Failed to write P10_QME_SPWU_HYP\n"); >>>>>> - return false; >>>>>> - } >>>>>> - /* Read back for debug */ >>>>>> + /* Special wakeup bits that could hold power mgt */ >>>>>> rc = xscom_read(chip->id, >>>>>> XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), >>>>>> &tmp); >>>>>> - if (tmp) >>>>>> + if (rc) { >>>>>> + log_simple_error(&e_info(OPAL_RC_SLW_SET), >>>>>> + "SLW: Failed to read P10_QME_SPWU_HYP\n"); >>>>>> + return false; >>>>>> + } >>>>> Ditto here -- should the p9 code be changed? I wonder if the special >>>>> wakeup check should be made in direct controls init code rather than >>>>> here. >>>> P9 made those check here too and that's why I followed pattern, however if you >>>> believe this belongs in direct controls I could maybe clean it up for both P9 >>>> and P10. >>> Well P9 just does a blind clear, whereas you are checking it here (and >>> seems you don't try to clear at all). >>> >>> I just wonder if that's the better way to go for P9 code as well, so >>> they don't have to diverge? >> I agree. For P9, we probably should not do a blind clear and just check for the >> bit and report. I'll clean that up for P9. > Thanks, I agree. > >>> And yes I think it might be nice to just move that (at least P9 and P10) >>> out of SLW and into direct-controls init entirely. The only reason not >>> to AFAIKS would be if some bring-up firmware without power management >>> just keeps special wakeup asserted. >> I would still be inclined to keep this in SLW init as now we know that it's job >> is to just check and report and moving this check to direct controls may cause >> more confusion. >> However, if you still believe that it would be better suited in direct-controls >> I would be happy to move it there after testing once if there isn't any >> microcode interaction that keeps SPW asserted at that point. > Sure leave it there for now, it's not a big deal. Sure thing, I'll leave it as-is for now. Thanks, Pratik > Thanks, > Nick From hegdevasant at linux.vnet.ibm.com Tue Aug 3 00:33:53 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Mon, 2 Aug 2021 20:03:53 +0530 Subject: [Skiboot] [PATCH 1/2] POWER9 Cleanups: de-assert SPW Message-ID: <20210802143354.971727-1-hegdevasant@linux.vnet.ibm.com> From: "Pratik R. Sampat" De-assert special wakeup bits for the case when SPWU bit is set, however the core is gated to maintain a coherent state for special wakeup. Signed-off-by: Pratik R. Sampat Signed-off-by: Vasant Hegde --- core/direct-controls.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/core/direct-controls.c b/core/direct-controls.c index 65cf122c1..0274367da 100644 --- a/core/direct-controls.c +++ b/core/direct-controls.c @@ -302,6 +302,8 @@ static int p9_core_set_special_wakeup(struct cpu_thread *cpu) * raise error. */ if (dctl_core_is_gated(cpu)) { + /* Deassert spwu for this strange error */ + xscom_write(chip_id, swake_addr, 0); prlog(PR_ERR, "Failed special wakeup on %u:%u" " as CORE_GATED is set\n", chip_id, core_id); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Tue Aug 3 00:33:54 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Mon, 2 Aug 2021 20:03:54 +0530 Subject: [Skiboot] [PATCH 2/2] POWER9 Cleanups: Don't force clear SPW bits In-Reply-To: <20210802143354.971727-1-hegdevasant@linux.vnet.ibm.com> References: <20210802143354.971727-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210802143354.971727-2-hegdevasant@linux.vnet.ibm.com> From: "Pratik R. Sampat" SLW force-cleared Special wakeup bits that could hold power management. However, SLW should expect these bits to be cleared at this point, hence only read and the report on the SPW bits to find anomalies instead. Signed-off-by: Pratik R. Sampat Signed-off-by: Vasant Hegde --- hw/slw.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/hw/slw.c b/hw/slw.c index 625ee886e..a0145deb6 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -226,19 +226,15 @@ static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) int rc; uint32_t core = pir_to_core_id(c->pir); - /* Clear special wakeup bits that could hold power mgt */ - rc = xscom_write(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), - 0); + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), + &tmp); if (rc) { log_simple_error(&e_info(OPAL_RC_SLW_SET), - "SLW: Failed to write EC_PPM_SPECIAL_WKUP_HYP\n"); + "SLW: Failed to read EC_PPM_SPECIAL_WKUP_HYP\n"); return false; } - /* Read back for debug */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), - &tmp); if (tmp) prlog(PR_WARNING, "SLW: core %d EC_PPM_SPECIAL_WKUP_HYP read 0x%016llx\n", -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 16:48:56 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:18:56 +0530 Subject: [Skiboot] [PATCH 27/61] platforms/astbmc: Add ast2600 In-Reply-To: References: <20210719132012.150948-1-hegdevasant@linux.vnet.ibm.com> <20210719132012.150948-28-hegdevasant@linux.vnet.ibm.com> Message-ID: On 7/29/21 1:54 PM, Joel Stanley wrote: > On Mon, 19 Jul 2021 at 13:23, Vasant Hegde > wrote: >> >> From: Reza Arbab >> >> Signed-off-by: Reza Arbab >> Signed-off-by: Vasant Hegde >> --- >> platforms/astbmc/astbmc.h | 2 ++ >> platforms/astbmc/common.c | 19 +++++++++++++++++-- >> 2 files changed, 19 insertions(+), 2 deletions(-) >> >> diff --git a/platforms/astbmc/astbmc.h b/platforms/astbmc/astbmc.h >> index 86631bc4e..00f221230 100644 >> --- a/platforms/astbmc/astbmc.h >> +++ b/platforms/astbmc/astbmc.h >> @@ -87,9 +87,11 @@ static struct slot_table_entry st_name[] = \ >> >> extern const struct bmc_hw_config bmc_hw_ast2400; >> extern const struct bmc_hw_config bmc_hw_ast2500; >> +extern const struct bmc_hw_config bmc_hw_ast2600; >> extern const struct bmc_platform bmc_plat_ast2400_ami; >> extern const struct bmc_platform bmc_plat_ast2500_ami; >> extern const struct bmc_platform bmc_plat_ast2500_openbmc; >> +extern const struct bmc_platform bmc_plat_ast2600_openbmc; >> >> extern void astbmc_early_init(void); >> extern int64_t astbmc_ipmi_reboot(void); >> diff --git a/platforms/astbmc/common.c b/platforms/astbmc/common.c >> index d96e070e5..83ef70ad3 100644 >> --- a/platforms/astbmc/common.c >> +++ b/platforms/astbmc/common.c >> @@ -266,8 +266,9 @@ static void astbmc_fixup_dt_mbox(struct dt_node *lpc) >> * can indicate they support mbox using the scratch register, or ipmi >> * by configuring the hiomap ipmi command. If neither are configured >> * for P8 then skiboot will drive the flash controller directly. >> + * XXX P10 >> */ >> - if (proc_gen != proc_gen_p9 && !ast_scratch_reg_is_mbox()) >> + if (proc_gen == proc_gen_p8 && !ast_scratch_reg_is_mbox()) >> return; >> >> /* First check if the mbox interface is already there */ >> @@ -478,7 +479,7 @@ void astbmc_early_init(void) >> * never MBOX. Thus only populate the MBOX node on P9 to allow >> * fallback. >> */ >> - if (proc_gen == proc_gen_p9) { >> + if (proc_gen >= proc_gen_p9) { >> astbmc_fixup_dt_mbox(dt_find_primary_lpc()); >> ast_setup_sio_mbox(MBOX_IO_BASE, MBOX_LPC_IRQ); >> } > > This part looks okay, the rainier BMC will talk without issue. > >> @@ -530,6 +531,14 @@ const struct bmc_hw_config bmc_hw_ast2500 = { >> .mcr_scu_strap = 0x00000000, >> }; >> >> +/* XXX P10: Update with Rainier values */ >> +const struct bmc_hw_config bmc_hw_ast2600 = { >> + .scu_revision_id = 0x05000303, >> + .mcr_configuration = 0x11200756, >> + .mcr_scu_mpll = 0x1008405F, >> + .mcr_scu_strap = 0x000030E0, >> +}; > > This one is a bit suspect. It won't cause any issue, only because the > PCI id has changed, so the quirk code won't call quirk_astbmc_vga to > populate the device tree. > > Probably best to omit it for now. Sure. For now I will set this to ast2500? -Vasant From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:38 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:38 +0530 Subject: [Skiboot] [PATCH v2 00/59] P10 Enablement Message-ID: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> This series adds P10 support. This includes various features along with base P10 support like xive2, phb5, P10 stop states, 2nd DAWR support, etc,. It also adds support for rainier and denali platform support. WARNING: skiboot LID size crossing 512K limit With this patchset signed skiboot lid size (skiboot.lid.xz.stb) is crossing 512K (Its around 514K on my FC33 system with gcc v10.3.1). Current rainier (P10 BMC system) has skiboot LID (PAYLOAD partition) size limit as 512K (see [1]). We are working on increasing PAYLAOD partition size to 1MB. Until its fixed build skiboot LID with CONFIG_FSP=0 option. This will build LID without FSP component and LID size will be ~470K. [1] https://github.ibm.com/open-power/pnor/blob/master-p10/p10Layouts/defaultPnorLayout_64.xml Alistair Popple (2): hw/phys-map/p10: Add P10 MMIO map platforms: Add Rainier Anju T Sudhakar (1): hw/imc: Power10 support C?dric Le Goater (19): plat/qemu/p10: add a POWER10 platform psi/p10: Activate P10 interrupts xive/p10: Add a XIVE2 driver psi/p10: Activate 64K ESB pages psi/p10: Activate StoreEOI xive/p10: Add option flags to the XIVE exploitation mode hw/phb5: Add support for PQ offloading hw/phb5: Add support for 'Address-Based Interrupt Trigger' mode psi/p10: Introduce xive2_source_mask() psi/p10: Mask all sources at init xive/p10: Introduce new capability bits xive/p10: Configure XIVE for fused cores xive/p10: Add automatic Context Save and Restore support xive/p10: Introduce a new OPAL_XIVE_IRQ_STORE_EOI2 flag xive/p10: Activate split mode for PHB ESBs when PQ_disable is available xive/p10: Activate has_array when PQ_disable is available xive/p10: Tune max_entries_in_modified when split_mode is on xive/p10: Change alignment of the queue overflow pages phb5: Activate StoreEOI for LSIs Frederic Barrat (7): hdata/iohub: Read PCI Gen5 equalization settings for P10 hw/phb5: Update PHB numbering to allow for virtual PHBs phb5: Add register inits specific to Gen5 phb5: Workaround for PCI bug HW551382 phb4: Cleanup PEC config discovery in CAPI mode phb4/5: Fix PHB link width detection to avoid useless retrainings phb5: Fix PHB max link speed definition on P10 Haren Myneni (5): hdat/spira: Define ibm,primary-topology-index property per chip hdat/spira: Add ibm,power10-vas-x string to VAS compatible property VAS: Define Remote Memory Access paste address on P10 VAS: Enable VAS on P10 NX: Set VAS RMA write BAR register on P10 Jordan Niethe (1): hw/phb5: Add initial support Klaus Heinrich Kiwi (1): external/gard: Enable Power10 Michael Neuling (2): p10: Workaround core recovery issue phb5: Enable Gen5 Nicholas Piggin (4): external/mambo: skiboot.tcl add POWER10 config Initial POWER10 enablement cpufeatures: Add POWER10 support hw/chiptod: Add POWER10 support Oliver O'Halloran (3): hw/p8-i2c: Add POWER10 support prd: Add base P10 support hw/psi-p10: Configure interrupt offset before notify addr Pratik Rajesh Sampat (1): libpore: P10 stop-api support Ravi Bangoria (1): hdata: Add POWER10 support Reza Arbab (1): platforms/astbmc: Add ast2600 Ryan Grimm (2): hw/nx: Enable p10 DARN hw/chiptod: Retry the sync procedure on failure Vaidyanathan Srinivasan (3): Basic P10 stop state support occ: Add POWER10 support xive2: Add NCU_SPEC_BAR to stop engine for restore Vasant Hegde (6): external/xscom-utils: Add P10 chip info external/opal-prd: Fix occ, homer node label search hdata/P10: Fix xscom address and ibm,chip-id property phys/P10: Use topology index to get phys mapping platform: Add Denali platform support hw/chiptod: Abort if core frequency is not set asm/head.S | 55 +- asm/misc.S | 4 +- core/affinity.c | 2 + core/chip.c | 44 +- core/cpu.c | 32 +- core/cpufeatures.c | 104 +- core/direct-controls.c | 386 +- core/fast-reboot.c | 4 + core/hmi.c | 225 +- core/init.c | 52 +- core/mce.c | 129 +- core/test/run-timer.c | 2 +- .../opal-pci-set-phb-capi-mode-93.rst | 5 +- doc/platforms-and-cpus.rst | 1 + external/gard/gard.c | 15 +- external/gard/gard.h | 1 + external/gard/test/results/02-usage.err | 1 + external/gard/units.c | 89 + external/mambo/skiboot.tcl | 34 +- external/opal-prd/opal-prd.c | 16 +- external/xscom-utils/adu_scoms.py | 2 + external/xscom-utils/getscom.c | 3 + external/xscom-utils/sram.c | 2 + hdata/cpu-common.c | 19 +- hdata/fsp.c | 16 +- hdata/hdata.h | 2 + hdata/i2c.c | 5 +- hdata/iohub.c | 81 +- hdata/memory.c | 8 +- hdata/spira.c | 147 +- hdata/spira.h | 37 +- hdata/test/hdata_to_dt.c | 14 +- hw/Makefile.inc | 2 +- hw/capp.c | 11 +- hw/chiptod.c | 143 +- hw/dts.c | 7 +- hw/fsp/fsp-occ.c | 3 +- hw/fsp/fsp-psi.c | 1 + hw/fsp/fsp.c | 5 + hw/homer.c | 16 + hw/imc.c | 61 +- hw/lpc.c | 7 +- hw/nx-compress.c | 36 + hw/nx.c | 29 +- hw/occ-sensor.c | 4 +- hw/occ.c | 172 +- hw/p8-i2c.c | 29 +- hw/phb4.c | 473 +- hw/phys-map.c | 105 +- hw/prd.c | 5 + hw/psi.c | 115 +- hw/slw.c | 218 +- hw/test/phys-map-test.c | 25 +- hw/vas.c | 121 +- hw/xive.c | 6 +- hw/xive2.c | 4665 +++++++++++++++++ hw/xscom.c | 25 +- include/chip.h | 52 + include/imc.h | 2 + include/nx.h | 3 + include/opal-api.h | 6 +- include/p10_stop_api.H | 239 + include/phb4-regs.h | 31 +- include/phb4.h | 23 +- include/phys-map.h | 13 +- include/processor.h | 52 +- include/psi.h | 13 +- include/skiboot.h | 1 + include/vas.h | 6 +- include/xive.h | 36 + include/xive2-regs.h | 581 ++ include/xscom-p10-regs.h | 56 + include/xscom.h | 85 + libpore/Makefile.inc | 2 +- libpore/p10_cpu_reg_restore_instruction.H | 88 + libpore/p10_hcd_header_defs.H | 152 + libpore/p10_hcd_memmap_base.H | 463 ++ libpore/p10_hcd_memmap_homer.H | 94 + libpore/p10_hcd_memmap_occ_sram.H | 174 + libpore/p10_hcode_image_defines.H | 462 ++ libpore/p10_stop_api.C | 1816 +++++++ libpore/p10_stop_api.H | 238 + libpore/p10_stop_data_struct.H | 162 + libpore/p10_stop_util.C | 190 + libpore/p10_stop_util.H | 123 + platforms/astbmc/Makefile.inc | 3 +- platforms/astbmc/astbmc.h | 2 + platforms/astbmc/common.c | 19 +- platforms/astbmc/rainier.c | 136 + platforms/ibm-fsp/hostservices.c | 4 + platforms/ibm-fsp/zz.c | 6 + 91 files changed, 12728 insertions(+), 426 deletions(-) create mode 100644 hw/xive2.c create mode 100644 include/p10_stop_api.H create mode 100644 include/xive2-regs.h create mode 100644 include/xscom-p10-regs.h create mode 100644 libpore/p10_cpu_reg_restore_instruction.H create mode 100644 libpore/p10_hcd_header_defs.H create mode 100644 libpore/p10_hcd_memmap_base.H create mode 100644 libpore/p10_hcd_memmap_homer.H create mode 100644 libpore/p10_hcd_memmap_occ_sram.H create mode 100644 libpore/p10_hcode_image_defines.H create mode 100644 libpore/p10_stop_api.C create mode 100644 libpore/p10_stop_api.H create mode 100644 libpore/p10_stop_data_struct.H create mode 100644 libpore/p10_stop_util.C create mode 100644 libpore/p10_stop_util.H create mode 100644 platforms/astbmc/rainier.c -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:39 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:39 +0530 Subject: [Skiboot] [PATCH v2 01/59] external/mambo: skiboot.tcl add POWER10 config In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-2-hegdevasant@linux.vnet.ibm.com> From: Nicholas Piggin Co-authored-by: Nicholas Piggin Signed-off-by: Nicholas Piggin Co-authored-by: Madhavan Srinivasan Signed-off-by: Madhavan Srinivasan Co-authored-by: Ravi Bangoria Signed-off-by: Ravi Bangoria [Folded Maddy's IMC changes and Ravi's DAWR changes - Vasant] Signed-off-by: Vasant Hegde --- external/mambo/skiboot.tcl | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/external/mambo/skiboot.tcl b/external/mambo/skiboot.tcl index 3a8e19406..0ecb55a77 100644 --- a/external/mambo/skiboot.tcl +++ b/external/mambo/skiboot.tcl @@ -143,6 +143,17 @@ if { $default_config == "P9" } { } } +if { $default_config == "P10" } { + # PVR configured for POWER10 DD1.0 + myconf config processor/initial/PVR 0x800100 + myconf config processor/initial/SIM_CTRL1 0xc228100400000000 + + if { $mconf(numa) } { + myconf config memory_region_id_shift 44 + } +} + + if { $mconf(numa) } { myconf config memory_regions $mconf(cpus) } @@ -390,8 +401,8 @@ mysim of addprop $fake_nvram_node empty "name" "ibm,fake-nvram" set opal_node [mysim of addchild $root_node "ibm,opal" ""] -# Allow P9 to use all idle states -if { $default_config == "P9" } { +# Allow P9/P10 to use all idle states +if { $default_config == "P9" || $default_config == "P10" } { set power_mgt_node [mysim of addchild $opal_node "power-mgt" ""] mysim of addprop $power_mgt_node int "ibm,enabled-stop-levels" 0xffffffff } @@ -461,7 +472,7 @@ for { set c 0 } { $c < $mconf(cpus) } { incr c } { lappend reg 0x22 0x120 1 0x22 0x0003 ;# 16G seg 16G pages mysim of addprop $cpu_node array "ibm,segment-page-sizes" reg - if { $default_config == "P9" } { + if { $default_config == "P9" || $default_config == "P10" } { # Set actual page size encodings set reg {} # 4K pages @@ -476,8 +487,13 @@ for { set c 0 } { $c < $mconf(cpus) } { incr c } { set reg {} # POWER9 PAPR defines upto bytes 62-63 + # POWER10 PAPR defines upto byte 64-65 # header + bytes 0-5 - lappend reg 0x4000f63fc70080c0 + if { $default_config == "P9" } { + lappend reg 0x4000f63fc70080c0 + } else { + lappend reg 0x4200f63fc70080c0 + } # bytes 6-13 lappend reg 0x8000000000000000 # bytes 14-21 @@ -492,8 +508,12 @@ for { set c 0 } { $c < $mconf(cpus) } { incr c } { lappend reg 0x8000800080008000 # bytes 54-61 58/59=seg tbl lappend reg 0x8000800080008000 - # bytes 62-69 - lappend reg 0x8000000000000000 + # bytes 62-69 64/65=DAWR1(P10 only) + if { $default_config == "P9" } { + lappend reg 0x8000000000000000 + } else { + lappend reg 0x8000800000000000 + } mysim of addprop $cpu_node array64 "ibm,pa-features" reg } else { set reg {} @@ -514,7 +534,7 @@ for { set c 0 } { $c < $mconf(cpus) } { incr c } { } #Add In-Memory Collection Counter nodes -if { $default_config == "P9" } { +if { $default_config == "P9" || $default_config == "P10" } { #Add the base node "imc-counters" set imc_c [mysim of addchild $root_node "imc-counters" ""] mysim of addprop $imc_c string "compatible" "ibm,opal-in-memory-counters" -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:42 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:42 +0530 Subject: [Skiboot] [PATCH v2 04/59] p10: Workaround core recovery issue In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-5-hegdevasant@linux.vnet.ibm.com> From: Michael Neuling This works around a core recovery issue in P10. The workaround involves the CME polling for a core recovery and performing the recovery procedure itself. For this to happen, the host leaves core recovery off (HID[5]) and then masks the PC system checkstop. This patch does this. Firmware starts skiboot with recovery already off, so we just leave it off for longer and then mask the PC system checkstop. This makes the window longer where a core recovery can cause an xstop but this window is still small and can still only happens on boot. Signed-off-by: Michael Neuling [Added mambo check - Vasant] Signed-off-by: Vasant Hegde --- asm/head.S | 4 ++-- core/init.c | 36 ++++++++++++++++++++++++++++++++++++ include/xscom-p10-regs.h | 2 ++ 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/asm/head.S b/asm/head.S index f85b0fe29..fa8933b14 100644 --- a/asm/head.S +++ b/asm/head.S @@ -828,9 +828,9 @@ init_shared_sprs: /* HID0: * Boot with PPC_BIT(5) set (dis_recovery). - * Clear bit 5 to enable recovery. + * Leave bit 5 set to disable recovery (due to HW570622) */ - LOAD_IMM64(%r3, 0) + LOAD_IMM64(%r3, PPC_BIT(5)) sync mtspr SPR_HID0,%r3 isync diff --git a/core/init.c b/core/init.c index 65f136daa..0bf4ab269 100644 --- a/core/init.c +++ b/core/init.c @@ -47,6 +47,7 @@ #include #include #include +#include enum proc_gen proc_gen; unsigned int pcie_max_link_speed; @@ -989,6 +990,38 @@ bool verify_romem(void) return true; } +static void mask_pc_system_xstop(void) +{ + struct cpu_thread *cpu; + uint32_t chip_id, core_id; + int rc; + + if (proc_gen != proc_gen_p10) + return; + + if (chip_quirk(QUIRK_MAMBO_CALLOUTS)) + return; + + /* + * On P10 Mask PC system checkstop (bit 28). This is needed + * for HW570622. We keep processor recovery disabled via + * HID[5] and mask the checkstop that it can cause. CME does + * the recovery handling for us. + */ + for_each_cpu(cpu) { + chip_id = cpu->chip_id; + core_id = pir_to_core_id(cpu->pir); + + rc = xscom_write(chip_id, + XSCOM_ADDR_P10_EC(core_id, P10_CORE_FIRMASK_OR), + PPC_BIT(28)); + if (rc) + prerror("Error setting FIR MASK rc:%d on PIR:%x\n", + rc, cpu->pir); + } +} + + /* Called from head.S, thus no prototype. */ void __noreturn __nomcount main_cpu_entry(const void *fdt); @@ -1170,6 +1203,9 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) cpu_set_ipi_enable(true); + /* Once all CPU are up apply this workaround */ + mask_pc_system_xstop(); + /* Add the /opal node to the device-tree */ add_opal_node(); diff --git a/include/xscom-p10-regs.h b/include/xscom-p10-regs.h index 8096b2f91..6045152d2 100644 --- a/include/xscom-p10-regs.h +++ b/include/xscom-p10-regs.h @@ -4,6 +4,8 @@ /* Core FIR (Fault Isolation Register) */ #define P10_CORE_FIR 0x440 +#define P10_CORE_FIRMASK_OR 0x445 + /* Core WOF (Whose On First) */ #define P10_CORE_WOF 0x448 -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:41 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:41 +0530 Subject: [Skiboot] [PATCH v2 03/59] hw/p8-i2c: Add POWER10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-4-hegdevasant@linux.vnet.ibm.com> From: Oliver O'Halloran Early P8s didn't have the I2C interrupt, but all the subsequent chips have one. Flip the interrupt support checking so the old chips are the special case rather than having to add a new entry for every new chip. P10 added several additional flag registers and moved the existing flag register. The actual data bits have not changed so the existing handshake protocol between the OCC and OPAL works just fine. Signed-off-by: Oliver O'Halloran Signed-off-by: Vasant Hegde --- hw/p8-i2c.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/hw/p8-i2c.c b/hw/p8-i2c.c index 6e24c3e82..45815858e 100644 --- a/hw/p8-i2c.c +++ b/hw/p8-i2c.c @@ -315,21 +315,15 @@ static bool p8_i2c_has_irqs(struct p8_i2c_master *master) * DD2.0. When operating without interrupts, we need to bump the * timeouts as we rely solely on the polls from Linux which can * be up to 2s apart ! - * - * Also we don't have interrupts for the Centaur i2c. */ - switch (chip->type) { - case PROC_CHIP_P8_MURANO: + if (proc_gen >= proc_gen_p9) + return true; + else if (chip->type == PROC_CHIP_P8_MURANO) return chip->ec_level >= 0x21; - case PROC_CHIP_P8_VENICE: + else if (chip->type == PROC_CHIP_P8_VENICE) return chip->ec_level >= 0x20; - case PROC_CHIP_P8_NAPLES: - case PROC_CHIP_P9_NIMBUS: - case PROC_CHIP_P9_CUMULUS: - return true; - default: - return false; - } + + return true; } static int p8_i2c_enable_irqs(struct p8_i2c_master *master) @@ -928,8 +922,8 @@ static int p8_i2c_check_initial_status(struct p8_i2c_master_port *port) */ static bool occ_uses_master(struct p8_i2c_master *master) { - /* OCC uses I2CM Engines 1,2 and 3, only on POWER9 */ - if (master->type == I2C_POWER8 && proc_gen == proc_gen_p9) + /* OCC uses I2CM Engines 1,2 and 3, only on POWER9/10 */ + if (master->type == I2C_POWER8 && proc_gen >= proc_gen_p9) return master->engine_id >= 1; return false; @@ -1591,7 +1585,12 @@ void p8_i2c_init(void) int i; /* setup the handshake reg */ - occflg = 0x6C08A; + if (proc_gen <= proc_gen_p9) + occflg = 0x6C08A; + else if (proc_gen == proc_gen_p10) + occflg = 0x6C0AC; + else + return; prlog(PR_INFO, "I2C: OCC flag reg: %x\n", occflg); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:43 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:43 +0530 Subject: [Skiboot] [PATCH v2 05/59] cpufeatures: Add POWER10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-6-hegdevasant@linux.vnet.ibm.com> From: Nicholas Piggin Signed-off-by: Nicholas Piggin Signed-off-by: Ravi Bangoria [Folded Ravi's DAWR patch - Vasant] Signed-off-by: Vasant Hegde --- core/cpufeatures.c | 104 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 82 insertions(+), 22 deletions(-) diff --git a/core/cpufeatures.c b/core/cpufeatures.c index 2e33d8ad3..5620b741d 100644 --- a/core/cpufeatures.c +++ b/core/cpufeatures.c @@ -20,6 +20,7 @@ /* Device-tree visible constants follow */ #define ISA_V2_07B 2070 #define ISA_V3_0B 3000 +#define ISA_V3_1 3100 #define USABLE_PR (1U << 0) #define USABLE_OS (1U << 1) @@ -47,12 +48,13 @@ #define CPU_P9P (1U << 4) #define CPU_P9_DD2_2 (1U << 5) #define CPU_P9_DD2_3 (1U << 6) +#define CPU_P10 (1U << 7) #define CPU_P9_DD2 (CPU_P9_DD2_0_1|CPU_P9_DD2_2|CPU_P9_DD2_3|CPU_P9P) #define CPU_P8 (CPU_P8_DD1|CPU_P8_DD2) #define CPU_P9 (CPU_P9_DD1|CPU_P9_DD2|CPU_P9P) -#define CPU_ALL (CPU_P8|CPU_P9) +#define CPU_ALL (CPU_P8|CPU_P9|CPU_P10) struct cpu_feature { const char *name; @@ -202,6 +204,16 @@ static const struct cpu_feature cpu_features_table[] = { -1, -1, -1, NULL, }, + /* + * DAWR1, DAWRX1 etc. + */ + { "debug-facilities-v31", + CPU_P10, + ISA_V3_1, USABLE_HV|USABLE_OS, + HV_CUSTOM, OS_CUSTOM, + -1, -1, -1, + NULL, }, + /* * ISAv2.07B CFAR */ @@ -473,7 +485,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B radix based MMU */ { "mmu-radix", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS, HV_CUSTOM, OS_CUSTOM, -1, -1, -1, @@ -483,7 +495,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B hash based MMU, new hash pte format, PCTR, etc */ { "mmu-hash-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS, HV_CUSTOM, OS_CUSTOM, -1, -1, -1, @@ -493,7 +505,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B wait instruction */ { "wait-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -504,7 +516,7 @@ static const struct cpu_feature cpu_features_table[] = { * XXX: Same question as for idle-nap */ { "idle-stop", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS, HV_CUSTOM, OS_CUSTOM, -1, -1, -1, @@ -516,7 +528,7 @@ static const struct cpu_feature cpu_features_table[] = { * system reset SRR1 reason, etc. */ { "hypervisor-virtualization-interrupt", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV, HV_CUSTOM, OS_NONE, -1, -1, -1, @@ -532,6 +544,16 @@ static const struct cpu_feature cpu_features_table[] = { -1, -1, -1, NULL, }, + /* + * POWER10 MCE / machine check exception. + */ + { "machine-check-power10", + CPU_P10, + ISA_V3_0B, USABLE_HV|USABLE_OS, + HV_CUSTOM, OS_CUSTOM, + -1, -1, -1, + NULL, }, + /* * POWER9 PMU / performance monitor unit. */ @@ -542,12 +564,22 @@ static const struct cpu_feature cpu_features_table[] = { -1, -1, -1, NULL, }, + /* + * POWER10 PMU / performance monitor unit. + */ + { "performance-monitor-power10", + CPU_P10, + ISA_V3_1, USABLE_HV|USABLE_OS, + HV_CUSTOM, OS_CUSTOM, + -1, -1, -1, + NULL, }, + /* * ISAv3.0B scv/rfscv system call instructions and exceptions, fscr bit * etc. */ { "system-call-vectored", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_OS|USABLE_PR, HV_NONE, OS_CUSTOM, -1, PPC_BITLSHIFT(51), 52, @@ -558,7 +590,7 @@ static const struct cpu_feature cpu_features_table[] = { * global msgsnd, msgsndp, msgsync, doorbell, etc. */ { "processor-control-facility-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS, HV_CUSTOM, OS_NONE, PPC_BITLSHIFT(53), -1, -1, @@ -568,7 +600,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B addpcis instruction */ { "pc-relative-addressing", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -591,7 +623,7 @@ static const struct cpu_feature cpu_features_table[] = { * Large decrementer and hypervisor decrementer */ { "timer-facilities-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS, HV_NONE, OS_NONE, -1, -1, -1, @@ -601,7 +633,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B deliver a random number instruction (darn) */ { "random-number-generator", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, 53, @@ -614,14 +646,14 @@ static const struct cpu_feature cpu_features_table[] = { * mcrxrx, setb */ { "fixed-point-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, NULL, }, { "decimal-integer-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -631,42 +663,42 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B lightweight mffs */ { "floating-point-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, "floating-point", }, { "decimal-floating-point-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, "floating-point-v3 decimal-floating-point", }, { "vector-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, "vector", }, { "vector-scalar-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, "vector-v3 vector-scalar" }, { "vector-binary128", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, 54, "vector-scalar-v3", }, { "vector-binary16", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -676,7 +708,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B external exception for EBB */ { "event-based-branch-v3", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -686,7 +718,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B Atomic Memory Operations (AMO) */ { "atomic-memory-operations", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -696,7 +728,7 @@ static const struct cpu_feature cpu_features_table[] = { * ISAv3.0B Copy-Paste Facility */ { "copy-paste", - CPU_P9, + CPU_P9|CPU_P10, ISA_V3_0B, USABLE_HV|USABLE_OS|USABLE_PR, HV_NONE, OS_NONE, -1, -1, -1, @@ -713,6 +745,27 @@ static const struct cpu_feature cpu_features_table[] = { -1, -1, -1, NULL, }, + /* + * Enable matrix multiply accumulate. + */ + { "matrix-multiply-accumulate", + CPU_P10, + ISA_V3_1, USABLE_PR, + HV_CUSTOM, OS_CUSTOM, + -1, -1, 49, + NULL, }, + + /* + * Enable prefix instructions. Toolchains assume this is + * enabled for when compiling for ISA 3.1. + */ + { "prefix-instructions", + CPU_P10, + ISA_V3_1, USABLE_HV|USABLE_OS|USABLE_PR, + HV_HFSCR, OS_FSCR, + 13, 13, -1, + NULL, }, + /* * Due to hardware bugs in POWER9, the hypervisor needs to assist * guests. @@ -973,6 +1026,13 @@ void dt_add_cpufeatures(struct dt_node *root) cpu_feature_isa = ISA_V3_0B; cpu_feature_cpu = CPU_P9P; break; + case PVR_TYPE_P10: + if (!cpu_name) + cpu_name = "POWER10"; + + cpu_feature_isa = ISA_V3_1; + cpu_feature_cpu = CPU_P10; + break; default: return; } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:44 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:44 +0530 Subject: [Skiboot] [PATCH v2 06/59] hw/chiptod: Add POWER10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-7-hegdevasant@linux.vnet.ibm.com> From: Nicholas Piggin POWER10 changes to use the SCOM addressing mode, as it was found to be more robust than the core ID addressing mode. Signed-off-by: Nicholas Piggin Signed-off-by: Vasant Hegde --- hw/chiptod.c | 74 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 59 insertions(+), 15 deletions(-) diff --git a/hw/chiptod.c b/hw/chiptod.c index 4e62fd714..3b57f5f16 100644 --- a/hw/chiptod.c +++ b/hw/chiptod.c @@ -105,6 +105,9 @@ /* -- TOD Error interrupt register -- */ #define TOD_ERROR_INJECT 0x00040031 +/* PC unit PIB address which recieves the timebase transfer from TOD */ +#define PC_TOD 0x4A3 + /* Local FIR EH.TPCHIP.TPC.LOCAL_FIR */ #define LOCAL_CORE_FIR 0x0104000C #define LFIR_SWITCH_COMPLETE PPC_BIT(18) @@ -122,7 +125,8 @@ static enum chiptod_type { chiptod_unknown, chiptod_p8, - chiptod_p9 + chiptod_p9, + chiptod_p10, } chiptod_type; enum chiptod_chip_role { @@ -595,7 +599,8 @@ static bool chiptod_poll_running(void) static bool chiptod_to_tb(void) { - uint64_t tval, tfmr, tvbits; + uint32_t pir = this_cpu()->pir; + uint64_t tval, tfmr; uint64_t timeout = 0; /* Tell the ChipTOD about our fabric address @@ -605,25 +610,51 @@ static bool chiptod_to_tb(void) * PIR between p7 and p8, we need to do the calculation differently. * * p7: 0b00001 || 3-bit core id - * p8: 0b0001 || 4-bit core id + * p8: 0b0001 || 4-bit core id + * p9: 0b001 || 5-bit core id + * p10: 0b001 || 5-bit core id + * + * However in P10 we don't use the core ID addressing, but rather core + * scom addressing mode, which appears to work better. */ if (xscom_readme(TOD_PIB_MASTER, &tval)) { prerror("XSCOM error reading PIB_MASTER\n"); return false; } - if (chiptod_type == chiptod_p9) { - tvbits = (this_cpu()->pir >> 2) & 0x1f; - tvbits |= 0x20; - } else if (chiptod_type == chiptod_p8) { - tvbits = (this_cpu()->pir >> 3) & 0xf; - tvbits |= 0x10; + + if (chiptod_type == chiptod_p10) { + uint32_t core_id = pir_to_core_id(pir); + + if (this_cpu()->is_fused_core && + PVR_VERS_MAJ(mfspr(SPR_PVR)) == 2) { + /* Workaround: must address the even small core. */ + core_id &= ~1; + } + + tval = XSCOM_ADDR_P10_EC(core_id, PC_TOD); + + tval <<= 32; /* PIB slave address goes in PPC bits [0:31] */ + + tval |= PPC_BIT(35); /* Enable SCOM addressing. */ + } else { - tvbits = (this_cpu()->pir >> 2) & 0x7; - tvbits |= 0x08; + uint64_t tvbits; + + if (chiptod_type == chiptod_p9) { + tvbits = (pir >> 2) & 0x1f; + tvbits |= 0x20; + } else if (chiptod_type == chiptod_p8) { + tvbits = (pir >> 3) & 0xf; + tvbits |= 0x10; + } else { + tvbits = (pir >> 2) & 0x7; + tvbits |= 0x08; + } + tval &= ~TOD_PIBM_ADDR_CFG_MCAST; + tval = SETFIELD(TOD_PIBM_ADDR_CFG_SLADDR, tval, tvbits); } - tval &= ~TOD_PIBM_ADDR_CFG_MCAST; - tval = SETFIELD(TOD_PIBM_ADDR_CFG_SLADDR, tval, tvbits); + if (xscom_writeme(TOD_PIB_MASTER, tval)) { prerror("XSCOM error writing PIB_MASTER\n"); return false; @@ -868,10 +899,21 @@ static void chiptod_sync_master(void *data) static void chiptod_sync_slave(void *data) { bool *result = data; + bool do_sync = false; /* Only get primaries, not threads */ - if (this_cpu()->is_secondary) { - /* On secondaries we just cleanup the TFMR */ + if (!this_cpu()->is_secondary) + do_sync = true; + + if (chiptod_type == chiptod_p10 && this_cpu()->is_fused_core && + PVR_VERS_MAJ(mfspr(SPR_PVR)) == 2) { + /* P10 DD2 fused core workaround, must sync on small cores */ + if (this_cpu() == this_cpu()->ec_primary) + do_sync = true; + } + + if (!do_sync) { + /* Just cleanup the TFMR */ chiptod_cleanup_thread_tfmr(); *result = true; return; @@ -1667,6 +1709,8 @@ static bool chiptod_probe(void) chiptod_type = chiptod_p8; if (dt_node_is_compatible(np, "ibm,power9-chiptod")) chiptod_type = chiptod_p9; + if (dt_node_is_compatible(np, "ibm,power10-chiptod")) + chiptod_type = chiptod_p10; } if (dt_has_node_property(np, "secondary", NULL)) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:45 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:45 +0530 Subject: [Skiboot] [PATCH v2 07/59] Basic P10 stop state support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-8-hegdevasant@linux.vnet.ibm.com> From: Vaidyanathan Srinivasan Adds support for STOP0 lite, STOP2 and STOP3 for Power10 with the following latencies, residency requirements: latency residency stop0lite 1us 10us stop0 10us 100us stop2 20us 200us stop3 45us 450us Signed-off-by: Vaidyanathan Srinivasan Signed-off-by: Pratik R. Sampat Signed-off-by: Vasant Hegde --- hw/homer.c | 16 +++++++ hw/slw.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++------- 2 files changed, 126 insertions(+), 16 deletions(-) diff --git a/hw/homer.c b/hw/homer.c index c5dbd58e3..3ff6ed1ae 100644 --- a/hw/homer.c +++ b/hw/homer.c @@ -15,6 +15,9 @@ #define P9_PBA_BAR0 0x5012B00 #define P9_PBA_BARMASK0 0x5012B04 +#define P10_PBA_BAR0 0x01010CDA +#define P10_PBA_BARMASK0 0x01010CDE + #define PBA_MASK_ALL_BITS 0x000001FFFFF00000ULL /* Bits 23:43 */ enum P8_BAR { @@ -31,6 +34,13 @@ enum P9_BAR { P9_BAR_SBE = 3, }; +enum P10_BAR { + P10_BAR_HOMER = 0, + P10_BAR_OCMB_THERMAL = 1, + P10_BAR_OCC_COMMON = 2, + P10_BAR_SBE = 3, +}; + static u64 pba_bar0, pba_barmask0; static u8 bar_homer, bar_slw, bar_occ_common; @@ -190,6 +200,12 @@ void homer_init(void) bar_homer = P9_BAR_HOMER; bar_occ_common = P9_BAR_OCC_COMMON; break; + case proc_gen_p10: + pba_bar0 = P10_PBA_BAR0; + pba_barmask0 = P10_PBA_BARMASK0; + bar_homer = P10_BAR_HOMER; + bar_occ_common = P10_BAR_OCC_COMMON; + break; default: return; }; diff --git a/hw/slw.c b/hw/slw.c index a0145deb6..8969096ac 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -761,6 +761,92 @@ static struct cpu_idle_states power9_fusedcore_cpu_idle_states[] = { .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, }; +/* + * Note latency_ns and residency_ns are estimated values for now. + */ +static struct cpu_idle_states power10_cpu_idle_states[] = { + { + .name = "stop0_lite", /* Enter stop0 with no state loss */ + .latency_ns = 1000, + .residency_ns = 10000, + .flags = 0*OPAL_PM_DEC_STOP \ + | 0*OPAL_PM_TIMEBASE_STOP \ + | 0*OPAL_PM_LOSE_USER_CONTEXT \ + | 0*OPAL_PM_LOSE_HYP_CONTEXT \ + | 0*OPAL_PM_LOSE_FULL_CONTEXT \ + | 1*OPAL_PM_STOP_INST_FAST, + .pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(0) \ + | OPAL_PM_PSSCR_MTL(0) \ + | OPAL_PM_PSSCR_TR(3), + .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, + { + .name = "stop0", + .latency_ns = 10000, + .residency_ns = 100000, + .flags = 0*OPAL_PM_DEC_STOP \ + | 0*OPAL_PM_TIMEBASE_STOP \ + | 1*OPAL_PM_LOSE_USER_CONTEXT \ + | 0*OPAL_PM_LOSE_HYP_CONTEXT \ + | 0*OPAL_PM_LOSE_FULL_CONTEXT \ + | 1*OPAL_PM_STOP_INST_FAST, + .pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(0) \ + | OPAL_PM_PSSCR_MTL(0) \ + | OPAL_PM_PSSCR_TR(3) \ + | OPAL_PM_PSSCR_ESL \ + | OPAL_PM_PSSCR_EC, + .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, + { + .name = "stop2", + .latency_ns = 20000, + .residency_ns = 200000, + .flags = 0*OPAL_PM_DEC_STOP \ + | 0*OPAL_PM_TIMEBASE_STOP \ + | 1*OPAL_PM_LOSE_USER_CONTEXT \ + | 0*OPAL_PM_LOSE_HYP_CONTEXT \ + | 0*OPAL_PM_LOSE_FULL_CONTEXT \ + | 1*OPAL_PM_STOP_INST_FAST, + .pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(2) \ + | OPAL_PM_PSSCR_MTL(2) \ + | OPAL_PM_PSSCR_TR(3) \ + | OPAL_PM_PSSCR_ESL \ + | OPAL_PM_PSSCR_EC, + .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, + { + .name = "stop3", + .latency_ns = 45000, + .residency_ns = 450000, + .flags = 0*OPAL_PM_DEC_STOP \ + | 0*OPAL_PM_TIMEBASE_STOP \ + | 1*OPAL_PM_LOSE_USER_CONTEXT \ + | 0*OPAL_PM_LOSE_HYP_CONTEXT \ + | 0*OPAL_PM_LOSE_FULL_CONTEXT \ + | 1*OPAL_PM_STOP_INST_FAST, + .pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(3) \ + | OPAL_PM_PSSCR_MTL(3) \ + | OPAL_PM_PSSCR_TR(3) \ + | OPAL_PM_PSSCR_ESL \ + | OPAL_PM_PSSCR_EC, + .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, +#if 0 + { + .name = "stop11", + .latency_ns = 10000000, + .residency_ns = 100000000, + .flags = 1*OPAL_PM_DEC_STOP \ + | 1*OPAL_PM_TIMEBASE_STOP \ + | 1*OPAL_PM_LOSE_USER_CONTEXT \ + | 1*OPAL_PM_LOSE_HYP_CONTEXT \ + | 1*OPAL_PM_LOSE_FULL_CONTEXT \ + | 1*OPAL_PM_STOP_INST_DEEP, + .pm_ctrl_reg_val = OPAL_PM_PSSCR_RL(11) \ + | OPAL_PM_PSSCR_MTL(11) \ + | OPAL_PM_PSSCR_TR(3) \ + | OPAL_PM_PSSCR_ESL \ + | OPAL_PM_PSSCR_EC, + .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, +#endif +}; + static void slw_late_init_p9(struct proc_chip *chip) { struct cpu_thread *c; @@ -801,7 +887,7 @@ void add_cpu_idle_state_properties(void) fdt64_t *pm_ctrl_reg_val_buf; fdt64_t *pm_ctrl_reg_mask_buf; u32 supported_states_mask; - u32 opal_disabled_states_mask = ~0xEC000000; /* all but stop11 */ + u32 opal_disabled_states_mask = ~0xFC000000; /* all but stop11 */ const char* nvram_disable_str; u32 nvram_disabled_states_mask = 0x00; u32 stop_levels; @@ -839,18 +925,26 @@ void add_cpu_idle_state_properties(void) */ chip = next_chip(NULL); assert(chip); - if (chip->type == PROC_CHIP_P9_NIMBUS || - chip->type == PROC_CHIP_P9_CUMULUS || - chip->type == PROC_CHIP_P9P) { - if (proc_chip_quirks & QUIRK_MAMBO_CALLOUTS) { - states = power9_mambo_cpu_idle_states; - nr_states = ARRAY_SIZE(power9_mambo_cpu_idle_states); - } else if (this_cpu()->is_fused_core) { - states = power9_fusedcore_cpu_idle_states; - nr_states = ARRAY_SIZE(power9_fusedcore_cpu_idle_states); - } else { - states = power9_cpu_idle_states; - nr_states = ARRAY_SIZE(power9_cpu_idle_states); + if (proc_gen >= proc_gen_p9) { + if (chip->type == PROC_CHIP_P9_NIMBUS || + chip->type == PROC_CHIP_P9_CUMULUS || + chip->type == PROC_CHIP_P9P) { + if (proc_chip_quirks & QUIRK_MAMBO_CALLOUTS) { + states = power9_mambo_cpu_idle_states; + nr_states = ARRAY_SIZE(power9_mambo_cpu_idle_states); + } else if (this_cpu()->is_fused_core) { + states = power9_fusedcore_cpu_idle_states; + nr_states = ARRAY_SIZE(power9_fusedcore_cpu_idle_states); + } else { + states = power9_cpu_idle_states; + nr_states = ARRAY_SIZE(power9_cpu_idle_states); + } + } else if (chip->type == PROC_CHIP_P10) { + states = power10_cpu_idle_states; + nr_states = ARRAY_SIZE(power10_cpu_idle_states); + } else { + prlog(PR_ERR, "determining chip type\n"); + return; } has_stop_inst = true; @@ -934,7 +1028,7 @@ void add_cpu_idle_state_properties(void) * device-tree */ if (has_stop_inst) { - /* Power 9 / POWER ISA 3.0 */ + /* Power 9/10 / POWER ISA 3.0 and above */ supported_states_mask = OPAL_PM_STOP_INST_FAST; if (wakeup_engine_state == WAKEUP_ENGINE_PRESENT) supported_states_mask |= OPAL_PM_STOP_INST_DEEP; @@ -1463,7 +1557,7 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) return OPAL_PARAMETER; } - if (proc_gen == proc_gen_p9) { + if (proc_gen >= proc_gen_p9) { if (!has_deep_states) { prlog(PR_INFO, "SLW: Deep states not enabled\n"); return OPAL_SUCCESS; @@ -1540,7 +1634,7 @@ void slw_init(void) slw_late_init_p8(chip); } p8_sbe_init_timer(); - } else if (proc_gen == proc_gen_p9) { + } else if (proc_gen >= proc_gen_p9) { for_each_chip(chip) { slw_init_chip_p9(chip); if(slw_image_check_p9(chip)) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:40 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:40 +0530 Subject: [Skiboot] [PATCH v2 02/59] Initial POWER10 enablement In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-3-hegdevasant@linux.vnet.ibm.com> From: Nicholas Piggin Co-authored-by: Nicholas Piggin Signed-off-by: Nicholas Piggin Co-authored-by: Vaidyanathan Srinivasan Signed-off-by: Vaidyanathan Srinivasan Co-authored-by: Michael Neuling Signed-off-by: Michael Neuling Co-authored-by: Vasant Hegde Signed-off-by: Vasant Hegde Co-authored-by: Mahesh Salgaonkar Signed-off-by: Mahesh Salgaonkar Co-authored-by: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- asm/head.S | 55 +++++- asm/misc.S | 4 +- core/affinity.c | 2 + core/chip.c | 40 +++- core/cpu.c | 32 +++- core/direct-controls.c | 363 ++++++++++++++++++++++++++++++++++--- core/hmi.c | 221 ++++++++++++++++++++-- core/init.c | 2 +- core/mce.c | 129 ++++++++++++- core/test/run-timer.c | 2 +- doc/platforms-and-cpus.rst | 1 + hw/chiptod.c | 30 ++- hw/dts.c | 7 +- hw/lpc.c | 7 +- hw/xscom.c | 25 ++- include/chip.h | 49 +++++ include/opal-api.h | 1 + include/processor.h | 52 ++++-- include/skiboot.h | 1 + include/xscom-p10-regs.h | 54 ++++++ include/xscom.h | 85 +++++++++ 21 files changed, 1067 insertions(+), 95 deletions(-) create mode 100644 include/xscom-p10-regs.h diff --git a/asm/head.S b/asm/head.S index d773bde04..f85b0fe29 100644 --- a/asm/head.S +++ b/asm/head.S @@ -324,7 +324,7 @@ boot_offset: * r28 : PVR * r27 : DTB pointer (or NULL) * r26 : PIR thread mask - * r25 : P9 fused core flag + * r25 : P9/10 fused core flag */ .global boot_entry boot_entry: @@ -342,6 +342,8 @@ boot_entry: beq 3f cmpwi cr0,%r3,PVR_TYPE_P9P beq 3f + cmpwi cr0,%r3,PVR_TYPE_P10 + beq 4f attn /* Unsupported CPU type... what do we do ? */ b . /* loop here, just in case attn is disabled */ @@ -352,8 +354,17 @@ boot_entry: mfspr %r3, SPR_SPRD andi. %r25, %r3, 1 beq 1f + b 2f - /* P8 or P9 fused -> 8 threads */ +4: /* + * P10 fused core check (SPRC/SPRD method does not work). + * PVR bit 12 set = normal code + */ + andi. %r3, %r28, 0x1000 + bne 1f + li %r25, 1 + + /* P8 or P9 fused or P10 fused -> 8 threads */ 2: li %r26,7 @@ -730,6 +741,8 @@ init_shared_sprs: beq 4f cmpwi cr0,%r3,PVR_TYPE_P9P beq 4f + cmpwi cr0,%r3,PVR_TYPE_P10 + beq 5f /* Unsupported CPU type... what do we do ? */ b 9f @@ -806,6 +819,32 @@ init_shared_sprs: LOAD_IMM64(%r3,0x00000103070F1F3F) mtspr SPR_RPR,%r3 + b 9f + +5: /* P10 */ + /* TSCR: UM recommended value */ + LOAD_IMM32(%r3,0x80287880) + mtspr SPR_TSCR, %r3 + + /* HID0: + * Boot with PPC_BIT(5) set (dis_recovery). + * Clear bit 5 to enable recovery. + */ + LOAD_IMM64(%r3, 0) + sync + mtspr SPR_HID0,%r3 + isync + + LOAD_IMM64(%r4,SPR_HMEER_P10_HMI_ENABLE_MASK) + mfspr %r3,SPR_HMEER + or %r3,%r3,%r4 + sync + mtspr SPR_HMEER,%r3 + isync + + LOAD_IMM64(%r3,0x00000103070F1F3F) + mtspr SPR_RPR,%r3 + 9: blr .global init_replicated_sprs @@ -822,6 +861,8 @@ init_replicated_sprs: beq 4f cmpwi cr0,%r3,PVR_TYPE_P9P beq 4f + cmpwi cr0,%r3,PVR_TYPE_P10 + beq 5f /* Unsupported CPU type... what do we do ? */ b 9f @@ -845,6 +886,16 @@ init_replicated_sprs: LOAD_IMM64(%r3,0x0000000000000010) mtspr SPR_DSCR,%r3 +5: /* P10 */ + /* LPCR: sane value */ + LOAD_IMM64(%r3,0x0040000000000000) + mtspr SPR_LPCR, %r3 + sync + isync + /* DSCR: Stride-N Stream Enable */ + LOAD_IMM64(%r3,0x0000000000000010) + mtspr SPR_DSCR,%r3 + 9: blr .global enter_nap diff --git a/asm/misc.S b/asm/misc.S index 033448975..ea4376322 100644 --- a/asm/misc.S +++ b/asm/misc.S @@ -99,13 +99,15 @@ cleanup_local_tlb: .global cleanup_global_tlb cleanup_global_tlb: - /* Only supported on P9 for now */ + /* Only supported on P9, P10 for now */ mfspr %r3,SPR_PVR srdi %r3,%r3,16 cmpwi cr0,%r3,PVR_TYPE_P9 beq cr0,1f cmpwi cr0,%r3,PVR_TYPE_P9P beq cr0,1f + cmpwi cr0,%r3,PVR_TYPE_P10 + beq cr0,1f blr /* Sync out previous updates */ diff --git a/core/affinity.c b/core/affinity.c index 47ba33cf2..0209d3cd9 100644 --- a/core/affinity.c +++ b/core/affinity.c @@ -111,6 +111,8 @@ void add_core_associativity(struct cpu_thread *cpu) core_id = (cpu->pir >> 3) & 0xf; else if (proc_gen == proc_gen_p9) core_id = (cpu->pir >> 2) & 0x1f; + else if (proc_gen == proc_gen_p10) + core_id = (cpu->pir >> 2) & 0x1f; else return; diff --git a/core/chip.c b/core/chip.c index f1269d3f9..f79e8cd04 100644 --- a/core/chip.c +++ b/core/chip.c @@ -13,7 +13,9 @@ enum proc_chip_quirks proc_chip_quirks; uint32_t pir_to_chip_id(uint32_t pir) { - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + return P10_PIR2GCID(pir); + else if (proc_gen == proc_gen_p9) return P9_PIR2GCID(pir); else if (proc_gen == proc_gen_p8) return P8_PIR2GCID(pir); @@ -23,41 +25,59 @@ uint32_t pir_to_chip_id(uint32_t pir) uint32_t pir_to_core_id(uint32_t pir) { - if (proc_gen == proc_gen_p9) { + if (proc_gen == proc_gen_p10) { + if (this_cpu()->is_fused_core) + return P10_PIRFUSED2NORMALCOREID(pir); + else + return P10_PIR2COREID(pir); + } else if (proc_gen == proc_gen_p9) { if (this_cpu()->is_fused_core) return P9_PIRFUSED2NORMALCOREID(pir); else return P9_PIR2COREID(pir); - } else if (proc_gen == proc_gen_p8) + } else if (proc_gen == proc_gen_p8) { return P8_PIR2COREID(pir); - else + } else { assert(false); + } } uint32_t pir_to_fused_core_id(uint32_t pir) { - if (proc_gen == proc_gen_p9) { + if (proc_gen == proc_gen_p10) { + if (this_cpu()->is_fused_core) + return P10_PIR2FUSEDCOREID(pir); + else + return P10_PIR2COREID(pir); + } else if (proc_gen == proc_gen_p9) { if (this_cpu()->is_fused_core) return P9_PIR2FUSEDCOREID(pir); else return P9_PIR2COREID(pir); - } else if (proc_gen == proc_gen_p8) + } else if (proc_gen == proc_gen_p8) { return P8_PIR2COREID(pir); - else + } else { assert(false); + } } uint32_t pir_to_thread_id(uint32_t pir) { - if (proc_gen == proc_gen_p9) { + if (proc_gen == proc_gen_p10) { + if (this_cpu()->is_fused_core) + return P10_PIRFUSED2NORMALTHREADID(pir); + else + return P10_PIR2THREADID(pir); + } else if (proc_gen == proc_gen_p9) { if (this_cpu()->is_fused_core) return P9_PIRFUSED2NORMALTHREADID(pir); else return P9_PIR2THREADID(pir); - } else if (proc_gen == proc_gen_p8) + } else if (proc_gen == proc_gen_p8) { return P8_PIR2THREADID(pir); - else + } else { assert(false); + } } struct proc_chip *next_chip(struct proc_chip *chip) diff --git a/core/cpu.c b/core/cpu.c index dbc1ff445..f58aeb27a 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -100,7 +100,7 @@ static void cpu_wake(struct cpu_thread *cpu) if (proc_gen == proc_gen_p8) { /* Poke IPI */ icp_kick_cpu(cpu); - } else if (proc_gen == proc_gen_p9) { + } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { p9_dbell_send(cpu->pir); } } @@ -507,6 +507,9 @@ static void cpu_idle_pm(enum cpu_wake_cause wake_on) case proc_gen_p9: vec = cpu_idle_p9(wake_on); break; + case proc_gen_p10: + vec = cpu_idle_p9(wake_on); + break; default: vec = 0; prlog_once(PR_DEBUG, "cpu_idle_pm called with bad processor type\n"); @@ -605,7 +608,7 @@ static void cpu_pm_disable(void) cpu_relax(); } } - } else if (proc_gen == proc_gen_p9) { + } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { for_each_available_cpu(cpu) { if (cpu->in_sleep || cpu->in_idle) p9_dbell_send(cpu->pir); @@ -648,7 +651,7 @@ void cpu_set_sreset_enable(bool enabled) pm_enabled = true; } - } else if (proc_gen == proc_gen_p9) { + } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { sreset_enabled = enabled; sync(); /* @@ -676,7 +679,7 @@ void cpu_set_ipi_enable(bool enabled) pm_enabled = true; } - } else if (proc_gen == proc_gen_p9) { + } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { ipi_enabled = enabled; sync(); if (!enabled) @@ -1014,6 +1017,13 @@ void init_boot_cpu(void) hid0_hile = SPR_HID0_POWER9_HILE; hid0_attn = SPR_HID0_POWER9_ENABLE_ATTN; break; + case PVR_TYPE_P10: + proc_gen = proc_gen_p10; + hile_supported = true; + radix_supported = true; + hid0_hile = SPR_HID0_POWER10_HILE; + hid0_attn = SPR_HID0_POWER10_ENABLE_ATTN; + break; default: proc_gen = proc_gen_unknown; } @@ -1033,6 +1043,14 @@ void init_boot_cpu(void) prlog(PR_INFO, "CPU: P9 generation processor" " (max %d threads/core)\n", cpu_thread_count); break; + case proc_gen_p10: + if (is_fused_core(pvr)) + cpu_thread_count = 8; + else + cpu_thread_count = 4; + prlog(PR_INFO, "CPU: P10 generation processor" + " (max %d threads/core)\n", cpu_thread_count); + break; default: prerror("CPU: Unknown PVR, assuming 1 thread\n"); cpu_thread_count = 1; @@ -1535,7 +1553,8 @@ void cpu_fast_reboot_complete(void) current_hile_mode = HAVE_LITTLE_ENDIAN; /* and set HID0:RADIX */ - current_radix_mode = true; + if (proc_gen == proc_gen_p9) + current_radix_mode = true; } static int64_t opal_reinit_cpus(uint64_t flags) @@ -1616,7 +1635,8 @@ static int64_t opal_reinit_cpus(uint64_t flags) flags &= ~(OPAL_REINIT_CPUS_MMU_HASH | OPAL_REINIT_CPUS_MMU_RADIX); - if (radix != current_radix_mode) { + + if (proc_gen == proc_gen_p9 && radix != current_radix_mode) { if (radix) req.set_bits |= SPR_HID0_POWER9_RADIX; else diff --git a/core/direct-controls.c b/core/direct-controls.c index 0274367da..f7509dde0 100644 --- a/core/direct-controls.c +++ b/core/direct-controls.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -268,6 +269,25 @@ static int p8_sreset_thread(struct cpu_thread *cpu) * using scom registers. */ +static int p9_core_is_gated(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t sshhyp_addr; + uint64_t val; + + sshhyp_addr = XSCOM_ADDR_P9_EC_SLAVE(core_id, P9_EC_PPM_SSHHYP); + + if (xscom_read(chip_id, sshhyp_addr, &val)) { + prlog(PR_ERR, "Could not query core gated on %u:%u:" + " Unable to read PPM_SSHHYP.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + + return !!(val & P9_CORE_GATED); +} + static int p9_core_set_special_wakeup(struct cpu_thread *cpu) { uint32_t chip_id = pir_to_chip_id(cpu->pir); @@ -301,7 +321,7 @@ static int p9_core_set_special_wakeup(struct cpu_thread *cpu) * out of stop state. If CORE_GATED is still set then * raise error. */ - if (dctl_core_is_gated(cpu)) { + if (p9_core_is_gated(cpu)) { /* Deassert spwu for this strange error */ xscom_write(chip_id, swake_addr, 0); prlog(PR_ERR, "Failed special wakeup on %u:%u" @@ -517,6 +537,295 @@ static int p9_sreset_thread(struct cpu_thread *cpu) return 0; } +/**************** POWER10 direct controls ****************/ + +/* Long running instructions may take time to complete. Timeout 100ms */ +#define P10_QUIESCE_POLL_INTERVAL 100 +#define P10_QUIESCE_TIMEOUT 100000 + +/* Waking may take up to 5ms for deepest sleep states. Set timeout to 100ms */ +#define P10_SPWU_POLL_INTERVAL 100 +#define P10_SPWU_TIMEOUT 100000 + +/* + * This implements direct control facilities of processor cores and threads + * using scom registers. + */ +static int p10_core_is_gated(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t ssh_addr; + uint64_t val; + + ssh_addr = XSCOM_ADDR_P10_QME_CORE(core_id, P10_QME_SSH_HYP); + + if (xscom_read(chip_id, ssh_addr, &val)) { + prlog(PR_ERR, "Could not query core gated on %u:%u:" + " Unable to read QME_SSH_HYP.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + + return !!(val & P10_SSH_CORE_GATED); +} + + +static int p10_core_set_special_wakeup(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t spwu_addr, ssh_addr; + uint64_t val; + int i; + + /* P10 could use SPWU_HYP done bit instead of SSH? */ + spwu_addr = XSCOM_ADDR_P10_QME_CORE(core_id, P10_QME_SPWU_HYP); + ssh_addr = XSCOM_ADDR_P10_QME_CORE(core_id, P10_QME_SSH_HYP); + + if (xscom_write(chip_id, spwu_addr, P10_SPWU_REQ)) { + prlog(PR_ERR, "Could not set special wakeup on %u:%u:" + " Unable to write QME_SPWU_HYP.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + + for (i = 0; i < P10_SPWU_TIMEOUT / P10_SPWU_POLL_INTERVAL; i++) { + if (xscom_read(chip_id, ssh_addr, &val)) { + prlog(PR_ERR, "Could not set special wakeup on %u:%u:" + " Unable to read QME_SSH_HYP.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + if (val & P10_SSH_SPWU_DONE) { + /* + * CORE_GATED will be unset on a successful special + * wakeup of the core which indicates that the core is + * out of stop state. If CORE_GATED is still set then + * raise error. + */ + if (p10_core_is_gated(cpu)) { + /* Deassert spwu for this strange error */ + xscom_write(chip_id, spwu_addr, 0); + prlog(PR_ERR, "Failed special wakeup on %u:%u" + " core remains gated.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } else { + return 0; + } + } + time_wait_us(P10_SPWU_POLL_INTERVAL); + } + + prlog(PR_ERR, "Could not set special wakeup on %u:%u:" + " operation timeout.\n", + chip_id, core_id); + /* + * As per the special wakeup protocol we should not de-assert + * the special wakeup on the core until WAKEUP_DONE is set. + * So even on error do not de-assert. + */ + + return OPAL_HARDWARE; +} + +static int p10_core_clear_special_wakeup(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t spwu_addr; + + spwu_addr = XSCOM_ADDR_P10_QME_CORE(core_id, P10_QME_SPWU_HYP); + + /* Add a small delay here if spwu problems time_wait_us(1); */ + if (xscom_write(chip_id, spwu_addr, 0)) { + prlog(PR_ERR, "Could not clear special wakeup on %u:%u:" + " Unable to write QME_SPWU_HYP.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + + return 0; +} + +static int p10_thread_quiesced(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t thread_id = pir_to_thread_id(cpu->pir); + uint32_t ras_addr; + uint64_t ras_status; + + ras_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_RAS_STATUS); + if (xscom_read(chip_id, ras_addr, &ras_status)) { + prlog(PR_ERR, "Could not check thread state on %u:%u:" + " Unable to read EC_RAS_STATUS.\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + + /* + * p10_thread_stop for the purpose of sreset wants QUIESCED + * and MAINT bits set. Step, RAM, etc. need more, but we don't + * use those in skiboot. + * + * P10 could try wait for more here in case of errors. + */ + if (!(ras_status & P10_THREAD_QUIESCED(thread_id))) + return 0; + + if (!(ras_status & P10_THREAD_MAINT(thread_id))) + return 0; + + return 1; +} + +static int p10_cont_thread(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t thread_id = pir_to_thread_id(cpu->pir); + uint32_t cts_addr; + uint32_t ti_addr; + uint32_t dctl_addr; + uint64_t core_thread_state; + uint64_t thread_info; + bool active, stop; + int rc; + int i; + + rc = p10_thread_quiesced(cpu); + if (rc < 0) + return rc; + if (!rc) { + prlog(PR_ERR, "Could not cont thread %u:%u:%u:" + " Thread is not quiesced.\n", + chip_id, core_id, thread_id); + return OPAL_BUSY; + } + + cts_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_CORE_THREAD_STATE); + ti_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_THREAD_INFO); + dctl_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_DIRECT_CONTROLS); + + if (xscom_read(chip_id, cts_addr, &core_thread_state)) { + prlog(PR_ERR, "Could not resume thread %u:%u:%u:" + " Unable to read EC_CORE_THREAD_STATE.\n", + chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + if (core_thread_state & P10_THREAD_STOPPED(thread_id)) + stop = true; + else + stop = false; + + if (xscom_read(chip_id, ti_addr, &thread_info)) { + prlog(PR_ERR, "Could not resume thread %u:%u:%u:" + " Unable to read EC_THREAD_INFO.\n", + chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + if (thread_info & P10_THREAD_ACTIVE(thread_id)) + active = true; + else + active = false; + + if (!active || stop) { + if (xscom_write(chip_id, dctl_addr, P10_THREAD_CLEAR_MAINT(thread_id))) { + prlog(PR_ERR, "Could not resume thread %u:%u:%u:" + " Unable to write EC_DIRECT_CONTROLS.\n", + chip_id, core_id, thread_id); + } + } else { + if (xscom_write(chip_id, dctl_addr, P10_THREAD_START(thread_id))) { + prlog(PR_ERR, "Could not resume thread %u:%u:%u:" + " Unable to write EC_DIRECT_CONTROLS.\n", + chip_id, core_id, thread_id); + } + } + + for (i = 0; i < P10_QUIESCE_TIMEOUT / P10_QUIESCE_POLL_INTERVAL; i++) { + int rc = p10_thread_quiesced(cpu); + if (rc < 0) + break; + if (!rc) + return 0; + + time_wait_us(P10_QUIESCE_POLL_INTERVAL); + } + + prlog(PR_ERR, "Could not start thread %u:%u:%u:" + " Unable to start thread.\n", + chip_id, core_id, thread_id); + + return OPAL_HARDWARE; +} + +static int p10_stop_thread(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t thread_id = pir_to_thread_id(cpu->pir); + uint32_t dctl_addr; + int rc; + int i; + + dctl_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_DIRECT_CONTROLS); + + rc = p10_thread_quiesced(cpu); + if (rc < 0) + return rc; + if (rc) { + prlog(PR_ERR, "Could not stop thread %u:%u:%u:" + " Thread is quiesced already.\n", + chip_id, core_id, thread_id); + return OPAL_BUSY; + } + + if (xscom_write(chip_id, dctl_addr, P10_THREAD_STOP(thread_id))) { + prlog(PR_ERR, "Could not stop thread %u:%u:%u:" + " Unable to write EC_DIRECT_CONTROLS.\n", + chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + + for (i = 0; i < P10_QUIESCE_TIMEOUT / P10_QUIESCE_POLL_INTERVAL; i++) { + int rc = p10_thread_quiesced(cpu); + if (rc < 0) + break; + if (rc) + return 0; + + time_wait_us(P10_QUIESCE_POLL_INTERVAL); + } + + prlog(PR_ERR, "Could not stop thread %u:%u:%u:" + " Unable to quiesce thread.\n", + chip_id, core_id, thread_id); + + return OPAL_HARDWARE; +} + +static int p10_sreset_thread(struct cpu_thread *cpu) +{ + uint32_t chip_id = pir_to_chip_id(cpu->pir); + uint32_t core_id = pir_to_core_id(cpu->pir); + uint32_t thread_id = pir_to_thread_id(cpu->pir); + uint32_t dctl_addr; + + dctl_addr = XSCOM_ADDR_P10_EC(core_id, P10_EC_DIRECT_CONTROLS); + + if (xscom_write(chip_id, dctl_addr, P10_THREAD_SRESET(thread_id))) { + prlog(PR_ERR, "Could not sreset thread %u:%u:%u:" + " Unable to write EC_DIRECT_CONTROLS.\n", + chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + + return 0; +} + /**************** generic direct controls ****************/ int dctl_set_special_wakeup(struct cpu_thread *t) @@ -529,7 +838,9 @@ int dctl_set_special_wakeup(struct cpu_thread *t) lock(&c->dctl_lock); if (c->special_wakeup_count == 0) { - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + rc = p10_core_set_special_wakeup(c); + else if (proc_gen == proc_gen_p9) rc = p9_core_set_special_wakeup(c); else /* (proc_gen == proc_gen_p8) */ rc = p8_core_set_special_wakeup(c); @@ -553,7 +864,9 @@ int dctl_clear_special_wakeup(struct cpu_thread *t) if (!c->special_wakeup_count) goto out; if (c->special_wakeup_count == 1) { - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + rc = p10_core_clear_special_wakeup(c); + else if (proc_gen == proc_gen_p9) rc = p9_core_clear_special_wakeup(c); else /* (proc_gen == proc_gen_p8) */ rc = p8_core_clear_special_wakeup(c); @@ -569,24 +882,13 @@ out: int dctl_core_is_gated(struct cpu_thread *t) { struct cpu_thread *c = t->primary; - uint32_t chip_id = pir_to_chip_id(c->pir); - uint32_t core_id = pir_to_core_id(c->pir); - uint32_t sshhyp_addr; - uint64_t val; - if (proc_gen != proc_gen_p9) + if (proc_gen == proc_gen_p10) + return p10_core_is_gated(c); + else if (proc_gen == proc_gen_p9) + return p9_core_is_gated(c); + else return OPAL_UNSUPPORTED; - - sshhyp_addr = XSCOM_ADDR_P9_EC_SLAVE(core_id, P9_EC_PPM_SSHHYP); - - if (xscom_read(chip_id, sshhyp_addr, &val)) { - prlog(PR_ERR, "Could not query core gated on %u:%u:" - " Unable to read PPM_SSHHYP.\n", - chip_id, core_id); - return OPAL_HARDWARE; - } - - return !!(val & P9_CORE_GATED); } static int dctl_stop(struct cpu_thread *t) @@ -599,7 +901,9 @@ static int dctl_stop(struct cpu_thread *t) unlock(&c->dctl_lock); return OPAL_BUSY; } - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + rc = p10_stop_thread(t); + else if (proc_gen == proc_gen_p9) rc = p9_stop_thread(t); else /* (proc_gen == proc_gen_p8) */ rc = p8_stop_thread(t); @@ -615,7 +919,7 @@ static int dctl_cont(struct cpu_thread *t) struct cpu_thread *c = t->primary; int rc; - if (proc_gen != proc_gen_p9) + if (proc_gen != proc_gen_p10 && proc_gen != proc_gen_p9) return OPAL_UNSUPPORTED; lock(&c->dctl_lock); @@ -623,7 +927,10 @@ static int dctl_cont(struct cpu_thread *t) unlock(&c->dctl_lock); return OPAL_BUSY; } - rc = p9_cont_thread(t); + if (proc_gen == proc_gen_p10) + rc = p10_cont_thread(t); + else /* (proc_gen == proc_gen_p9) */ + rc = p9_cont_thread(t); if (!rc) t->dctl_stopped = false; unlock(&c->dctl_lock); @@ -647,7 +954,9 @@ static int dctl_sreset(struct cpu_thread *t) unlock(&c->dctl_lock); return OPAL_BUSY; } - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + rc = p10_sreset_thread(t); + else if (proc_gen == proc_gen_p9) rc = p9_sreset_thread(t); else /* (proc_gen == proc_gen_p8) */ rc = p8_sreset_thread(t); @@ -752,7 +1061,7 @@ int sreset_all_others(void) * Then sreset the target thread, which resumes execution on that thread. * Then de-assert special wakeup on the core. */ -static int64_t p9_sreset_cpu(struct cpu_thread *cpu) +static int64_t do_sreset_cpu(struct cpu_thread *cpu) { int rc; @@ -792,7 +1101,7 @@ int64_t opal_signal_system_reset(int cpu_nr) struct cpu_thread *cpu; int64_t ret; - if (proc_gen != proc_gen_p9) + if (proc_gen != proc_gen_p9 && proc_gen != proc_gen_p10) return OPAL_UNSUPPORTED; /* @@ -811,7 +1120,7 @@ int64_t opal_signal_system_reset(int cpu_nr) } lock(&sreset_lock); - ret = p9_sreset_cpu(cpu); + ret = do_sreset_cpu(cpu); unlock(&sreset_lock); return ret; @@ -822,7 +1131,7 @@ void direct_controls_init(void) if (chip_quirk(QUIRK_MAMBO_CALLOUTS)) return; - if (proc_gen != proc_gen_p9) + if (proc_gen != proc_gen_p9 && proc_gen != proc_gen_p10) return; opal_register(OPAL_SIGNAL_SYSTEM_RESET, opal_signal_system_reset, 1); diff --git a/core/hmi.c b/core/hmi.c index 120fe4b57..35b609047 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -27,7 +28,7 @@ #include /* - * HMER register layout: + * P9 HMER register layout: * +===+==========+============================+========+===================+ * |Bit|Name |Description |PowerKVM|Action | * | | | |HMI | | @@ -147,6 +148,78 @@ * NOTE: Per Dave Larson, never enable 8,9,21-23 */ +/* + * P10 HMER register layout: + * Bit Name Description + * 0 malfunction_alert A processor core in the system has checkstopped + * (failed recovery). This is broadcasted to every + * processor in the system + * + * 1 reserved reserved + * + * 2 proc_rcvy_done Processor recovery occurred error-bit in fir not + * masked (see bit 11) + * + * 3 reserved reserved + * + * 4 tfac_error Timer facility experienced an error. TB, DEC, + * HDEC, PURR or SPURR may be corrupted (details in + * TFMR) + * + * 5 tfx_error Error occurred on transfer from tfac shadow to + * core + * + * 6 spurr_scale_limit Nominal frequency exceeded 399 percent + * + * 7 reserved reserved + * + * 8 xscom_fail An XSCOM operation caused by a cache inhibited + * load/store from this thread failed. A trap + * register is available. + * + * 9 xscom_done An XSCOM operation caused by a cache inhibited + * load/store from this thread completed. If + * hypervisor intends to use this bit, it is + * responsible for clearing it before performing the + * xscom operation. NOTE: this bit should always be + * masked in HMEER + * + * 10 reserved reserved + * + * 11 proc_rcvy_again Processor recovery occurred again before bit 2 + * was cleared + * + * 12-15 reserved reserved + * + * 16 scom_fir_hmi An error inject to PC FIR has occurred to set HMI. + * This error inject can also set FIR(61) to cause + * recovery. + * + * 17 reserved reserved + * + * 18 trig_fir_hmi Debug trigger has occurred to set HMI. This + * trigger can also set FIR(60) to cause recovery + * + * 19-20 reserved reserved + * + * 21-23 xscom_status If bit 8 is active, the reason will be detailed in + * these bits. These bits are information only and + * always masked (mask = ?0?) If hypervisor intends + * to use this field, it is responsible for clearing + * it before performing the xscom operation. + * + * 24:63 Not implemented Not implemented. + * + * P10 HMEER enabled bits: + * Name Action + * malfunction_alert Decode and log FIR bits. + * proc_rcvy_done Log and continue. + * tfac_error Log and attempt to recover time facilities. + * tfx_error Log and attempt to recover time facilities. + * spurr_scale_limit Log and continue. XXX? + * proc_rcvy_again Log and continue. + */ + /* Used for tracking cpu threads inside hmi handling. */ #define HMI_STATE_CLEANUP_DONE 0x100 #define CORE_THREAD_MASK 0x0ff @@ -174,13 +247,17 @@ (SPR_TFMR_TBST_CORRUPT | SPR_TFMR_TB_MISSING_SYNC | \ SPR_TFMR_TB_MISSING_STEP | SPR_TFMR_FW_CONTROL_ERR | \ SPR_TFMR_TFMR_CORRUPT | SPR_TFMR_TB_RESIDUE_ERR | \ - SPR_TFMR_HDEC_PARITY_ERROR) + SPR_TFMR_HDEC_PARITY_ERROR | SPR_TFMR_TFAC_XFER_ERROR) /* TFMR "thread" errors */ #define SPR_TFMR_THREAD_ERRORS \ (SPR_TFMR_PURR_PARITY_ERR | SPR_TFMR_SPURR_PARITY_ERR | \ SPR_TFMR_DEC_PARITY_ERR) +/* + * Starting from p9, core inits are setup to escalate all core + * local checkstop to system checkstop. Review this list when that changes. + */ static const struct core_xstop_bit_info { uint8_t bit; /* CORE FIR bit number */ enum OpalHMI_CoreXstopReason reason; @@ -203,10 +280,12 @@ static const struct core_xstop_bit_info { { 63, CORE_CHECKSTOP_PC_SPRD_HYP_ERR_INJ }, }; -static const struct core_recoverable_bit_info { +struct core_fir_bit_info { uint8_t bit; /* CORE FIR bit number */ const char *reason; -} recoverable_bits[] = { +}; + +static const struct core_fir_bit_info p9_recoverable_bits[] = { { 0, "IFU - SRAM (ICACHE parity, etc)" }, { 2, "IFU - RegFile" }, { 4, "IFU - Logic" }, @@ -226,6 +305,58 @@ static const struct core_recoverable_bit_info { { 43, "PC - Thread hang recovery" }, }; +static const struct core_fir_bit_info p10_core_fir_bits[] = { + { 0, "IFU - SRAM recoverable error (ICACHE parity error, etc.)" }, + { 1, "PC - TC checkstop" }, + { 2, "IFU - RegFile recoverable error" }, + { 3, "IFU - RegFile core checkstop" }, + { 4, "IFU - Logic recoverable error" }, + { 5, "IFU - Logic core checkstop" }, + { 7, "VSU - Inference accumulator recoverable error" }, + { 8, "PC - Recovery core checkstop" }, + { 9, "VSU - Slice Target File (STF) recoverable error" }, + { 11, "ISU - Logic recoverable error" }, + { 12, "ISU - Logic core checkstop" }, + { 14, "ISU - Machine check received while ME=0 checkstop" }, + { 15, "ISU - UE from L2" }, + { 16, "ISU - Number of UEs from L2 above threshold" }, + { 17, "ISU - UE on CI load" }, + { 18, "MMU - TLB recoverable error" }, + { 19, "MMU - SLB error" }, + { 21, "MMU - CXT recoverable error" }, + { 22, "MMU - Logic core checkstop" }, + { 23, "MMU - MMU system checkstop" }, + { 24, "VSU - Logic recoverable error" }, + { 25, "VSU - Logic core checkstop" }, + { 26, "PC - In maint mode and recovery in progress" }, + { 28, "PC - PC system checkstop" }, + { 29, "LSU - SRAM recoverable error (DCACHE parity error, etc.)" }, + { 30, "LSU - Set deleted" }, + { 31, "LSU - RegFile recoverable error" }, + { 32, "LSU - RegFile core checkstop" }, + { 33, "MMU - TLB multi hit error occurred" }, + { 34, "MMU - SLB multi hit error occurred" }, + { 35, "LSU - ERAT multi hit error occurred" }, + { 36, "PC - Forward progress error" }, + { 37, "LSU - Logic recoverable error" }, + { 38, "LSU - Logic core checkstop" }, + { 41, "LSU - System checkstop" }, + { 43, "PC - Thread hang recoverable error" }, + { 45, "PC - Logic core checkstop" }, + { 47, "PC - TimeBase facility checkstop" }, + { 52, "PC - Hang recovery failed core checkstop" }, + { 53, "PC - Core internal hang detected" }, + { 55, "PC - Nest hang detected" }, + { 56, "PC - Other core chiplet recoverable error" }, + { 57, "PC - Other core chiplet core checkstop" }, + { 58, "PC - Other core chiplet system checkstop" }, + { 59, "PC - SCOM satellite error detected" }, + { 60, "PC - Debug trigger error inject" }, + { 61, "PC - SCOM or firmware recoverable error inject" }, + { 62, "PC - Firmware checkstop error inject" }, + { 63, "PC - Firmware SPRC / SPRD checkstop" }, +}; + static const struct nx_xstop_bit_info { uint8_t bit; /* NX FIR bit number */ enum OpalHMI_NestAccelXstopReason reason; @@ -270,6 +401,12 @@ static int setup_scom_addresses(void) nx_dma_engine_fir = P9_NX_DMA_ENGINE_FIR; nx_pbi_fir = P9_NX_PBI_FIR; return 1; + case proc_gen_p10: + malf_alert_scom = P10_MALFUNC_ALERT; + nx_status_reg = P10_NX_STATUS_REG; + nx_dma_engine_fir = P10_NX_DMA_ENGINE_FIR; + nx_pbi_fir = P10_NX_PBI_FIR; + return 1; default: prerror("%s: Unknown CPU type\n", __func__); break; @@ -320,6 +457,10 @@ static int read_core_fir(uint32_t chip_id, uint32_t core_id, uint64_t *core_fir) rc = xscom_read(chip_id, XSCOM_ADDR_P9_EC(core_id, P9_CORE_FIR), core_fir); break; + case proc_gen_p10: + rc = xscom_read(chip_id, + XSCOM_ADDR_P10_EC(core_id, P10_CORE_FIR), core_fir); + break; default: rc = OPAL_HARDWARE; } @@ -335,6 +476,10 @@ static int read_core_wof(uint32_t chip_id, uint32_t core_id, uint64_t *core_wof) rc = xscom_read(chip_id, XSCOM_ADDR_P9_EC(core_id, P9_CORE_WOF), core_wof); break; + case proc_gen_p10: + rc = xscom_read(chip_id, + XSCOM_ADDR_P10_EC(core_id, P10_CORE_WOF), core_wof); + break; default: rc = OPAL_HARDWARE; } @@ -394,6 +539,13 @@ static bool decode_core_fir(struct cpu_thread *cpu, loc ? loc : "Not Available", cpu->chip_id, core_id, core_fir); + if (proc_gen == proc_gen_p10) { + for (i = 0; i < ARRAY_SIZE(p10_core_fir_bits); i++) { + if (core_fir & PPC_BIT(p10_core_fir_bits[i].bit)) + prlog(PR_INFO, " %s\n", p10_core_fir_bits[i].reason); + } + } + /* Check CORE FIR bits and populate HMI event with error info. */ for (i = 0; i < ARRAY_SIZE(xstop_bits); i++) { if (core_fir & PPC_BIT(xstop_bits[i].bit)) { @@ -910,6 +1062,7 @@ static void hmi_print_debug(const uint8_t *msg, uint64_t hmer) if (!loc) loc = "Not Available"; + /* Also covers P10 SPR_HMER_TFAC_SHADOW_XFER_ERROR */ if (hmer & (SPR_HMER_TFAC_ERROR | SPR_HMER_TFMR_PARITY_ERROR)) { prlog(PR_DEBUG, "[Loc: %s]: P:%d C:%d T:%d: TFMR(%016lx) %s\n", loc, this_cpu()->chip_id, core_id, thread_index, @@ -1231,10 +1384,16 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, int i; prlog(PR_DEBUG, "Core WOF = 0x%016llx recovered error:\n", core_wof); - for (i = 0; i < ARRAY_SIZE(recoverable_bits); i++) { - if (core_wof & PPC_BIT(recoverable_bits[i].bit)) - prlog(PR_DEBUG, "%s\n", - recoverable_bits[i].reason); + if (proc_gen <= proc_gen_p9) { + for (i = 0; i < ARRAY_SIZE(p9_recoverable_bits); i++) { + if (core_wof & PPC_BIT(p9_recoverable_bits[i].bit)) + prlog(PR_DEBUG, " %s\n", p9_recoverable_bits[i].reason); + } + } else if (proc_gen == proc_gen_p10) { + for (i = 0; i < ARRAY_SIZE(p10_core_fir_bits); i++) { + if (core_wof & PPC_BIT(p10_core_fir_bits[i].bit)) + prlog(PR_DEBUG, " %s\n", p10_core_fir_bits[i].reason); + } } } @@ -1245,7 +1404,8 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, queue_hmi_event(hmi_evt, recover, out_flags); } } - if (hmer & SPR_HMER_PROC_RECV_ERROR_MASKED) { + + if ((proc_gen <= proc_gen_p9) && (hmer & SPR_HMER_PROC_RECV_ERROR_MASKED)) { handled |= SPR_HMER_PROC_RECV_ERROR_MASKED; if (cpu_is_thread0(cpu) && hmi_evt) { hmi_evt->severity = OpalHMI_SEV_NO_ERROR; @@ -1254,6 +1414,7 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, } hmi_print_debug("Processor recovery Done (masked).", hmer); } + if (hmer & SPR_HMER_PROC_RECV_AGAIN) { handled |= SPR_HMER_PROC_RECV_AGAIN; if (cpu_is_thread0(cpu) && hmi_evt) { @@ -1264,17 +1425,30 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, hmi_print_debug("Processor recovery occurred again before" "bit2 was cleared\n", hmer); } + + /* XXX: what to do with this? */ + if (hmer & SPR_HMER_SPURR_SCALE_LIMIT) { + handled |= SPR_HMER_SPURR_SCALE_LIMIT; + if (cpu_is_thread0(cpu) && hmi_evt) { + hmi_evt->severity = OpalHMI_SEV_NO_ERROR; + hmi_evt->type = OpalHMI_ERROR_PROC_RECOV_DONE; + queue_hmi_event(hmi_evt, recover, out_flags); + } + hmi_print_debug("Turbo versus nominal frequency exceeded limit.", hmer); + } + /* Assert if we see malfunction alert, we can not continue. */ if (hmer & SPR_HMER_MALFUNCTION_ALERT) { handled |= SPR_HMER_MALFUNCTION_ALERT; hmi_print_debug("Malfunction Alert", hmer); + recover = 0; if (hmi_evt) decode_malfunction(hmi_evt, out_flags); } /* Assert if we see Hypervisor resource error, we can not continue. */ - if (hmer & SPR_HMER_HYP_RESOURCE_ERR) { + if ((proc_gen <= proc_gen_p9) && (hmer & SPR_HMER_HYP_RESOURCE_ERR)) { handled |= SPR_HMER_HYP_RESOURCE_ERR; hmi_print_debug("Hypervisor resource error", hmer); @@ -1285,7 +1459,21 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, queue_hmi_event(hmi_evt, recover, out_flags); } } - if (hmer & SPR_HMER_TRIG_FIR_HMI) { + + /* XXX: what to do with this? */ + if ((proc_gen <= proc_gen_p9) && (hmer & SPR_HMER_THD_WAKE_BLOCKED_TM_SUSPEND)) { + handled |= SPR_HMER_THD_WAKE_BLOCKED_TM_SUSPEND; + hmer &= ~SPR_HMER_THD_WAKE_BLOCKED_TM_SUSPEND; + + hmi_print_debug("Attempted to wake thread when threads in TM suspend mode.", hmer); + if (hmi_evt) { + hmi_evt->severity = OpalHMI_SEV_NO_ERROR; + hmi_evt->type = OpalHMI_ERROR_PROC_RECOV_DONE, + queue_hmi_event(hmi_evt, recover, out_flags); + } + } + + if ((proc_gen <= proc_gen_p9) && (hmer & SPR_HMER_TRIG_FIR_HMI)) { handled |= SPR_HMER_TRIG_FIR_HMI; hmer &= ~SPR_HMER_TRIG_FIR_HMI; @@ -1296,6 +1484,17 @@ static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt, queue_hmi_event(hmi_evt, recover, out_flags); } } + if ((proc_gen == proc_gen_p10) && (hmer & SPR_HMER_P10_TRIG_FIR_HMI)) { + handled |= SPR_HMER_P10_TRIG_FIR_HMI; + hmer &= ~SPR_HMER_P10_TRIG_FIR_HMI; + + hmi_print_debug("Clearing unknown debug trigger", hmer); + if (hmi_evt) { + hmi_evt->severity = OpalHMI_SEV_NO_ERROR; + hmi_evt->type = OpalHMI_ERROR_DEBUG_TRIG_FIR, + queue_hmi_event(hmi_evt, recover, out_flags); + } + } if (recover == 0) disable_fast_reboot("Unrecoverable HMI"); diff --git a/core/init.c b/core/init.c index 09749f475..65f136daa 100644 --- a/core/init.c +++ b/core/init.c @@ -1167,7 +1167,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Initialize the rest of the cpu thread structs */ init_all_cpus(); - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) cpu_set_ipi_enable(true); /* Add the /opal node to the device-tree */ diff --git a/core/mce.c b/core/mce.c index 3f5091628..47674abcb 100644 --- a/core/mce.c +++ b/core/mce.c @@ -65,6 +65,42 @@ static const struct mce_ierror_table mce_p9_ierror_table[] = { "instruction fetch page table access to foreign address", }, { 0 } }; +static const struct mce_ierror_table mce_p10_ierror_table[] = { +{ 0x00000000081c0000, 0x0000000000040000, + MCE_INSNFETCH | MCE_MEMORY_ERROR | MCE_INVOLVED_EA, + "instruction fetch memory uncorrectable error", }, +{ 0x00000000081c0000, 0x0000000000080000, + MCE_INSNFETCH | MCE_SLB_ERROR | MCE_INVOLVED_EA, + "instruction fetch SLB parity error", }, +{ 0x00000000081c0000, 0x00000000000c0000, + MCE_INSNFETCH | MCE_SLB_ERROR | MCE_INVOLVED_EA, + "instruction fetch SLB multi-hit error", }, +{ 0x00000000081c0000, 0x0000000000100000, + MCE_INSNFETCH | MCE_INVOLVED_EA | MCE_ERAT_ERROR, + "instruction fetch ERAT multi-hit error", }, +{ 0x00000000081c0000, 0x0000000000140000, + MCE_INSNFETCH | MCE_INVOLVED_EA | MCE_TLB_ERROR, + "instruction fetch TLB multi-hit error", }, +{ 0x00000000081c0000, 0x0000000000180000, + MCE_INSNFETCH | MCE_MEMORY_ERROR | MCE_TABLE_WALK | MCE_INVOLVED_EA, + "instruction fetch page table access memory uncorrectable error", }, +{ 0x00000000081c0000, 0x00000000001c0000, + MCE_INSNFETCH | MCE_INVOLVED_EA, + "instruction fetch to control real address", }, +{ 0x00000000081c0000, 0x00000000080c0000, + MCE_INSNFETCH | MCE_INVOLVED_EA, + "instruction fetch real address error", }, +{ 0x00000000081c0000, 0x0000000008100000, + MCE_INSNFETCH | MCE_TABLE_WALK | MCE_INVOLVED_EA, + "instruction fetch page table access real address error", }, +{ 0x00000000081c0000, 0x0000000008140000, + MCE_LOADSTORE | MCE_IMPRECISE, + "store real address asynchronous error", }, +{ 0x00000000081c0000, 0x00000000081c0000, + MCE_INSNFETCH | MCE_TABLE_WALK | MCE_INVOLVED_EA, + "instruction fetch page table access to control real address", }, +{ 0 } }; + struct mce_derror_table { unsigned long dsisr_value; uint64_t type; @@ -113,6 +149,42 @@ static const struct mce_derror_table mce_p9_derror_table[] = { "load/store to foreign address", }, { 0 } }; +static const struct mce_derror_table mce_p10_derror_table[] = { +{ 0x00008000, + MCE_LOADSTORE | MCE_MEMORY_ERROR, + "load/store memory uncorrectable error", }, +{ 0x00004000, + MCE_LOADSTORE | MCE_MEMORY_ERROR | MCE_TABLE_WALK | MCE_INVOLVED_EA, + "load/store page table access memory uncorrectable error", }, +{ 0x00000800, + MCE_LOADSTORE | MCE_INVOLVED_EA | MCE_ERAT_ERROR, + "load/store ERAT multi-hit error", }, +{ 0x00000400, + MCE_LOADSTORE | MCE_INVOLVED_EA | MCE_TLB_ERROR, + "load/store TLB multi-hit error", }, +{ 0x00000200, + MCE_TLBIE_ERROR, + "TLBIE or TLBIEL instruction programming error", }, +{ 0x00000100, + MCE_LOADSTORE | MCE_INVOLVED_EA | MCE_SLB_ERROR, + "load/store SLB parity error", }, +{ 0x00000080, + MCE_LOADSTORE | MCE_INVOLVED_EA | MCE_SLB_ERROR, + "load/store SLB multi-hit error", }, +{ 0x00000040, + MCE_LOADSTORE | MCE_INVOLVED_EA, + "load real address error", }, +{ 0x00000020, + MCE_LOADSTORE | MCE_TABLE_WALK, + "load/store page table access real address error", }, +{ 0x00000010, + MCE_LOADSTORE | MCE_TABLE_WALK, + "load/store page table access to control real address", }, +{ 0x00000008, + MCE_LOADSTORE, + "load/store to control real address", }, +{ 0 } }; + static void decode_ierror(const struct mce_ierror_table table[], uint64_t srr1, uint64_t *type, @@ -145,20 +217,11 @@ static void decode_derror(const struct mce_derror_table table[], } } -void decode_mce(uint64_t srr0, uint64_t srr1, +static void decode_mce_p9(uint64_t srr0, uint64_t srr1, uint32_t dsisr, uint64_t dar, uint64_t *type, const char **error_str, uint64_t *address) { - *type = MCE_UNKNOWN; - *error_str = "unknown error"; - *address = 0; - - if (proc_gen != proc_gen_p9) { - *error_str = "unknown error (processor not supported)"; - return; - } - /* * On POWER9 DD2.1 and below, it's possible to get a machine check * caused by a paste instruction where only DSISR bit 25 is set. This @@ -198,3 +261,49 @@ void decode_mce(uint64_t srr0, uint64_t srr1, *address = srr0; } } + +static void decode_mce_p10(uint64_t srr0, uint64_t srr1, + uint32_t dsisr, uint64_t dar, + uint64_t *type, const char **error_str, + uint64_t *address) +{ + /* + * Async machine check due to bad real address from store or foreign + * link time out comes with the load/store bit (PPC bit 42) set in + * SRR1, but the cause comes in SRR1 not DSISR. Clear bit 42 so we're + * directed to the ierror table so it will find the cause (which + * describes it correctly as a store error). + */ + if (SRR1_MC_LOADSTORE(srr1) && + (srr1 & 0x081c0000) == 0x08140000) { + srr1 &= ~PPC_BIT(42); + } + + if (SRR1_MC_LOADSTORE(srr1)) { + decode_derror(mce_p10_derror_table, dsisr, type, error_str); + if (*type & MCE_INVOLVED_EA) + *address = dar; + } else { + decode_ierror(mce_p10_ierror_table, srr1, type, error_str); + if (*type & MCE_INVOLVED_EA) + *address = srr0; + } +} + +void decode_mce(uint64_t srr0, uint64_t srr1, + uint32_t dsisr, uint64_t dar, + uint64_t *type, const char **error_str, + uint64_t *address) +{ + *type = MCE_UNKNOWN; + *error_str = "unknown error"; + *address = 0; + + if (proc_gen == proc_gen_p9) { + decode_mce_p9(srr0, srr1, dsisr, dar, type, error_str, address); + } else if (proc_gen == proc_gen_p10) { + decode_mce_p10(srr0, srr1, dsisr, dar, type, error_str, address); + } else { + *error_str = "unknown error (processor not supported)"; + } +} diff --git a/core/test/run-timer.c b/core/test/run-timer.c index fef5648d7..8f8b20ed3 100644 --- a/core/test/run-timer.c +++ b/core/test/run-timer.c @@ -16,7 +16,7 @@ #define smt_lowest() #define smt_medium() -enum proc_gen proc_gen = proc_gen_p9; +enum proc_gen proc_gen = proc_gen_unknown; static uint64_t stamp, last; struct lock; diff --git a/doc/platforms-and-cpus.rst b/doc/platforms-and-cpus.rst index 658e00ed0..2f5e9436f 100644 --- a/doc/platforms-and-cpus.rst +++ b/doc/platforms-and-cpus.rst @@ -17,6 +17,7 @@ Power9N 0x004e1xxx Nimbus 24 small core Power9C 0x004e2xxx Cumulus 12 small core Power9C 0x004e3xxx Cumulus 24 small core Power9P 0x004fxxxx Axone +Power10 0x0080xxxx =============== =============== ===================== Platforms diff --git a/hw/chiptod.c b/hw/chiptod.c index f445fd49a..4e62fd714 100644 --- a/hw/chiptod.c +++ b/hw/chiptod.c @@ -959,6 +959,30 @@ bool chiptod_wakeup_resync(void) return false; } +/* + * Fixup for p10 TOD bug workaround. + * + * The TOD may fail to start if all clocks in the system are derived from + * the same reference oscillator. + * + * Avoiding this is pretty easy: Whenever we clear/reset the TOD registers, + * make sure to init bits 26:31 of TOD_SLAVE_PATH_CTRL (0x40005) to 0b111111 + * instead of 0b000000. The value 0 in TOD_S_PATH_CTRL_REG(26:31) must be + * avoided, and if it does get written it must be followed up by writing a + * value of all ones to clean up the resulting bad state before the (nonzero) + * final value can be written. + */ +static void fixup_tod_reg_value(struct chiptod_tod_regs *treg_entry) +{ + int32_t chip_id = this_cpu()->chip_id; + + if (proc_gen != proc_gen_p10) + return; + + if (treg_entry->xscom_addr == TOD_SLAVE_PATH_CTRL) + treg_entry->val[chip_id].data |= PPC_BITMASK(26,31); +} + static int __chiptod_recover_tod_errors(void) { uint64_t terr; @@ -997,8 +1021,12 @@ static int __chiptod_recover_tod_errors(void) return 0; } + fixup_tod_reg_value(&chiptod_tod_regs[i]); + prlog(PR_DEBUG, "Parity error, Restoring TOD register: " - "%08llx\n", chiptod_tod_regs[i].xscom_addr); + "%08llx = %016llx\n", + chiptod_tod_regs[i].xscom_addr, + chiptod_tod_regs[i].val[chip_id].data); if (xscom_writeme(chiptod_tod_regs[i].xscom_addr, chiptod_tod_regs[i].val[chip_id].data)) { prerror("XSCOM error writing 0x%08llx reg.\n", diff --git a/hw/dts.c b/hw/dts.c index b72516ab2..d8831e4d3 100644 --- a/hw/dts.c +++ b/hw/dts.c @@ -171,7 +171,11 @@ static void dts_async_read_temp(struct timer *t __unused, void *data, swkup_rc = dctl_set_special_wakeup(cpu); - rc = dts_read_core_temp_p9(cpu->pir, &dts); + if (proc_gen == proc_gen_p9) + rc = dts_read_core_temp_p9(cpu->pir, &dts); + else /* (proc_gen == proc_gen_p10) */ + rc = OPAL_UNSUPPORTED; /* XXX P10 */ + if (!rc) { if (cpu->sensor_attr == SENSOR_DTS_ATTR_TEMP_MAX) *cpu->sensor_data = cpu_to_be64(dts.temp); @@ -219,6 +223,7 @@ static int dts_read_core_temp(u32 pir, struct dts *dts, u8 attr, rc = OPAL_ASYNC_COMPLETION; unlock(&cpu->dts_lock); break; + case proc_gen_p10: /* XXX P10 */ default: rc = OPAL_UNSUPPORTED; } diff --git a/hw/lpc.c b/hw/lpc.c index c2a07a0db..bf3ab1fae 100644 --- a/hw/lpc.c +++ b/hw/lpc.c @@ -915,7 +915,8 @@ void lpc_finalize_interrupts(void) if (chip->lpc && chip->psi && (chip->type == PROC_CHIP_P9_NIMBUS || chip->type == PROC_CHIP_P9_CUMULUS || - chip->type == PROC_CHIP_P9P)) + chip->type == PROC_CHIP_P9P || + chip->type == PROC_CHIP_P10)) lpc_create_int_map(chip->lpc, chip->psi->node); } } @@ -959,6 +960,7 @@ static void lpc_init_interrupts_one(struct proc_chip *chip) case PROC_CHIP_P9_NIMBUS: case PROC_CHIP_P9_CUMULUS: case PROC_CHIP_P9P: + case PROC_CHIP_P10: /* On P9, we additionally setup the routing. */ lpc->has_serirq = true; for (i = 0; i < LPC_NUM_SERIRQ; i++) { @@ -1377,7 +1379,8 @@ void lpc_register_client(uint32_t chip_id, has_routes = chip->type == PROC_CHIP_P9_NIMBUS || chip->type == PROC_CHIP_P9_CUMULUS || - chip->type == PROC_CHIP_P9P; + chip->type == PROC_CHIP_P9P || + chip->type == PROC_CHIP_P10; if (policy != IRQ_ATTR_TARGET_OPAL && !has_routes) { prerror("Chip doesn't support OS interrupt policy\n"); diff --git a/hw/xscom.c b/hw/xscom.c index c97740a62..347457242 100644 --- a/hw/xscom.c +++ b/hw/xscom.c @@ -94,7 +94,11 @@ static void xscom_reset(uint32_t gcid, bool need_delay) mtspr(SPR_HMER, HMER_CLR_MASK); /* Setup local and target scom addresses */ - if (proc_gen == proc_gen_p9) { + if (proc_gen == proc_gen_p10) { + recv_status_reg = 0x00090018; + log_reg = 0x0090012; + err_reg = 0x0090013; + } else if (proc_gen == proc_gen_p9) { recv_status_reg = 0x00090018; log_reg = 0x0090012; err_reg = 0x0090013; @@ -497,7 +501,7 @@ static int xscom_indirect_read(uint32_t gcid, uint64_t pcb_addr, uint64_t *val) { uint64_t form = xscom_indirect_form(pcb_addr); - if ((proc_gen == proc_gen_p9) && (form == 1)) + if ((proc_gen >= proc_gen_p9) && (form == 1)) return OPAL_UNSUPPORTED; return xscom_indirect_read_form0(gcid, pcb_addr, val); @@ -565,7 +569,7 @@ static int xscom_indirect_write(uint32_t gcid, uint64_t pcb_addr, uint64_t val) { uint64_t form = xscom_indirect_form(pcb_addr); - if ((proc_gen == proc_gen_p9) && (form == 1)) + if ((proc_gen >= proc_gen_p9) && (form == 1)) return xscom_indirect_write_form1(gcid, pcb_addr, val); return xscom_indirect_write_form0(gcid, pcb_addr, val); @@ -576,7 +580,7 @@ static uint32_t xscom_decode_chiplet(uint32_t partid, uint64_t *pcb_addr) uint32_t gcid = (partid & 0x0fffffff) >> 4; uint32_t core = partid & 0xf; - if (proc_gen == proc_gen_p9) { + if (proc_gen >= proc_gen_p9) { /* XXX Not supported */ *pcb_addr = 0; } else { @@ -821,7 +825,9 @@ int64_t xscom_read_cfam_chipid(uint32_t partid, uint32_t *chip_id) * something up */ if (chip_quirk(QUIRK_NO_F000F)) { - if (proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p10) + val = 0x120DA04980000000UL; /* P10 DD1.0 */ + else if (proc_gen == proc_gen_p9) val = 0x203D104980000000UL; /* P9 Nimbus DD2.3 */ else val = 0x221EF04980000000UL; /* P8 Murano DD2.1 */ @@ -873,6 +879,10 @@ static void xscom_init_chip_info(struct proc_chip *chip) chip->type = PROC_CHIP_P9P; assert(proc_gen == proc_gen_p9); break; + case 0xda: + chip->type = PROC_CHIP_P10; + assert(proc_gen == proc_gen_p10); + break; default: printf("CHIP: Unknown chip type 0x%02x !!!\n", (unsigned char)(val & 0xff)); @@ -911,7 +921,7 @@ static void xscom_init_chip_info(struct proc_chip *chip) prlog(PR_INFO,"P9 DD%i.%i%d detected\n", 0xf & (chip->ec_level >> 4), chip->ec_level & 0xf, rev); chip->ec_rev = rev; - } + } /* XXX P10 */ } /* @@ -949,7 +959,8 @@ void xscom_init(void) struct proc_chip *chip; const char *chip_name; static const char *chip_names[] = { - "UNKNOWN", "P8E", "P8", "P8NVL", "P9N", "P9C", "P9P" + "UNKNOWN", "P8E", "P8", "P8NVL", "P9N", "P9C", "P9P", + "P10", }; chip = get_chip(gcid); diff --git a/include/chip.h b/include/chip.h index 4deb96182..8bc48ba29 100644 --- a/include/chip.h +++ b/include/chip.h @@ -100,10 +100,58 @@ #define P9_PIRFUSED2NORMALTHREADID(pir) (((pir) >> 1) & 0x3) +#define P10_PIR2FUSEDCOREID(pir) P9_PIR2FUSEDCOREID(pir) +#define P10_PIRFUSED2NORMALCOREID(pir) P9_PIRFUSED2NORMALCOREID(pir) +#define P10_PIRFUSED2NORMALTHREADID(pir) P9_PIRFUSED2NORMALTHREADID(pir) + /* P9 specific ones mostly used by XIVE */ #define P9_PIR2LOCALCPU(pir) ((pir) & 0xff) #define P9_PIRFROMLOCALCPU(chip, cpu) (((chip) << 8) | (cpu)) +/* + * P10 PIR + * ------- + * + * PIR layout: + * + * | 49| 50| 51| 52| 53| 54| 55| 56| 57| 58| 59| 60| 61| 62| 63| + * |Spare ID |Topology ID |Sp. |Quad ID |Core ID |Thread ID| + * + * Bit 56 is a spare quad ID. In big-core mode, thread ID extends to bit 61. + * + * P10 GCID + * -------- + * + * - Global chip ID is also called Topology ID. + * - Node ID is called Group ID (? XXX P10). + * + * Global chip ID is a 4 bit number. + * + * There is a topology mode bit that can be 0 or 1, which changes GCID mapping. + * + * Topology mode 0: + * NodeID ChipID + * | | | + * |____|____|____|____| + * + * Topology mode 1: + * NodeID ChipID + * | | | + * |____|____|____|____| + */ +#define P10_PIR2GCID(pir) (((pir) >> 8) & 0xf) + +#define P10_PIR2COREID(pir) (((pir) >> 2) & 0x3f) + +#define P10_PIR2THREADID(pir) ((pir) & 0x3) + +// XXX P10 These depend on the topology mode, how to get that (system type?) +#define P10_GCID2NODEID(gcid, mode) ((mode) == 0 ? ((gcid) >> 1) & 0x7 : ((gcid) >> 2) & 0x3) +#define P10_GCID2CHIPID(gcid, mode) ((mode) == 0 ? (gcid) & 0x1 : (gcid) & 0x3) + +/* P10 specific ones mostly used by XIVE */ +#define P10_PIR2LOCALCPU(pir) ((pir) & 0xff) +#define P10_PIRFROMLOCALCPU(chip, cpu) (((chip) << 8) | (cpu)) struct dt_node; struct centaur_chip; @@ -123,6 +171,7 @@ enum proc_chip_type { PROC_CHIP_P9_NIMBUS, PROC_CHIP_P9_CUMULUS, PROC_CHIP_P9P, + PROC_CHIP_P10, }; /* Simulator quirks */ diff --git a/include/opal-api.h b/include/opal-api.h index e90cab1e9..9cba35c7d 100644 --- a/include/opal-api.h +++ b/include/opal-api.h @@ -731,6 +731,7 @@ enum OpalHMI_CoreXstopReason { CORE_CHECKSTOP_PC_AMBI_HANG_DETECTED = 0x00004000, CORE_CHECKSTOP_PC_DEBUG_TRIG_ERR_INJ = 0x00008000, CORE_CHECKSTOP_PC_SPRD_HYP_ERR_INJ = 0x00010000, + CORE_CHECKSTOP_MMU_SYSTEM = 0x00020000, }; enum OpalHMI_NestAccelXstopReason { diff --git a/include/processor.h b/include/processor.h index 70e749f1a..973d7e77b 100644 --- a/include/processor.h +++ b/include/processor.h @@ -27,6 +27,7 @@ #define MSR_LE PPC_BIT(63) /* Little Endian */ /* PIR */ +#define SPR_PIR_P10_MASK 0x7fff /* Mask of implemented bits */ #define SPR_PIR_P9_MASK 0x7fff /* Mask of implemented bits */ #define SPR_PIR_P8_MASK 0x1fff /* Mask of implemented bits */ @@ -114,6 +115,7 @@ #define SPR_TFMR_MOVE_CHIP_TOD_TO_TB PPC_BIT(18) #define SPR_TFMR_CLEAR_TB_ERRORS PPC_BIT(24) /* Bits in TFMR - thread indep. status bits */ +#define SPR_TFMR_TFAC_XFER_ERROR PPC_BIT(25) #define SPR_TFMR_HDEC_PARITY_ERROR PPC_BIT(26) #define SPR_TFMR_TBST_CORRUPT PPC_BIT(27) #define SPR_TFMR_TBST_ENCODED PPC_BITMASK(28,31) @@ -140,17 +142,21 @@ /* Bits in HMER/HMEER */ #define SPR_HMER_MALFUNCTION_ALERT PPC_BIT(0) #define SPR_HMER_PROC_RECV_DONE PPC_BIT(2) -#define SPR_HMER_PROC_RECV_ERROR_MASKED PPC_BIT(3) +#define SPR_HMER_PROC_RECV_ERROR_MASKED PPC_BIT(3) /* Not P10 */ #define SPR_HMER_TFAC_ERROR PPC_BIT(4) -#define SPR_HMER_TFMR_PARITY_ERROR PPC_BIT(5) +#define SPR_HMER_TFMR_PARITY_ERROR PPC_BIT(5) /* P9 */ +#define SPR_HMER_TFAC_SHADOW_XFER_ERROR PPC_BIT(5) /* P10 */ +#define SPR_HMER_SPURR_SCALE_LIMIT PPC_BIT(6) /* P10 */ #define SPR_HMER_XSCOM_FAIL PPC_BIT(8) #define SPR_HMER_XSCOM_DONE PPC_BIT(9) #define SPR_HMER_PROC_RECV_AGAIN PPC_BIT(11) -#define SPR_HMER_WARN_RISE PPC_BIT(14) -#define SPR_HMER_WARN_FALL PPC_BIT(15) +#define SPR_HMER_WARN_RISE PPC_BIT(14) /* Not P10 */ +#define SPR_HMER_WARN_FALL PPC_BIT(15) /* Not P10 */ #define SPR_HMER_SCOM_FIR_HMI PPC_BIT(16) -#define SPR_HMER_TRIG_FIR_HMI PPC_BIT(17) -#define SPR_HMER_HYP_RESOURCE_ERR PPC_BIT(20) +#define SPR_HMER_TRIG_FIR_HMI PPC_BIT(17) /* Not P10 */ +#define SPR_HMER_THD_WAKE_BLOCKED_TM_SUSPEND PPC_BIT(17) /* Not P10 */ +#define SPR_HMER_P10_TRIG_FIR_HMI PPC_BIT(18) +#define SPR_HMER_HYP_RESOURCE_ERR PPC_BIT(20) /* Not P10 */ #define SPR_HMER_XSCOM_STATUS PPC_BITMASK(21,23) /* @@ -165,14 +171,23 @@ SPR_HMER_TFMR_PARITY_ERROR |\ SPR_HMER_PROC_RECV_AGAIN) +#define SPR_HMEER_P10_HMI_ENABLE_MASK (SPR_HMER_MALFUNCTION_ALERT |\ + SPR_HMER_PROC_RECV_DONE |\ + SPR_HMER_TFAC_ERROR |\ + SPR_HMER_TFAC_SHADOW_XFER_ERROR |\ + SPR_HMER_SPURR_SCALE_LIMIT |\ + SPR_HMER_PROC_RECV_AGAIN) + /* Bits in HID0 */ #define SPR_HID0_POWER8_4LPARMODE PPC_BIT(2) #define SPR_HID0_POWER8_2LPARMODE PPC_BIT(6) #define SPR_HID0_POWER8_DYNLPARDIS PPC_BIT(15) #define SPR_HID0_POWER8_HILE PPC_BIT(19) #define SPR_HID0_POWER9_HILE PPC_BIT(4) +#define SPR_HID0_POWER10_HILE PPC_BIT(4) #define SPR_HID0_POWER8_ENABLE_ATTN PPC_BIT(31) #define SPR_HID0_POWER9_ENABLE_ATTN (PPC_BIT(2) | PPC_BIT(3)) +#define SPR_HID0_POWER10_ENABLE_ATTN (PPC_BIT(2) | PPC_BIT(3)) #define SPR_HID0_POWER9_RADIX PPC_BIT(8) /* PVR bits */ @@ -192,6 +207,7 @@ #define PVR_TYPE_P8NVL 0x004c /* Naples */ #define PVR_TYPE_P9 0x004e #define PVR_TYPE_P9P 0x004f /* Axone */ +#define PVR_TYPE_P10 0x0080 #ifdef __ASSEMBLY__ @@ -236,16 +252,22 @@ static inline bool is_power9n(uint32_t version) static inline bool is_fused_core(uint32_t version) { - if (PVR_TYPE(version) != PVR_TYPE_P9) - return false; - - switch(PVR_CHIP_TYPE(version)) { - case 0: - case 2: - return true; - default: + if (PVR_TYPE(version) == PVR_TYPE_P9) { + switch(PVR_CHIP_TYPE(version)) { + case 0: + case 2: + return true; + default: + return false; + } + + } else if(PVR_TYPE(version) == PVR_TYPE_P10) { + if(PVR_CHIP_TYPE(version) & 0x01) return false; - } + else + return true; + } else + return false; } static inline bool is_power9c(uint32_t version) diff --git a/include/skiboot.h b/include/skiboot.h index d33c02506..f3378ec28 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -97,6 +97,7 @@ enum proc_gen { proc_gen_unknown, proc_gen_p8, proc_gen_p9, + proc_gen_p10, }; extern enum proc_gen proc_gen; diff --git a/include/xscom-p10-regs.h b/include/xscom-p10-regs.h new file mode 100644 index 000000000..8096b2f91 --- /dev/null +++ b/include/xscom-p10-regs.h @@ -0,0 +1,54 @@ +#ifndef __XSCOM_P10_REGS_H__ +#define __XSCOM_P10_REGS_H__ + +/* Core FIR (Fault Isolation Register) */ +#define P10_CORE_FIR 0x440 + +/* Core WOF (Whose On First) */ +#define P10_CORE_WOF 0x448 + +#define P10_MALFUNC_ALERT 0x00090022 + +#define P10_NX_STATUS_REG 0x02011040 /* NX status register */ +#define P10_NX_DMA_ENGINE_FIR 0x02011100 /* DMA & Engine FIR Data Register */ +#define P10_NX_PBI_FIR 0x02011080 /* PowerBus Interface FIR Register */ + +#define P10_EC_CORE_THREAD_STATE 0x412 /* XXX P10 is this right? */ +#define P10_THREAD_STOPPED(t) PPC_BIT(56 + (t)) + +#define P10_EC_THREAD_INFO 0x413 +#define P10_THREAD_ACTIVE(t) PPC_BIT(t) + +#define P10_EC_RAS_STATUS 0x454 +#define P10_THREAD_MAINT(t) PPC_BIT(0 + 8*(t)) +#define P10_THREAD_QUIESCED(t) PPC_BIT(1 + 8*(t)) +#define P10_THREAD_ICT_EMPTY(t) PPC_BIT(2 + 8*(t)) + +#define P10_EC_DIRECT_CONTROLS 0x449 +#define P10_THREAD_STOP(t) PPC_BIT(7 + 8*(t)) +#define P10_THREAD_START(t) PPC_BIT(6 + 8*(t)) +#define P10_THREAD_SRESET(t) PPC_BIT(4 + 8*(t)) +#define P10_THREAD_CLEAR_MAINT(t) PPC_BIT(3 + 8*(t)) +#define P10_THREAD_PWR(t) PPC_BIT(32 + 8*(t)) + +#define P10_QME_FIR 0x000 + +#define P10_QME_SPWU_HYP 0x83c +#define P10_SPWU_REQ PPC_BIT(0) +#define P10_SPWU_DONE PPC_BIT(4) + +#define P10_QME_SSH_HYP 0x82c +#define P10_SSH_CORE_GATED PPC_BIT(0) +#define P10_SSH_SPWU_DONE PPC_BIT(1) + +#define P10_NCU_STATUS_REG 0x64f +#define P10_NCU_SPEC_BAR 0x650 +#define P10_NCU_SPEC_BAR_ENABLE PPC_BIT(0) +#define P10_NCU_SPEC_BAR_256K PPC_BIT(1) +#define P10_NCU_SPEC_BAR_ADDRMSK 0x000fffffffffc000ull /* 16k aligned */ + +#define P10_NCU_DARN_BAR 0x651 +#define P10_NCU_DARN_BAR_EN PPC_BIT(0) +#define P10_NCU_DARN_BAR_ADDRMSK 0x000ffffffffff000ull /* 4k aligned */ + +#endif /* __XSCOM_P10_REGS_H__ */ diff --git a/include/xscom.h b/include/xscom.h index db6d3fcd6..a6bb7e400 100644 --- a/include/xscom.h +++ b/include/xscom.h @@ -137,6 +137,91 @@ #define XSCOM_ADDR_P9_EC_SLAVE(core, addr) \ XSCOM_ADDR_P9_EC(core, (addr) | 0xf0000) +/* + * Additional useful definitions for P10 + */ + +/* + * POWER10 pervasive structure + * Chip has 8 EQ chiplets (aka super-chiplets), and other nest chiplets. + * Each EQ contains 4 EX regions. + * Each EX contains an ECL2, L3, MMA. + * Each ECL2 contains an EC (core), L2, and NCU. + * + * Each EQ has a Quad Management Engine (QME), responsible for power management + * for the cores, among other things. + * + * POWER10 XSCOM address format: + * + * | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16-31| + * MC=0 |WR|MC|SLAVE ADDR |PIB MASTER |PORT NUMBER|LOCAL| + * MC=1 |WR|MC|MC TYPE |MC GROUP|PIB MASTER |PORT NUMBER|LOCAL| + * + * * Port is also known as PSCOM endpoint. + * + * WR is set by the xscom access functions (XSCOM_DATA_IND_READ bit) + * MC is always 0 (skiboot does not use multicast scoms). + * + * For unicast: + * EQ0-7 is addressed from 0x20 to 0x27 in the top 8 bits. + * L3 is on port 1 + * NCU is on port 1 + * ECL2 (core+L2) is on port 2 (XXX P10 scoms html doc suggests port 1?) + * QME is on port E. + * + * EQ chiplets (aka super chiplet) local address format: + * + * | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15| + * |C0|C1|C2|C3|RING ID |SAT ID |REGISTER ID | + * + * EX0-4 are selected with one-hot encoding (C0-3) + * + * QME per-core register access address format: + * | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15| + * |C0|C1|C2|C3| 1| 0| 0| 0|PER-CORE REGISTER ID | + * + * NCU - ring 6 (port 1) + * L3 - ring 3 (port 1) (XXX P10 scoms html doc suggests ring 6) + * L2 - ring 0 (port 2) (XXX P10 scoms html doc suggests ring 4) + * EC (PC unit) - rings 2-5 (port 2) + * + * Other chiplets: + * + * | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15| + * | 1|RING ID |SAT ID |REGISTER ID | + */ + +#define P10_CORE_EQ_CHIPLET(core) (0x20 + ((core) >> 2)) +#define P10_CORE_PROC(core) ((core) & 0x3) + +#define XSCOM_P10_EQ(chiplet) ((chiplet) << 24) + +#define XSCOM_P10_QME(chiplet) \ + (XSCOM_P10_EQ(chiplet) | (0xE << 16)) + +#define XSCOM_P10_QME_CORE(chiplet, proc) \ + (XSCOM_P10_QME(chiplet) | ((1 << (3 - proc)) << 12)) + +#define XSCOM_P10_EC(chiplet, proc) \ + (XSCOM_P10_EQ(chiplet) | (0x2 << 16) | ((1 << (3 - proc)) << 12)) + +#define XSCOM_P10_NCU(chiplet, proc) \ + (XSCOM_P10_EQ(chiplet) | (0x1 << 16) | ((1 << (3 - proc)) << 12)) + +#define XSCOM_ADDR_P10_EQ(core, addr) \ + (XSCOM_P10_EQ(P10_CORE_EQ_CHIPLET(core)) | (addr)) + +#define XSCOM_ADDR_P10_QME(core, addr) \ + (XSCOM_P10_QME(P10_CORE_EQ_CHIPLET(core)) | (addr)) + +#define XSCOM_ADDR_P10_QME_CORE(core, addr) \ + (XSCOM_P10_QME_CORE(P10_CORE_EQ_CHIPLET(core), P10_CORE_PROC(core)) | (addr)) + +#define XSCOM_ADDR_P10_EC(core, addr) \ + (XSCOM_P10_EC(P10_CORE_EQ_CHIPLET(core), P10_CORE_PROC(core)) | (addr)) + +#define XSCOM_ADDR_P10_NCU(core, addr) \ + (XSCOM_P10_NCU(P10_CORE_EQ_CHIPLET(core), P10_CORE_PROC(core)) | (addr)) /* Definitions relating to indirect XSCOMs shared with centaur */ #define XSCOM_ADDR_IND_FLAG PPC_BIT(0) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:46 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:46 +0530 Subject: [Skiboot] [PATCH v2 08/59] plat/qemu/p10: add a POWER10 platform In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-9-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater BMC is still defined as ast2500 but it should change to ast2600 when available. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- core/chip.c | 1 + 1 file changed, 1 insertion(+) diff --git a/core/chip.c b/core/chip.c index f79e8cd04..a4ba3249e 100644 --- a/core/chip.c +++ b/core/chip.c @@ -173,6 +173,7 @@ void init_chips(void) if (dt_node_is_compatible(dt_root, "qemu,powernv") || dt_node_is_compatible(dt_root, "qemu,powernv8") || dt_node_is_compatible(dt_root, "qemu,powernv9") || + dt_node_is_compatible(dt_root, "qemu,powernv10") || dt_find_by_path(dt_root, "/qemu")) { proc_chip_quirks |= QUIRK_QEMU | QUIRK_NO_CHIPTOD | QUIRK_NO_DIRECT_CTL | QUIRK_NO_RNG; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:48 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:48 +0530 Subject: [Skiboot] [PATCH v2 10/59] external/gard: Enable Power10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-11-hegdevasant@linux.vnet.ibm.com> From: Klaus Heinrich Kiwi Add Power10 support for opal-gard utility. Signed-off-by: Klaus Heinrich Kiwi [Folded test case fix and updated commit message - Vasant] Signed-off-by: Vasant Hegde --- external/gard/gard.c | 15 ++++- external/gard/gard.h | 1 + external/gard/test/results/02-usage.err | 1 + external/gard/units.c | 89 +++++++++++++++++++++++++ 4 files changed, 103 insertions(+), 3 deletions(-) diff --git a/external/gard/gard.c b/external/gard/gard.c index b012cf9b7..53a26d0e9 100644 --- a/external/gard/gard.c +++ b/external/gard/gard.c @@ -157,8 +157,12 @@ static void guess_chip_gen(void) set_chip_gen(p9_chip_units); return; + case 0x0080: /* power10 */ + set_chip_gen(p10_chip_units); + return; + default: - fprintf(stderr, "Unsupported processor (pvr %#x)! Set the processor generation manually with -8 or -9\n", pvr); + fprintf(stderr, "Unsupported processor (pvr %#x)! Set the processor generation manually with -8, -9 or -0\n", pvr); exit(1); } } @@ -773,7 +777,8 @@ static void usage(const char *progname) fprintf(stderr, "Usage: %s [-a -e -f -p] []\n\n", progname); fprintf(stderr, "-8 --p8\n"); - fprintf(stderr, "-9 --p9\n\tSet the processor generation\n\n"); + fprintf(stderr, "-9 --p9\n"); + fprintf(stderr, "-0 --p10\n\tSet the processor generation\n\n"); fprintf(stderr, "-e --ecc\n\tForce reading/writing with ECC bytes.\n\n"); fprintf(stderr, "-f --file \n\tDon't search for MTD device," " read from .\n\n"); @@ -802,9 +807,10 @@ static struct option global_options[] = { { "ecc", no_argument, 0, 'e' }, { "p8", no_argument, 0, '8' }, { "p9", no_argument, 0, '9' }, + { "p10", no_argument, 0, '0' }, { 0 }, }; -static const char *global_optstring = "+ef:p89"; +static const char *global_optstring = "+ef:p890"; int main(int argc, char **argv) { @@ -853,6 +859,9 @@ int main(int argc, char **argv) case '9': set_chip_gen(p9_chip_units); break; + case '0': + set_chip_gen(p10_chip_units); + break; case '?': usage(progname); rc = EXIT_FAILURE; diff --git a/external/gard/gard.h b/external/gard/gard.h index 329772a74..d59c2a0de 100644 --- a/external/gard/gard.h +++ b/external/gard/gard.h @@ -71,3 +71,4 @@ struct chip_unit_desc { extern const struct chip_unit_desc *chip_units; extern const struct chip_unit_desc p8_chip_units[]; extern const struct chip_unit_desc p9_chip_units[]; +extern const struct chip_unit_desc p10_chip_units[]; diff --git a/external/gard/test/results/02-usage.err b/external/gard/test/results/02-usage.err index 0e0782628..453fcf52f 100644 --- a/external/gard/test/results/02-usage.err +++ b/external/gard/test/results/02-usage.err @@ -2,6 +2,7 @@ Usage: ./opal-gard [-a -e -f -p] [] -8 --p8 -9 --p9 +-0 --p10 Set the processor generation -e --ecc diff --git a/external/gard/units.c b/external/gard/units.c index 35d46e443..f3b435a3a 100644 --- a/external/gard/units.c +++ b/external/gard/units.c @@ -151,3 +151,92 @@ const struct chip_unit_desc p9_chip_units[] = { {0x4F, "LAST_IN_RANGE"}, }; +const struct chip_unit_desc p10_chip_units[] = { + {0x00, "NA"}, + {0x01, "Sys"}, + {0x02, "Node"}, + {0x03, "DIMM"}, + {0x04, "Membuf"}, + {0x05, "Proc"}, + {0x06, "EX"}, + {0x07, "Core"}, + {0x08, "L2"}, + {0x09, "L3"}, + {0x0A, "L4"}, + {0x0B, "MCS"}, + /* a hole! */ + {0x0D, "MBA"}, + {0x0E, "XBUS"}, + {0x0F, "ABUS"}, + {0x10, "PCI"}, + {0x11, "DPSS"}, + {0x12, "APSS"}, + {0x13, "OCC"}, + {0x14, "PSI"}, + {0x15, "FSP"}, + {0x16, "PNOR"}, + {0x17, "OSC"}, + {0x18, "TODCLK"}, + {0x19, "CONTROL_NODE"}, + {0x1A, "OSCREFCLK"}, + {0x1B, "OSCPCICLK"}, + {0x1C, "REFCLKENDPT"}, + {0x1D, "PCICLKENDPT"}, + {0x1E, "NX"}, + {0x1F, "PORE"}, + {0x20, "PCIESWITCH"}, + {0x21, "CAPP"}, + {0x22, "FSI"}, + {0x23, "EQ"}, + {0x24, "MCA"}, + {0x25, "MCBIST"}, + {0x26, "MI"}, + {0x27, "DMI"}, + {0x28, "OBUS"}, + {0x2A, "SBE"}, + {0x2B, "PPE"}, + {0x2C, "PERV"}, + {0x2D, "PEC"}, + {0x2E, "PHB"}, + {0x2F, "SYSREFCLKENDPT"}, + {0x30, "MFREFCLKENDPT"}, + {0x31, "TPM"}, + {0x32, "SP"}, + {0x33, "UART"}, + {0x34, "PS"}, + {0x35, "FAN"}, + {0x36, "VRM"}, + {0x37, "USB"}, + {0x38, "ETH"}, + {0x39, "PANEL"}, + {0x3A, "BMC"}, + {0x3B, "FLASH"}, + {0x3C, "SEEPROM"}, + {0x3D, "TMP"}, + {0x3E, "GPIO_EXPANDER"}, + {0x3F, "POWER_SEQUENCER"}, + {0x40, "RTC"}, + {0x41, "FANCTLR"}, + {0x42, "OBUS_BRICK"}, + {0x43, "NPU"}, + {0x44, "MC"}, + {0x45, "TEST_FAIL"}, + {0x46, "MFREFCLK"}, + {0x47, "SMPGROUP"}, + {0x48, "OMI"}, + {0x49, "MCC"}, + {0x4A, "OMIC"}, + {0x4B, "OCMB_CHIP"}, + {0x4C, "MEM_PORT"}, + {0x4D, "I2C_MUX"}, + {0x4E, "PMIC"}, + {0x4F, "NMMU"}, + {0x50, "PAU"}, + {0x51, "IOHS"}, + {0x52, "PAUC"}, + {0x53, "FC"}, + {0x54, "LPCREFCLKENDPT"}, + {0x55, "GENERIC_I2C_DEVICE"}, + {0x56, "LAST_IN_RANGE"}, +}; + -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:47 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:47 +0530 Subject: [Skiboot] [PATCH v2 09/59] psi/p10: Activate P10 interrupts In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-10-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Behave as P9 for now until we know more on P10. Interface should be the same, apart from the size of the ESB pages. Signed-off-by: C?dric Le Goater [Fixed suprious interrupt issue - Vasant] Signed-off-by: Vasant Hegde --- hw/psi.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++- include/psi.h | 7 ++++++ 2 files changed, 70 insertions(+), 1 deletion(-) diff --git a/hw/psi.c b/hw/psi.c index 545a81643..f95a066d3 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -266,7 +266,10 @@ static void psi_spurious_fsp_irq(struct psi *psi) prlog(PR_NOTICE, "PSI: Spurious interrupt, attempting clear\n"); - if (proc_gen == proc_gen_p9) { + if (proc_gen == proc_gen_p10) { + reg = PSIHB_XSCOM_P10_HBCSR_CLR; + bit = PSIHB_XSCOM_P10_HBSCR_FSP_IRQ; + } else if (proc_gen == proc_gen_p9) { reg = PSIHB_XSCOM_P9_HBCSR_CLR; bit = PSIHB_XSCOM_P9_HBSCR_FSP_IRQ; } else if (proc_gen == proc_gen_p8) { @@ -737,6 +740,61 @@ static void psi_init_p9_interrupts(struct psi *psi) out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, 0); } +/* + * P9 and P10 have the same PSIHB interface + */ +static const struct irq_source_ops psi_p10_irq_ops = { + .interrupt = psihb_p9_interrupt, + .attributes = psi_p9_irq_attributes, + .name = psi_p9_irq_name, +}; + +static void psi_init_p10_interrupts(struct psi *psi) +{ + struct proc_chip *chip; + u64 val; + /* TODO (clg) : fix ESB page size to 64k when ready */ + uint32_t esb_shift = 12; + + /* Grab chip */ + chip = get_chip(psi->chip_id); + if (!chip) + return; + + /* Configure the CI BAR */ + phys_map_get(chip->id, PSIHB_ESB, 0, &val, NULL); + val |= PSIHB_ESB_CI_VALID; + out_be64(psi->regs + PSIHB_ESB_CI_BASE, val); + + val = in_be64(psi->regs + PSIHB_ESB_CI_BASE); + psi->esb_mmio = (void *)(val & ~PSIHB_ESB_CI_VALID); + prlog(PR_DEBUG, "PSI[0x%03x]: ESB MMIO at @%p\n", + psi->chip_id, psi->esb_mmio); + + /* Grab and configure the notification port */ + val = xive_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); + val |= PSIHB_ESB_NOTIF_VALID; + out_be64(psi->regs + PSIHB_ESB_NOTIF_ADDR, val); + + /* Setup interrupt offset */ + val = xive_get_notify_base(psi->interrupt); + val <<= 32; + out_be64(psi->regs + PSIHB_IVT_OFFSET, val); + + /* Register sources */ + prlog(PR_DEBUG, + "PSI[0x%03x]: Interrupts sources registered for P10 DD%i.%i\n", + psi->chip_id, 0xf & (chip->ec_level >> 4), chip->ec_level & 0xf); + + xive_register_hw_source(psi->interrupt, P9_PSI_NUM_IRQS, + esb_shift, psi->esb_mmio, XIVE_SRC_LSI, + psi, &psi_p10_irq_ops); + + /* Reset irq handling and switch to ESB mode */ + out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, PSIHB_IRQ_RESET); + out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, 0); +} + static void psi_init_interrupts(struct psi *psi) { /* Configure the interrupt BUID and mask it */ @@ -747,6 +805,9 @@ static void psi_init_interrupts(struct psi *psi) case proc_gen_p9: psi_init_p9_interrupts(psi); break; + case proc_gen_p10: + psi_init_p10_interrupts(psi); + break; default: /* Unknown: just no interrupts */ prerror("PSI: Unknown interrupt type\n"); @@ -826,6 +887,7 @@ static void psi_create_mm_dtnode(struct psi *psi) "ibm,power8-psi"); break; case proc_gen_p9: + case proc_gen_p10: dt_add_property_strings(np, "compatible", "ibm,psi", "ibm,power9-psi"); psi_create_p9_int_map(psi, np); diff --git a/include/psi.h b/include/psi.h index f7b5927ca..a7104ef0b 100644 --- a/include/psi.h +++ b/include/psi.h @@ -116,6 +116,13 @@ #define PSIHB_XSCOM_P9_HBCSR_CLR 0x13 #define PSIHB_XSCOM_P9_HBSCR_FSP_IRQ PPC_BIT(17) +#define PSIHB_XSCOM_P10_BASE 0xa +#define PSIHB_XSCOM_P10_HBBAR_EN PPC_BIT(63) +#define PSIHB_XSCOM_P10_HBCSR 0xe +#define PSIHB_XSCOM_P10_HBCSR_SET 0x12 +#define PSIHB_XSCOM_P10_HBCSR_CLR 0x13 +#define PSIHB_XSCOM_P10_HBSCR_FSP_IRQ PPC_BIT(17) + /* P9 PSI Interrupts */ #define P9_PSI_IRQ_PSI 0 #define P9_PSI_IRQ_OCC 1 -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:49 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:49 +0530 Subject: [Skiboot] [PATCH v2 11/59] external/xscom-utils: Add P10 chip info In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-12-hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde --- external/xscom-utils/adu_scoms.py | 2 ++ external/xscom-utils/getscom.c | 3 +++ external/xscom-utils/sram.c | 2 ++ 3 files changed, 7 insertions(+) diff --git a/external/xscom-utils/adu_scoms.py b/external/xscom-utils/adu_scoms.py index d651b7e9f..e90634190 100755 --- a/external/xscom-utils/adu_scoms.py +++ b/external/xscom-utils/adu_scoms.py @@ -176,6 +176,8 @@ class GetSCom(object): name = "P9 (Cumulus) processor" elif id == 0xd9: name = "P9P (Axone) processor" + elif id == 0xda: + name = "P10 processor" elif id == 0xe9: name = "Centaur memory buffer" else: diff --git a/external/xscom-utils/getscom.c b/external/xscom-utils/getscom.c index c18a04972..67596e618 100644 --- a/external/xscom-utils/getscom.c +++ b/external/xscom-utils/getscom.c @@ -56,6 +56,9 @@ static void print_chip_info(uint32_t chip_id) case 0xd9: name = "P9P (Axone) processor"; break; + case 0xda: + name = "P10 processor"; + break; case 0xe9: name = "Centaur memory buffer"; break; diff --git a/external/xscom-utils/sram.c b/external/xscom-utils/sram.c index 87df70e10..efe08d8e7 100644 --- a/external/xscom-utils/sram.c +++ b/external/xscom-utils/sram.c @@ -28,6 +28,7 @@ #define PVR_TYPE_P8NVL 0x004c /* Naples */ #define PVR_TYPE_P9 0x004e #define PVR_TYPE_P9P 0x004f /* Axone */ +#define PVR_TYPE_P10 0x0080 #ifdef __powerpc__ static uint64_t get_xscom_base(void) @@ -39,6 +40,7 @@ static uint64_t get_xscom_base(void) switch (pvr >> 16) { case PVR_TYPE_P9: case PVR_TYPE_P9P: + case PVR_TYPE_P10: /* P10 OCB_PIB OCC Control Register is same for P9 and P10 */ return OCB_PIB_BASE_P9; case PVR_TYPE_P8E: -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:50 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:50 +0530 Subject: [Skiboot] [PATCH v2 12/59] external/opal-prd: Fix occ, homer node label search In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-13-hegdevasant@linux.vnet.ibm.com> Starting P10, hostboot/HDAT will provide consistent reserved node name. It will just provide node name without starting string "ibm,". That will cause `pm-complex <*>` operation to fails. This patch fixes above issue. For backward compatability purpose I have kept support for old variant of node name as well. Signed-off-by: Vasant Hegde --- external/opal-prd/opal-prd.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/external/opal-prd/opal-prd.c b/external/opal-prd/opal-prd.c index 12269e8eb..1c610da4c 100644 --- a/external/opal-prd/opal-prd.c +++ b/external/opal-prd/opal-prd.c @@ -1508,17 +1508,23 @@ static int pm_complex_load_start(void) range = find_range("ibm,occ-common-area", 0); if (!range) { - pr_log(LOG_ERR, "PM: ibm,occ-common-area not found"); - return rc; + range = find_range("occ-common-area", 0); + if (!range) { + pr_log(LOG_ERR, "PM: occ-common-area not found"); + return rc; + } } occ_common = range->physaddr; for (i = 0; i < nr_chips; i++) { range = find_range("ibm,homer-image", chips[i]); if (!range) { - pr_log(LOG_ERR, "PM: ibm,homer-image not found 0x%lx", - chips[i]); - return -1; + range = find_range("homer-image", chips[i]); + if (!range) { + pr_log(LOG_ERR, "PM: homer-image not found 0x%lx", + chips[i]); + return -1; + } } homer = range->physaddr; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:51 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:51 +0530 Subject: [Skiboot] [PATCH v2 13/59] occ: Add POWER10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-14-hegdevasant@linux.vnet.ibm.com> From: Vaidyanathan Srinivasan Add support for parsing OCC on Power10 to populate the pstate information. Also enables OCC on P10 Denali system. Co-authored-by: Pratik R. Sampat Co-authored-by: Vaidyanathan Srinivasan Signed-off-by: Pratik R. Sampat Signed-off-by: Vaidyanathan Srinivasan Signed-off-by: Vasant Hegde --- hw/fsp/fsp-occ.c | 3 +- hw/occ-sensor.c | 4 +- hw/occ.c | 172 ++++++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 165 insertions(+), 14 deletions(-) diff --git a/hw/fsp/fsp-occ.c b/hw/fsp/fsp-occ.c index 3081f89a9..58926f408 100644 --- a/hw/fsp/fsp-occ.c +++ b/hw/fsp/fsp-occ.c @@ -167,7 +167,7 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id) if (err) return; - if (proc_gen == proc_gen_p9) { + if (proc_gen >= proc_gen_p9) { if (in_ipl) { /* OCC is pre-loaded in P9, so send SUCCESS to FSP */ rsp = fsp_mkmsg(FSP_CMD_LOAD_OCC_STAT, 2, 0, seq_id); @@ -316,6 +316,7 @@ static void occ_do_reset(u8 scope, u32 dbob_id, u32 seq_id) rc = host_services_occ_stop(); break; case proc_gen_p9: + case proc_gen_p10: last_seq_id = seq_id; chip = next_chip(NULL); prd_fsp_occ_reset(chip->id); diff --git a/hw/occ-sensor.c b/hw/occ-sensor.c index 8605c405e..6efaf908b 100644 --- a/hw/occ-sensor.c +++ b/hw/occ-sensor.c @@ -500,8 +500,8 @@ bool occ_sensors_init(void) int occ_num = 0, i; bool has_gpu = false; - /* OCC inband sensors is only supported in P9 */ - if (proc_gen != proc_gen_p9) + /* OCC inband sensors is only supported in P9/10 */ + if (proc_gen < proc_gen_p9) return false; /* Sensors are copied to BAR2 OCC Common Area */ diff --git a/hw/occ.c b/hw/occ.c index b09b76dc4..8d7bcbec9 100644 --- a/hw/occ.c +++ b/hw/occ.c @@ -36,12 +36,14 @@ #define MAX_PSTATES 256 #define MAX_P8_CORES 12 #define MAX_P9_CORES 24 +#define MAX_P10_CORES 32 #define MAX_OPAL_CMD_DATA_LENGTH 4090 #define MAX_OCC_RSP_DATA_LENGTH 8698 #define P8_PIR_CORE_MASK 0xFFF8 #define P9_PIR_QUAD_MASK 0xFFF0 +#define P10_PIR_CHIP_MASK 0x0000 #define FREQ_MAX_IN_DOMAIN 0 #define FREQ_MOST_RECENTLY_SET 1 @@ -120,6 +122,28 @@ struct occ_pstate_table { u8 core_max[MAX_P9_CORES]; u8 pad[56]; } v9; + struct __packed { /* Version 0xA0 */ + u8 occ_role; + u8 pstate_min; + u8 pstate_fixed_freq; + u8 pstate_base; + u8 pstate_ultra_turbo; + u8 pstate_fmax; + u8 minor; + u8 pstate_bottom_throttle; + u8 spare; + u8 spare1; + u32 reserved_32; + u64 reserved_64; + struct __packed { + u8 id; + u8 valid; + u16 reserved; + __be32 freq_khz; + } pstates[MAX_PSTATES]; + u8 core_max[MAX_P10_CORES]; + u8 pad[48]; + } v10; }; } __packed; @@ -237,7 +261,12 @@ struct occ_dynamic_data { u8 major_version; u8 minor_version; u8 gpus_present; - u8 spare1; + struct __packed { /* Version 0x90 */ + u8 spare1; + } v9; + struct __packed { /* Version 0xA0 */ + u8 wof_enabled; + } v10; u8 cpu_throttle; u8 mem_throttle; u8 quick_pwr_drop; @@ -370,7 +399,7 @@ static bool wait_for_all_occ_init(void) * Tuletta), OCC is not loaded before OPAL boot. Hence * initialization can take a while. * - * Note: Checking for occ_data->version == (0x01/0x02/0x90) + * Note: Checking for occ_data->version == (0x01/0x02/0x90/0xA0) * is ok because we clear all of * homer_base+size before passing memory to host * services. This ensures occ_data->version == 0x0 @@ -381,7 +410,7 @@ static bool wait_for_all_occ_init(void) version = occ_data->version; if (version == 0x01 || version == 0x02 || - version == 0x90) + version == 0x90 || version == 0xA0) break; time_wait_ms(100); @@ -465,6 +494,57 @@ static bool wait_for_all_occ_init(void) } break; + case 0xA0: + /* + * OCC-OPAL interface version 0x90 has a + * dynamic data section. This has an + * occ_state field whose values inform about + * the state of the OCC. + * + * 0x00 = OCC not running. No communication + * allowed. + * + * 0x01 = Standby. No communication allowed. + * + * 0x02 = Observation State. Communication + * allowed and is command dependent. + * + * 0x03 = Active State. Communication allowed + * and is command dependent. + * + * 0x04 = Safe State. No communication + * allowed. Just like CPU throttle + * status, some failures will not allow + * for OCC to update state to safe. + * + * 0x05 = Characterization State. + * Communication allowed and is command + * dependent. + * + * We will error out if OCC is not in the + * Active State. + * + * XXX : Should we error out only if no + * communication is allowed with the + * OCC ? + */ + occ_dyn_data = get_occ_dynamic_data(chip); + if (occ_dyn_data->occ_state != 0x3) { + /** + * @fwts-label OCCInactive + * @fwts-advice The OCC for a chip was not active. + * This means that CPU frequency scaling will + * not be functional. CPU may be set to a low, + * safe frequency. This means that CPU idle + * states and CPU frequency scaling may not be + * functional. + */ + prlog(PR_ERR, "OCC: Chip: %x: OCC not active\n", + chip->id); + return false; + } + break; + default: prlog(PR_ERR, "OCC: Unknown OCC-OPAL interface version.\n"); return false; @@ -476,7 +556,7 @@ static bool wait_for_all_occ_init(void) prlog(PR_DEBUG, "OCC: Chip %02x Data (%016llx) = %016llx\n", chip->id, (uint64_t)occ_data, be64_to_cpu(*(__be64 *)occ_data)); - if (version == 0x90) { + if (version == 0x90 || version == 0xA0) { occ_dyn_data = get_occ_dynamic_data(chip); prlog(PR_DEBUG, "OCC: Chip %02x Dynamic Data (%016llx) = %016llx\n", chip->id, (uint64_t)occ_dyn_data, @@ -549,6 +629,36 @@ static void parse_pstates_v9(struct occ_pstate_table *data, __be32 *dt_id, nr_pstates, j); } +static void parse_pstates_v10(struct occ_pstate_table *data, __be32 *dt_id, + __be32 *dt_freq, int nr_pstates, int pmax, int pmin) +{ + int i, j; + int invalid = 0; + + for (i = 0, j = 0; i < MAX_PSTATES && j < nr_pstates; i++) { + if (cmp_pstates(data->v10.pstates[i].id, pmax) > 0) + continue; + + if (!data->v10.pstates[i].valid) { + prlog(PR_WARNING, "OCC: Found Invalid pstate with index %d. Skipping it.\n", i); + invalid++; + continue; + } + + dt_id[j] = cpu_to_be32(data->v10.pstates[i].id); + dt_freq[j] = cpu_to_be32(be32_to_cpu(data->v10.pstates[i].freq_khz) / 1000); + j++; + + if (data->v10.pstates[i].id == pmin) + break; + } + + if ((j + invalid) != nr_pstates) { + prerror("OCC: Expected pstates(%d) not equal to (Parsed pstates(%d) + Invalid Pstates (%d))\n", + nr_pstates, j, invalid); + } +} + static void parse_vid(struct occ_pstate_table *occ_data, struct dt_node *node, u8 nr_pstates, int pmax, int pmin) @@ -588,6 +698,7 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, struct proc_chip *chip; uint64_t occ_data_area; struct occ_pstate_table *occ_data = NULL; + struct occ_dynamic_data *occ_dyn_data; /* Arrays for device tree */ __be32 *dt_id, *dt_freq; int pmax, pmin, pnom; @@ -647,7 +758,7 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, /* Parse Pmax, Pmin and Pnominal */ switch (major) { case 0: - if (proc_gen == proc_gen_p9) { + if (proc_gen >= proc_gen_p9) { /** * @fwts-label OCCInvalidVersion02 * @fwts-advice The PState table layout version is not @@ -685,6 +796,15 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, pnom = occ_data->v9.pstate_nom; pmax = occ_data->v9.pstate_ultra_turbo; break; + case 0xA: + pmin = occ_data->v10.pstate_min; + pnom = occ_data->v10.pstate_fixed_freq; + occ_dyn_data = get_occ_dynamic_data(chip); + if (occ_dyn_data->v10.wof_enabled) + pmax = occ_data->v10.pstate_ultra_turbo; + else + pmax = occ_data->v10.pstate_fmax; + break; default: /** * @fwts-label OCCUnsupportedVersion @@ -730,7 +850,7 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, nr_pstates = labs(pmax - pmin) + 1; prlog(PR_DEBUG, "OCC: Version %x Min %d Nom %d Max %d Nr States %d\n", occ_data->version, pmin, pnom, pmax, nr_pstates); - if ((major == 0x9 && nr_pstates <= 1) || + if (((major == 0x9 || major == 0xA) && nr_pstates <= 1) || (major == 0 && (nr_pstates <= 1 || nr_pstates > 128))) { /** * @fwts-label OCCInvalidPStateRange @@ -760,6 +880,10 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, parse_pstates_v9(occ_data, dt_id, dt_freq, nr_pstates, pmax, pmin); break; + case 0xA: + parse_pstates_v10(occ_data, dt_id, dt_freq, nr_pstates, + pmax, pmin); + break; default: return false; } @@ -801,6 +925,12 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, for (i = 0; i < nr_cores; i++) dt_cmax[i] = cpu_to_be32(occ_data->v9.core_max[i]); break; + case 0xA: + pturbo = occ_data->v10.pstate_base; + pultra_turbo = occ_data->v10.pstate_ultra_turbo; + for (i = 0; i < nr_cores; i++) + dt_cmax[i] = cpu_to_be32(occ_data->v10.core_max[i]); + break; default: return false; } @@ -824,7 +954,7 @@ static bool add_cpu_pstate_properties(struct dt_node *power_mgt, free(dt_cmax); } - if (major == 0x9) + if (major == 0x9 || major == 0xA) goto out; dt_add_property_cells(power_mgt, "#address-cells", 2); @@ -888,7 +1018,7 @@ static bool cpu_pstates_prepare_core(struct proc_chip *chip, * * Use the OR SCOM to set the required bits in PM_GP1 register * since the OCC might be mainpulating the PM_GP1 register as well. - */ + */ rc = xscom_write(chip->id, XSCOM_ADDR_P8_EX_SLAVE(core, EX_PM_SET_GP1), EX_PM_SETUP_GP1_PM_SPR_OVERRIDE_EN); if (rc) { @@ -970,6 +1100,7 @@ static inline u8 get_cpu_throttle(struct proc_chip *chip) case 0: return pdata->v2.throttle; case 0x9: + case 0xA: data = get_occ_dynamic_data(chip); return data->cpu_throttle; default: @@ -1379,7 +1510,7 @@ static void occ_cmd_interface_init(void) struct occ_pstate_table *pdata; struct dt_node *power_mgt; struct proc_chip *chip; - int i = 0; + int i = 0, major; /* Check if the OCC data is valid */ for_each_chip(chip) { @@ -1390,7 +1521,8 @@ static void occ_cmd_interface_init(void) chip = next_chip(NULL); pdata = get_occ_pstate_table(chip); - if ((pdata->version >> 4) != 0x9) + major = pdata->version >> 4; + if (major != 0x9 || major != 0xA) return; for_each_chip(chip) @@ -1403,11 +1535,18 @@ static void occ_cmd_interface_init(void) pdata = get_occ_pstate_table(chip); data = get_occ_dynamic_data(chip); chips[i].chip_id = chip->id; - chips[i].occ_role = pdata->v9.occ_role; chips[i].occ_state = &data->occ_state; chips[i].valid = &pdata->valid; chips[i].cmd = &data->cmd; chips[i].rsp = &data->rsp; + switch (major) { + case 0x9: + chips[i].occ_role = pdata->v9.occ_role; + break; + case 0xA: + chips[i].occ_role = pdata->v10.occ_role; + break; + } init_lock(&chips[i].queue_lock); chips[i].cmd_in_progress = false; chips[i].request_id = 0; @@ -1881,6 +2020,7 @@ void occ_pstates_init(void) homer_opal_data_offset = P8_HOMER_OPAL_DATA_OFFSET; break; case proc_gen_p9: + case proc_gen_p10: homer_opal_data_offset = P9_HOMER_OPAL_DATA_OFFSET; break; default: @@ -1943,6 +2083,11 @@ void occ_pstates_init(void) } else if (proc_gen == proc_gen_p9) { freq_domain_mask = P9_PIR_QUAD_MASK; domain_runs_at = FREQ_MAX_IN_DOMAIN; + } else if (proc_gen == proc_gen_p10) { + freq_domain_mask = P10_PIR_CHIP_MASK; + domain_runs_at = FREQ_MAX_IN_DOMAIN; + } else { + assert(0); } dt_add_property_cells(power_mgt, "freq-domain-mask", freq_domain_mask); @@ -2112,6 +2257,11 @@ void occ_send_dummy_interrupt(void) OCB_OCI_OCIMISC_IRQ | OCB_OCI_OCIMISC_IRQ_OPAL_DUMMY); break; + case proc_gen_p10: + xscom_write(psi->chip_id, P9_OCB_OCI_OCCMISC_OR, + OCB_OCI_OCIMISC_IRQ | + OCB_OCI_OCIMISC_IRQ_OPAL_DUMMY); + break; default: break; } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:52 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:52 +0530 Subject: [Skiboot] [PATCH v2 14/59] hdata: Add POWER10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-15-hegdevasant@linux.vnet.ibm.com> From: Ravi Bangoria Initial P10 support - LPC : This contains two useful information: LPC MCTP Memory Window Base Address Second vUART console details - Enable memory-buffer mmio - Fix ipmi sensors IPMI sensors are deprecated in P10. Hence do not parse ipmi sensors. - I2C support - Detect PHB5 - Create p10 xscom, xive, chiptod nodes - Set pa-features bit for 2nd DAWR Availability of 2nd DAWR depends on 0th bit of 64th byte of ibm,pa-features property. Set it for p10. Co-authored-by: Vasant Hegde Signed-off-by: Vasant Hegde Co-authored-by: Nicholas Piggin Signed-off-by: Nicholas Piggin Co-authored-by: Reza Arbab Signed-off-by: Reza Arbab Co-authored-by: Ravi Bangoria Signed-off-by: Ravi Bangoria Signed-off-by: Vasant Hegde --- hdata/cpu-common.c | 19 ++++++++++++++- hdata/fsp.c | 10 ++++++-- hdata/i2c.c | 5 ++-- hdata/iohub.c | 50 +++++++++++++++++++++++++++++--------- hdata/memory.c | 8 ++++--- hdata/spira.c | 52 ++++++++++++++++++++++++++++++++++------ hdata/spira.h | 20 ++++++++++++++-- hdata/test/hdata_to_dt.c | 14 +++++++++-- 8 files changed, 148 insertions(+), 30 deletions(-) diff --git a/hdata/cpu-common.c b/hdata/cpu-common.c index e46f919b7..bf821c154 100644 --- a/hdata/cpu-common.c +++ b/hdata/cpu-common.c @@ -46,6 +46,18 @@ struct dt_node * add_core_common(struct dt_node *cpus, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 .. 55 */ 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 56 .. 63 */ }; + const uint8_t pa_features_p10[] = { + 66, 0, + 0xf6, 0x3f, 0xc7, 0xc0, 0x80, 0xd0, 0x80, 0x00, /* 0 .. 7 */ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 8 .. 15 */ + 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 16 .. 23 */ + 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 24 .. 31 */ + 0x80, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, 0x00, /* 32 .. 39 */ + 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 40 .. 47 */ + 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 .. 55 */ + 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 56 .. 63 */ + 0x80, 0x00, /* 64 .. 65 */ + }; const uint8_t *pa_features; size_t pa_features_size; @@ -83,6 +95,11 @@ struct dt_node * add_core_common(struct dt_node *cpus, pa_features_size = sizeof(pa_features_p9); } break; + case PVR_TYPE_P10: + name = "PowerPC,POWER10"; + pa_features = pa_features_p10; + pa_features_size = sizeof(pa_features_p10); + break; default: name = "PowerPC,Unknown"; pa_features = NULL; @@ -103,7 +120,7 @@ struct dt_node * add_core_common(struct dt_node *cpus, dt_add_property_cells(cpu, "ibm,processor-page-sizes", 0xc, 0x10, 0x18, 0x22); - if (proc_gen == proc_gen_p9) + if (proc_gen >= proc_gen_p9) dt_add_property_cells(cpu, "ibm,processor-radix-AP-encodings", 0x0000000c, 0xa0000010, 0x20000015, 0x4000001e); diff --git a/hdata/fsp.c b/hdata/fsp.c index 18380e7d4..458b7e636 100644 --- a/hdata/fsp.c +++ b/hdata/fsp.c @@ -355,7 +355,7 @@ static void add_ipmi_sensors(struct dt_node *bmc_node) static void bmc_create_node(const struct HDIF_common_hdr *sp) { struct dt_node *bmc_node; - u32 fw_bar, io_bar, mem_bar, internal_bar; + u32 fw_bar, io_bar, mem_bar, internal_bar, mctp_base; const struct spss_iopath *iopath; const struct spss_sp_impl *sp_impl; struct dt_node *lpcm, *lpc, *n; @@ -370,7 +370,8 @@ static void bmc_create_node(const struct HDIF_common_hdr *sp) dt_add_property_cells(bmc_node, "#size-cells", 0); /* Add sensor info under /bmc */ - add_ipmi_sensors(bmc_node); + if (proc_gen < proc_gen_p10) + add_ipmi_sensors(bmc_node); sp_impl = HDIF_get_idata(sp, SPSS_IDATA_SP_IMPL, &size); if (CHECK_SPPTR(sp_impl) && (size > 8)) { @@ -425,12 +426,17 @@ static void bmc_create_node(const struct HDIF_common_hdr *sp) mem_bar = be32_to_cpu(iopath->lpc.memory_bar); io_bar = be32_to_cpu(iopath->lpc.io_bar); internal_bar = be32_to_cpu(iopath->lpc.internal_bar); + mctp_base = be32_to_cpu(iopath->lpc.mctp_base); prlog(PR_DEBUG, "LPC: IOPATH chip id = %x\n", chip_id); prlog(PR_DEBUG, "LPC: FW BAR = %#x\n", fw_bar); prlog(PR_DEBUG, "LPC: MEM BAR = %#x\n", mem_bar); prlog(PR_DEBUG, "LPC: IO BAR = %#x\n", io_bar); prlog(PR_DEBUG, "LPC: Internal BAR = %#x\n", internal_bar); + if (proc_gen >= proc_gen_p10) { + /* MCTP is part of FW BAR */ + prlog(PR_DEBUG, "LPC: MCTP base = %#x\n", mctp_base); + } /* * The internal address space BAR actually points to the LPC master diff --git a/hdata/i2c.c b/hdata/i2c.c index 8aa93d8f5..7d5d655a5 100644 --- a/hdata/i2c.c +++ b/hdata/i2c.c @@ -250,7 +250,7 @@ int parse_i2c_devs(const struct HDIF_common_hdr *hdr, int idata_index, * This code makes a few assumptions about XSCOM addrs, etc * and will need updating for new processors */ - assert(proc_gen == proc_gen_p9); + assert(proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10); /* * Emit an error if we get a newer version. This is an interim measure @@ -301,7 +301,8 @@ int parse_i2c_devs(const struct HDIF_common_hdr *hdr, int idata_index, * engines outside this range so we don't create bogus * i2cm@ nodes. */ - if (dev->i2cm_engine >= 4 && proc_gen == proc_gen_p9) + if (dev->i2cm_engine >= 4 && + (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10)) continue; bus = p8_i2c_add_port_node(xscom, dev->i2cm_engine, dev->i2cm_port, diff --git a/hdata/iohub.c b/hdata/iohub.c index fa3afbf7a..fb215e1fb 100644 --- a/hdata/iohub.c +++ b/hdata/iohub.c @@ -151,6 +151,7 @@ static struct dt_node *add_pec_stack(const struct cechub_io_hub *hub, int phb_index, u8 active_phbs) { struct dt_node *stack; + const char *compat; u64 eq[8]; u8 *gen4; int i; @@ -158,9 +159,14 @@ static struct dt_node *add_pec_stack(const struct cechub_io_hub *hub, stack = dt_new_addr(pbcq, "stack", stack_index); assert(stack); + if (proc_gen == proc_gen_p9) + compat = "ibm,power9-phb-stack"; + else + compat = "ibm,power10-phb-stack"; + dt_add_property_cells(stack, "reg", stack_index); dt_add_property_cells(stack, "ibm,phb-index", phb_index); - dt_add_property_string(stack, "compatible", "ibm,power9-phb-stack"); + dt_add_property_string(stack, "compatible", compat); /* XXX: This should probably just return if the PHB is disabled * rather than adding the extra properties. @@ -190,6 +196,7 @@ static struct dt_node *add_pec_stack(const struct cechub_io_hub *hub, return stack; } +/* Add PHB4 on p9, PHB5 on p10 */ static struct dt_node *io_add_phb4(const struct cechub_io_hub *hub, const struct HDIF_common_hdr *sp_iohubs, struct dt_node *xcom, @@ -199,10 +206,21 @@ static struct dt_node *io_add_phb4(const struct cechub_io_hub *hub, { struct dt_node *pbcq; uint8_t active_phb_mask = hub->fab_br0_pdt; - uint32_t pe_xscom = 0x4010c00 + (pec_index * 0x0000400); - uint32_t pci_xscom = 0xd010800 + (pec_index * 0x1000000); + uint32_t pe_xscom; + uint32_t pci_xscom; + const char *compat; int i; + if (proc_gen == proc_gen_p9) { + pe_xscom = 0x4010c00 + (pec_index * 0x0000400); + pci_xscom = 0xd010800 + (pec_index * 0x1000000); + compat = "ibm,power9-pbcq"; + } else { + pe_xscom = 0x3011800 - (pec_index * 0x1000000); + pci_xscom = 0x8010800 + (pec_index * 0x1000000); + compat = "ibm,power10-pbcq"; + } + /* Create PBCQ node under xscom */ pbcq = dt_new_addr(xcom, "pbcq", pe_xscom); if (!pbcq) @@ -214,7 +232,7 @@ static struct dt_node *io_add_phb4(const struct cechub_io_hub *hub, pci_xscom, 0x200); /* The hubs themselves go under the stacks */ - dt_add_property_strings(pbcq, "compatible", "ibm,power9-pbcq"); + dt_add_property_strings(pbcq, "compatible", compat); dt_add_property_cells(pbcq, "ibm,pec-index", pec_index); dt_add_property_cells(pbcq, "#address-cells", 1); dt_add_property_cells(pbcq, "#size-cells", 0); @@ -229,7 +247,7 @@ static struct dt_node *io_add_phb4(const struct cechub_io_hub *hub, */ io_get_loc_code(sp_iohubs, pbcq, "ibm,loc-code"); - prlog(PR_INFO, "CEC: Added PHB4 PBCQ %d with %d stacks\n", + prlog(PR_INFO, "CEC: Added PBCQ %d with %d stacks\n", pec_index, stacks); /* the actual PHB nodes created later on by skiboot */ @@ -267,6 +285,7 @@ static struct dt_node *io_add_p8(const struct cechub_io_hub *hub, return xscom; } +/* Add PBCQs for p9/p10 */ static struct dt_node *io_add_p9(const struct cechub_io_hub *hub, const struct HDIF_common_hdr *sp_iohubs) { @@ -280,17 +299,22 @@ static struct dt_node *io_add_p9(const struct cechub_io_hub *hub, xscom = find_xscom_for_chip(chip_id); if (!xscom) { - prerror("P9: Can't find XSCOM for chip %d\n", chip_id); + prerror("IOHUB: Can't find XSCOM for chip %d\n", chip_id); return NULL; } - prlog(PR_DEBUG, "IOHUB: PHB4 active bridge mask %x\n", + prlog(PR_DEBUG, "IOHUB: PHB active bridge mask %x\n", (u32) hub->fab_br0_pdt); /* Create PBCQs */ - io_add_phb4(hub, sp_iohubs, xscom, 0, 1, 0); - io_add_phb4(hub, sp_iohubs, xscom, 1, 2, 1); - io_add_phb4(hub, sp_iohubs, xscom, 2, 3, 3); + if (proc_gen == proc_gen_p9) { + io_add_phb4(hub, sp_iohubs, xscom, 0, 1, 0); + io_add_phb4(hub, sp_iohubs, xscom, 1, 2, 1); + io_add_phb4(hub, sp_iohubs, xscom, 2, 3, 3); + } else { /* p10 */ + io_add_phb4(hub, sp_iohubs, xscom, 0, 3, 0); + io_add_phb4(hub, sp_iohubs, xscom, 1, 3, 3); + } return xscom; } @@ -806,6 +830,10 @@ static void io_parse_fru(const void *sp_iohubs) prlog(PR_INFO, "CEC: Axone !\n"); io_add_p9(hub, sp_iohubs); break; + case CECHUB_HUB_RAINIER: + prlog(PR_INFO, "CEC: Rainier !\n"); + io_add_p9(hub, sp_iohubs); + break; default: prlog(PR_ERR, "CEC: Hub ID 0x%04x unsupported !\n", hub_id); @@ -817,7 +845,7 @@ static void io_parse_fru(const void *sp_iohubs) io_parse_slots(sp_iohubs, chip_id); } - if (proc_gen == proc_gen_p8 || proc_gen == proc_gen_p9) + if (proc_gen == proc_gen_p8 || proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) io_add_p8_cec_vpd(sp_iohubs); } diff --git a/hdata/memory.c b/hdata/memory.c index 6602addd9..efdb502b1 100755 --- a/hdata/memory.c +++ b/hdata/memory.c @@ -53,6 +53,8 @@ struct HDIF_ms_area_address_range { #define PHYS_ATTR_STATUS_SAVE_FAILED 0x02 #define PHYS_ATTR_STATUS_SAVED 0x04 #define PHYS_ATTR_STATUS_NOT_SAVED 0x08 +#define PHYS_ATTR_STATUS_ENCRYPTED 0x10 +#define PHYS_ATTR_STATUS_ERR_DETECTED 0x40 #define PHYS_ATTR_STATUS_MEM_INVALID 0xff /* Memory Controller ID for Nimbus P9 systems */ @@ -514,7 +516,7 @@ static void add_memory_buffer_mmio(const struct HDIF_common_hdr *msarea) struct dt_node *membuf; beint64_t *reg, *flags; - if (PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P9P) + if (proc_gen <= proc_gen_p9 && PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P9P) return; if (be16_to_cpu(msarea->version) < 0x50) { @@ -911,7 +913,8 @@ static bool __memory_parse(struct dt_node *root) prlog(PR_DEBUG, "MS VPD: is at %p\n", ms_vpd); msac = HDIF_get_idata(ms_vpd, MSVPD_IDATA_MS_ADDR_CONFIG, &size); - if (!CHECK_SPPTR(msac) || size < sizeof(*msac)) { + if (!CHECK_SPPTR(msac) || + size < offsetof(struct msvpd_ms_addr_config, max_possible_ms_address)) { prerror("MS VPD: bad msac size %u @ %p\n", size, msac); op_display(OP_FATAL, OP_MOD_MEM, 0x0002); return false; @@ -953,4 +956,3 @@ void memory_parse(void) abort(); } } - diff --git a/hdata/spira.c b/hdata/spira.c index 2e3b3a463..85c2fe71c 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -301,6 +301,7 @@ static struct dt_node *add_xscom_node(uint64_t base, uint32_t hw_id, addr = base | ((uint64_t)hw_id << PPC_BITLSHIFT(28)); break; case proc_gen_p9: + case proc_gen_p10: /* XXX P10 */ default: /* On P9 we need to put the chip ID in the natural powerbus * position. @@ -332,6 +333,10 @@ static struct dt_node *add_xscom_node(uint64_t base, uint32_t hw_id, dt_add_property_strings(node, "compatible", "ibm,xscom", "ibm,power9-xscom"); break; + case proc_gen_p10: + dt_add_property_strings(node, "compatible", + "ibm,xscom", "ibm,power10-xscom"); + break; default: dt_add_property_strings(node, "compatible", "ibm,xscom"); } @@ -420,6 +425,11 @@ static void add_psihb_node(struct dt_node *np) psi_slen = 0x100; psi_comp = "ibm,power9-psihb-x"; break; + case proc_gen_p10: + psi_scom = 0x3011d00; + psi_slen = 0x100; + psi_comp = "ibm,power10-psihb-x"; + break; default: psi_comp = NULL; } @@ -438,10 +448,28 @@ static void add_psihb_node(struct dt_node *np) static void add_xive_node(struct dt_node *np) { - struct dt_node *xive = dt_new_addr(np, "xive", 0x5013000); + struct dt_node *xive; + const char *comp; + u32 scom, slen; + + switch (proc_gen) { + case proc_gen_p9: + scom = 0x5013000; + slen = 0x300; + comp = "ibm,power9-xive-x"; + break; + case proc_gen_p10: + scom = 0x2010800; + slen = 0x400; + comp = "ibm,power10-xive-x"; + break; + default: + return; + } - dt_add_property_cells(xive, "reg", 0x5013000, 0x300); - dt_add_property_string(xive, "compatible", "ibm,power9-xive-x"); + xive = dt_new_addr(np, "xive", scom); + dt_add_property_cells(xive, "reg", scom, slen); + dt_add_property_string(xive, "compatible", comp); /* HACK: required for simics */ dt_add_property(xive, "force-assign-bars", NULL, 0); @@ -725,6 +753,9 @@ static void add_chiptod_node(unsigned int chip_id, int flags) case proc_gen_p9: compat_str = "ibm,power9-chiptod"; break; + case proc_gen_p10: + compat_str = "ibm,power10-chiptod"; + break; default: return; } @@ -866,6 +897,7 @@ static void add_nx_node(u32 gcid) /* POWER9 NX is not software compatible with P8 NX */ dt_add_property_strings(nx, "compatible", "ibm,power9-nx"); break; + case proc_gen_p10: /* XXX P10 */ default: return; } @@ -903,15 +935,21 @@ static void add_nx(void) static void add_nmmu(void) { struct dt_node *xscom, *nmmu; + u32 scom; - /* Nest MMU only exists on POWER9 */ - if (proc_gen != proc_gen_p9) + /* Nest MMU only exists on POWER9 or later */ + if (proc_gen < proc_gen_p9) return; + if (proc_gen == proc_gen_p9) + scom = 0x5012c40; + else + scom = 0x2010c40; + dt_for_each_compatible(dt_root, xscom, "ibm,xscom") { - nmmu = dt_new_addr(xscom, "nmmu", 0x5012c40); + nmmu = dt_new_addr(xscom, "nmmu", scom); dt_add_property_strings(nmmu, "compatible", "ibm,power9-nest-mmu"); - dt_add_property_cells(nmmu, "reg", 0x5012c40, 0x20); + dt_add_property_cells(nmmu, "reg", scom, 0x20); } } diff --git a/hdata/spira.h b/hdata/spira.h index 18d73bdfa..7c5341f94 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -304,7 +304,7 @@ struct spss_iopath { __be32 firmware_bar; __be32 internal_bar; - __be32 reserved2; + __be32 mctp_base; __be64 uart_base; __be32 uart_size; @@ -316,13 +316,27 @@ struct spss_iopath { #define UART_INT_LVL_LOW 0x1 #define UART_INT_RISING 0x2 #define UART_INT_LVL_HIGH 0x3 - uint8_t reserved3[2]; + uint8_t uart_valid; + uint8_t reserved3; __be64 bt_base; __be32 bt_size; uint8_t bt_sms_int_num; uint8_t bt_bmc_response_int_num; uint8_t reserved4[2]; + + __be16 kcs_data_reg_addr; + __be16 kcs_status_reg_addr; + uint8_t kcs_int_number; + + __be64 uart2_base; + __be32 uart2_size; + __be32 uart2_clk; /* UART baud clock in Hz */ + __be32 uart2_baud; /* UART baud rate */ + uint8_t uart2_int_number; + uint8_t uart2_int_type; + uint8_t uart2_valid; + uint8_t reserved5; } __packed lpc; }; } __packed; @@ -493,6 +507,7 @@ struct msvpd_ms_addr_config { __be64 max_possible_ms_address; __be32 deprecated; __be64 mirrorable_memory_starting_address; + __be64 hrmor_stash_loc_address; } __packed; /* Idata index 1: Total configured mainstore */ @@ -651,6 +666,7 @@ struct cechub_io_hub { #define CECHUB_HUB_NIMBUS_LAGRANGE 0x0022 /* Nimbus+lagrange from spec */ #define CECHUB_HUB_CUMULUS_DUOMO 0x0030 /* cumulus+duomo from spec */ #define CECHUB_HUB_AXONE_HOPPER 0x0040 /* axone+hopper */ +#define CECHUB_HUB_RAINIER 0x0050 __be32 ec_level; __be32 aff_dom2; /* HDAT < v9.x only */ __be32 aff_dom3; /* HDAT < v9.x only */ diff --git a/hdata/test/hdata_to_dt.c b/hdata/test/hdata_to_dt.c index 90d83f937..1729f1ca9 100644 --- a/hdata/test/hdata_to_dt.c +++ b/hdata/test/hdata_to_dt.c @@ -2,7 +2,7 @@ /* * Given a hdata dump, output the device tree. * - * Copyright 2013-2019 IBM Corp. + * Copyright 2013-2020 IBM Corp. */ #include @@ -63,11 +63,13 @@ unsigned long tb_hz = 512000000; #define PVR_TYPE_P8NVL 0x004c #define PVR_TYPE_P9 0x004e #define PVR_TYPE_P9P 0x004f +#define PVR_TYPE_P10 0x0080 #define PVR_P8E 0x004b0201 #define PVR_P8 0x004d0200 #define PVR_P8NVL 0x004c0100 #define PVR_P9 0x004e0200 #define PVR_P9P 0x004f0100 +#define PVR_P10 0x00800100 #define SPR_PVR 0x11f /* RO: Processor version register */ @@ -328,6 +330,10 @@ int main(int argc, char *argv[]) fake_pvr = PVR_P9P; proc_gen = proc_gen_p9; opt_count++; + } else if (strcmp(argv[i], "-10") == 0) { + fake_pvr = PVR_P10; + proc_gen = proc_gen_p10; + opt_count++; } } @@ -347,13 +353,17 @@ int main(int argc, char *argv[]) " -8 Force PVR to POWER8\n" " -8E Force PVR to POWER8E\n" " -9 Force PVR to POWER9 (nimbus)\n" + " -9P Force PVR to POWER9P (Axone)\n" + " -10 Force PVR to POWER10\n" "\n" "When no PVR is specified -8 is assumed" "\n" "Pipe to 'dtc -I dtb -O dts' for human readable output\n"); } - phys_map_init(fake_pvr); + /* We don't have phys mapping for P8 */ + if (proc_gen != proc_gen_p8) + phys_map_init(fake_pvr); /* Copy in spira dump (assumes little has changed!). */ if (new_spira) { -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:53 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:53 +0530 Subject: [Skiboot] [PATCH v2 15/59] hdat/spira: Define ibm, primary-topology-index property per chip In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-16-hegdevasant@linux.vnet.ibm.com> From: Haren Myneni HDAT provides Topology ID table and the primary topology location on P10. This primary location points to primary topology entry in ID table which contains the primary topology index and this index is used to define the paste base address per chip. This patch reads Topology ID table and the primary topology location from hdata and retrieves the primary topology index in the ID table. Make this primaty topology index value available with ibm,primary-topology-index property per chip. VAS reads this property to setup paste base address for each chip. Signed-off-by: Haren Myneni Signed-off-by: Vasant Hegde --- core/chip.c | 3 +++ hdata/spira.c | 12 ++++++++++++ hdata/spira.h | 5 ++++- include/chip.h | 3 +++ 4 files changed, 22 insertions(+), 1 deletion(-) diff --git a/core/chip.c b/core/chip.c index a4ba3249e..2d95b2e05 100644 --- a/core/chip.c +++ b/core/chip.c @@ -133,6 +133,9 @@ static void init_chip(struct dt_node *dn) if (lc) chip->loc_code = strdup(lc); + chip->primary_topology = dt_prop_get_u32_def(dn, + "ibm,primary-topology-index", 0xffffffff); + prlog(PR_INFO, "CHIP: Initialised chip %d from %s\n", id, dn->name); chips[id] = chip; } diff --git a/hdata/spira.c b/hdata/spira.c index 85c2fe71c..2fd3da108 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -688,6 +688,18 @@ static bool add_xscom_sppcrd(uint64_t xscom_base) be32_to_cpu(cinfo->sw_xstop_fir_scom), fir_bit); } + + if (proc_gen >= proc_gen_p10) { + uint8_t primary_loc = cinfo->primary_topology_loc; + + if (primary_loc >= CHIP_MAX_TOPOLOGY_ENTRIES) { + prerror("XSCOM: Invalid primary topology index %d\n", + primary_loc); + continue; + } + dt_add_property_cells(np, "ibm,primary-topology-index", + cinfo->topology_id_table[primary_loc]); + } } return i > 0; diff --git a/hdata/spira.h b/hdata/spira.h index 7c5341f94..7da1154d7 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -1092,7 +1092,10 @@ struct sppcrd_chip_info { /* From latest version (possibly 0x21 and later) */ __be32 sw_xstop_fir_scom; uint8_t sw_xstop_fir_bitpos; - uint8_t reserved_1[3]; + /* Latest version for P10 */ +#define CHIP_MAX_TOPOLOGY_ENTRIES 32 + uint8_t topology_id_table[CHIP_MAX_TOPOLOGY_ENTRIES]; + uint8_t primary_topology_loc; /* Index in topology_id_table */ } __packed; /* Idata index 1 : Chip TOD */ diff --git a/include/chip.h b/include/chip.h index 8bc48ba29..bbfc65e3a 100644 --- a/include/chip.h +++ b/include/chip.h @@ -277,6 +277,9 @@ struct proc_chip { /* Used during OCC init */ bool ex_present; + + /* Used by hw/vas.c on p10 */ + uint32_t primary_topology; }; extern uint32_t pir_to_chip_id(uint32_t pir); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:55 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:55 +0530 Subject: [Skiboot] [PATCH v2 17/59] hdata/P10: Fix xscom address and ibm, chip-id property In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-18-hegdevasant@linux.vnet.ibm.com> `xscom_id` is deprecated in P10. Instead we should use topology ID's ("Primary topology table index") to calculate xscom address. Also use ("Processor fabric topology id") for "ibm,chip-id" property. Signed-off-by: Vasant Hegde --- hdata/fsp.c | 2 +- hdata/hdata.h | 1 + hdata/spira.c | 34 +++++++++++++++++++++++----------- hdata/spira.h | 3 +++ 4 files changed, 28 insertions(+), 12 deletions(-) diff --git a/hdata/fsp.c b/hdata/fsp.c index 458b7e636..42f1121ab 100644 --- a/hdata/fsp.c +++ b/hdata/fsp.c @@ -297,7 +297,7 @@ static void add_chip_id_to_sensors(struct dt_node *sensor_node, uint32_t slca_in } dt_add_property_cells(sensor_node, - "ibm,chip-id", be32_to_cpu(cinfo->xscom_id)); + "ibm,chip-id", get_xscom_id(cinfo)); return; } } diff --git a/hdata/hdata.h b/hdata/hdata.h index cbc61c31d..bae4eaa58 100644 --- a/hdata/hdata.h +++ b/hdata/hdata.h @@ -24,6 +24,7 @@ extern void vpd_data_parse(struct dt_node *node, extern struct dt_node *find_xscom_for_chip(uint32_t chip_id); extern uint32_t pcid_to_chip_id(uint32_t proc_chip_id); +extern uint32_t get_xscom_id(const struct sppcrd_chip_info *cinfo); extern struct dt_node *add_core_common(struct dt_node *cpus, const struct sppcia_cpu_cache *cache, diff --git a/hdata/spira.c b/hdata/spira.c index b7101d72e..7d56f3f29 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -289,12 +289,23 @@ struct HDIF_common_hdr *__get_hdif(struct spira_ntuple *n, const char id[], return h; } -static struct dt_node *add_xscom_node(uint64_t base, uint32_t hw_id, - uint32_t proc_chip_id) +uint32_t get_xscom_id(const struct sppcrd_chip_info *cinfo) +{ + if (proc_gen <= proc_gen_p9) + return be32_to_cpu(cinfo->xscom_id); + + /* On P10 use Processor fabric topology id for chip id */ + return (uint32_t)(cinfo->fab_topology_id); +} + +static struct dt_node *add_xscom_node(uint64_t base, + const struct sppcrd_chip_info *cinfo) { struct dt_node *node; uint64_t addr, size; uint64_t freq; + uint32_t hw_id = get_xscom_id(cinfo); + uint32_t proc_chip_id = be32_to_cpu(cinfo->proc_chip_id); switch (proc_gen) { case proc_gen_p8: @@ -302,13 +313,16 @@ static struct dt_node *add_xscom_node(uint64_t base, uint32_t hw_id, addr = base | ((uint64_t)hw_id << PPC_BITLSHIFT(28)); break; case proc_gen_p9: - case proc_gen_p10: /* XXX P10 */ - default: /* On P9 we need to put the chip ID in the natural powerbus * position. */ addr = base | (((uint64_t)hw_id) << 42); break; + case proc_gen_p10: + default: + /* Use Primary topology table index for xscom address */ + addr = base | (((uint64_t)cinfo->topology_id_table[cinfo->primary_topology_loc]) << 44); + break; }; size = (u64)1 << PPC_BITLSHIFT(28); @@ -611,9 +625,7 @@ static bool add_xscom_sppcrd(uint64_t xscom_base) continue; /* Create the XSCOM node */ - np = add_xscom_node(xscom_base, - be32_to_cpu(cinfo->xscom_id), - be32_to_cpu(cinfo->proc_chip_id)); + np = add_xscom_node(xscom_base, cinfo); if (!np) continue; @@ -636,7 +648,7 @@ static bool add_xscom_sppcrd(uint64_t xscom_base) SPPCRD_IDATA_KW_VPD); if (vpd_node) dt_add_property_cells(vpd_node, "ibm,chip-id", - be32_to_cpu(cinfo->xscom_id)); + get_xscom_id(cinfo)); fru_id = HDIF_get_idata(hdif, SPPCRD_IDATA_FRU_ID, NULL); if (fru_id) @@ -875,7 +887,7 @@ static bool add_chiptod_new(void) flags |= CHIPTOD_ID_FLAGS_PRIMARY; } - add_chiptod_node(be32_to_cpu(cinfo->xscom_id), flags); + add_chiptod_node(get_xscom_id(cinfo), flags); found = true; } return found; @@ -947,7 +959,7 @@ static void add_nx(void) continue; if (cinfo->nx_state) - add_nx_node(be32_to_cpu(cinfo->xscom_id)); + add_nx_node(get_xscom_id(cinfo)); } } @@ -1397,7 +1409,7 @@ uint32_t pcid_to_chip_id(uint32_t proc_chip_id) continue; } if (proc_chip_id == be32_to_cpu(cinfo->proc_chip_id)) - return be32_to_cpu(cinfo->xscom_id); + return get_xscom_id(cinfo); } /* Not found, what to do ? Assert ? For now return a number diff --git a/hdata/spira.h b/hdata/spira.h index 7da1154d7..3a8a31e1a 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -1096,6 +1096,9 @@ struct sppcrd_chip_info { #define CHIP_MAX_TOPOLOGY_ENTRIES 32 uint8_t topology_id_table[CHIP_MAX_TOPOLOGY_ENTRIES]; uint8_t primary_topology_loc; /* Index in topology_id_table */ + __be32 abc_bus_speed; /* SMP A */ + __be32 wxyz_bus_speed; /* SMP X */ + uint8_t fab_topology_id;/* topology id associated with the chip. */ } __packed; /* Idata index 1 : Chip TOD */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:54 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:54 +0530 Subject: [Skiboot] [PATCH v2 16/59] hdat/spira: Add ibm, power10-vas-x string to VAS compatible property In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-17-hegdevasant@linux.vnet.ibm.com> From: Haren Myneni VAS SCOM base address and paste address format are changed on P10. This patch adds ibm,power10-vas-x string to compatible property per each VAS node. This compatible string is used to define the paste base address later during VAS initialization. Also enables NX on P10 without adding any compatible string since the NX SCOM base address is not changed. Signed-off-by: Haren Myneni Signed-off-by: Vasant Hegde --- hdata/spira.c | 25 ++++++++++++++++--------- include/vas.h | 5 +++-- 2 files changed, 19 insertions(+), 11 deletions(-) diff --git a/hdata/spira.c b/hdata/spira.c index 2fd3da108..b7101d72e 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "hdata.h" #include "hostservices.h" @@ -475,17 +476,23 @@ static void add_xive_node(struct dt_node *np) dt_add_property(xive, "force-assign-bars", NULL, 0); } -/* - * SCOM Base Address from P9 SCOM Assignment spreadsheet - */ -#define VAS_SCOM_BASE_ADDR 0x03011800 - static void add_vas_node(struct dt_node *np, int idx) { - struct dt_node *vas = dt_new_addr(np, "vas", VAS_SCOM_BASE_ADDR); + struct dt_node *vas; + const char *comp; + uint64_t base_addr; - dt_add_property_cells(vas, "reg", VAS_SCOM_BASE_ADDR, 0x300); - dt_add_property_string(vas, "compatible", "ibm,power9-vas-x"); + if (proc_gen == proc_gen_p9) { + base_addr = P9_VAS_SCOM_BASE_ADDR; + comp = "ibm,power9-vas-x"; + } else { + base_addr = VAS_SCOM_BASE_ADDR; + comp = "ibm,power10-vas-x"; + } + + vas = dt_new_addr(np, "vas", base_addr); + dt_add_property_cells(vas, "reg", base_addr, 0x300); + dt_add_property_string(vas, "compatible", comp); dt_add_property_cells(vas, "ibm,vas-id", idx); } @@ -906,10 +913,10 @@ static void add_nx_node(u32 gcid) "ibm,power8-nx"); break; case proc_gen_p9: + case proc_gen_p10: /* POWER9 NX is not software compatible with P8 NX */ dt_add_property_strings(nx, "compatible", "ibm,power9-nx"); break; - case proc_gen_p10: /* XXX P10 */ default: return; } diff --git a/include/vas.h b/include/vas.h index 1c06e5606..369c3807a 100644 --- a/include/vas.h +++ b/include/vas.h @@ -67,9 +67,10 @@ extern __attrconst uint64_t vas_get_wcbs_bar(int chipid); #define VAS_WINDOWS_PER_CHIP 65536 /* 64K */ /* - * SCOM Base Address from P9 SCOM Assignment spreadsheet + * SCOM Base Address from P9/P10 SCOM Assignment spreadsheet */ -#define VAS_SCOM_BASE_ADDR 0x03011800 +#define P9_VAS_SCOM_BASE_ADDR 0x03011800 +#define VAS_SCOM_BASE_ADDR 0x02011400 /* * NOTE: VAS_SCOM_BASE_ADDR (0x3011800) includes the SCOM ring of 6. So, -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:56 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:56 +0530 Subject: [Skiboot] [PATCH v2 18/59] phys/P10: Use topology index to get phys mapping In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-19-hegdevasant@linux.vnet.ibm.com> This fixes multipchip rainier boot issue. for Rainer: chip0: ibm,primary-topology-index = < 0x0>; chip1: ibm,primary-topology-index = < 0x4>; chip2: ibm,primary-topology-index = < 0x8>; chip3: ibm,primary-topology-index = < 0xc>; for Denali: node0: chip0: ibm,primary-topology-index = < 0x0>; chip1: ibm,primary-topology-index = < 0x1>; chip2: ibm,primary-topology-index = < 0x2>; chip3: ibm,primary-topology-index = < 0x3>; node1: chip0: ibm,primary-topology-index = < 0x4>; chip1: ibm,primary-topology-index = < 0x5>; chip2: ibm,primary-topology-index = < 0x6>; chip3: ibm,primary-topology-index = < 0x7>; Note that bmc_create_node() gets called very early in the boot process. Hence we have to traverse through HDAT ntuple to get right topology index. May be we can optimize pcid_to_topology_idx() function as its pretty much duplicate of pcid_to_chip_id(). But for now lets keep it as separate function. Signed-off-by: Vasant Hegde Signed-off-by: Ryan Grimm Signed-off-by: Vasant Hegde --- hdata/fsp.c | 4 +++- hdata/hdata.h | 1 + hdata/spira.c | 28 ++++++++++++++++++++++++++++ hw/phys-map.c | 18 ++++++++++++++++-- hw/test/phys-map-test.c | 7 ++++++- include/phys-map.h | 3 +++ 6 files changed, 57 insertions(+), 4 deletions(-) diff --git a/hdata/fsp.c b/hdata/fsp.c index 42f1121ab..30cda53f6 100644 --- a/hdata/fsp.c +++ b/hdata/fsp.c @@ -361,6 +361,7 @@ static void bmc_create_node(const struct HDIF_common_hdr *sp) struct dt_node *lpcm, *lpc, *n; u64 lpcm_base, lpcm_end; uint32_t chip_id; + uint32_t topology_idx; int size; bmc_node = dt_new(dt_root, "bmc"); @@ -399,8 +400,9 @@ static void bmc_create_node(const struct HDIF_common_hdr *sp) * phys map offset */ chip_id = pcid_to_chip_id(be32_to_cpu(iopath->lpc.chip_id)); + topology_idx = pcid_to_topology_idx(be32_to_cpu(iopath->lpc.chip_id)); - phys_map_get(chip_id, LPC_BUS, 0, &lpcm_base, NULL); + __phys_map_get(topology_idx, chip_id, LPC_BUS, 0, &lpcm_base, NULL); lpcm = dt_new_addr(dt_root, "lpcm-opb", lpcm_base); assert(lpcm); diff --git a/hdata/hdata.h b/hdata/hdata.h index bae4eaa58..6aad82932 100644 --- a/hdata/hdata.h +++ b/hdata/hdata.h @@ -24,6 +24,7 @@ extern void vpd_data_parse(struct dt_node *node, extern struct dt_node *find_xscom_for_chip(uint32_t chip_id); extern uint32_t pcid_to_chip_id(uint32_t proc_chip_id); +extern uint32_t pcid_to_topology_idx(uint32_t proc_chip_id); extern uint32_t get_xscom_id(const struct sppcrd_chip_info *cinfo); extern struct dt_node *add_core_common(struct dt_node *cpus, diff --git a/hdata/spira.c b/hdata/spira.c index 7d56f3f29..baa23751d 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -1418,6 +1418,34 @@ uint32_t pcid_to_chip_id(uint32_t proc_chip_id) return (uint32_t)-1; } +uint32_t pcid_to_topology_idx(uint32_t proc_chip_id) +{ + unsigned int i; + const void *hdif; + + /* First, try the proc_chip ntuples for chip data */ + for_each_ntuple_idx(&spira.ntuples.proc_chip, hdif, i, + SPPCRD_HDIF_SIG) { + const struct sppcrd_chip_info *cinfo; + + cinfo = HDIF_get_idata(hdif, SPPCRD_IDATA_CHIP_INFO, NULL); + if (!CHECK_SPPTR(cinfo)) { + prerror("XSCOM: Bad ChipID data %d\n", i); + continue; + } + if (proc_chip_id == be32_to_cpu(cinfo->proc_chip_id)) { + if (proc_gen <= proc_gen_p9) + return get_xscom_id(cinfo); + else + return ((u32)cinfo->topology_id_table[cinfo->primary_topology_loc]); + } + } + + /* Not found, what to do ? Assert ? For now return a number + * guaranteed to not exist + */ + return (uint32_t)-1; +} /* Create '/ibm,opal/led' node */ static void dt_init_led_node(void) { diff --git a/hw/phys-map.c b/hw/phys-map.c index 2c4d8e45f..194e4953d 100644 --- a/hw/phys-map.c +++ b/hw/phys-map.c @@ -277,7 +277,7 @@ static inline bool phys_map_entry_null(const struct phys_map_entry *e) /* This crashes skiboot on error as any bad calls here are almost * certainly a developer error */ -void phys_map_get(uint64_t gcid, enum phys_map_type type, +void __phys_map_get(uint64_t topology_idx, uint64_t gcid, enum phys_map_type type, int index, uint64_t *addr, uint64_t *size) { const struct phys_map_entry *e; uint64_t a; @@ -302,7 +302,7 @@ void phys_map_get(uint64_t gcid, enum phys_map_type type, break; } a = e->addr; - a += gcid << phys_map->chip_select_shift; + a += topology_idx << (phys_map->chip_select_shift); if (addr) *addr = a; @@ -322,6 +322,20 @@ error: assert(0); } +void phys_map_get(uint64_t gcid, enum phys_map_type type, + int index, uint64_t *addr, uint64_t *size) +{ + struct proc_chip *chip; + uint64_t topology_idx = gcid; + + if (proc_gen >= proc_gen_p10) { + chip = get_chip(gcid); + topology_idx = chip->primary_topology; + } + + return __phys_map_get(topology_idx, gcid, type, index, addr, size); +} + void phys_map_init(unsigned long pvr) { const char *name = "unused"; diff --git a/hw/test/phys-map-test.c b/hw/test/phys-map-test.c index 2aabdb826..aa5b7339a 100644 --- a/hw/test/phys-map-test.c +++ b/hw/test/phys-map-test.c @@ -79,6 +79,11 @@ static inline bool map_call_entry_null(const struct map_call_entry *t) /* Pick a chip ID, any ID. */ #define FAKE_CHIP_ID 8 +struct proc_chip *get_chip(uint32_t chip_id __unused) +{ + return NULL; +} + static void check_map_call(void) { uint64_t start, size, end; @@ -98,7 +103,7 @@ static void check_map_call(void) /* Loop over table entries ... */ for (e = phys_map->table; !phys_map_entry_null(e); e++) { - phys_map_get(FAKE_CHIP_ID, e->type, e->index, &start, &size); + __phys_map_get(FAKE_CHIP_ID, FAKE_CHIP_ID, e->type, e->index, &start, &size); /* Check for alignment */ if ((e->type != SYSTEM_MEM) && (e->type != RESV)) { diff --git a/include/phys-map.h b/include/phys-map.h index ae7a4aa55..97351a720 100644 --- a/include/phys-map.h +++ b/include/phys-map.h @@ -48,6 +48,9 @@ enum phys_map_type { extern void phys_map_get(uint64_t gcid, enum phys_map_type type, int index, uint64_t *addr, uint64_t *size); +extern void __phys_map_get(uint64_t topology_idx, uint64_t gcid, + enum phys_map_type type, int index, uint64_t *addr, uint64_t *size); + extern void phys_map_init(unsigned long pvr); #endif /* __PHYS_MAP_H */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:57 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:57 +0530 Subject: [Skiboot] [PATCH v2 19/59] hdata/iohub: Read PCI Gen5 equalization settings for P10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-20-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat HDAT spec added fields to define the equalization settings for PCI Gen5 link. Format is the same as PCI Gen4, so we just need to add extra fields in the "ibm,lane-eq" in the device tree. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hdata/iohub.c | 27 ++++++++++++++++++--------- hdata/spira.h | 8 +++++++- 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/hdata/iohub.c b/hdata/iohub.c index fb215e1fb..92df48b8f 100644 --- a/hdata/iohub.c +++ b/hdata/iohub.c @@ -152,8 +152,8 @@ static struct dt_node *add_pec_stack(const struct cechub_io_hub *hub, { struct dt_node *stack; const char *compat; - u64 eq[8]; - u8 *gen4; + u64 eq[12]; + u8 *ptr; int i; stack = dt_new_addr(pbcq, "stack", stack_index); @@ -181,18 +181,27 @@ static struct dt_node *add_pec_stack(const struct cechub_io_hub *hub, eq[i] = be64_to_cpu(hub->phb_lane_eq[phb_index][i]); for (i = 0; i < 4; i++) /* gen 4 eq settings */ eq[i+4] = be64_to_cpu(hub->phb4_lane_eq[phb_index][i]); + for (i = 0; i < 4; i++) /* gen 5 eq settings */ + eq[i+8] = be64_to_cpu(hub->phb5_lane_eq[phb_index][i]); /* Lane-eq settings are packed 2 bytes per lane for 16 lanes - * On P9 DD2, 1 byte per lane is used in the hardware + * On P9 DD2 and P10, 1 byte per lane is used in the hardware */ - /* Repack 2 byte lane settings into 1 byte */ - gen4 = (u8 *)&eq[4]; - for (i = 0; i < 16; i++) - gen4[i] = gen4[2*i]; + /* Repack 2 byte lane settings into 1 byte for gen 4 & 5 */ + ptr = (u8 *)&eq[4]; + for (i = 0; i < 32; i++) + ptr[i] = ptr[2*i]; - dt_add_property_u64s(stack, "ibm,lane-eq", eq[0], eq[1], - eq[2], eq[3], eq[4], eq[5]); + if (proc_gen == proc_gen_p9) + dt_add_property_u64s(stack, "ibm,lane-eq", + eq[0], eq[1], eq[2], eq[3], + eq[4], eq[5]); + else + dt_add_property_u64s(stack, "ibm,lane-eq", + eq[0], eq[1], eq[2], eq[3], + eq[4], eq[5], + eq[6], eq[7]); return stack; } diff --git a/hdata/spira.h b/hdata/spira.h index 3a8a31e1a..7fcf5c302 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -706,8 +706,11 @@ struct cechub_io_hub { /* HDAT >= v9.x, HDIF version 0x6A adds phb_lane_eq with four * words per PHB (4 PHBs). * - * HDAT >= 10.x, HDIF version 0x7A adds space for another two + * HDAT >= 10.x, HDIF version 0x7A adds space for another * two PHBs (6 total) and the gen4 EQ values. + * + * HDAT >= 10.5x, HDIF version 0x8B adds space for the + * gen5 EQ values. */ struct { /* Gen 3 PHB eq values, 6 PHBs */ @@ -715,6 +718,9 @@ struct cechub_io_hub { /* Gen 4 PHB eq values */ __be64 phb4_lane_eq[6][4]; + + /* Gen 5 PHB eq values */ + __be64 phb5_lane_eq[6][4]; }; }; } __packed; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:58 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:58 +0530 Subject: [Skiboot] [PATCH v2 20/59] prd: Add base P10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-21-hegdevasant@linux.vnet.ibm.com> From: Oliver O'Halloran Signed-off-by: Oliver O'Halloran Signed-off-by: Vasant Hegde --- hw/prd.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/prd.c b/hw/prd.c index 761d0a42b..45d765457 100644 --- a/hw/prd.c +++ b/hw/prd.c @@ -740,6 +740,11 @@ void prd_init(void) prd_ipoll_status_reg = PRD_P9_IPOLL_REG_STATUS; prd_ipoll_mask = PRD_P9_IPOLL_MASK; break; + case proc_gen_p10: /* IPOLL regs are the same for p9 and p10 */ + prd_ipoll_mask_reg = PRD_P9_IPOLL_REG_MASK; + prd_ipoll_status_reg = PRD_P9_IPOLL_REG_STATUS; + prd_ipoll_mask = PRD_P9_IPOLL_MASK; + break; default: assert(0); } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:20:59 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:50:59 +0530 Subject: [Skiboot] [PATCH v2 21/59] hw/phys-map/p10: Add P10 MMIO map In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-22-hegdevasant@linux.vnet.ibm.com> From: Alistair Popple Adds a phys map for P10 based on the MMIO spreadsheet. Also updates the phys map test to take a parameter which selects which map to test. - Introduce new BAR for the PC subengine of XIVE2 On P10, the NVP (Process) and NVG (Group) pages share the MMIO range. The even page gives access to the NVP structure and the odd page to the NVG structure. OPAL only uses the NVP. - Introduce new BARs for the VC subengine of XIVE2 On P10, the source ESB pages and END ESB pages have now their own MMIO range. - Increase the MMIO range for the END ESB pages The range was increased to 2TB to be able to address more END entries. We now have a maximum of 16M entries per chip. The END and ESB ranges are reordered for alignment. Signed-off-by: Alistair Popple Signed-off-by: C?dric Le Goater [Folded Cedric's patches - Vasant] Signed-off-by: Vasant Hegde --- hw/phys-map.c | 87 ++++++++++++++++++++++++++++++++++++++++- hw/test/phys-map-test.c | 18 +++++++-- include/phys-map.h | 6 ++- 3 files changed, 106 insertions(+), 5 deletions(-) diff --git a/hw/phys-map.c b/hw/phys-map.c index 194e4953d..b8fff0a4f 100644 --- a/hw/phys-map.c +++ b/hw/phys-map.c @@ -26,6 +26,84 @@ struct phys_map_info { static const struct phys_map_info *phys_map; +static const struct phys_map_entry phys_map_table_p10[] = { + /* System memory upto 4TB minus GPU memory */ + { SYSTEM_MEM, 0, 0x0000000000000000ull, 0x0000034000000000ull }, + + /* TODO: Figure out GPU memory */ + + /* 0 TB offset @ MMIO 0x0006000000000000ull */ + { PHB4_64BIT_MMIO, 0, 0x0006000000000000ull, 0x0000004000000000ull }, + { PHB4_64BIT_MMIO, 1, 0x0006004000000000ull, 0x0000004000000000ull }, + { PHB4_64BIT_MMIO, 2, 0x0006008000000000ull, 0x0000004000000000ull }, + { PHB4_32BIT_MMIO, 0, 0x000600c000000000ull, 0x0000000080000000ull }, + { PHB4_32BIT_MMIO, 1, 0x000600c080000000ull, 0x0000000080000000ull }, + { PHB4_32BIT_MMIO, 2, 0x000600c100000000ull, 0x0000000080000000ull }, + { PHB4_32BIT_MMIO, 3, 0x000600c180000000ull, 0x0000000080000000ull }, + { PHB4_32BIT_MMIO, 4, 0x000600c200000000ull, 0x0000000080000000ull }, + { PHB4_32BIT_MMIO, 5, 0x000600c280000000ull, 0x0000000080000000ull }, + { PHB4_XIVE_ESB , 0, 0x000600c300000000ull, 0x0000000020000000ull }, + { PHB4_XIVE_ESB , 1, 0x000600c320000000ull, 0x0000000020000000ull }, + { PHB4_XIVE_ESB , 2, 0x000600c340000000ull, 0x0000000020000000ull }, + { PHB4_XIVE_ESB , 3, 0x000600c360000000ull, 0x0000000020000000ull }, + { PHB4_XIVE_ESB , 4, 0x000600c380000000ull, 0x0000000020000000ull }, + { PHB4_XIVE_ESB , 5, 0x000600c3a0000000ull, 0x0000000020000000ull }, + { PHB4_REG_SPC , 0, 0x000600c3c0000000ull, 0x0000000000100000ull }, + { PHB4_REG_SPC , 1, 0x000600c3c0100000ull, 0x0000000000100000ull }, + { PHB4_REG_SPC , 2, 0x000600c3c0200000ull, 0x0000000000100000ull }, + { PHB4_REG_SPC , 3, 0x000600c3c0300000ull, 0x0000000000100000ull }, + { PHB4_REG_SPC , 4, 0x000600c3c0400000ull, 0x0000000000100000ull }, + { PHB4_REG_SPC , 5, 0x000600c3c0500000ull, 0x0000000000100000ull }, + { RESV , 0, 0x000600c3c0600000ull, 0x0000003c3fa00000ull }, + + /* 1 TB offset */ + { RESV , 1, 0x0006010000000000ull, 0x0000010000000000ull }, + + /* 2 TB offset */ + { PHB4_64BIT_MMIO, 3, 0x0006020000000000ull, 0x0000004000000000ull }, + { PHB4_64BIT_MMIO, 4, 0x0006024000000000ull, 0x0000004000000000ull }, + { PHB4_64BIT_MMIO, 5, 0x0006028000000000ull, 0x0000004000000000ull }, + { RESV , 2, 0x000602c000000000ull, 0x0000004000000000ull }, + + /* 3 TB offset */ + { LPC_BUS , 0, 0x0006030000000000ull, 0x0000000100000000ull }, + { FSP_MMIO , 0, 0x0006030100000000ull, 0x0000000100000000ull }, + { XIVE_IC , 0, 0x0006030200000000ull, 0x0000000002000000ull }, + { PSIHB_ESB , 0, 0x0006030202000000ull, 0x0000000000100000ull }, + { RESV , 3, 0x0006030202100000ull, 0x0000000000f00000ull }, + { PSIHB_REG , 0, 0x0006030203000000ull, 0x0000000000100000ull }, + { RESV , 4, 0x0006030203100000ull, 0x0000000000080000ull }, + { XIVE_TM , 0, 0x0006030203180000ull, 0x0000000000040000ull }, + { RESV , 5, 0x00060302031c0000ull, 0x0000000000010000ull }, + { NX_RNG , 0, 0x00060302031d0000ull, 0x0000000000010000ull }, + { RESV , 6, 0x00060302031e0000ull, 0x0000000004e20000ull }, + { XIVE_NVC , 0, 0x0006030208000000ull, 0x0000000008000000ull }, + { RESV , 7, 0x0006030210000000ull, 0x00000000ee000000ull }, + { VAS_HYP_WIN , 0, 0x00060302fe000000ull, 0x0000000002000000ull }, + { VAS_USER_WIN , 0, 0x0006030300000000ull, 0x0000000100000000ull }, + + /* TODO: MC, OCMB, PAU */ + { RESV , 8, 0x0006030400000000ull, 0x000000f800000000ull }, + { XSCOM , 0, 0x000603fc00000000ull, 0x0000000400000000ull }, + + /* 4 TB offset */ + { XIVE_NVPG , 0, 0x0006040000000000ull, 0x0000010000000000ull }, + + /* 5 - 7 TB offset */ + /* for P10 the END and ESB regions are separate in the MMIO + * table */ + { XIVE_ESB , 0, 0x0006050000000000ull, 0x0000010000000000ull }, + { XIVE_END , 0, 0x0006060000000000ull, 0x0000020000000000ull }, + + /* 8 - 13 TB offset */ + { RESV , 9, 0x0006080000000000ull, 0x0000060000000000ull }, + + /* 14 TB offset */ + { RESV ,10, 0x00060e0000000000ull, 0x0000008000000000ull }, + + { NULL_MAP, 0, 0, 0 }, +}; + static const struct phys_map_entry phys_map_table_nimbus[] = { /* System memory upto 4TB minus GPU memory */ @@ -266,6 +344,11 @@ static const struct phys_map_info phys_map_axone = { .table = phys_map_table_axone, }; +static const struct phys_map_info phys_map_p10 = { + .chip_select_shift = 44, + .table = phys_map_table_p10, +}; + static inline bool phys_map_entry_null(const struct phys_map_entry *e) { if (e->type == NULL_MAP) @@ -352,9 +435,11 @@ void phys_map_init(unsigned long pvr) name = "nimbus"; phys_map = &phys_map_nimbus; } + } else if (proc_gen == proc_gen_p10) { + name = "p10"; + phys_map = &phys_map_p10; } prlog(PR_DEBUG, "Assigning physical memory map table for %s\n", name); } - diff --git a/hw/test/phys-map-test.c b/hw/test/phys-map-test.c index aa5b7339a..d507175fe 100644 --- a/hw/test/phys-map-test.c +++ b/hw/test/phys-map-test.c @@ -172,14 +172,26 @@ static void check_map_call(void) unsigned long fake_pvr[] = { 0x004e0200, /* PVR_P9 */ 0x004f0100, /* PVR_P9P */ + 0x00800100, /* PVR_P10 */ }; int main(void) { - /* Fake we are POWER9 */ - proc_gen = proc_gen_p9; - for (int i = 0; i < ARRAY_SIZE(fake_pvr); i++) { + switch(PVR_TYPE(fake_pvr[i])) { + case PVR_TYPE_P9: + case PVR_TYPE_P9P: + proc_gen = proc_gen_p9; + break; + case PVR_TYPE_P10: + proc_gen = proc_gen_p10; + break; + default: + printf("Unknown PVR 0x%lx\n", fake_pvr[i]); + return 1; + break; + } + phys_map_init(fake_pvr[i]); /* Run tests */ diff --git a/include/phys-map.h b/include/phys-map.h index 97351a720..a3394c0d0 100644 --- a/include/phys-map.h +++ b/include/phys-map.h @@ -42,7 +42,11 @@ enum phys_map_type { MC_OCMB_CFG, MC_OCMB_MMIO, XSCOM, - RESV + RESV, + XIVE_NVC, + XIVE_NVPG, + XIVE_ESB, + XIVE_END, }; extern void phys_map_get(uint64_t gcid, enum phys_map_type type, -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:00 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:00 +0530 Subject: [Skiboot] [PATCH v2 22/59] VAS: Define Remote Memory Access paste address on P10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-23-hegdevasant@linux.vnet.ibm.com> From: Haren Myneni Paste base address format is changed on p10. Instead of node/chip IDs, Primary topology index is used to define paste base address. Also RA(11) bit is used to define the foreign address. Changes to define the paste base address for each VAS engine with the new format on P10. Signed-off-by: Haren Myneni Signed-off-by: Vasant Hegde --- hw/vas.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 73 insertions(+), 3 deletions(-) diff --git a/hw/vas.c b/hw/vas.c index 393ad801e..c9639831f 100644 --- a/hw/vas.c +++ b/hw/vas.c @@ -160,6 +160,11 @@ static void reset_fir(struct proc_chip *chip) #define P9_RMA_LSMP_64K_SYS_ID PPC_BITMASK(8, 12) #define P9_RMA_LSMP_64K_NODE_ID PPC_BITMASK(15, 18) #define P9_RMA_LSMP_64K_CHIP_ID PPC_BITMASK(19, 21) + +/* Paste base address format (on P10 or later) */ +#define RMA_FOREIGN_ADDR_ENABLE PPC_BITMASK(8, 11) +#define RMA_TOPOLOGY_INDEX PPC_BITMASK(15, 19) + #define RMA_LSMP_WINID_START_BIT 32 #define RMA_LSMP_WINID_NUM_BITS 16 @@ -221,6 +226,59 @@ static void p9_get_rma_bar(int chipid, uint64_t *val) *val = v; } +/* + * The start/base of the paste BAR is computed using the tables 1.1 through + * 1.3 in Section 1.3.3.1 (Send Message w/Paste Commands (cl_rma_w)) of VAS + * P10 Workbook. + * + * With 64K mode and Large SMP Mode the bits are used as follows: + * + * Bits Values Comments + * -------------------------------------- + * 0:7 0b 0000_0000 Reserved + * 8:11 0b 0001 Foreign Address Enable + * 12 0b 0 SMF + * 13:14 0b 00 Memory Select + * + * 15:19 0 throuh 16 Topology Index + * 20:23 0b 0000 Chip Internal Address + * + * 24:31 0b 0000_0000 RPN 0:7, Reserved + * 32:47 0 through 64K Send Window Id + * 48:51 0b 0000 Spare + * + * 52 0b 0 Reserved + * 53 0b 1 Report Enable (Set to 1 for NX). + * 54 0b 0 Reserved + * + * 55:56 0b 00 Snoop Bus + * 57:63 0b 0000_000 Reserved + * + * Example: For Node 0, Chip 0, Window id 4, Report Enable 1: + * + * Byte0 Byte1 Byte2 Byte3 Byte4 Byte5 Byte6 Byte7 + * 00000000 00010000 00000000 00000000 00000000 00000100 00000100 00000000 + * | | | | | + * +---+ +-------+-------+ v + * | | Report Enable + * v v + * Topology Index Window id 4 + * + * Thus the paste address for window id 4 is 0x00100000_00040400 and + * the _base_ paste address for Node 0 Chip 0 is 0x00100000_00000000. + */ + +static void get_rma_bar(struct proc_chip *chip, uint64_t *val) +{ + uint64_t v; + + v = 0ULL; + v = SETFIELD(RMA_FOREIGN_ADDR_ENABLE, v, 1); + v = SETFIELD(RMA_TOPOLOGY_INDEX, v, chip->primary_topology); + + *val = v; +} + /* * Initialize RMA BAR on this chip to correspond to its node/chip id. * This will cause VAS to accept paste commands to targeted for this chip. @@ -231,7 +289,10 @@ static int init_rma(struct proc_chip *chip) int rc; uint64_t val; - p9_get_rma_bar(chip->id, &val); + if (proc_gen == proc_gen_p9) + p9_get_rma_bar(chip->id, &val); + else + get_rma_bar(chip, &val); rc = vas_scom_write(chip, VAS_RMA_BAR, val); if (rc) @@ -271,9 +332,18 @@ static int init_rma(struct proc_chip *chip) static inline void get_paste_bar(int chipid, uint64_t *start, uint64_t *len) { + struct proc_chip *chip; uint64_t val; - p9_get_rma_bar(chipid, &val); + if (proc_gen == proc_gen_p9) + p9_get_rma_bar(chipid, &val); + else { + chip = get_chip(chipid); + if (!chip) + return; + + get_rma_bar(chip, &val); + } *start = val; *len = VAS_PASTE_BAR_LEN; @@ -394,8 +464,8 @@ static void create_mm_dt_node(struct proc_chip *chip) struct vas *vas; uint64_t hvwc_start, hvwc_len; uint64_t uwc_start, uwc_len; - uint64_t pbar_start, pbar_len; uint64_t pbf_start, pbf_nbits; + uint64_t pbar_start = 0, pbar_len = 0; vas = chip->vas; get_hvwc_mmio_bar(chip->id, &hvwc_start, &hvwc_len); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:01 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:01 +0530 Subject: [Skiboot] [PATCH v2 23/59] VAS: Enable VAS on P10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-24-hegdevasant@linux.vnet.ibm.com> From: Haren Myneni Enable VAS on P10 based on "ibm,power10-vas-x" compatible string and export the new compatible property to kernel. Also do not set foreign address enable for VAS/NX RMA BAR From section 1.3.3.1 in VAS workbook, RA(0:12) = 0's for VAS/NX RMA BAR. It means foreign address enable bit (RA(11) should be 0 for RMA VAR. But this bit has to be set for paste base address which is used for COPY/PASTE. Signed-off-by: Haren Myneni Signed-off-by: Vasant Hegde --- hw/vas.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/hw/vas.c b/hw/vas.c index c9639831f..274008665 100644 --- a/hw/vas.c +++ b/hw/vas.c @@ -266,6 +266,9 @@ static void p9_get_rma_bar(int chipid, uint64_t *val) * * Thus the paste address for window id 4 is 0x00100000_00040400 and * the _base_ paste address for Node 0 Chip 0 is 0x00100000_00000000. + * + * Note: Bit 11 (Foreign Address Enable) is set only for paste base address. + * Not for VAS/NX RMA BAR. RA(0:12) = 0 for VAS/NX RMA BAR. */ static void get_rma_bar(struct proc_chip *chip, uint64_t *val) @@ -273,7 +276,6 @@ static void get_rma_bar(struct proc_chip *chip, uint64_t *val) uint64_t v; v = 0ULL; - v = SETFIELD(RMA_FOREIGN_ADDR_ENABLE, v, 1); v = SETFIELD(RMA_TOPOLOGY_INDEX, v, chip->primary_topology); *val = v; @@ -343,6 +345,12 @@ static inline void get_paste_bar(int chipid, uint64_t *start, uint64_t *len) return; get_rma_bar(chip, &val); + + /* + * RA(11) (Foreign Address Enable) is set only for paste + * base address. + */ + val = SETFIELD(RMA_FOREIGN_ADDR_ENABLE, val, 1); } *start = val; @@ -462,6 +470,7 @@ static void create_mm_dt_node(struct proc_chip *chip) { struct dt_node *dn; struct vas *vas; + const char *compat; uint64_t hvwc_start, hvwc_len; uint64_t uwc_start, uwc_len; uint64_t pbf_start, pbf_nbits; @@ -473,9 +482,14 @@ static void create_mm_dt_node(struct proc_chip *chip) get_paste_bar(chip->id, &pbar_start, &pbar_len); get_paste_bitfield(&pbf_start, &pbf_nbits); + if (proc_gen == proc_gen_p9) + compat = "ibm,power9-vas"; + else + compat = "ibm,power10-vas"; + dn = dt_new_addr(dt_root, "vas", hvwc_start); - dt_add_property_strings(dn, "compatible", "ibm,power9-vas", + dt_add_property_strings(dn, "compatible", compat, "ibm,vas"); dt_add_property_u64s(dn, "reg", hvwc_start, hvwc_len, @@ -579,13 +593,18 @@ void vas_init(void) { bool enabled; struct dt_node *np; + const char *compat; - if (proc_gen != proc_gen_p9) + if (proc_gen == proc_gen_p9) + compat = "ibm,power9-vas-x"; + else if (proc_gen == proc_gen_p10) + compat = "ibm,power10-vas-x"; + else return; enabled = vas_nx_enabled(); - dt_for_each_compatible(dt_root, np, "ibm,power9-vas-x") { + dt_for_each_compatible(dt_root, np, compat) { if (init_vas_inst(np, enabled)) goto out; } @@ -594,7 +613,7 @@ void vas_init(void) return; out: - dt_for_each_compatible(dt_root, np, "ibm,power9-vas-x") + dt_for_each_compatible(dt_root, np, compat) disable_vas_inst(np); vas_err("Disabled (failed initialization)\n"); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:03 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:03 +0530 Subject: [Skiboot] [PATCH v2 25/59] hw/nx: Enable p10 DARN In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-26-hegdevasant@linux.vnet.ibm.com> From: Ryan Grimm Init and enable NCU DARN BAR on sibling cores as well for fused core mode. Signed-off-by: Ryan Grimm Signed-off-by: Vaidyanathan Srinivasan [Folded Vaidy's fused core support fix - Vasant] Signed-off-by: Vasant Hegde --- hw/nx.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/hw/nx.c b/hw/nx.c index 122048087..fdadf53c7 100644 --- a/hw/nx.c +++ b/hw/nx.c @@ -12,11 +12,12 @@ #include #include #include +#include #include #include #include -static void p9_darn_init(void) +static void darn_init(void) { struct dt_node *nx; struct proc_chip *chip; @@ -45,11 +46,25 @@ static void p9_darn_init(void) for_each_available_core_in_chip(c, chip->id) { uint64_t addr; - addr = XSCOM_ADDR_P9_EX(pir_to_core_id(c->pir), + + if (proc_gen == proc_gen_p9) { + addr = XSCOM_ADDR_P9_EX(pir_to_core_id(c->pir), P9X_EX_NCU_DARN_BAR); - xscom_write(chip->id, addr, + xscom_write(chip->id, addr, bar | P9X_EX_NCU_DARN_BAR_EN); - + } else if (proc_gen >= proc_gen_p10) { + addr = XSCOM_ADDR_P10_NCU(pir_to_core_id(c->pir), + P10_NCU_DARN_BAR); + xscom_write(chip->id, addr, + bar | P10_NCU_DARN_BAR_EN); + /* Init for sibling core also */ + if (c->is_fused_core) { + addr = XSCOM_ADDR_P10_NCU(pir_to_core_id(c->pir + 1), + P10_NCU_DARN_BAR); + xscom_write(chip->id, addr, + bar | P10_NCU_DARN_BAR_EN); + } + } } } } @@ -59,7 +74,7 @@ void nx_p9_rng_late_init(void) struct cpu_thread *c; uint64_t rc; - if (proc_gen != proc_gen_p9) + if (proc_gen < proc_gen_p9) return; if (chip_quirk(QUIRK_NO_RNG)) return; @@ -118,6 +133,6 @@ void nx_init(void) nx_init_one(node); } - if (proc_gen == proc_gen_p9) - p9_darn_init(); + if (proc_gen >= proc_gen_p9) + darn_init(); } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:02 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:02 +0530 Subject: [Skiboot] [PATCH v2 24/59] NX: Set VAS RMA write BAR register on P10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-25-hegdevasant@linux.vnet.ibm.com> From: Haren Myneni For each NX instance, VAS RMA write BAR register should be set with the corresponding VAS RMA BAR value. Refer section: 5.30 VAS RMA write BAR (P10 NX work Book V1.01) Signed-off-by: Haren Myneni Signed-off-by: Vasant Hegde --- hw/nx-compress.c | 36 ++++++++++++++++++++++++++++++++++++ hw/vas.c | 18 ++++++++++++++++++ include/nx.h | 3 +++ include/vas.h | 1 + 4 files changed, 58 insertions(+) diff --git a/hw/nx-compress.c b/hw/nx-compress.c index b2302866b..9b3c6717d 100644 --- a/hw/nx-compress.c +++ b/hw/nx-compress.c @@ -115,6 +115,30 @@ static int nx_cfg_umac_status_ctrl(u32 gcid, u64 xcfg) return rc; } +static int nx_cfg_vas_rma_bar(u32 gcid, u64 xcfg) +{ + int rc = 0; + u64 cfg; + + cfg = vas_get_rma_bar(gcid); + /* + * NOTE: Write the entire bar address to SCOM. VAS/NX will extract + * the relevant (NX_P10_VAS_RMA_WRITE_BAR) bits. IOW, _don't_ + * just write the bit field like: + * cfg = SETFIELD(NX_P10_VAS_RMA_WRITE_BAR, 0ULL, cfg); + */ + rc = xscom_write(gcid, xcfg, cfg); + + if (rc) + prerror("NX%d: ERROR: VAS RMA WRITE BAR, %d\n", gcid, rc); + else + prlog(PR_DEBUG, "NX%d: VAS RMA WRITE BAR, 0x%016lx, " + "xcfg 0x%llx\n", gcid, (unsigned long)cfg, + xcfg); + + return rc; +} + int nx_cfg_rx_fifo(struct dt_node *node, const char *compat, const char *priority, u32 gcid, u32 pid, u32 tid, u64 umac_bar, u64 umac_notify) @@ -272,6 +296,10 @@ void nx_create_compress_node(struct dt_node *node) prlog(PR_INFO, "NX%d: 842 at 0x%x\n", gcid, pb_base); + /* + * ibm,power9-nx is compatible on P10. So using same + * compatible string. + */ if (dt_node_is_compatible(node, "ibm,power9-nx")) { u64 cfg_mmio, cfg_txwc, cfg_uctrl, cfg_dma; @@ -297,6 +325,14 @@ void nx_create_compress_node(struct dt_node *node) if (rc) return; + if (proc_gen > proc_gen_p9) { + u64 cfg_rma = pb_base + NX_P10_VAS_RMA_WRITE_BAR; + + rc = nx_cfg_vas_rma_bar(gcid, cfg_rma); + if (rc) + return; + } + p9_nx_enable_842(node, gcid, pb_base); p9_nx_enable_gzip(node, gcid, pb_base); } else diff --git a/hw/vas.c b/hw/vas.c index 274008665..0dbe0bcda 100644 --- a/hw/vas.c +++ b/hw/vas.c @@ -281,6 +281,24 @@ static void get_rma_bar(struct proc_chip *chip, uint64_t *val) *val = v; } +/* Interface for NX - make sure VAS is fully initialized first */ +__attrconst uint64_t vas_get_rma_bar(int chipid) +{ + struct proc_chip *chip; + uint64_t addr; + + if (!vas_initialized) + return 0ULL; + + chip = get_chip(chipid); + if (!chip) + return 0ULL; + + get_rma_bar(chip, &addr); + + return addr; +} + /* * Initialize RMA BAR on this chip to correspond to its node/chip id. * This will cause VAS to accept paste commands to targeted for this chip. diff --git a/include/nx.h b/include/nx.h index 5734e24a3..c42d165e9 100644 --- a/include/nx.h +++ b/include/nx.h @@ -141,6 +141,9 @@ #define NX_P9_ERAT_STATUS_CTRL NX_P9_SAT(0x3, 0x16) +/* Introduced in P10, but P10 NX SCOM address is same as P9 */ +#define NX_P10_VAS_RMA_WRITE_BAR NX_P9_SAT(0x3, 0x19) + /* NX Status Register */ #define NX_P8_STATUS NX_P8_SAT(0x1, 0x00) #define NX_P9_STATUS NX_P9_SAT(0x1, 0x00) /* DMA Status register */ diff --git a/include/vas.h b/include/vas.h index 369c3807a..1f59b1d9c 100644 --- a/include/vas.h +++ b/include/vas.h @@ -27,6 +27,7 @@ extern void vas_init(void); extern __attrconst bool vas_nx_enabled(void); extern __attrconst uint64_t vas_get_hvwc_mmio_bar(const int chipid); extern __attrconst uint64_t vas_get_wcbs_bar(int chipid); +extern __attrconst uint64_t vas_get_rma_bar(int chipid); /* * HVWC and UWC BAR. -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:04 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:04 +0530 Subject: [Skiboot] [PATCH v2 26/59] hw/imc: Power10 support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-27-hegdevasant@linux.vnet.ibm.com> From: Anju T Sudhakar POWER10 IMC support: Add POWER10 scom addresses for IMC Add support for IMC trace-mode Fix the catalog subit for POWER10 Signed-off-by: Anju T Sudhakar Signed-off-by: Madhavan Srinivasan Signed-off-by: Vasant Hegde --- hw/fsp/fsp.c | 5 +++++ hw/imc.c | 61 ++++++++++++++++++++++++++++++++++++++++++--------- include/imc.h | 2 ++ 3 files changed, 58 insertions(+), 10 deletions(-) diff --git a/hw/fsp/fsp.c b/hw/fsp/fsp.c index 70452cf98..2c5f9d71b 100644 --- a/hw/fsp/fsp.c +++ b/hw/fsp/fsp.c @@ -2373,6 +2373,9 @@ int fsp_fetch_data_queue(uint8_t flags, uint16_t id, uint32_t sub_id, #define CAPP_IDX_NIMBUS_DD23 0x203d1 #define IMA_CATALOG_NIMBUS 0x4e0200 +#define IMA_CATALOG_P10_DD1 0x800100 +#define IMA_CATALOG_P10_DD2 0x800200 + static struct { enum resource_id id; @@ -2392,6 +2395,8 @@ static struct { { RESOURCE_ID_CAPP, CAPP_IDX_NIMBUS_DD21, 0x80a02007 }, { RESOURCE_ID_CAPP, CAPP_IDX_NIMBUS_DD22, 0x80a02007 }, { RESOURCE_ID_CAPP, CAPP_IDX_NIMBUS_DD23, 0x80a02007 }, + { RESOURCE_ID_IMA_CATALOG,IMA_CATALOG_P10_DD1, 0x80f00103 }, + { RESOURCE_ID_IMA_CATALOG,IMA_CATALOG_P10_DD2, 0x80f00103 }, }; static void fsp_start_fetching_next_lid(void); diff --git a/hw/imc.c b/hw/imc.c index 7d29ce6f7..cbd68edc4 100644 --- a/hw/imc.c +++ b/hw/imc.c @@ -170,6 +170,20 @@ static unsigned int htm_scom_index_p9[] = { 0x10012700 }; +static unsigned int pdbar_scom_index_p10[] = { + 0x2001868B, + 0x2001468B, + 0x2001268B, + 0x2001168B +}; + +static unsigned int htm_scom_index_p10[] = { + 0x20018680, + 0x20014680, + 0x20012680, + 0x20011680 +}; + static struct imc_chip_cb *get_imc_cb(uint32_t chip_id) { struct proc_chip *chip = get_chip(chip_id); @@ -263,13 +277,23 @@ static bool is_imc_device_type_supported(struct dt_node *node) if (val == IMC_COUNTER_TRACE) { pvr = mfspr(SPR_PVR); - /* - * Trace mode is supported in Nimbus DD2.2 - * and later versions. - */ - if ((chip->type == PROC_CHIP_P9_NIMBUS) && - (PVR_VERS_MAJ(pvr) == 2) && (PVR_VERS_MIN(pvr) >= 2)) + + switch (chip->type) { + case PROC_CHIP_P9_NIMBUS: + /* + * Trace mode is supported in Nimbus DD2.2 + * and later versions. + */ + if ((PVR_VERS_MAJ(pvr) == 2) && + (PVR_VERS_MIN(pvr) >= 2)) + return true; + break; + case PROC_CHIP_P10: return true; + default: + return false; + } + } return false; } @@ -453,8 +477,8 @@ void imc_catalog_preload(void) if (proc_chip_quirks & QUIRK_MAMBO_CALLOUTS) return; - /* Enable only for power 9 */ - if (proc_gen != proc_gen_p9) + /* Enable only for power 9/10 */ + if (proc_gen < proc_gen_p9) return; compress_buf = malloc(MAX_COMPRESSED_IMC_DTB_SIZE); @@ -559,6 +583,17 @@ static int setup_imc_scoms(void) IMC_TRACE_CPMC2SEL_VAL, IMC_TRACE_BUFF_SIZE); return 0; + case proc_gen_p10: + CORE_IMC_EVENT_MASK_ADDR = CORE_IMC_EVENT_MASK_ADDR_P10; + TRACE_IMC_ADDR = TRACE_IMC_ADDR_P10; + pdbar_scom_index = pdbar_scom_index_p10; + htm_scom_index = htm_scom_index_p10; + trace_scom_val = TRACE_IMC_SCOM(IMC_TRACE_CPMC1, + IMC_TRACE_CPMCLOAD_VAL, + IMC_TRACE_CPMC1SEL_VAL, + IMC_TRACE_CPMC2SEL_VAL, + IMC_TRACE_BUFF_SIZE); + return 0; default: prerror("%s: Unknown cpu type\n", __func__); break; @@ -586,8 +621,8 @@ void imc_init(void) goto imc_mambo; } - /* Enable only for power 9 */ - if (proc_gen != proc_gen_p9) + /* Enable only for power 9/10 */ + if (proc_gen < proc_gen_p9) return; if (!imc_xz) @@ -720,6 +755,9 @@ static uint32_t get_imc_scom_addr_for_core(int core, uint64_t addr) case proc_gen_p9: scom_addr = XSCOM_ADDR_P9_EC(core, addr); return scom_addr; + case proc_gen_p10: + scom_addr = XSCOM_ADDR_P10_EC(core, addr); + return scom_addr; default: return 0; } @@ -734,6 +772,9 @@ static uint32_t get_imc_scom_addr_for_quad(int core, uint64_t addr) case proc_gen_p9: scom_addr = XSCOM_ADDR_P9_EQ(core, addr); return scom_addr; + case proc_gen_p10: + scom_addr = XSCOM_ADDR_P10_EQ(core, addr); + return scom_addr; default: return 0; } diff --git a/include/imc.h b/include/imc.h index a446dc581..96f9ec4b6 100644 --- a/include/imc.h +++ b/include/imc.h @@ -110,6 +110,7 @@ struct imc_chip_cb * Core IMC SCOMs */ #define CORE_IMC_EVENT_MASK_ADDR_P9 0x20010AA8ull +#define CORE_IMC_EVENT_MASK_ADDR_P10 0x20020400ull #define CORE_IMC_EVENT_MASK 0x0402010000000000ull #define CORE_IMC_PDBAR_MASK 0x0003ffffffffe000ull #define CORE_IMC_HTM_MODE_ENABLE 0xE800000000000000ull @@ -133,6 +134,7 @@ struct imc_chip_cb * *CPMC1SEL *CPMC2SEL *BUFFERSIZE */ #define TRACE_IMC_ADDR_P9 0x20010AA9ull +#define TRACE_IMC_ADDR_P10 0x20020401ull #define TRACE_IMC_SAMPLESEL(x) ((uint64_t)x << 62) #define TRACE_IMC_CPMC_LOAD(x) ((0xffffffff - (uint64_t)x) << 30) #define TRACE_IMC_CPMC1SEL(x) ((uint64_t)x << 23) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:05 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:05 +0530 Subject: [Skiboot] [PATCH v2 27/59] platforms/astbmc: Add ast2600 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-28-hegdevasant@linux.vnet.ibm.com> From: Reza Arbab Signed-off-by: Reza Arbab Signed-off-by: Vasant Hegde --- platforms/astbmc/astbmc.h | 2 ++ platforms/astbmc/common.c | 19 +++++++++++++++++-- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/platforms/astbmc/astbmc.h b/platforms/astbmc/astbmc.h index 86631bc4e..00f221230 100644 --- a/platforms/astbmc/astbmc.h +++ b/platforms/astbmc/astbmc.h @@ -87,9 +87,11 @@ static struct slot_table_entry st_name[] = \ extern const struct bmc_hw_config bmc_hw_ast2400; extern const struct bmc_hw_config bmc_hw_ast2500; +extern const struct bmc_hw_config bmc_hw_ast2600; extern const struct bmc_platform bmc_plat_ast2400_ami; extern const struct bmc_platform bmc_plat_ast2500_ami; extern const struct bmc_platform bmc_plat_ast2500_openbmc; +extern const struct bmc_platform bmc_plat_ast2600_openbmc; extern void astbmc_early_init(void); extern int64_t astbmc_ipmi_reboot(void); diff --git a/platforms/astbmc/common.c b/platforms/astbmc/common.c index d96e070e5..83ef70ad3 100644 --- a/platforms/astbmc/common.c +++ b/platforms/astbmc/common.c @@ -266,8 +266,9 @@ static void astbmc_fixup_dt_mbox(struct dt_node *lpc) * can indicate they support mbox using the scratch register, or ipmi * by configuring the hiomap ipmi command. If neither are configured * for P8 then skiboot will drive the flash controller directly. + * XXX P10 */ - if (proc_gen != proc_gen_p9 && !ast_scratch_reg_is_mbox()) + if (proc_gen == proc_gen_p8 && !ast_scratch_reg_is_mbox()) return; /* First check if the mbox interface is already there */ @@ -478,7 +479,7 @@ void astbmc_early_init(void) * never MBOX. Thus only populate the MBOX node on P9 to allow * fallback. */ - if (proc_gen == proc_gen_p9) { + if (proc_gen >= proc_gen_p9) { astbmc_fixup_dt_mbox(dt_find_primary_lpc()); ast_setup_sio_mbox(MBOX_IO_BASE, MBOX_LPC_IRQ); } @@ -530,6 +531,14 @@ const struct bmc_hw_config bmc_hw_ast2500 = { .mcr_scu_strap = 0x00000000, }; +/* XXX P10: Update with Rainier values */ +const struct bmc_hw_config bmc_hw_ast2600 = { + .scu_revision_id = 0x05000303, + .mcr_configuration = 0x11200756, + .mcr_scu_mpll = 0x1008405F, + .mcr_scu_strap = 0x000030E0, +}; + const struct bmc_platform bmc_plat_ast2400_ami = { .name = "ast2400:ami", .hw = &bmc_hw_ast2400, @@ -547,3 +556,9 @@ const struct bmc_platform bmc_plat_ast2500_openbmc = { .hw = &bmc_hw_ast2500, .sw = &bmc_sw_openbmc, }; + +const struct bmc_platform bmc_plat_ast2600_openbmc = { + .name = "ast2600:openbmc", + .hw = &bmc_hw_ast2600, + .sw = &bmc_sw_openbmc, +}; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:06 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:06 +0530 Subject: [Skiboot] [PATCH v2 28/59] platforms: Add Rainier In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-29-hegdevasant@linux.vnet.ibm.com> From: Alistair Popple Rainier comes in two variants; 4U and 2U. PCIe slot power on from oohall with multi-socket support from fbarrat: On Rainier the PCIe slots have individual slot power controllers. These need to be enabled at boot so that we can scan the devices in the PHB root ports. This should really be integrated into the OPAL slot power control framework that was used for PCIe Hotplug support on Frienze (P8 FSP systems). Unfortunately, the way that is implemented is difficult to extend at best and needs to be refactored before we can add support for runtime power control on rainier. Signed-off-by: Alistair Popple Signed-off-by: Oliver O'Halloran Signed-off-by: Frederic Barrat [arbab at linux.ibm.com: Use bmc_plat_ast2600_openbmc] Signed-off-by: Reza Arbab Signed-off-by: Vasant Hegde --- platforms/astbmc/Makefile.inc | 3 +- platforms/astbmc/rainier.c | 136 ++++++++++++++++++++++++++++++++++ 2 files changed, 137 insertions(+), 2 deletions(-) create mode 100644 platforms/astbmc/rainier.c diff --git a/platforms/astbmc/Makefile.inc b/platforms/astbmc/Makefile.inc index 24e94039f..070813231 100644 --- a/platforms/astbmc/Makefile.inc +++ b/platforms/astbmc/Makefile.inc @@ -7,8 +7,7 @@ ASTBMC_OBJS = pnor.o common.o slots.o \ witherspoon.o zaius.o romulus.o p9dsu.o \ vesnin.o nicole.o mihawk.o mowgli.o \ talos.o blackbird.o \ - swift.o + swift.o rainier.o ASTBMC = $(PLATDIR)/astbmc/built-in.a $(ASTBMC): $(ASTBMC_OBJS:%=$(PLATDIR)/astbmc/%) - diff --git a/platforms/astbmc/rainier.c b/platforms/astbmc/rainier.c new file mode 100644 index 000000000..17d9fe2bf --- /dev/null +++ b/platforms/astbmc/rainier.c @@ -0,0 +1,136 @@ +// SPDX-License-Identifier: Apache-2.0 +/* + * Copyright (c) 2020 IBM + */ + +#include +#include +#include +#include +#include +#include + +#include "astbmc.h" + +/* + * puti2c pu 2 1 C6 00 6 1 -quiet + * puti2c pu 2 1 C6 54 7 1 -quiet + * puti2c pu 2 1 C6 05 8 1 -quiet + * puti2c pu 2 1 C6 00 9 1 -quiet + * + * sleep 4 + * + * puti2c pu 2 1 C6 55 6 1 -quiet + * puti2c pu 2 1 C6 55 7 1 -quiet + * 2 - engine + * 1 - port + * C6 - slave addr + * 55 - data + * 7 - register + * 1 - register length? + */ + +static int64_t smbus_write8(struct i2c_bus *bus, uint8_t reg, uint8_t data) +{ + struct i2c_request req; + + memset(&req, 0, sizeof(req)); + + req.bus = bus; + req.dev_addr = 0xC6 >> 1; /* Docs use 8bit addresses */ + + req.op = SMBUS_WRITE; + req.offset = reg; + req.offset_bytes = 1; + req.rw_buf = &data; + req.rw_len = 1; + req.timeout = 100; + + return i2c_request_sync(&req); +} + +static int64_t slot_power_enable(struct i2c_bus *bus) +{ + /* FIXME: we could do this in one transaction using auto-increment */ + if (smbus_write8(bus, 0x6, 0x00)) + return -1; + if (smbus_write8(bus, 0x7, 0x54)) + return -1; + if (smbus_write8(bus, 0x8, 0x05)) + return -1; + if (smbus_write8(bus, 0x9, 0x00)) + return -1; + + /* FIXME: Poll for PGOOD going high */ + + if (smbus_write8(bus, 0x6, 0x55)) + return -1; + if (smbus_write8(bus, 0x7, 0x55)) + return -1; + + return 0; +} + +static void rainier_init_slot_power(void) +{ + struct proc_chip *chip; + struct i2c_bus *bus; + + /* + * Controller on P0 is for slots C7 -> C11 + * on P2 is for slots C0 -> C4 + * Both chips use engine 2 port 1 + * + * Rainier with only one socket is officially supported, so + * we may not have slots C0 -> C4 + */ + for_each_chip(chip) { + if (chip->id % 4) + continue; + bus = p8_i2c_add_bus(chip->id, 2, 1, 400000); + if (!bus) { + prerror("Unable to find PCIe power controller I2C bus!\n"); + return; + } + if (slot_power_enable(bus)) { + prerror("Error enabling PCIe slot power on chip %d\n", + chip->id); + } + } +} + +static void rainier_init(void) +{ + astbmc_init(); + rainier_init_slot_power(); +} + +static bool rainier_probe(void) +{ + if (!dt_node_is_compatible(dt_root, "ibm,rainier") && + !dt_node_is_compatible(dt_root, "ibm,rainier-2s2u") && + !dt_node_is_compatible(dt_root, "ibm,rainier-2s4u")) + return false; + + /* Lot of common early inits here */ + astbmc_early_init(); + + /* Setup UART for use by OPAL (Linux hvc) */ + uart_set_console_policy(UART_CONSOLE_OPAL); + + return true; +} + +DECLARE_PLATFORM(rainier) = { + .name = "Rainier", + .probe = rainier_probe, + .init = rainier_init, + .start_preload_resource = flash_start_preload_resource, + .resource_loaded = flash_resource_loaded, + .bmc = &bmc_plat_ast2600_openbmc, + .cec_power_down = astbmc_ipmi_power_down, + .cec_reboot = astbmc_ipmi_reboot, + .elog_commit = ipmi_elog_commit, + .exit = astbmc_exit, + .terminate = ipmi_terminate, +}; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:07 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:07 +0530 Subject: [Skiboot] [PATCH v2 29/59] platform: Add Denali platform support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-30-hegdevasant@linux.vnet.ibm.com> Denali is P10 system. But FSP interaction (MBOX protocol) is same as ZZ. Hence add denali platform detection code inside zz.c for now. We can think of adding separate platform later. Also enable : - P10 TCE mapping support - Detect PHBs Signed-off-by: Vasant Hegde --- hdata/iohub.c | 4 ++++ hdata/spira.h | 1 + hw/fsp/fsp-psi.c | 1 + platforms/ibm-fsp/hostservices.c | 4 ++++ platforms/ibm-fsp/zz.c | 6 ++++++ 5 files changed, 16 insertions(+) diff --git a/hdata/iohub.c b/hdata/iohub.c index 92df48b8f..92655407e 100644 --- a/hdata/iohub.c +++ b/hdata/iohub.c @@ -843,6 +843,10 @@ static void io_parse_fru(const void *sp_iohubs) prlog(PR_INFO, "CEC: Rainier !\n"); io_add_p9(hub, sp_iohubs); break; + case CECHUB_HUB_DENALI: + prlog(PR_INFO, "CEC: Denali !\n"); + io_add_p9(hub, sp_iohubs); + break; default: prlog(PR_ERR, "CEC: Hub ID 0x%04x unsupported !\n", hub_id); diff --git a/hdata/spira.h b/hdata/spira.h index 7fcf5c302..afdc9228a 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -667,6 +667,7 @@ struct cechub_io_hub { #define CECHUB_HUB_CUMULUS_DUOMO 0x0030 /* cumulus+duomo from spec */ #define CECHUB_HUB_AXONE_HOPPER 0x0040 /* axone+hopper */ #define CECHUB_HUB_RAINIER 0x0050 +#define CECHUB_HUB_DENALI 0x0051 __be32 ec_level; __be32 aff_dom2; /* HDAT < v9.x only */ __be32 aff_dom3; /* HDAT < v9.x only */ diff --git a/hw/fsp/fsp-psi.c b/hw/fsp/fsp-psi.c index aeaf47e89..38f130dd7 100644 --- a/hw/fsp/fsp-psi.c +++ b/hw/fsp/fsp-psi.c @@ -37,6 +37,7 @@ void psi_init_for_fsp(struct psi *psi) switch (proc_gen) { case proc_gen_p8: case proc_gen_p9: + case proc_gen_p10: out_be64(psi->regs + PSIHB_TAR, PSI_TCE_TABLE_BASE | PSIHB_TAR_256K_ENTRIES); break; diff --git a/platforms/ibm-fsp/hostservices.c b/platforms/ibm-fsp/hostservices.c index 81fd6bdd3..accc0989a 100644 --- a/platforms/ibm-fsp/hostservices.c +++ b/platforms/ibm-fsp/hostservices.c @@ -551,6 +551,10 @@ int hservice_wakeup(uint32_t i_core, uint32_t i_mode) i_core &= SPR_PIR_P9_MASK; i_core <<= 2; break; + case proc_gen_p10: + i_core &= SPR_PIR_P10_MASK; + i_core <<= 2; + break; default: return OPAL_UNSUPPORTED; } diff --git a/platforms/ibm-fsp/zz.c b/platforms/ibm-fsp/zz.c index 7c6050ab7..493d6030a 100644 --- a/platforms/ibm-fsp/zz.c +++ b/platforms/ibm-fsp/zz.c @@ -160,6 +160,12 @@ static bool zz_probe(void) if (dt_node_is_compatible(dt_root, "ibm,fleetwood-m9s")) { return true; } + + /* Add Denali FSP platform and map it to ZZ */ + if (dt_node_is_compatible(dt_root, "ibm,denali")) { + return true; + } + return false; } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:09 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:09 +0530 Subject: [Skiboot] [PATCH v2 31/59] psi/p10: Activate 64K ESB pages In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-32-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/psi.c | 7 ++++--- include/psi.h | 5 +++-- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/hw/psi.c b/hw/psi.c index 26677a3b2..991ea3b1f 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -753,8 +753,7 @@ static void psi_init_p10_interrupts(struct psi *psi) { struct proc_chip *chip; u64 val; - /* TODO (clg) : fix ESB page size to 64k when ready */ - uint32_t esb_shift = 12; + uint32_t esb_shift = 16; /* Grab chip */ chip = get_chip(psi->chip_id); @@ -764,10 +763,12 @@ static void psi_init_p10_interrupts(struct psi *psi) /* Configure the CI BAR */ phys_map_get(chip->id, PSIHB_ESB, 0, &val, NULL); val |= PSIHB_ESB_CI_VALID; + if (esb_shift == 16) + val |= PSIHB10_ESB_CI_64K; out_be64(psi->regs + PSIHB_ESB_CI_BASE, val); val = in_be64(psi->regs + PSIHB_ESB_CI_BASE); - psi->esb_mmio = (void *)(val & ~PSIHB_ESB_CI_VALID); + psi->esb_mmio = (void *)(val & ~(PSIHB_ESB_CI_VALID|PSIHB10_ESB_CI_64K)); prlog(PR_DEBUG, "PSI[0x%03x]: ESB MMIO at @%p\n", psi->chip_id, psi->esb_mmio); diff --git a/include/psi.h b/include/psi.h index a7104ef0b..dbf94b4b3 100644 --- a/include/psi.h +++ b/include/psi.h @@ -94,9 +94,10 @@ #define PSIHB_IRQ_METHOD PPC_BIT(0) #define PSIHB_IRQ_RESET PPC_BIT(1) #define PSIHB_ESB_CI_BASE 0x60 -#define PSIHB_ESB_CI_VALID 1 +#define PSIHB10_ESB_CI_64K PPC_BIT(1) +#define PSIHB_ESB_CI_VALID PPC_BIT(63) #define PSIHB_ESB_NOTIF_ADDR 0x68 -#define PSIHB_ESB_NOTIF_VALID 1 +#define PSIHB_ESB_NOTIF_VALID PPC_BIT(63) #define PSIHB_IVT_OFFSET 0x70 #define PSIHB_IVT_OFF_SHIFT 32 /* -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:12 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:12 +0530 Subject: [Skiboot] [PATCH v2 34/59] xive/p10: Add option flags to the XIVE exploitation mode In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-35-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Change sligthly the semantic of the parameter of the opal_xive_reset() OPAL call to configure the interrupt mode of the machine and, at the same time, to configure the associated options. These options only apply to the XIVE exploitation mode. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/hw/xive2.c b/hw/xive2.c index a7bfdcbde..4ddcf184f 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -171,6 +171,14 @@ static enum { XIVE_MODE_NONE, } xive_mode = XIVE_MODE_NONE; +/* + * The XIVE exploitation mode options indicates the active features and + * is part of the mode parameter of the opal_xive_reset() call + */ +static uint64_t xive_expl_options; + +#define XIVE_EXPL_ALL_OPTIONS 0 + /* * Each source controller has one of these. There's one embedded in * the XIVE struct for IPIs @@ -3896,11 +3904,11 @@ void xive2_cpu_reset(void) in_be64(xs->tm_ring1 + TM_SPC_PULL_POOL_CTX); } -static int64_t __xive_reset(uint64_t version) +static int64_t __xive_reset(uint64_t mode) { struct proc_chip *chip; - xive_mode = version; + xive_mode = mode; /* Mask all interrupt sources */ irq_for_each_source(xive_reset_mask_source_cb, NULL); @@ -3938,13 +3946,20 @@ int64_t xive2_reset(void) return __xive_reset(XIVE_MODE_EXPL); } -static int64_t opal_xive_reset(uint64_t version) +static int64_t opal_xive_reset(uint64_t mode) { - prlog(PR_DEBUG, "XIVE reset, version: %d...\n", (int)version); + prlog(PR_DEBUG, "XIVE reset. mode = %llx\n", mode); - if (version != XIVE_MODE_EXPL) { - prerror("ignoring version %lld at reset. " - "XIVE exploitation mode is the default\n", version); + if (!(mode & XIVE_MODE_EXPL)) { + prlog(PR_NOTICE, "No emulation mode. XIVE exploitation mode " + "is the default\n"); + } + + xive_expl_options = mode & ~XIVE_MODE_EXPL; + if (xive_expl_options & ~XIVE_EXPL_ALL_OPTIONS) { + prerror("invalid XIVE exploitation mode option %016llx\n", + xive_expl_options); + return OPAL_PARAMETER; } return __xive_reset(XIVE_MODE_EXPL); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:13 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:13 +0530 Subject: [Skiboot] [PATCH v2 35/59] hw/phb5: Add support for PQ offloading In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-36-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater The POWER9 DD2.0 introduced a StoreEOI operation which had benefits over the LoadEOI operation : less latency and improved performance for interrupt handling. Because of load vs. store ordering issues in some cases, it had to be deactivates. The POWER10 processor has a set of new features in the XIVE2 and the PHB5 controllers to address this problem. At the interrupt controller level, XIVE2 adds a new load offset to the ESB page which offers the capability to order loads after stores. It should be enforced by the OS when doing loads if StoreEOI is to be used. But this is not enough. The firmware should also carefully configure the PHB interrupt sources to make sure that operations on the PQ state bits of a source are routed to a single logic unit : the XIVE2 IC. The PHB5 introduces a new configuration PQ disable (bit 9) bit for this purpose. It disables the check of the PQ state bits when processing new MSI interrupts. When set, the PHB ignores its local PQ state bits and forwards unconditionally any MSI trigger to the XIVE2 interrupt controller. The XIVE2 IC knows from the trigger message that the PQ bits have not been checked and performs the check using the local PQ bits. This configuration bit only applies to MSIs and LSIs are still checked on the PHB to handle the assertion level. This requires a new XIVE interface to register a HW interrupt source using the IC ESB pages of the allocated HW interrupt numbers, and not the ESB pages of the HW source. This is what this change proposes for MSIs, LSI still being handled the old way. PQ disable is a requirement for StoreEOI. Signed-off-by: C?dric Le Goater [FB: port to phb4.c] Signed-off-by: Frederic Barrat Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/phb4.c | 47 ++++++++++++++++++++++++++++++++++++++------- hw/xive2.c | 40 +++++++++++++++++++++++++++++++++++--- include/phb4-regs.h | 1 + include/xive.h | 2 ++ 4 files changed, 80 insertions(+), 10 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index e074fa2a3..d2d9f9ec0 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -136,8 +136,6 @@ static void phb4_init_hw(struct phb4 *p); #define PHBLOGCFG(p, fmt, a...) do {} while (0) #endif -#define PHB4_CAN_STORE_EOI(p) XIVE_STORE_EOI_ENABLED - static bool pci_eeh_mmio; static bool pci_retry_all; static int rx_err_max = PHB4_RX_ERR_MAX; @@ -152,6 +150,24 @@ static inline bool is_phb5(void) return (proc_gen == proc_gen_p10); } +/* PQ offloading on the XIVE IC. */ +static inline bool phb_pq_disable(struct phb4 *p __unused) +{ + if (is_phb5()) + return 1; + + return false; +} + +static inline bool phb_can_store_eoi(struct phb4 *p) +{ + if (is_phb5()) + /* PQ offloading is required for StoreEOI */ + return XIVE2_STORE_EOI_ENABLED && phb_pq_disable(p); + + return XIVE_STORE_EOI_ENABLED; +} + /* Note: The "ASB" name is historical, practically this means access via * the XSCOM backdoor */ @@ -5366,8 +5382,12 @@ static void phb4_init_hw(struct phb4 *p) val = PHB_CTRLR_IRQ_PGSZ_64K; val |= PHB_CTRLR_TCE_CLB_DISABLE; // HW557787 circumvention val |= SETFIELD(PHB_CTRLR_TVT_ADDR_SEL, 0ull, TVT_2_PER_PE); - if (PHB4_CAN_STORE_EOI(p)) + if (phb_pq_disable(p)) + val |= PHB_CTRLR_IRQ_PQ_DISABLE; + if (phb_can_store_eoi(p)) { val |= PHB_CTRLR_IRQ_STORE_EOI; + PHBDBG(p, "store EOI is enabled\n"); + } if (!pci_eeh_mmio) val |= PHB_CTRLR_MMIO_EEH_DISABLE; @@ -5927,16 +5947,29 @@ static void phb4_create(struct dt_node *np) /* Compute XIVE source flags depending on PHB revision */ irq_flags = 0; - if (PHB4_CAN_STORE_EOI(p)) + if (phb_can_store_eoi(p)) irq_flags |= XIVE_SRC_STORE_EOI; else irq_flags |= XIVE_SRC_TRIGGER_PAGE; if (is_phb5()) { - /* Register all interrupt sources with XIVE */ - xive2_register_hw_source(p->base_msi, p->num_irqs - 8, 16, - p->int_mmio, irq_flags, NULL, NULL); + /* + * Register sources with XIVE. If offloading is on, use the + * ESB pages of the XIVE IC for the MSI sources instead of the + * ESB pages of the PHB. + */ + if (phb_pq_disable(p)) { + xive2_register_esb_source(p->base_msi, p->num_irqs - 8); + } else { + xive2_register_hw_source(p->base_msi, + p->num_irqs - 8, 16, + p->int_mmio, irq_flags, + NULL, NULL); + } + /* + * LSI sources always use the ESB pages of the PHB. + */ xive2_register_hw_source(p->base_lsi, 8, 16, p->int_mmio + ((p->num_irqs - 8) << 16), XIVE_SRC_LSI, p, &phb4_lsi_ops); diff --git a/hw/xive2.c b/hw/xive2.c index 4ddcf184f..3f4958fce 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -2579,8 +2579,8 @@ void xive2_register_hw_source(uint32_t base, uint32_t count, uint32_t shift, false, data, ops); } -void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, - const struct irq_source_ops *ops) +static void __xive2_register_esb_source(uint32_t base, uint32_t count, + void *data, const struct irq_source_ops *ops) { struct xive_src *s; struct xive *x = xive_from_isn(base); @@ -2589,7 +2589,6 @@ void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, uint32_t flags = XIVE_SRC_EOI_PAGE1 | XIVE_SRC_TRIGGER_PAGE; assert(x); - assert(base >= x->int_base && (base + count) <= x->int_ipi_top); s = malloc(sizeof(struct xive_src)); assert(s); @@ -2605,6 +2604,41 @@ void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, flags, false, data, ops); } +/* + * Check that IPI sources have interrupt numbers in the IPI interrupt + * number range + */ +void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, + const struct irq_source_ops *ops) +{ + struct xive *x = xive_from_isn(base); + + assert(x); + assert(base >= x->int_base && (base + count) <= x->int_ipi_top); + + __xive2_register_esb_source(base, count, data, ops); +} + +/* + * Some HW sources (PHB) can disable the use of their own ESB pages + * and offload all the checks on ESB pages of the IC. The interrupt + * numbers are not necessarily in the IPI range. + */ +void xive2_register_esb_source(uint32_t base, uint32_t count) +{ + __xive2_register_esb_source(base, count, NULL, NULL); +} + +uint64_t xive2_get_esb_base(uint32_t base) +{ + struct xive *x = xive_from_isn(base); + uint32_t base_idx = GIRQ_TO_IDX(base); + + assert(x); + + return (uint64_t) x->esb_base + (1ul << XIVE_ESB_SHIFT) * base_idx; +} + static void xive_set_quirks(struct xive *x, struct proc_chip *chip __unused) { uint64_t quirks = 0; diff --git a/include/phb4-regs.h b/include/phb4-regs.h index 03b53ae01..139522814 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -101,6 +101,7 @@ #define PHB_VERSION 0x800 #define PHB_CTRLR 0x810 +#define PHB_CTRLR_IRQ_PQ_DISABLE PPC_BIT(9) /* PHB5 */ #define PHB_CTRLR_IRQ_PGSZ_64K PPC_BIT(11) #define PHB_CTRLR_IRQ_STORE_EOI PPC_BIT(12) #define PHB_CTRLR_MMIO_RD_STRICT PPC_BIT(13) diff --git a/include/xive.h b/include/xive.h index dc1b25d03..faaef2aeb 100644 --- a/include/xive.h +++ b/include/xive.h @@ -86,6 +86,8 @@ void xive2_register_hw_source(uint32_t base, uint32_t count, uint32_t shift, const struct irq_source_ops *ops); void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, const struct irq_source_ops *ops); +void xive2_register_esb_source(uint32_t base, uint32_t count); +uint64_t xive2_get_esb_base(uint32_t girq); void xive2_cpu_callin(struct cpu_thread *cpu); void *xive2_get_trigger_port(uint32_t girq); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:10 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:10 +0530 Subject: [Skiboot] [PATCH v2 32/59] psi/p10: Activate StoreEOI In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-33-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/psi.c | 15 ++++++++++++++- include/psi.h | 1 + 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/hw/psi.c b/hw/psi.c index 991ea3b1f..291422539 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -749,11 +749,14 @@ static const struct irq_source_ops psi_p10_irq_ops = { .name = psi_p9_irq_name, }; +#define PSIHB10_CAN_STORE_EOI(x) XIVE2_STORE_EOI_ENABLED + static void psi_init_p10_interrupts(struct psi *psi) { struct proc_chip *chip; u64 val; uint32_t esb_shift = 16; + uint32_t flags = XIVE_SRC_LSI; /* Grab chip */ chip = get_chip(psi->chip_id); @@ -772,6 +775,16 @@ static void psi_init_p10_interrupts(struct psi *psi) prlog(PR_DEBUG, "PSI[0x%03x]: ESB MMIO at @%p\n", psi->chip_id, psi->esb_mmio); + /* Store EOI */ + if (PSIHB10_CAN_STORE_EOI(psi)) { + val = in_be64(psi->regs + PSIHB_CR); + val |= PSIHB10_CR_STORE_EOI; + out_be64(psi->regs + PSIHB_CR, val); + prlog(PR_DEBUG, "PSI[0x%03x]: store EOI is enabled\n", + psi->chip_id); + flags |= XIVE_SRC_STORE_EOI; + } + /* Grab and configure the notification port */ val = xive2_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); val |= PSIHB_ESB_NOTIF_VALID; @@ -788,7 +801,7 @@ static void psi_init_p10_interrupts(struct psi *psi) psi->chip_id, 0xf & (chip->ec_level >> 4), chip->ec_level & 0xf); xive2_register_hw_source(psi->interrupt, P9_PSI_NUM_IRQS, - esb_shift, psi->esb_mmio, XIVE_SRC_LSI, + esb_shift, psi->esb_mmio, flags, psi, &psi_p10_irq_ops); /* Reset irq handling and switch to ESB mode */ diff --git a/include/psi.h b/include/psi.h index dbf94b4b3..ac7afa09f 100644 --- a/include/psi.h +++ b/include/psi.h @@ -41,6 +41,7 @@ #define PSIHB_CR_PSI_LINK_ENABLE PPC_BIT(5) #define PSIHB_CR_FSP_RESET PPC_BIT(6) #define PSIHB_CR_PSIHB_RESET PPC_BIT(7) +#define PSIHB10_CR_STORE_EOI PPC_BIT(12) #define PSIHB_CR_PSI_IRQ PPC_BIT(16) /* PSIHB interrupt */ #define PSIHB_CR_FSP_IRQ PPC_BIT(17) /* FSP interrupt */ #define PSIHB_CR_FSP_LINK_ACTIVE PPC_BIT(18) /* FSP link active */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:08 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:08 +0530 Subject: [Skiboot] [PATCH v2 30/59] xive/p10: Add a XIVE2 driver In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-31-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater The XIVE2 interrupt controller of the POWER10 processor follows the same logic than on POWER9 but the HW interface has been largely reviewed. It has a new register interface, different BARs, extra VSDs, new layout for the XIVE structures, and a set of new features which are described below. The OPAL XIVE2 driver code activating this controller was duplicated from P9 for clarity as the registers and structures have changed considerably. The same OPAL interface is implemented for OS compatibility and it should not impact existing Linux kernels, KVM included. Guest OS is not impacted either. Support for new features will be implemented in time and will require new support from the OS. * XIVE2 BARS The interrupt controller BARs have a different layout outlined below. Each sub-engine has now own its range and the indirect TIMA access was replaced with a set of pages, one per CPU, under the IC BAR: - IC BAR (Interrupt Controller) . 4 pages, one per sub-engine . 128 indirect TIMA pages - TM BAR (Thread Interrupt Management Area) . 4 pages - ESB BAR (ESB pages for IPIs) . up to 1TB - END BAR (ESB pages for ENDs) . up to 2TB - NVC BAR (Notification Virtual Crowd) . up to 128 - NVPG BAR (Notification Virtual Process and Group) . up to 1TB - Direct mapped Thread Context Area (reads & writes) OPAL does not use the grouping and crowd capability. * Virtual Structure Tables XIVE2 adds new tables types and also changes the field layout of the END and NVP Virtualization Structure Descriptors. - EAS - END new layout - NVT was splitted in : . NVP (Processor), 32B . NVG (Group), 32B . NVC (Crowd == P9 block group) 32B - IC for remote configuration - SYNC for cache injection - ERQ for event input queue The setup is slighly different on XIVE2 because the indexing has changed for some of the tables, block ID or the chip topology ID can be used. * XIVE2 features SCOM and MMIO registers have a new layout and XIVE2 adds a new global capability and configuration registers. The lowlevel hardware offers a set of new features among which : - cache injection mechanism - 4 cache watch engines - a configurable number of priorities : 1 -8 - StoreEOI with load-after-store ordering is activated by default - new sync/kill operations for cache operations Other features will have some impact on the Hypervisor and guest OS when activated, but this is not required for initial support of the controller. - Gen2 TIMA layout - A P9-compat mode, or Gen1, TIMA toggle bit for SW compatibility - Automatic Context save & restore - increase to 24bit for VP number - New escalations schems : ESB, Adaptive, CPPR POWER10 adds support for User interrupts. When configured, the XIVE2 controller can notify directly user processes using the Event Based Branch exception line of the thread. If not running, the OS is notified through an escalation event. New OPAL and PAPR interfaces will be required and OS support needs to be studied. * XIVE2 P9-compat mode, or Gen1 The thread interrupt management area (TIMA) is a set of pages mapped in the Hypervisor and in the guest OS address space giving access to the interrupt thread context registers for interrupt management, ACK, EOI, CPPR, etc. XIVE2 changes slightly the TIMA layout with extra bits for the new features, larger CAM lines and the controller provides configuration switches for backward compatibility. This is called the XIVE2 P9-compat mode, of Gen1 TIMA. It impacts the layout of the TIMA and the availability of the internal features associated with it, Automatic Save & Restore for instance. Using a P9 layout also means setting the controller in such a mode at init time. The XIVE2 driver in OPAL chooses to initialize the XIVE2 controller with a XIVE2/P10 TIMA directly because the layouts are compatible with the Linux PowerNV and the guest OSes expectations. For KVM support, the OPAL calls abstract the HW interface and no assumption is made on the OS CAM line width. * Activating new XIVE2 features Everything related to OPAL internals such as the use of the new cache sync mechanism can be implemented in time without impact on the OS. Other features will require new device tree properties exposed to the OS and extra support for the OS. Automatic Context save & restore is one of the first feature which should be looked at. * XICS-over-XICS driver (P8 compatibility) The P8 emulation mode is an OPAL compat interface used for Linux kernels which did not have XIVE native support. This was useful for POWER9 bringup but it is much less now. As it was adding a lot of complexity and reducing the interrupt controller resources, this mode is not available in the XIVE2 driver for POWER10. It will still be possible to add this compat mode in the future if required. The OS will have to reset the driver at boot time, like on POWER9. * Impact on other drivers (PSI, PHB, NPU) Interrupts are allocated in a very similar way. Each controller might have different ESB characteristics, StoreEOI support, 64K pages for PSI. All is in place to support these changes already. PHB5 will have support for "address-based trigger mode", probably in the DD2.0 time frame when verification is completed. When activated, the XIVE IC ESB pages will be used instead of the PHB ESB pages for a lower interrupt latency. LSI will still use old fashion triggers without StoreEOI. * Yet to be addressed : - OPAL P10 interface incomplete (stop states) - Clarify the PHB5 strategy regarding the use of the XIVE IC ESB pages instead of the PHB ones when address-based trigger mode is supported. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- core/fast-reboot.c | 4 + core/init.c | 12 +- hw/Makefile.inc | 2 +- hw/psi.c | 25 +- hw/slw.c | 12 +- hw/xive.c | 6 +- hw/xive2.c | 4444 ++++++++++++++++++++++++++++++++++++++++++ include/xive.h | 29 + include/xive2-regs.h | 549 ++++++ 9 files changed, 5071 insertions(+), 12 deletions(-) create mode 100644 hw/xive2.c create mode 100644 include/xive2-regs.h diff --git a/core/fast-reboot.c b/core/fast-reboot.c index ac9b3b284..9f92525a9 100644 --- a/core/fast-reboot.c +++ b/core/fast-reboot.c @@ -262,6 +262,8 @@ static void cleanup_cpu_state(void) if (proc_gen == proc_gen_p9) xive_cpu_reset(); + else if (proc_gen == proc_gen_p10) + xive2_cpu_reset(); /* Per core cleanup */ if (cpu_is_thread0(cpu) || cpu_is_core_chiplet_primary(cpu)) { @@ -381,6 +383,8 @@ void __noreturn fast_reboot_entry(void) if (proc_gen == proc_gen_p9) xive_reset(); + else if (proc_gen == proc_gen_p10) + xive2_reset(); /* Let the CPU layer do some last minute global cleanups */ cpu_fast_reboot_complete(); diff --git a/core/init.c b/core/init.c index 0bf4ab269..e38969554 100644 --- a/core/init.c +++ b/core/init.c @@ -1225,8 +1225,11 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) if (proc_gen == proc_gen_p8) cpu_set_ipi_enable(true); - /* On P9, initialize XIVE */ - init_xive(); + /* On P9 and P10, initialize XIVE */ + if (proc_gen == proc_gen_p9) + init_xive(); + else if (proc_gen == proc_gen_p10) + xive2_init(); /* Grab centaurs from device-tree if present (only on FSP-less) */ centaur_init(); @@ -1437,7 +1440,10 @@ void __noreturn __secondary_cpu_entry(void) mtmsrd(MSR_RI, 1); /* Some XIVE setup */ - xive_cpu_callin(cpu); + if (proc_gen == proc_gen_p9) + xive_cpu_callin(cpu); + else if (proc_gen == proc_gen_p10) + xive2_cpu_callin(cpu); /* Wait for work to do */ while(true) { diff --git a/hw/Makefile.inc b/hw/Makefile.inc index a7f450cf7..37256d3cc 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -9,7 +9,7 @@ HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o -HW_OBJS += ocmb.o +HW_OBJS += ocmb.o xive2.o HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/psi.c b/hw/psi.c index f95a066d3..26677a3b2 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -772,12 +772,12 @@ static void psi_init_p10_interrupts(struct psi *psi) psi->chip_id, psi->esb_mmio); /* Grab and configure the notification port */ - val = xive_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); + val = xive2_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); val |= PSIHB_ESB_NOTIF_VALID; out_be64(psi->regs + PSIHB_ESB_NOTIF_ADDR, val); /* Setup interrupt offset */ - val = xive_get_notify_base(psi->interrupt); + val = xive2_get_notify_base(psi->interrupt); val <<= 32; out_be64(psi->regs + PSIHB_IVT_OFFSET, val); @@ -786,7 +786,7 @@ static void psi_init_p10_interrupts(struct psi *psi) "PSI[0x%03x]: Interrupts sources registered for P10 DD%i.%i\n", psi->chip_id, 0xf & (chip->ec_level >> 4), chip->ec_level & 0xf); - xive_register_hw_source(psi->interrupt, P9_PSI_NUM_IRQS, + xive2_register_hw_source(psi->interrupt, P9_PSI_NUM_IRQS, esb_shift, psi->esb_mmio, XIVE_SRC_LSI, psi, &psi_p10_irq_ops); @@ -956,6 +956,23 @@ static struct psi *psi_probe_p9(struct proc_chip *chip, u64 base) return psi; } +static struct psi *psi_probe_p10(struct proc_chip *chip, u64 base) +{ + struct psi *psi = NULL; + uint64_t addr; + + phys_map_get(chip->id, PSIHB_REG, 0, &addr, NULL); + xscom_write(chip->id, base + PSIHB_XSCOM_P9_BASE, + addr | PSIHB_XSCOM_P9_HBBAR_EN); + + psi = alloc_psi(chip, base); + if (!psi) + return NULL; + psi->regs = (void *)addr; + psi->interrupt = xive2_alloc_hw_irqs(chip->id, P9_PSI_NUM_IRQS, 16); + return psi; +} + static bool psi_init_psihb(struct dt_node *psihb) { uint32_t chip_id = dt_get_chip_id(psihb); @@ -974,6 +991,8 @@ static bool psi_init_psihb(struct dt_node *psihb) psi = psi_probe_p8(chip, base); else if (dt_node_is_compatible(psihb, "ibm,power9-psihb-x")) psi = psi_probe_p9(chip, base); + else if (dt_node_is_compatible(psihb, "ibm,power10-psihb-x")) + psi = psi_probe_p10(chip, base); else { prerror("PSI: Unknown processor type\n"); return false; diff --git a/hw/slw.c b/hw/slw.c index 8969096ac..9e676af74 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -965,9 +965,15 @@ void add_cpu_idle_state_properties(void) } } if ((wakeup_engine_state == WAKEUP_ENGINE_PRESENT) && has_deep_states) { - slw_late_init_p9(chip); - xive_late_init(); - nx_p9_rng_late_init(); + if (chip->type == PROC_CHIP_P9_NIMBUS || + chip->type == PROC_CHIP_P9_CUMULUS) { + slw_late_init_p9(chip); + xive_late_init(); + nx_p9_rng_late_init(); + } else if (chip->type == PROC_CHIP_P10) { + /* TODO (p10): need P10 stop state engine */ + xive2_late_init(); + } } if (wakeup_engine_state != WAKEUP_ENGINE_PRESENT) has_deep_states = false; diff --git a/hw/xive.c b/hw/xive.c index c442ea5e3..51b03549a 100644 --- a/hw/xive.c +++ b/hw/xive.c @@ -1776,7 +1776,8 @@ static void xive_create_mmio_dt_node(struct xive *x) dt_add_property_cells(xive_dt_node, "ibm,xive-eq-sizes", 12, 16, 21, 24); - dt_add_property_cells(xive_dt_node, "ibm,xive-#priorities", 8); + dt_add_property_cells(xive_dt_node, "ibm,xive-#priorities", + NUM_INT_PRIORITIES); dt_add_property(xive_dt_node, "single-escalation-support", NULL, 0); xive_add_provisioning_properties(); @@ -4191,7 +4192,8 @@ static int64_t xive_setup_silent_gather(uint64_t vp_id, bool enable) if (!memcmp(eq_orig, &eq, sizeof(eq))) rc = 0; else - rc = xive_eqc_cache_update(x, blk, idx + 7, &eq, false); + rc = xive_eqc_cache_update(x, blk, idx + XIVE_ESCALATION_PRIO, + &eq, false); if (rc) return rc; diff --git a/hw/xive2.c b/hw/xive2.c new file mode 100644 index 000000000..a7bfdcbde --- /dev/null +++ b/hw/xive2.c @@ -0,0 +1,4444 @@ +// SPDX-License-Identifier: Apache-2.0 +/* + * XIVE2: eXternal Interrupt Virtualization Engine. POWER10 interrupt + * controller + * + * Copyright (c) 2016-2019, IBM Corporation. + */ + +#define pr_fmt(fmt) "XIVE: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include /* TODO (p10): need P10 stop state engine */ + + +/* Verbose debug */ +#undef XIVE_VERBOSE_DEBUG +#undef DEBUG + +/* Extra debug options used in debug builds */ +#ifdef DEBUG +#define XIVE_CHECK_LOCKS +#define XIVE_DEBUG_INIT_CACHE_UPDATES +#define XIVE_EXTRA_CHECK_INIT_CACHE +#else +#undef XIVE_CHECK_LOCKS +#undef XIVE_DEBUG_INIT_CACHE_UPDATES +#undef XIVE_EXTRA_CHECK_INIT_CACHE +#endif + +/* + * VSDs, blocks, set translation etc... + * + * For the following data structures, the XIVE use a mechanism called + * Virtualization Structure Tables (VST) to manage the memory layout + * and access: ESBs (Event State Buffers), EAS (Event assignment + * structures), ENDs (Event Notification Descriptors) and NVT/NVP + * (Notification Virtual Targets/Processors). + * + * These structures divide those tables into 16 "blocks". Each XIVE + * instance has a definition for all 16 blocks that can either represent + * an actual table in memory or a remote XIVE MMIO port to access a + * block that is owned by that remote XIVE. + * + * Our SW design will consist of allocating one block per chip (and thus + * per XIVE instance) for now, thus giving us up to 16 supported chips in + * the system. We may have to revisit that if we ever support systems with + * more than 16 chips but that isn't on our radar at the moment or if we + * want to do like pHyp on some machines and dedicate 2 blocks per chip + * for some structures. + * + * Thus we need to be careful that we never expose to Linux the concept + * of block and block boundaries, but instead we provide full number ranges + * so that consecutive blocks can be supported. + * + * Similarily, for MMIO access, the BARs support what is called "set + * translation" which allows the BAR to be devided into a certain + * number of sets. Each "set" can be routed to a specific block and + * offset within a block. + */ + +#define XIVE_MAX_BLOCKS 16 +#define XIVE_VSD_SIZE 8 + +/* + * Max number of ESBs. (direct table) + * + * The max number of ESBs supported in the P10 MMIO space is 1TB/128K: 8M. + * + * 1M is our current top limit of ESB entries and EAS entries + * pre-allocated per chip. That allocates 256KB per chip for the state + * bits and 8M per chip for the EAS. + */ + +#define XIVE_INT_ORDER 20 /* 1M interrupts */ +#define XIVE_INT_COUNT (1ul << XIVE_INT_ORDER) + +/* + * First interrupt number, also the first logical interrupt number + * allocated by Linux (maximum ISA interrupt number + 1) + */ +#define XIVE_INT_FIRST 0x10 + +/* Corresponding direct table sizes */ +#define XIVE_ESB_SIZE (XIVE_INT_COUNT / 4) +#define XIVE_EAT_SIZE (XIVE_INT_COUNT * 8) + +/* Use 64K for everything by default */ +#define XIVE_ESB_SHIFT (16 + 1) /* trigger + mgmt pages */ +#define XIVE_ESB_PAGE_SIZE (1ul << XIVE_ESB_SHIFT) /* 2 pages */ + +/* + * Max number of ENDs. (indirect table) + * + * The max number of ENDs supported in the P10 MMIO space is 2TB/128K: 16M. + * Since one END is 32 bytes, a 64K indirect subpage can hold 2K ENDs. + * We need 8192 subpages, ie, 64K of memory for the indirect table. + */ +#define END_PER_PAGE (PAGE_SIZE / sizeof(struct xive_end)) + +#define XIVE_END_ORDER 23 /* 8M ENDs */ +#define XIVE_END_COUNT (1ul << XIVE_END_ORDER) +#define XIVE_END_TABLE_SIZE ((XIVE_END_COUNT / END_PER_PAGE) * XIVE_VSD_SIZE) + +#define XIVE_END_SHIFT (16 + 1) /* ESn + ESe pages */ + +/* One bit per number of priorities configured */ +#define xive_end_bitmap_size(x) (XIVE_END_COUNT >> xive_cfg_vp_prio_shift(x)) + +/* Number of priorities (and thus ENDs) we allocate for each VP */ +#define xive_cfg_vp_prio_shift(x) GETFIELD(CQ_XIVE_CFG_VP_INT_PRIO, (x)->config) +#define xive_cfg_vp_prio(x) (1 << xive_cfg_vp_prio_shift(x)) + +/* Max priority number */ +#define xive_max_prio(x) (xive_cfg_vp_prio(x) - 1) + +/* Priority used for gather/silent escalation (KVM) */ +#define xive_escalation_prio(x) xive_max_prio(x) + +/* + * Max number of VPs. (indirect table) + * + * The max number of NVPs we support in our MMIO space is 1TB/128K: 8M. + * Since one NVP is 32 bytes, a 64K indirect subpage can hold 2K NVPs. + * We need 4096 pointers, ie, 32K of memory for the indirect table. + * + * However, we use 8 priorities (by default) per NVP and the number of + * ENDs is configured to 8M. Therefore, our VP space is limited to 1M. + */ +#define VP_PER_PAGE (PAGE_SIZE / sizeof(struct xive_nvp)) + +#define XIVE_VP_ORDER(x) (XIVE_END_ORDER - xive_cfg_vp_prio_shift(x)) +#define XIVE_VP_COUNT(x) (1ul << XIVE_VP_ORDER(x)) +#define XIVE_VP_TABLE_SIZE(x) ((XIVE_VP_COUNT(x) / VP_PER_PAGE) * XIVE_VSD_SIZE) + +#define XIVE_NVP_SHIFT 17 /* NVPG BAR: two pages, even NVP, odd NVG */ + +/* VP Space maximums in Gen1 and Gen2 modes */ +#define VP_SHIFT_GEN1 19 /* in sync with END_W6_VP_OFFSET_GEN1 */ +#define VP_SHIFT_GEN2 24 /* in sync with END_W6_VP_OFFSET */ + +/* + * VP ids for HW threads. + * + * Depends on the thread id bits configuration of the IC. 8bit is the + * default for P10 and 7bit for p9. + * + * These values are global because they should be common to all chips + */ +static uint32_t xive_threadid_shift; +static uint32_t xive_hw_vp_base; +static uint32_t xive_hw_vp_count; + +/* + * The XIVE operation mode indicates the active "API" and corresponds + * to the "version/mode" parameter of the opal_xive_reset() call + */ +static enum { + /* No XICS emulation */ + XIVE_MODE_EXPL = OPAL_XIVE_MODE_EXPL, /* default */ + XIVE_MODE_NONE, +} xive_mode = XIVE_MODE_NONE; + +/* + * Each source controller has one of these. There's one embedded in + * the XIVE struct for IPIs + */ +struct xive_src { + struct irq_source is; + const struct irq_source_ops *orig_ops; + struct xive *xive; + void *esb_mmio; + uint32_t esb_base; + uint32_t esb_shift; + uint32_t flags; +}; + +struct xive_cpu_state { + struct xive *xive; + void *tm_ring1; + + /* Base HW VP and associated queues */ + uint32_t vp_blk; + uint32_t vp_idx; + uint32_t end_blk; + uint32_t end_idx; /* Base end index of a block of 8 */ + + struct lock lock; +}; + +enum xive_generation { + XIVE_GEN1 = 1, /* P9 compat mode */ + XIVE_GEN2 = 2, /* P10 default */ +}; + +enum xive_quirks { + /* HW527671 - 8bits Hardwired Thread Id range not implemented */ + XIVE_QUIRK_THREADID_7BITS = 0x00000001, + /* HW542974 - interrupt command priority checker not working properly */ + XIVE_QUIRK_BROKEN_PRIO_CHECK = 0x00000002, +}; + +struct xive { + uint32_t chip_id; + uint32_t block_id; + struct dt_node *x_node; + + enum xive_generation generation; + uint64_t config; + + uint64_t xscom_base; + + /* MMIO regions */ + void *ic_base; + uint64_t ic_size; + uint32_t ic_shift; + void *ic_tm_direct_base; + + void *tm_base; + uint64_t tm_size; + uint32_t tm_shift; + void *nvp_base; + uint64_t nvp_size; + void *esb_base; + uint64_t esb_size; + void *end_base; + uint64_t end_size; + + /* Set on XSCOM register access error */ + bool last_reg_error; + + /* Per-XIVE mutex */ + struct lock lock; + + /* Pre-allocated tables. + * + * We setup all the VDS for actual tables (ie, by opposition to + * forwarding ports) as either direct pre-allocated or indirect + * and partially populated. + * + * Currently, the ESB and the EAS tables are direct and fully + * pre-allocated based on XIVE_INT_COUNT. + * + * The other tables are indirect, we thus pre-allocate the indirect + * table (ie, pages of pointers) and populate enough of the pages + * for our basic setup using 64K subpages. + * + * The size of the indirect tables are driven by XIVE_VP_COUNT + * and XIVE_END_COUNT. The number of pre-allocated ones are + * driven by xive_hw_vp_count for the HW threads. The number + * of END depends on number of VP. + */ + + /* Direct SBE and EAT tables */ + void *sbe_base; + void *eat_base; + + /* Indirect END table. NULL entries are unallocated, count is + * the numbre of pointers (ie, sub page placeholders). + */ + beint64_t *end_ind_base; + uint32_t end_ind_count; + uint64_t end_ind_size; + + /* END allocation bitmap. Each bit represent #priority ENDs */ + bitmap_t *end_map; + + /* Indirect NVT/VP table. NULL entries are unallocated, count is + * the numbre of pointers (ie, sub page placeholders). + */ + beint64_t *vp_ind_base; + uint32_t vp_ind_count; + uint64_t vp_ind_size; + + /* VP space size. Depends on Gen1/2 mode */ + uint32_t vp_shift; + + /* Pool of donated pages for provisioning indirect END and VP pages */ + struct list_head donated_pages; + + /* To ease a possible change to supporting more than one block of + * interrupts per chip, we store here the "base" global number + * and max number of interrupts for this chip. The global number + * encompass the block number and index. + */ + uint32_t int_base; + uint32_t int_count; + + /* Due to the overlap between IPIs and HW sources in the EAS table, + * we keep some kind of top-down allocator. It is used for HW sources + * to "allocate" interrupt entries and will limit what can be handed + * out as IPIs. Of course this assumes we "allocate" all HW sources + * before we start handing out IPIs. + * + * Note: The numbers here are global interrupt numbers so that we can + * potentially handle more than one block per chip in the future. + */ + uint32_t int_hw_bot; /* Bottom of HW allocation */ + uint32_t int_ipi_top; /* Highest IPI handed out so far + 1 */ + + /* The IPI allocation bitmap */ + bitmap_t *ipi_alloc_map; + + /* We keep track of which interrupts were ever enabled to + * speed up xive_reset + */ + bitmap_t *int_enabled_map; + + /* Embedded source IPIs */ + struct xive_src ipis; + + /* Embedded escalation interrupts */ + struct xive_src esc_irqs; + + /* In memory queue overflow */ + void *q_ovf; + + /* Cache/sync injection */ + uint64_t sync_inject_size; + void *sync_inject; + + /* INT HW Errata */ + uint64_t quirks; +}; + +#define XIVE_CAN_STORE_EOI(x) XIVE2_STORE_EOI_ENABLED + +/* First XIVE unit configured on the system */ +static struct xive *one_xive; + +/* Global DT node */ +static struct dt_node *xive_dt_node; + +/* Block <-> Chip conversions. + * + * As chipIDs may not be within the range of 16 block IDs supported by XIVE, + * we have a 2 way conversion scheme. + * + * From block to chip, use the global table below. + * + * From chip to block, a field in struct proc_chip contains the first block + * of that chip. For now we only support one block per chip but that might + * change in the future + */ +#define XIVE_INVALID_CHIP 0xffffffff +#define XIVE_MAX_CHIPS 16 +static uint32_t xive_block_to_chip[XIVE_MAX_CHIPS]; +static uint32_t xive_block_count; + +static uint32_t xive_chip_to_block(uint32_t chip_id) +{ + struct proc_chip *c = get_chip(chip_id); + + assert(c); + assert(c->xive); + return c->xive->block_id; +} + +/* + * Conversion between GIRQ and block/index. + * + * ------------------------------------ + * |000E|BLOC| INDEX| + * ------------------------------------ + * 4 4 24 + * + * the E bit indicates that this is an escalation interrupt, in + * that case, the BLOC/INDEX represents the END containing the + * corresponding escalation descriptor. + * + * Global interrupt numbers for non-escalation interrupts are thus + * limited to 28 bits. + */ + +#define INT_SHIFT 24 +#define INT_ESC_SHIFT (INT_SHIFT + 4) /* 4bits block id */ + +#if XIVE_INT_ORDER > INT_SHIFT +#error "Too many ESBs for IRQ encoding" +#endif + +#if XIVE_END_ORDER > INT_SHIFT +#error "Too many ENDs for escalation IRQ number encoding" +#endif + +#define GIRQ_TO_BLK(__g) (((__g) >> INT_SHIFT) & 0xf) +#define GIRQ_TO_IDX(__g) ((__g) & ((1 << INT_SHIFT) - 1)) +#define BLKIDX_TO_GIRQ(__b,__i) (((uint32_t)(__b)) << INT_SHIFT | (__i)) + +#define GIRQ_IS_ESCALATION(__g) ((__g) & (1 << INT_ESC_SHIFT)) +#define MAKE_ESCALATION_GIRQ(__b,__i)(BLKIDX_TO_GIRQ(__b,__i) | (1 << INT_ESC_SHIFT)) + + +/* Block/IRQ to chip# conversions */ +#define PC_BLK_TO_CHIP(__b) (xive_block_to_chip[__b]) +#define VC_BLK_TO_CHIP(__b) (xive_block_to_chip[__b]) +#define GIRQ_TO_CHIP(__isn) (VC_BLK_TO_CHIP(GIRQ_TO_BLK(__isn))) + +/* Routing of physical processors to VPs */ +#define PIR2VP_IDX( __pir) (xive_hw_vp_base | P10_PIR2LOCALCPU(__pir)) +#define PIR2VP_BLK(__pir) (xive_chip_to_block(P10_PIR2GCID(__pir))) +#define VP2PIR(__blk, __idx) (P10_PIRFROMLOCALCPU(VC_BLK_TO_CHIP(__blk), (__idx) & 0xff)) + +/* Decoding of OPAL API VP IDs. The VP IDs are encoded as follow + * + * Block group mode: + * + * ----------------------------------- + * |GVEOOOOO| INDEX| + * ----------------------------------- + * || | + * || Order + * |Virtual + * Group + * + * G (Group) : Set to 1 for a group VP (not currently supported) + * V (Virtual) : Set to 1 for an allocated VP (vs. a physical processor ID) + * E (Error) : Should never be 1, used internally for errors + * O (Order) : Allocation order of the VP block + * + * The conversion is thus done as follow (groups aren't implemented yet) + * + * If V=0, O must be 0 and 24-bit INDEX value is the PIR + * If V=1, the order O group is allocated such that if N is the number of + * chip bits considered for allocation (*) + * then the INDEX is constructed as follow (bit numbers such as 0=LSB) + * - bottom O-N bits is the index within the "VP block" + * - next N bits is the XIVE blockID of the VP + * - the remaining bits is the per-chip "base" + * so the conversion consists of "extracting" the block ID and moving + * down the upper bits by N bits. + * + * In non-block-group mode, the difference is that the blockID is + * on the left of the index (the entire VP block is in a single + * block ID) + */ + +#define VP_GROUP_SHIFT 31 +#define VP_VIRTUAL_SHIFT 30 +#define VP_ERROR_SHIFT 29 +#define VP_ORDER_SHIFT 24 + +#define vp_group(vp) (((vp) >> VP_GROUP_SHIFT) & 1) +#define vp_virtual(vp) (((vp) >> VP_VIRTUAL_SHIFT) & 1) +#define vp_order(vp) (((vp) >> VP_ORDER_SHIFT) & 0x1f) +#define vp_index(vp) ((vp) & ((1 << VP_ORDER_SHIFT) - 1)) + +/* VP allocation */ +static uint32_t xive_chips_alloc_bits = 0; +static struct buddy *xive_vp_buddy; +static struct lock xive_buddy_lock = LOCK_UNLOCKED; + +/* VP# decoding/encoding */ +static bool xive_decode_vp(uint32_t vp, uint32_t *blk, uint32_t *idx, + uint8_t *order, bool *group) +{ + uint32_t o = vp_order(vp); + uint32_t n = xive_chips_alloc_bits; + uint32_t index = vp_index(vp); + uint32_t imask = (1 << (o - n)) - 1; + + /* Groups not supported yet */ + if (vp_group(vp)) + return false; + if (group) + *group = false; + + /* PIR case */ + if (!vp_virtual(vp)) { + if (find_cpu_by_pir(index) == NULL) + return false; + if (blk) + *blk = PIR2VP_BLK(index); + if (idx) + *idx = PIR2VP_IDX(index); + return true; + } + + /* Ensure o > n, we have *at least* 2 VPs per block */ + if (o <= n) + return false; + + /* Combine the index base and index */ + if (idx) + *idx = ((index >> n) & ~imask) | (index & imask); + /* Extract block ID */ + if (blk) + *blk = (index >> (o - n)) & ((1 << n) - 1); + + /* Return order as well if asked for */ + if (order) + *order = o; + + return true; +} + +static uint32_t xive_encode_vp(uint32_t blk, uint32_t idx, uint32_t order) +{ + uint32_t vp = (1 << VP_VIRTUAL_SHIFT) | (order << VP_ORDER_SHIFT); + uint32_t n = xive_chips_alloc_bits; + uint32_t imask = (1 << (order - n)) - 1; + + vp |= (idx & ~imask) << n; + vp |= blk << (order - n); + vp |= idx & imask; + return vp; +} + +/* + * XSCOM/MMIO helpers + */ +#define XIVE_NO_MMIO -1 + +#define xive_regw(__x, __r, __v) \ + __xive_regw(__x, __r, X_##__r, __v, #__r) +#define xive_regr(__x, __r) \ + __xive_regr(__x, __r, X_##__r, #__r) +#define xive_regwx(__x, __r, __v) \ + __xive_regw(__x, XIVE_NO_MMIO, X_##__r, __v, #__r) +#define xive_regrx(__x, __r) \ + __xive_regr(__x, XIVE_NO_MMIO, X_##__r, #__r) + +#ifdef XIVE_VERBOSE_DEBUG +#define xive_vdbg(__x,__fmt,...) prlog(PR_DEBUG,"[ IC %02x ] " __fmt, (__x)->chip_id, ##__VA_ARGS__) +#define xive_cpu_vdbg(__c,__fmt,...) prlog(PR_DEBUG,"[CPU %04x] " __fmt, (__c)->pir, ##__VA_ARGS__) +#else +#define xive_vdbg(x,fmt,...) do { } while(0) +#define xive_cpu_vdbg(x,fmt,...) do { } while(0) +#endif + +#define xive_dbg(__x,__fmt,...) prlog(PR_DEBUG,"[ IC %02x ] " __fmt, (__x)->chip_id, ##__VA_ARGS__) +#define xive_cpu_dbg(__c,__fmt,...) prlog(PR_DEBUG,"[CPU %04x] " __fmt, (__c)->pir, ##__VA_ARGS__) +#define xive_notice(__x,__fmt,...) prlog(PR_NOTICE,"[ IC %02x ] " __fmt, (__x)->chip_id, ##__VA_ARGS__) +#define xive_cpu_notice(__c,__fmt,...) prlog(PR_NOTICE,"[CPU %04x] " __fmt, (__c)->pir, ##__VA_ARGS__) +#define xive_warn(__x,__fmt,...) prlog(PR_WARNING,"[ IC %02x ] " __fmt, (__x)->chip_id, ##__VA_ARGS__) +#define xive_cpu_warn(__c,__fmt,...) prlog(PR_WARNING,"[CPU %04x] " __fmt, (__c)->pir, ##__VA_ARGS__) +#define xive_err(__x,__fmt,...) prlog(PR_ERR,"[ IC %02x ] " __fmt, (__x)->chip_id, ##__VA_ARGS__) +#define xive_cpu_err(__c,__fmt,...) prlog(PR_ERR,"[CPU %04x] " __fmt, (__c)->pir, ##__VA_ARGS__) + +/* + * The XIVE subengine being accessed can be deduced from the XSCOM + * reg, and from there, the page offset in the IC BAR. + */ +static void* xive_ic_page(struct xive *x, uint32_t x_reg) +{ + uint64_t pgoff = (x_reg >> 8) & 0x3; + + return x->ic_base + (pgoff << x->ic_shift); +} + +static void __xive_regw(struct xive *x, uint32_t m_reg, uint32_t x_reg, uint64_t v, + const char *rname) +{ + bool use_xscom = (m_reg == XIVE_NO_MMIO) || !x->ic_base; + int64_t rc; + + x->last_reg_error = false; + + assert(x_reg != 0); + + if (use_xscom) { + rc = xscom_write(x->chip_id, x->xscom_base + x_reg, v); + if (rc) { + if (!rname) + rname = "???"; + xive_err(x, "Error writing register %s\n", rname); + /* Anything else we can do here ? */ + x->last_reg_error = true; + } + } else { + out_be64(xive_ic_page(x, x_reg) + m_reg, v); + } +} + +static uint64_t __xive_regr(struct xive *x, uint32_t m_reg, uint32_t x_reg, + const char *rname) +{ + bool use_xscom = (m_reg == XIVE_NO_MMIO) || !x->ic_base; + int64_t rc; + uint64_t val; + + x->last_reg_error = false; + + assert(x_reg != 0); + + if (use_xscom) { + rc = xscom_read(x->chip_id, x->xscom_base + x_reg, &val); + if (rc) { + if (!rname) + rname = "???"; + xive_err(x, "Error reading register %s\n", rname); + /* Anything else we can do here ? */ + x->last_reg_error = true; + return -1ull; + } + } else { + val = in_be64(xive_ic_page(x, x_reg) + m_reg); + } + return val; +} + +/* Locate a controller from an IRQ number */ +static struct xive *xive_from_isn(uint32_t isn) +{ + uint32_t chip_id = GIRQ_TO_CHIP(isn); + struct proc_chip *c = get_chip(chip_id); + + if (!c) + return NULL; + return c->xive; +} + +static struct xive *xive_from_pc_blk(uint32_t blk) +{ + uint32_t chip_id = PC_BLK_TO_CHIP(blk); + struct proc_chip *c = get_chip(chip_id); + + if (!c) + return NULL; + return c->xive; +} + +static struct xive *xive_from_vc_blk(uint32_t blk) +{ + uint32_t chip_id = VC_BLK_TO_CHIP(blk); + struct proc_chip *c = get_chip(chip_id); + + if (!c) + return NULL; + return c->xive; +} + +static struct xive_end *xive_get_end(struct xive *x, unsigned int idx) +{ + struct xive_end *p; + + if (idx >= (x->end_ind_count * END_PER_PAGE)) + return NULL; + p = (struct xive_end *)(be64_to_cpu(x->end_ind_base[idx / END_PER_PAGE]) & + VSD_ADDRESS_MASK); + if (!p) + return NULL; + + return &p[idx % END_PER_PAGE]; +} + +static struct xive_eas *xive_get_eas(struct xive *x, unsigned int isn) +{ + struct xive_eas *eat; + uint32_t idx = GIRQ_TO_IDX(isn); + + if (GIRQ_IS_ESCALATION(isn)) { + /* Allright, an escalation EAS is buried inside an END, let's + * try to find it + */ + struct xive_end *end; + + if (x->chip_id != VC_BLK_TO_CHIP(GIRQ_TO_BLK(isn))) { + xive_err(x, "%s, ESC ISN 0x%x not on right chip\n", + __func__, isn); + return NULL; + } + end = xive_get_end(x, idx); + if (!end) { + xive_err(x, "%s, ESC ISN 0x%x END not found\n", + __func__, isn); + return NULL; + } + + /* If using single-escalation, don't let anybody get + * to the individual escalation interrupts + */ + if (xive_get_field32(END_W0_UNCOND_ESCALATE, end->w0)) + return NULL; + + /* Grab the escalation END */ + return (struct xive_eas *)(char *)&end->w4; + } else { + /* Check the block matches */ + if (isn < x->int_base || isn >= x->int_count) { + xive_err(x, "%s, ISN 0x%x not on right chip\n", + __func__, isn); + return NULL; + } + assert (idx < XIVE_INT_COUNT); + + /* If we support >1 block per chip, this should still + * work as we are likely to make the table contiguous + * anyway + */ + eat = x->eat_base; + assert(eat); + + return eat + idx; + } +} + +static struct xive_nvp *xive_get_vp(struct xive *x, unsigned int idx) +{ + struct xive_nvp *p; + + assert(idx < (x->vp_ind_count * VP_PER_PAGE)); + p = (struct xive_nvp *)(be64_to_cpu(x->vp_ind_base[idx / VP_PER_PAGE]) & + VSD_ADDRESS_MASK); + if (!p) + return NULL; + + return &p[idx % VP_PER_PAGE]; +} + +/* + * Store the END base of the VP in W5, using the new architected field + * in P10. Used to be the pressure relief interrupt field on P9. + */ +static void xive_vp_set_end_base(struct xive_nvp *vp, + uint32_t end_blk, uint32_t end_idx) +{ + vp->w5 = xive_set_field32(NVP_W5_VP_END_BLOCK, 0, end_blk) | + xive_set_field32(NVP_W5_VP_END_INDEX, 0, end_idx); + + /* This is the criteria to know if a VP was allocated */ + assert(vp->w5 != 0); +} + +static void xive_init_default_vp(struct xive_nvp *vp, + uint32_t end_blk, uint32_t end_idx) +{ + memset(vp, 0, sizeof(struct xive_nvp)); + + xive_vp_set_end_base(vp, end_blk, end_idx); + + vp->w0 = xive_set_field32(NVP_W0_VALID, 0, 1); +} + +/* + * VPs of the HW threads have their own set of ENDs which is allocated + * when XIVE is initialized. These are tagged with a FIRMWARE bit so + * that they can be identified when the driver is reset (kexec). + */ +static void xive_init_hw_end(struct xive_end *end) +{ + memset(end, 0, sizeof(struct xive_end)); + end->w0 = xive_set_field32(END_W0_FIRMWARE1, 0, 1); +} + +static void *xive_get_donated_page(struct xive *x) +{ + return (void *)list_pop_(&x->donated_pages, 0); +} + +#define XIVE_ALLOC_IS_ERR(_idx) ((_idx) >= 0xfffffff0) + +#define XIVE_ALLOC_NO_SPACE 0xffffffff /* No possible space */ +#define XIVE_ALLOC_NO_IND 0xfffffffe /* Indirect need provisioning */ +#define XIVE_ALLOC_NO_MEM 0xfffffffd /* Local allocation failed */ + +static uint32_t xive_alloc_end_set(struct xive *x, bool alloc_indirect) +{ + uint32_t ind_idx; + int idx; + int end_base_idx; + + xive_vdbg(x, "Allocating END set...\n"); + + assert(x->end_map); + + /* Allocate from the END bitmap. Each bit is 8 ENDs */ + idx = bitmap_find_zero_bit(*x->end_map, 0, xive_end_bitmap_size(x)); + if (idx < 0) { + xive_dbg(x, "Allocation from END bitmap failed !\n"); + return XIVE_ALLOC_NO_SPACE; + } + + end_base_idx = idx << xive_cfg_vp_prio_shift(x); + + xive_vdbg(x, "Got ENDs 0x%x..0x%x\n", end_base_idx, + end_base_idx + xive_max_prio(x)); + + /* Calculate the indirect page where the ENDs reside */ + ind_idx = end_base_idx / END_PER_PAGE; + + /* Is there an indirect page ? If not, check if we can provision it */ + if (!x->end_ind_base[ind_idx]) { + /* Default flags */ + uint64_t vsd_flags = SETFIELD(VSD_TSIZE, 0ull, 4) | + SETFIELD(VSD_MODE, 0ull, VSD_MODE_EXCLUSIVE); + void *page; + + /* If alloc_indirect is set, allocate the memory from OPAL own, + * otherwise try to provision from the donated pool + */ + if (alloc_indirect) { + /* Allocate/provision indirect page during boot only */ + xive_vdbg(x, "Indirect empty, provisioning from local pool\n"); + page = local_alloc(x->chip_id, PAGE_SIZE, PAGE_SIZE); + if (!page) { + xive_dbg(x, "provisioning failed !\n"); + return XIVE_ALLOC_NO_MEM; + } + vsd_flags |= VSD_FIRMWARE; + } else { + xive_vdbg(x, "Indirect empty, provisioning from donated pages\n"); + page = xive_get_donated_page(x); + if (!page) { + xive_vdbg(x, "no idirect pages available !\n"); + return XIVE_ALLOC_NO_IND; + } + } + memset(page, 0, PAGE_SIZE); + x->end_ind_base[ind_idx] = cpu_to_be64(vsd_flags | + (((uint64_t)page) & VSD_ADDRESS_MASK)); + /* Any cache scrub needed ? */ + } + + bitmap_set_bit(*x->end_map, idx); + return end_base_idx; +} + +static void xive_free_end_set(struct xive *x, uint32_t ends) +{ + uint32_t idx; + uint8_t prio_mask = xive_max_prio(x); + + xive_vdbg(x, "Freeing END 0x%x..0x%x\n", ends, ends + xive_max_prio(x)); + + assert((ends & prio_mask) == 0); + assert(x->end_map); + + idx = ends >> xive_cfg_vp_prio_shift(x); + bitmap_clr_bit(*x->end_map, idx); +} + +static bool xive_provision_vp_ind(struct xive *x, uint32_t vp_idx, uint32_t order) +{ + uint32_t pbase, pend, i; + + pbase = vp_idx / VP_PER_PAGE; + pend = (vp_idx + (1 << order)) / VP_PER_PAGE; + + for (i = pbase; i <= pend; i++) { + void *page; + u64 vsd; + + /* Already provisioned ? */ + if (x->vp_ind_base[i]) + continue; + + /* Try to grab a donated page */ + page = xive_get_donated_page(x); + if (!page) + return false; + + /* Install the page */ + memset(page, 0, PAGE_SIZE); + vsd = ((uint64_t)page) & VSD_ADDRESS_MASK; + vsd |= SETFIELD(VSD_TSIZE, 0ull, 4); + vsd |= SETFIELD(VSD_MODE, 0ull, VSD_MODE_EXCLUSIVE); + x->vp_ind_base[i] = cpu_to_be64(vsd); + } + return true; +} + +static void xive_init_vp_allocator(void) +{ + /* Initialize chip alloc bits */ + xive_chips_alloc_bits = ilog2(xive_block_count); + + prlog(PR_INFO, "%d chips considered for VP allocations\n", + 1 << xive_chips_alloc_bits); + + /* Allocate a buddy big enough for XIVE_VP_ORDER allocations. + * + * each bit in the buddy represents 1 << xive_chips_alloc_bits + * VPs. + */ + xive_vp_buddy = buddy_create(XIVE_VP_ORDER(one_xive)); + assert(xive_vp_buddy); + + /* + * We reserve the whole range of VP ids representing HW threads. + */ + assert(buddy_reserve(xive_vp_buddy, xive_hw_vp_base, + xive_threadid_shift)); +} + +static uint32_t xive_alloc_vps(uint32_t order) +{ + uint32_t local_order, i; + int vp; + + /* The minimum order is 2 VPs per chip */ + if (order < (xive_chips_alloc_bits + 1)) + order = xive_chips_alloc_bits + 1; + + /* We split the allocation */ + local_order = order - xive_chips_alloc_bits; + + /* We grab that in the global buddy */ + assert(xive_vp_buddy); + lock(&xive_buddy_lock); + vp = buddy_alloc(xive_vp_buddy, local_order); + unlock(&xive_buddy_lock); + if (vp < 0) + return XIVE_ALLOC_NO_SPACE; + + /* Provision on every chip considered for allocation */ + for (i = 0; i < (1 << xive_chips_alloc_bits); i++) { + struct xive *x = xive_from_pc_blk(i); + bool success; + + /* Return internal error & log rather than assert ? */ + assert(x); + lock(&x->lock); + success = xive_provision_vp_ind(x, vp, local_order); + unlock(&x->lock); + if (!success) { + lock(&xive_buddy_lock); + buddy_free(xive_vp_buddy, vp, local_order); + unlock(&xive_buddy_lock); + return XIVE_ALLOC_NO_IND; + } + } + + /* Encode the VP number. "blk" is 0 as this represents + * all blocks and the allocation always starts at 0 + */ + return xive_encode_vp(0, vp, order); +} + +static void xive_free_vps(uint32_t vp) +{ + uint32_t idx; + uint8_t order, local_order; + + assert(xive_decode_vp(vp, NULL, &idx, &order, NULL)); + + /* We split the allocation */ + local_order = order - xive_chips_alloc_bits; + + /* Free that in the buddy */ + lock(&xive_buddy_lock); + buddy_free(xive_vp_buddy, idx, local_order); + unlock(&xive_buddy_lock); +} + +enum xive_cache_type { + xive_cache_easc, + xive_cache_esbc, + xive_cache_endc, + xive_cache_nxc, +}; + +/* + * Cache update + */ + +#define FLUSH_CTRL_POLL_VALID PPC_BIT(0) /* POLL bit is the same for all */ + +static int64_t __xive_cache_scrub(struct xive *x, + enum xive_cache_type ctype, + uint64_t block, uint64_t idx, + bool want_inval __unused, bool want_disable __unused) +{ + uint64_t ctrl_reg, x_ctrl_reg; + uint64_t poll_val, ctrl_val; + +#ifdef XIVE_CHECK_LOCKS + assert(lock_held_by_me(&x->lock)); +#endif + switch (ctype) { + case xive_cache_easc: + poll_val = + SETFIELD(VC_EASC_FLUSH_POLL_BLOCK_ID, 0ll, block) | + SETFIELD(VC_EASC_FLUSH_POLL_OFFSET, 0ll, idx) | + VC_EASC_FLUSH_POLL_BLOCK_ID_MASK | + VC_EASC_FLUSH_POLL_OFFSET_MASK; + xive_regw(x, VC_EASC_FLUSH_POLL, poll_val); + ctrl_reg = VC_EASC_FLUSH_CTRL; + x_ctrl_reg = X_VC_EASC_FLUSH_CTRL; + break; + case xive_cache_esbc: + poll_val = + SETFIELD(VC_ESBC_FLUSH_POLL_BLOCK_ID, 0ll, block) | + SETFIELD(VC_ESBC_FLUSH_POLL_OFFSET, 0ll, idx) | + VC_ESBC_FLUSH_POLL_BLOCK_ID_MASK | + VC_ESBC_FLUSH_POLL_OFFSET_MASK; + xive_regw(x, VC_ESBC_FLUSH_POLL, poll_val); + ctrl_reg = VC_ESBC_FLUSH_CTRL; + x_ctrl_reg = X_VC_ESBC_FLUSH_CTRL; + break; + case xive_cache_endc: + poll_val = + SETFIELD(VC_ENDC_FLUSH_POLL_BLOCK_ID, 0ll, block) | + SETFIELD(VC_ENDC_FLUSH_POLL_OFFSET, 0ll, idx) | + VC_ENDC_FLUSH_POLL_BLOCK_ID_MASK | + VC_ENDC_FLUSH_POLL_OFFSET_MASK; + xive_regw(x, VC_ENDC_FLUSH_POLL, poll_val); + ctrl_reg = VC_ENDC_FLUSH_CTRL; + x_ctrl_reg = X_VC_ENDC_FLUSH_CTRL; + break; + case xive_cache_nxc: + poll_val = + SETFIELD(PC_NXC_FLUSH_POLL_BLOCK_ID, 0ll, block) | + SETFIELD(PC_NXC_FLUSH_POLL_OFFSET, 0ll, idx) | + PC_NXC_FLUSH_POLL_BLOCK_ID_MASK | + PC_NXC_FLUSH_POLL_OFFSET_MASK; + xive_regw(x, PC_NXC_FLUSH_POLL, poll_val); + ctrl_reg = PC_NXC_FLUSH_CTRL; + x_ctrl_reg = X_PC_NXC_FLUSH_CTRL; + break; + default: + return OPAL_INTERNAL_ERROR; + } + + /* XXX Add timeout !!! */ + for (;;) { + ctrl_val = __xive_regr(x, ctrl_reg, x_ctrl_reg, NULL); + if (!(ctrl_val & FLUSH_CTRL_POLL_VALID)) + break; + /* Small delay */ + time_wait(100); + } + sync(); + return 0; +} + +static int64_t xive_easc_scrub(struct xive *x, uint64_t block, uint64_t idx) +{ + return __xive_cache_scrub(x, xive_cache_easc, block, idx, false, false); +} + +static int64_t xive_nxc_scrub(struct xive *x, uint64_t block, uint64_t idx) +{ + return __xive_cache_scrub(x, xive_cache_nxc, block, idx, false, false); +} + +static int64_t xive_nxc_scrub_clean(struct xive *x, uint64_t block, uint64_t idx) +{ + return __xive_cache_scrub(x, xive_cache_nxc, block, idx, true, false); +} + +static int64_t xive_endc_scrub(struct xive *x, uint64_t block, uint64_t idx) +{ + return __xive_cache_scrub(x, xive_cache_endc, block, idx, false, false); +} + +#define XIVE_CACHE_WATCH_MAX_RETRIES 10 + +static int64_t __xive_cache_watch(struct xive *x, enum xive_cache_type ctype, + uint64_t block, uint64_t idx, + uint32_t start_dword, uint32_t dword_count, + beint64_t *new_data, bool light_watch, + bool synchronous) +{ + uint64_t sreg, sregx, dreg0, dreg0x; + uint64_t dval0, sval, status; + int64_t i; + int retries = 0; + +#ifdef XIVE_CHECK_LOCKS + assert(lock_held_by_me(&x->lock)); +#endif + switch (ctype) { + case xive_cache_endc: + sreg = VC_ENDC_WATCH0_SPEC; + sregx = X_VC_ENDC_WATCH0_SPEC; + dreg0 = VC_ENDC_WATCH0_DATA0; + dreg0x = X_VC_ENDC_WATCH0_DATA0; + sval = SETFIELD(VC_ENDC_WATCH_BLOCK_ID, idx, block); + break; + case xive_cache_nxc: + sreg = PC_NXC_WATCH0_SPEC; + sregx = X_PC_NXC_WATCH0_SPEC; + dreg0 = PC_NXC_WATCH0_DATA0; + dreg0x = X_PC_NXC_WATCH0_DATA0; + sval = SETFIELD(PC_NXC_WATCH_BLOCK_ID, idx, block); + break; + default: + return OPAL_INTERNAL_ERROR; + } + + /* The full bit is in the same position for ENDC and NXC */ + if (!light_watch) + sval |= VC_ENDC_WATCH_FULL; + + for (;;) { + /* Write the cache watch spec */ + __xive_regw(x, sreg, sregx, sval, NULL); + + /* Load data0 register to populate the watch */ + dval0 = __xive_regr(x, dreg0, dreg0x, NULL); + + /* If new_data is NULL, this is a dummy watch used as a + * workaround for a HW bug + */ + if (!new_data) { + __xive_regw(x, dreg0, dreg0x, dval0, NULL); + return 0; + } + + /* Write the words into the watch facility. We write in reverse + * order in case word 0 is part of it as it must be the last + * one written. + */ + for (i = start_dword + dword_count - 1; i >= start_dword ;i--) { + uint64_t dw = be64_to_cpu(new_data[i - start_dword]); + __xive_regw(x, dreg0 + i * 8, dreg0x + i, dw, NULL); + } + + /* Write data0 register to trigger the update if word 0 wasn't + * written above + */ + if (start_dword > 0) + __xive_regw(x, dreg0, dreg0x, dval0, NULL); + + /* This may not be necessary for light updates (it's possible + * that a sync in sufficient, TBD). Ensure the above is + * complete and check the status of the watch. + */ + status = __xive_regr(x, sreg, sregx, NULL); + + /* Bits FULL and CONFLICT are in the same position in + * ENDC and NXC + */ + if (!(status & VC_ENDC_WATCH_FULL) || + !(status & VC_ENDC_WATCH_CONFLICT)) + break; + if (!synchronous) + return OPAL_BUSY; + + if (++retries == XIVE_CACHE_WATCH_MAX_RETRIES) { + xive_err(x, "Reached maximum retries %d when doing " + "a %s cache update\n", retries, + ctype == xive_cache_endc ? "ENDC" : "NXC"); + return OPAL_BUSY; + } + } + + /* Perform a scrub with "want_invalidate" set to false to push the + * cache updates to memory as well + */ + return __xive_cache_scrub(x, ctype, block, idx, false, false); +} + +#ifdef XIVE_DEBUG_INIT_CACHE_UPDATES +static bool xive_check_endc_update(struct xive *x, uint32_t idx, struct xive_end *end) +{ + struct xive_end *end_p = xive_get_end(x, idx); + struct xive_end end2; + + assert(end_p); + end2 = *end_p; + if (memcmp(end, &end2, sizeof(struct xive_end)) != 0) { + xive_err(x, "END update mismatch idx %d\n", idx); + xive_err(x, "want: %08x %08x %08x %08x\n", + end->w0, end->w1, end->w2, end->w3); + xive_err(x, " %08x %08x %08x %08x\n", + end->w4, end->w5, end->w6, end->w7); + xive_err(x, "got : %08x %08x %08x %08x\n", + end2.w0, end2.w1, end2.w2, end2.w3); + xive_err(x, " %08x %08x %08x %08x\n", + end2.w4, end2.w5, end2.w6, end2.w7); + return false; + } + return true; +} + +static bool xive_check_nxc_update(struct xive *x, uint32_t idx, struct xive_nvp *vp) +{ + struct xive_nvp *vp_p = xive_get_vp(x, idx); + struct xive_nvp vp2; + + assert(vp_p); + vp2 = *vp_p; + if (memcmp(vp, &vp2, sizeof(struct xive_nvp)) != 0) { + xive_err(x, "VP update mismatch idx %d\n", idx); + xive_err(x, "want: %08x %08x %08x %08x\n", + vp->w0, vp->w1, vp->w2, vp->w3); + xive_err(x, " %08x %08x %08x %08x\n", + vp->w4, vp->w5, vp->w6, vp->w7); + xive_err(x, "got : %08x %08x %08x %08x\n", + vp2.w0, vp2.w1, vp2.w2, vp2.w3); + xive_err(x, " %08x %08x %08x %08x\n", + vp2.w4, vp2.w5, vp2.w6, vp2.w7); + return false; + } + return true; +} +#else +static inline bool xive_check_endc_update(struct xive *x __unused, + uint32_t idx __unused, + struct xive_end *end __unused) +{ + return true; +} + +static inline bool xive_check_nxc_update(struct xive *x __unused, + uint32_t idx __unused, + struct xive_nvp *vp __unused) +{ + return true; +} +#endif + +static int64_t xive_escalation_ive_cache_update(struct xive *x, uint64_t block, + uint64_t idx, struct xive_eas *eas, + bool synchronous) +{ + return __xive_cache_watch(x, xive_cache_endc, block, idx, + 2, 1, &eas->w, true, synchronous); +} + +static int64_t xive_endc_cache_update(struct xive *x, uint64_t block, + uint64_t idx, struct xive_end *end, + bool synchronous) +{ + int64_t ret; + + ret = __xive_cache_watch(x, xive_cache_endc, block, idx, + 0, 4, (beint64_t *)end, false, synchronous); + xive_check_endc_update(x, idx, end); + return ret; +} + +static int64_t xive_nxc_cache_update(struct xive *x, uint64_t block, + uint64_t idx, struct xive_nvp *vp, + bool synchronous) +{ + int64_t ret; + + ret = __xive_cache_watch(x, xive_cache_nxc, block, idx, + 0, 4, (beint64_t *)vp, false, synchronous); + xive_check_nxc_update(x, idx, vp); + return ret; +} + +/* + * VSD + */ +static bool xive_set_vsd(struct xive *x, uint32_t tbl, uint32_t idx, uint64_t v) +{ + /* Set VC subengine */ + xive_regw(x, VC_VSD_TABLE_ADDR, + SETFIELD(VC_VSD_TABLE_SELECT, 0ull, tbl) | + SETFIELD(VC_VSD_TABLE_ADDRESS, 0ull, idx)); + if (x->last_reg_error) + return false; + xive_regw(x, VC_VSD_TABLE_DATA, v); + if (x->last_reg_error) + return false; + + /* also set PC subengine if table is used */ + if (tbl == VST_EAS || tbl == VST_ERQ || tbl == VST_IC) + return true; + + xive_regw(x, PC_VSD_TABLE_ADDR, + SETFIELD(PC_VSD_TABLE_SELECT, 0ull, tbl) | + SETFIELD(PC_VSD_TABLE_ADDRESS, 0ull, idx)); + if (x->last_reg_error) + return false; + xive_regw(x, PC_VSD_TABLE_DATA, v); + if (x->last_reg_error) + return false; + return true; +} + +static bool xive_set_local_tables(struct xive *x) +{ + uint64_t base, i; + + /* These have to be power of 2 sized */ + assert(is_pow2(XIVE_ESB_SIZE)); + assert(is_pow2(XIVE_EAT_SIZE)); + + /* All tables set as exclusive */ + base = SETFIELD(VSD_MODE, 0ull, VSD_MODE_EXCLUSIVE); + + /* ESB: direct mode */ + if (!xive_set_vsd(x, VST_ESB, x->block_id, base | + (((uint64_t)x->sbe_base) & VSD_ADDRESS_MASK) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(XIVE_ESB_SIZE) - 12))) + return false; + + /* EAS: direct mode */ + if (!xive_set_vsd(x, VST_EAS, x->block_id, base | + (((uint64_t)x->eat_base) & VSD_ADDRESS_MASK) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(XIVE_EAT_SIZE) - 12))) + return false; + + /* END: indirect mode with 64K subpages */ + if (!xive_set_vsd(x, VST_END, x->block_id, base | + (((uint64_t)x->end_ind_base) & VSD_ADDRESS_MASK) | + VSD_INDIRECT | SETFIELD(VSD_TSIZE, 0ull, + ilog2(x->end_ind_size) - 12))) + return false; + + /* NVP: indirect mode with 64K subpages */ + if (!xive_set_vsd(x, VST_NVP, x->block_id, base | + (((uint64_t)x->vp_ind_base) & VSD_ADDRESS_MASK) | + VSD_INDIRECT | SETFIELD(VSD_TSIZE, 0ull, + ilog2(x->vp_ind_size) - 12))) + return false; + + /* NVG: not used */ + /* NVC: not used */ + + /* INT and SYNC: indexed with the Topology# */ + if (!xive_set_vsd(x, VST_IC, x->chip_id, base | + (((uint64_t)x->ic_base) & VSD_ADDRESS_MASK) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->ic_size) - 12))) + return false; + + if (!xive_set_vsd(x, VST_SYNC, x->chip_id, base | + (((uint64_t)x->sync_inject) & VSD_ADDRESS_MASK) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->sync_inject_size) - 12))) + return false; + + /* + * ERQ: one 64K page for each queue overflow. Indexed with : + * + * 0:IPI, 1:HWD, 2:NxC, 3:INT, 4:OS-Queue, 5:Pool-Queue, 6:Hard-Queue + */ + for (i = 0; i < VC_QUEUE_COUNT; i++) { + u64 addr = ((uint64_t)x->q_ovf) + i * PAGE_SIZE; + u64 cfg, sreg, sregx; + + if (!xive_set_vsd(x, VST_ERQ, i, base | + (addr & VSD_ADDRESS_MASK) | + SETFIELD(VSD_TSIZE, 0ull, 4))) + return false; + + sreg = VC_QUEUES_CFG_REM0 + i * 8; + sregx = X_VC_QUEUES_CFG_REM0 + i; + cfg = __xive_regr(x, sreg, sregx, NULL); + cfg |= VC_QUEUES_CFG_MEMB_EN; + cfg = SETFIELD(VC_QUEUES_CFG_MEMB_SZ, cfg, 4); + __xive_regw(x, sreg, sregx, cfg, NULL); + } + + return true; +} + + +/* + * IC BAR layout + * + * Page 0: Internal CQ register accesses (reads & writes) + * Page 1: Internal PC register accesses (reads & writes) + * Page 2: Internal VC register accesses (reads & writes) + * Page 3: Internal TCTXT (TIMA) reg accesses (read & writes) + * Page 4: Notify Port page (writes only, w/data), + * Page 5: Reserved + * Page 6: Sync Poll page (writes only, dataless) + * Page 7: Sync Inject page (writes only, dataless) + * Page 8: LSI Trigger page (writes only, dataless) + * Page 9: LSI SB Management page (reads & writes dataless) + * Pages 10-255: Reserved + * Pages 256-383: Direct mapped Thread Context Area (reads & writes) + * covering the 128 threads in P10. + * Pages 384-511: Reserved + */ + +#define XIVE_IC_CQ_PGOFF 0 +#define XIVE_IC_PC_PGOFF 1 +#define XIVE_IC_VC_PGOFF 2 +#define XIVE_IC_TCTXT_PGOFF 3 +#define XIVE_NOTIFY_PGOFF 4 +#define XIVE_SYNC_POLL_PGOFF 6 +#define XIVE_SYNC_INJECT_PGOFF 7 +#define XIVE_LSI_TRIGGER_PGOFF 8 +#define XIVE_LSI_MGMT_PGOFF 9 +#define XIVE_IC_TM_DIRECT_PGOFF 256 + +static bool xive_configure_ic_bars(struct xive *x) +{ + uint64_t chip_id = x->chip_id; + uint64_t val; + + /* Reset all bars to zero */ + xive_regwx(x, CQ_RST_CTL, CQ_RST_PB_BAR_RESET); + + /* IC BAR */ + phys_map_get(chip_id, XIVE_IC, 0, (uint64_t *)&x->ic_base, &x->ic_size); + val = (uint64_t)x->ic_base | CQ_IC_BAR_VALID | CQ_IC_BAR_64K; + x->ic_shift = 16; + + xive_regwx(x, CQ_IC_BAR, val); + if (x->last_reg_error) + return false; + + /* + * TM BAR, same address for each chip. Hence we create a fake + * chip 0 and use that for all phys_map_get(XIVE_TM) calls. + */ + phys_map_get(0, XIVE_TM, 0, (uint64_t *)&x->tm_base, &x->tm_size); + val = (uint64_t)x->tm_base | CQ_TM_BAR_VALID | CQ_TM_BAR_64K; + x->tm_shift = 16; + + xive_regwx(x, CQ_TM_BAR, val); + if (x->last_reg_error) + return false; + + /* IC BAR sub-pages shortcuts */ + x->ic_tm_direct_base = x->ic_base + + (XIVE_IC_TM_DIRECT_PGOFF << x->ic_shift); + + return true; +} + +/* + * NVPG, NVC, ESB, END BARs have common attributes: 64k page and only + * one set covering the whole BAR. + */ +static bool xive_configure_bars(struct xive *x) +{ + uint64_t chip_id = x->chip_id; + uint64_t val; + uint64_t esb_size; + uint64_t end_size; + uint64_t nvp_size; + + x->nvp_size = XIVE_VP_COUNT(x) << XIVE_NVP_SHIFT; + x->esb_size = XIVE_INT_COUNT << XIVE_ESB_SHIFT; + x->end_size = XIVE_END_COUNT << XIVE_END_SHIFT; + + /* + * NVC BAR is not configured because we do not use the XIVE2 + * Crowd capability. + */ + + /* NVPG BAR: two pages, even NVP, odd NVG */ + phys_map_get(chip_id, XIVE_NVPG, 0, (uint64_t *)&x->nvp_base, &nvp_size); + if (x->nvp_size > nvp_size) { + xive_err(x, "NVP table is larger than default: " + "0x%012llx > 0x%012llx\n", x->nvp_size, nvp_size); + return false; + } + + val = (uint64_t)x->nvp_base | CQ_BAR_VALID | CQ_BAR_64K | + SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->nvp_size) - 24); + xive_regwx(x, CQ_NVPG_BAR, val); + if (x->last_reg_error) + return false; + + /* ESB BAR */ + phys_map_get(chip_id, XIVE_ESB, 0, (uint64_t *)&x->esb_base, &esb_size); + if (x->esb_size > esb_size) { + xive_err(x, "ESB table is larger than default: " + "0x%012llx > 0x%012llx\n", x->esb_size, esb_size); + return false; + } + + val = (uint64_t)x->esb_base | CQ_BAR_VALID | CQ_BAR_64K | + SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->esb_size) - 24); + xive_regwx(x, CQ_ESB_BAR, val); + if (x->last_reg_error) + return false; + + /* END BAR */ + phys_map_get(chip_id, XIVE_END, 0, (uint64_t *)&x->end_base, &end_size); + if (x->end_size > end_size) { + xive_err(x, "END table is larger than default: " + "0x%012llx > 0x%012llx\n", x->end_size, end_size); + return false; + } + + val = (uint64_t)x->end_base | CQ_BAR_VALID | CQ_BAR_64K | + SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->end_size) - 24); + xive_regwx(x, CQ_END_BAR, val); + if (x->last_reg_error) + return false; + + xive_dbg(x, "IC: %14p [0x%012llx]\n", x->ic_base, x->ic_size); + xive_dbg(x, "TM: %14p [0x%012llx]\n", x->tm_base, x->tm_size); + xive_dbg(x, "NVP: %14p [0x%012llx]\n", x->nvp_base, x->nvp_size); + xive_dbg(x, "ESB: %14p [0x%012llx]\n", x->esb_base, x->esb_size); + xive_dbg(x, "END: %14p [0x%012llx]\n", x->end_base, x->end_size); + + return true; +} + +static void xive_dump_mmio(struct xive *x) +{ + prlog(PR_DEBUG, " CQ_CFG_PB_GEN = %016llx\n", + in_be64(x->ic_base + CQ_CFG_PB_GEN)); + prlog(PR_DEBUG, " CQ_MSGSND = %016llx\n", + in_be64(x->ic_base + CQ_MSGSND)); +} + +static const struct { + uint64_t bitmask; + const char *name; +} xive_capabilities[] = { +}; + +static void xive_dump_capabilities(struct xive *x, uint64_t cap_val) +{ + int i; + + xive_dbg(x, "capabilities: %016llx\n", cap_val); + xive_dbg(x, "\tVersion: %lld\n", + GETFIELD(CQ_XIVE_CAP_VERSION, cap_val)); + xive_dbg(x, "\tUser interrupt priorities: [ 1 - %d ]\n", + 1 << GETFIELD(CQ_XIVE_CAP_USER_INT_PRIO, cap_val)); + xive_dbg(x, "\tVP interrupt priorities: [ %d - 8 ]\n", + 1 << GETFIELD(CQ_XIVE_CAP_VP_INT_PRIO, cap_val)); + xive_dbg(x, "\tExtended Blockid bits: %lld\n", + 4 + GETFIELD(CQ_XIVE_CAP_BLOCK_ID_WIDTH, cap_val)); + + for (i = 0; i < ARRAY_SIZE(xive_capabilities); i++) { + if (xive_capabilities[i].bitmask & cap_val) + xive_dbg(x, "\t%s\n", xive_capabilities[i].name); + } +} + +static const struct { + uint64_t bitmask; + const char *name; +} xive_configs[] = { + { CQ_XIVE_CFG_GEN1_TIMA_OS, "Gen1 mode TIMA OS" }, + { CQ_XIVE_CFG_GEN1_TIMA_HYP, "Gen1 mode TIMA Hyp" }, + { CQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0, "Gen1 mode TIMA General Hypervisor Block0" }, + { CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS, "Gen1 mode TIMA Crowd disable" }, + { CQ_XIVE_CFG_GEN1_END_ESX, "Gen1 mode END ESx" }, +}; + +static void xive_dump_configuration(struct xive *x, const char *prefix, + uint64_t cfg_val) +{ + int i ; + + xive_dbg(x, "%s configuration: %016llx\n", prefix, cfg_val); + xive_dbg(x, "\tHardwired Thread Id range: %lld bits\n", + 7 + GETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE, cfg_val)); + xive_dbg(x, "\tUser Interrupt priorities: [ 1 - %d ]\n", + 1 << GETFIELD(CQ_XIVE_CFG_USER_INT_PRIO, cfg_val)); + xive_dbg(x, "\tVP Interrupt priorities: [ 0 - %d ]\n", xive_max_prio(x)); + xive_dbg(x, "\tBlockId bits: %lld bits\n", + 4 + GETFIELD(CQ_XIVE_CFG_BLOCK_ID_WIDTH, cfg_val)); + if (CQ_XIVE_CFG_HYP_HARD_BLKID_OVERRIDE & cfg_val) + xive_dbg(x, "\tHardwired BlockId: %lld\n", + GETFIELD(CQ_XIVE_CFG_HYP_HARD_BLOCK_ID, cfg_val)); + + for (i = 0; i < ARRAY_SIZE(xive_configs); i++) { + if (xive_configs[i].bitmask & cfg_val) + xive_dbg(x, "\t%s\n", xive_configs[i].name); + } +} + +/* + * Default XIVE configuration + */ +#define XIVE_CONFIGURATION \ + (SETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE, 0ull, CQ_XIVE_CFG_THREADID_8BITS) | \ + SETFIELD(CQ_XIVE_CFG_VP_INT_PRIO, 0ull, CQ_XIVE_CFG_INT_PRIO_8)) + +/* + * Gen1 configuration for tests (QEMU) + */ +#define XIVE_CONFIGURATION_GEN1 \ + (SETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE, 0ull, CQ_XIVE_CFG_THREADID_7BITS) | \ + SETFIELD(CQ_XIVE_CFG_VP_INT_PRIO, 0ull, CQ_XIVE_CFG_INT_PRIO_8) | \ + CQ_XIVE_CFG_GEN1_TIMA_OS | \ + CQ_XIVE_CFG_GEN1_TIMA_HYP | \ + CQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0 | \ + CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS | \ + CQ_XIVE_CFG_GEN1_END_ESX) + +static void xive_config_reduced_priorities_fixup(struct xive *x) +{ + if (xive_cfg_vp_prio_shift(x) < CQ_XIVE_CFG_INT_PRIO_8 && + x->quirks & XIVE_QUIRK_BROKEN_PRIO_CHECK) { + uint64_t val = xive_regr(x, PC_ERR1_CFG1); + + val &= ~PC_ERR1_CFG1_INTERRUPT_INVALID_PRIO; + xive_dbg(x, "workaround for reduced priorities. " + "PC_ERR1_CFG1=%016llx\n", val); + xive_regw(x, PC_ERR1_CFG1, val); + } +} + +static bool xive_config_init(struct xive *x) +{ + uint64_t cap_val; + + cap_val = xive_regr(x, CQ_XIVE_CAP); + xive_dump_capabilities(x, cap_val); + + x->generation = GETFIELD(CQ_XIVE_CAP_VERSION, cap_val); + + /* + * Allow QEMU to override version for tests + */ + if (x->generation != XIVE_GEN2 && !chip_quirk(QUIRK_QEMU)) { + xive_err(x, "Invalid XIVE controller version %d\n", + x->generation); + return false; + } + + x->config = xive_regr(x, CQ_XIVE_CFG); + xive_dump_configuration(x, "default", x->config); + + /* Start with default settings */ + x->config = x->generation == XIVE_GEN1 ? XIVE_CONFIGURATION_GEN1 : + XIVE_CONFIGURATION; + + if (x->quirks & XIVE_QUIRK_THREADID_7BITS) + x->config = SETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE, x->config, + CQ_XIVE_CFG_THREADID_7BITS); + + /* + * Hardwire the block ID. The default value is the topology ID + * of the chip which is different from the block. + */ + x->config |= CQ_XIVE_CFG_HYP_HARD_BLKID_OVERRIDE | + SETFIELD(CQ_XIVE_CFG_HYP_HARD_BLOCK_ID, 0ull, x->block_id); + + xive_dump_configuration(x, "new", x->config); + xive_regw(x, CQ_XIVE_CFG, x->config); + if (xive_regr(x, CQ_XIVE_CFG) != x->config) { + xive_err(x, "configuration setting failed\n"); + } + + /* + * Disable error reporting in the FIR for info errors from the VC. + */ + xive_regw(x, CQ_FIRMASK_OR, CQ_FIR_VC_INFO_ERROR_0_2); + + /* + * Mask CI Load and Store to bad location, as IPI trigger + * pages may be mapped to user space, and a read on the + * trigger page causes a checkstop + */ + xive_regw(x, CQ_FIRMASK_OR, CQ_FIR_PB_RCMDX_CI_ERR1); + + /* + * VP space settings. P9 mode is 19bits. + */ + x->vp_shift = x->generation == XIVE_GEN1 ? + VP_SHIFT_GEN1 : VP_SHIFT_GEN2; + + /* + * VP ids for HW threads. These values are hardcoded in the + * CAM line of the HW context + * + * POWER10 |chip|0000000000000001|threadid| + * 28bits 4 16 8 + * + * POWER9 |chip|000000000001|thrdid | + * 23bits 4 12 7 + */ + + /* TODO (cosmetic): set VP ids for HW threads only once */ + xive_threadid_shift = 7 + GETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE, + x->config); + + xive_hw_vp_base = 1 << xive_threadid_shift; + xive_hw_vp_count = 1 << xive_threadid_shift; + + xive_dbg(x, "store EOI is %savailable\n", + XIVE_CAN_STORE_EOI(x) ? "" : "not "); + + xive_config_reduced_priorities_fixup(x); + + return true; +} + +/* Set Translation tables : 1 block per chip */ +static bool xive_setup_set_xlate(struct xive *x) +{ + unsigned int i; + + /* Configure ESBs */ + xive_regw(x, CQ_TAR, + CQ_TAR_AUTOINC | SETFIELD(CQ_TAR_SELECT, 0ull, CQ_TAR_ESB)); + if (x->last_reg_error) + return false; + for (i = 0; i < XIVE_MAX_BLOCKS; i++) { + xive_regw(x, CQ_TDR, CQ_TDR_VALID | + SETFIELD(CQ_TDR_BLOCK_ID, 0ull, x->block_id)); + if (x->last_reg_error) + return false; + } + + /* Configure ENDs */ + xive_regw(x, CQ_TAR, + CQ_TAR_AUTOINC | SETFIELD(CQ_TAR_SELECT, 0ull, CQ_TAR_END)); + if (x->last_reg_error) + return false; + for (i = 0; i < XIVE_MAX_BLOCKS; i++) { + xive_regw(x, CQ_TDR, CQ_TDR_VALID | + SETFIELD(CQ_TDR_BLOCK_ID, 0ull, x->block_id)); + if (x->last_reg_error) + return false; + } + + /* Configure NVPs */ + xive_regw(x, CQ_TAR, + CQ_TAR_AUTOINC | SETFIELD(CQ_TAR_SELECT, 0ull, CQ_TAR_NVPG)); + if (x->last_reg_error) + return false; + for (i = 0; i < XIVE_MAX_BLOCKS; i++) { + xive_regw(x, CQ_TDR, CQ_TDR_VALID | + SETFIELD(CQ_TDR_BLOCK_ID, 0ull, x->block_id)); + if (x->last_reg_error) + return false; + } + return true; +} + +static bool xive_prealloc_tables(struct xive *x) +{ + uint32_t i; + uint32_t pbase, pend; + + /* ESB has 4 entries per byte */ + x->sbe_base = local_alloc(x->chip_id, XIVE_ESB_SIZE, XIVE_ESB_SIZE); + if (!x->sbe_base) { + xive_err(x, "Failed to allocate SBE\n"); + return false; + } + + /* PQs are initialized to 0b01 which corresponds to "ints off" */ + memset(x->sbe_base, 0x55, XIVE_ESB_SIZE); + xive_dbg(x, "SBE at %p size 0x%lx\n", x->sbe_base, XIVE_ESB_SIZE); + + /* EAS entries are 8 bytes */ + x->eat_base = local_alloc(x->chip_id, XIVE_EAT_SIZE, XIVE_EAT_SIZE); + if (!x->eat_base) { + xive_err(x, "Failed to allocate EAS\n"); + return false; + } + + /* + * We clear the entries (non-valid). They will be initialized + * when actually used + */ + memset(x->eat_base, 0, XIVE_EAT_SIZE); + xive_dbg(x, "EAT at %p size 0x%lx\n", x->eat_base, XIVE_EAT_SIZE); + + /* Indirect END table. Limited to one top page. */ + x->end_ind_size = ALIGN_UP(XIVE_END_TABLE_SIZE, PAGE_SIZE); + if (x->end_ind_size > PAGE_SIZE) { + xive_err(x, "END indirect table is too big !\n"); + return false; + } + x->end_ind_base = local_alloc(x->chip_id, x->end_ind_size, + x->end_ind_size); + if (!x->end_ind_base) { + xive_err(x, "Failed to allocate END indirect table\n"); + return false; + } + memset(x->end_ind_base, 0, x->end_ind_size); + xive_dbg(x, "ENDi at %p size 0x%llx #%ld entries\n", x->end_ind_base, + x->end_ind_size, XIVE_END_COUNT); + x->end_ind_count = XIVE_END_TABLE_SIZE / XIVE_VSD_SIZE; + + /* Indirect VP table. Limited to one top page. */ + x->vp_ind_size = ALIGN_UP(XIVE_VP_TABLE_SIZE(x), PAGE_SIZE); + if (x->vp_ind_size > PAGE_SIZE) { + xive_err(x, "VP indirect table is too big !\n"); + return false; + } + x->vp_ind_base = local_alloc(x->chip_id, x->vp_ind_size, + x->vp_ind_size); + if (!x->vp_ind_base) { + xive_err(x, "Failed to allocate VP indirect table\n"); + return false; + } + xive_dbg(x, "VPi at %p size 0x%llx #%ld entries\n", x->vp_ind_base, + x->vp_ind_size, XIVE_VP_COUNT(x)); + x->vp_ind_count = XIVE_VP_TABLE_SIZE(x) / XIVE_VSD_SIZE; + memset(x->vp_ind_base, 0, x->vp_ind_size); + + /* Allocate pages for the VP ids representing HW threads */ + pbase = xive_hw_vp_base / VP_PER_PAGE; + pend = (xive_hw_vp_base + xive_hw_vp_count) / VP_PER_PAGE; + + xive_dbg(x, "Allocating pages %d to %d of VPs (for %d VPs)\n", + pbase, pend, xive_hw_vp_count); + for (i = pbase; i <= pend; i++) { + void *page; + u64 vsd; + + /* Indirect entries have a VSD format */ + page = local_alloc(x->chip_id, PAGE_SIZE, PAGE_SIZE); + if (!page) { + xive_err(x, "Failed to allocate VP page\n"); + return false; + } + xive_dbg(x, "VP%d at %p size 0x%x\n", i, page, PAGE_SIZE); + memset(page, 0, PAGE_SIZE); + vsd = ((uint64_t)page) & VSD_ADDRESS_MASK; + + vsd |= SETFIELD(VSD_TSIZE, 0ull, 4); + vsd |= SETFIELD(VSD_MODE, 0ull, VSD_MODE_EXCLUSIVE); + vsd |= VSD_FIRMWARE; + x->vp_ind_base[i] = cpu_to_be64(vsd); + } + + /* + * Allocate page for cache and sync injection (512 * 128 hw + * threads) + one extra page for future use + */ + x->sync_inject_size = PAGE_SIZE + PAGE_SIZE; + x->sync_inject = local_alloc(x->chip_id, x->sync_inject_size, + x->sync_inject_size); + if (!x->sync_inject) { + xive_err(x, "Failed to allocate sync pages\n"); + return false; + } + + /* Allocate the queue overflow pages */ + x->q_ovf = local_alloc(x->chip_id, VC_QUEUE_COUNT * PAGE_SIZE, PAGE_SIZE); + if (!x->q_ovf) { + xive_err(x, "Failed to allocate queue overflow\n"); + return false; + } + return true; +} + +static void xive_add_provisioning_properties(void) +{ + beint32_t chips[XIVE_MAX_CHIPS]; + uint32_t i, count; + + dt_add_property_cells(xive_dt_node, + "ibm,xive-provision-page-size", PAGE_SIZE); + + count = 1 << xive_chips_alloc_bits; + for (i = 0; i < count; i++) + chips[i] = cpu_to_be32(xive_block_to_chip[i]); + dt_add_property(xive_dt_node, "ibm,xive-provision-chips", + chips, 4 * count); +} + +static void xive_create_mmio_dt_node(struct xive *x) +{ + uint64_t tb = (uint64_t)x->tm_base; + uint32_t stride = 1u << x->tm_shift; + + xive_dt_node = dt_new_addr(dt_root, "interrupt-controller", tb); + assert(xive_dt_node); + + dt_add_property_u64s(xive_dt_node, "reg", + tb + 0 * stride, stride, + tb + 1 * stride, stride, + tb + 2 * stride, stride, + tb + 3 * stride, stride); + + dt_add_property_strings(xive_dt_node, "compatible", + "ibm,opal-xive-pe", "ibm,opal-intc"); + + dt_add_property_cells(xive_dt_node, "ibm,xive-eq-sizes", + 12, 16, 21, 24); + + dt_add_property_cells(xive_dt_node, "ibm,xive-#priorities", + xive_cfg_vp_prio(x)); + + dt_add_property(xive_dt_node, "single-escalation-support", NULL, 0); + + if (XIVE_CAN_STORE_EOI(x)) + dt_add_property(xive_dt_node, "store-eoi", NULL, 0); + + xive_add_provisioning_properties(); + +} + +static void xive_setup_forward_ports(struct xive *x, struct proc_chip *remote_chip) +{ + struct xive *remote_xive = remote_chip->xive; + uint64_t base = SETFIELD(VSD_MODE, 0ull, VSD_MODE_FORWARD); + + if (!xive_set_vsd(x, VST_ESB, remote_xive->block_id, + base | ((uint64_t)remote_xive->esb_base) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->esb_size) - 12))) + goto error; + + /* EAS: No remote */ + + if (!xive_set_vsd(x, VST_END, remote_xive->block_id, + base | ((uint64_t)remote_xive->end_base) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->end_size) - 12))) + goto error; + + if (!xive_set_vsd(x, VST_NVP, remote_xive->block_id, + base | ((uint64_t)remote_xive->nvp_base) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->nvp_size) - 12))) + goto error; + + /* NVG: not used */ + /* NVC: not used */ + + if (!xive_set_vsd(x, VST_IC, remote_xive->chip_id, + base | ((uint64_t)remote_xive->ic_base) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->ic_size) - 12))) + goto error; + + if (!xive_set_vsd(x, VST_SYNC, remote_xive->chip_id, + base | ((uint64_t)remote_xive->sync_inject) | + SETFIELD(VSD_TSIZE, 0ull, ilog2(x->sync_inject_size) - 12))) + goto error; + + /* ERQ: No remote */ + + return; + + error: + xive_err(x, "Failure configuring forwarding ports\n"); +} + +static void late_init_one_xive(struct xive *x) +{ + struct proc_chip *chip; + + /* We need to setup the cross-chip forward ports. Let's + * iterate all chip and set them up accordingly + */ + for_each_chip(chip) { + /* We skip ourselves or chips without a xive */ + if (chip->xive == x || !chip->xive) + continue; + + /* Setup our forward ports to that chip */ + xive_setup_forward_ports(x, chip); + } +} + +static bool xive_check_ipi_free(struct xive *x, uint32_t irq, uint32_t count) +{ + uint32_t i, idx = GIRQ_TO_IDX(irq); + + for (i = 0; i < count; i++) + if (bitmap_tst_bit(*x->ipi_alloc_map, idx + i)) + return false; + return true; +} + +uint32_t xive2_alloc_hw_irqs(uint32_t chip_id, uint32_t count, + uint32_t align) +{ + struct proc_chip *chip = get_chip(chip_id); + struct xive *x; + uint32_t base, i; + + assert(chip); + assert(is_pow2(align)); + + x = chip->xive; + assert(x); + + lock(&x->lock); + + /* Allocate the HW interrupts */ + base = x->int_hw_bot - count; + base &= ~(align - 1); + if (base < x->int_ipi_top) { + xive_err(x, + "HW alloc request for %d interrupts aligned to %d failed\n", + count, align); + unlock(&x->lock); + return XIVE_IRQ_ERROR; + } + if (!xive_check_ipi_free(x, base, count)) { + xive_err(x, "HWIRQ boot allocator request overlaps dynamic allocator\n"); + unlock(&x->lock); + return XIVE_IRQ_ERROR; + } + + x->int_hw_bot = base; + + /* Initialize the corresponding EAS entries to sane defaults, + * IE entry is valid, not routed and masked, EQ data is set + * to the GIRQ number. + */ + for (i = 0; i < count; i++) { + struct xive_eas *eas = xive_get_eas(x, base + i); + + eas->w = xive_set_field64(EAS_VALID, 0, 1) | + xive_set_field64(EAS_MASKED, 0, 1) | + xive_set_field64(EAS_END_DATA, 0, base + i); + } + + unlock(&x->lock); + return base; +} + +uint32_t xive2_alloc_ipi_irqs(uint32_t chip_id, uint32_t count, + uint32_t align) +{ + struct proc_chip *chip = get_chip(chip_id); + struct xive *x; + uint32_t base, i; + + assert(chip); + assert(is_pow2(align)); + + x = chip->xive; + assert(x); + + lock(&x->lock); + + /* Allocate the IPI interrupts */ + base = x->int_ipi_top + (align - 1); + base &= ~(align - 1); + if (base >= x->int_hw_bot) { + xive_err(x, + "IPI alloc request for %d interrupts aligned to %d failed\n", + count, align); + unlock(&x->lock); + return XIVE_IRQ_ERROR; + } + if (!xive_check_ipi_free(x, base, count)) { + xive_err(x, "IPI boot allocator request overlaps dynamic allocator\n"); + unlock(&x->lock); + return XIVE_IRQ_ERROR; + } + + x->int_ipi_top = base + count; + + /* Initialize the corresponding EAS entries to sane defaults, + * IE entry is valid, not routed and masked, END data is set + * to the GIRQ number. + */ + for (i = 0; i < count; i++) { + struct xive_eas *eas = xive_get_eas(x, base + i); + + eas->w = xive_set_field64(EAS_VALID, 0, 1) | + xive_set_field64(EAS_MASKED, 0, 1) | + xive_set_field64(EAS_END_DATA, 0, base + i); + } + + unlock(&x->lock); + return base; +} + +void *xive2_get_trigger_port(uint32_t girq) +{ + uint32_t idx = GIRQ_TO_IDX(girq); + struct xive *x; + + /* Find XIVE on which the EAS resides */ + x = xive_from_isn(girq); + if (!x) + return NULL; + + if (GIRQ_IS_ESCALATION(girq)) { + /* There is no trigger page for escalation interrupts */ + return NULL; + } else { + /* Make sure it's an IPI on that chip */ + if (girq < x->int_base || + girq >= x->int_ipi_top) + return NULL; + + return x->esb_base + idx * XIVE_ESB_PAGE_SIZE; + } +} + +/* + * Notify Port page (writes only, w/data), separated into two + * categories, both sent to VC: + * - IPI queue (Addr bit 52 = 0) (for NPU) + * - HW queue (Addr bit 52 = 1) + */ +uint64_t xive2_get_notify_port(uint32_t chip_id, uint32_t ent) +{ + struct proc_chip *chip = get_chip(chip_id); + struct xive *x; + uint32_t offset = 0; + + assert(chip); + x = chip->xive; + assert(x); + + /* This is where we can assign a different HW queue to a different + * source by offsetting into the cache lines of the notify port + * + * For now we keep it very basic, this will have to be looked at + * again on real HW with some proper performance analysis. + * + * Here's what Florian says on the matter: + * + * << + * The first 2k of the notify port page can all be used for PCIe triggers + * + * However the idea would be that we try to use the first 4 cache lines to + * balance the PCIe Interrupt requests to use the least used snoop buses + * (we went from 2 to 4 snoop buses for P9). snoop 0 is heavily used + * (I think TLBIs are using that in addition to the normal addresses), + * snoop 3 is used for all Int commands, so I think snoop 2 (CL 2 in the + * page) is the least used overall. So we probably should that one for + * the Int commands from PCIe. + * + * In addition, our EAS cache supports hashing to provide "private" cache + * areas for the PHBs in the shared 1k EAS cache. This allows e.g. to avoid + * that one "thrashing" PHB thrashes the EAS cache for everyone, or provide + * a PHB with a private area that would allow high cache hits in case of a + * device using very few interrupts. The hashing is based on the offset within + * the cache line. So using that, you can e.g. set the EAS cache up so that + * IPIs use 512 entries, the x16 PHB uses 256 entries and the x8 PHBs 128 + * entries each - or IPIs using all entries and sharing with PHBs, so PHBs + * would use 512 entries and 256 entries respectively. + * + * This is a tuning we would probably do later in the lab, but as a "prep" + * we should set up the different PHBs such that they are using different + * 8B-aligned offsets within the cache line, so e.g. + * PH4_0 addr 0x100 (CL 2 DW0 + * PH4_1 addr 0x108 (CL 2 DW1) + * PH4_2 addr 0x110 (CL 2 DW2) + * etc. + * >> + * + * I'm using snoop1 for PHB0 and snoop2 for everybody else. + */ + + /* Florian adds : + * + * we just set them up for a start to have different offsets + * within the cache line so that we could use the allocation + * restrictions that can be enforced in the interrupt + * controller + * + * P10 might now be randomizing the cache line bits in HW to + * balance snoop bus usage + * + * TODO (phb5) : implement "address based triggers" (DD2.0?) + * + * The PHBs would no longer target the notify port page but + * the "base ESB MMIO address" of the ESB/EAS range they are + * allocated. Needs a XIVE API change for the PHBs. + */ + switch(ent) { + case XIVE_HW_SRC_PHBn(0): + offset = 0x800; + break; + case XIVE_HW_SRC_PHBn(1): + offset = 0x908; + break; + case XIVE_HW_SRC_PHBn(2): + offset = 0x910; + break; + case XIVE_HW_SRC_PHBn(3): + offset = 0x918; + break; + case XIVE_HW_SRC_PHBn(4): + offset = 0x920; + break; + case XIVE_HW_SRC_PHBn(5): + offset = 0x928; + break; + case XIVE_HW_SRC_PSI: + offset = 0x930; + break; + default: + assert(false); + return 0; + } + + return ((uint64_t)x->ic_base) + + (XIVE_NOTIFY_PGOFF << x->ic_shift) + offset; +} + +/* Manufacture the powerbus packet bits 32:63 */ +__attrconst uint32_t xive2_get_notify_base(uint32_t girq) +{ + return (GIRQ_TO_BLK(girq) << 28) | GIRQ_TO_IDX(girq); +} + +static bool xive_get_irq_targetting(uint32_t isn, uint32_t *out_target, + uint8_t *out_prio, uint32_t *out_lirq) +{ + struct xive_eas *eas; + struct xive *x, *end_x; + struct xive_end *end; + uint32_t end_blk, end_idx; + uint32_t vp_blk, vp_idx; + uint32_t prio, server; + bool is_escalation = GIRQ_IS_ESCALATION(isn); + + /* Find XIVE on which the EAS resides */ + x = xive_from_isn(isn); + if (!x) + return false; + /* Grab the EAS */ + eas = xive_get_eas(x, isn); + if (!eas) + return false; + if (!xive_get_field64(EAS_VALID, eas->w) && !is_escalation) { + xive_err(x, "ISN %x lead to invalid EAS !\n", isn); + return false; + } + + if (out_lirq) + *out_lirq = xive_get_field64(EAS_END_DATA, eas->w); + + /* Find the END and its xive instance */ + end_blk = xive_get_field64(EAS_END_BLOCK, eas->w); + end_idx = xive_get_field64(EAS_END_INDEX, eas->w); + end_x = xive_from_vc_blk(end_blk); + + /* This can fail if the interrupt hasn't been initialized yet + * but it should also be masked, so fail silently + */ + if (!end_x) + goto pick_default; + end = xive_get_end(end_x, end_idx); + if (!end) + goto pick_default; + + /* XXX Check valid and format 0 */ + + /* No priority conversion, return the actual one ! */ + if (xive_get_field64(EAS_MASKED, eas->w)) + prio = 0xff; + else + prio = xive_get_field32(END_W7_F0_PRIORITY, end->w7); + if (out_prio) + *out_prio = prio; + + vp_blk = xive_get_field32(END_W6_VP_BLOCK, end->w6); + vp_idx = xive_get_field32(END_W6_VP_OFFSET, end->w6); + server = VP2PIR(vp_blk, vp_idx); + + if (out_target) + *out_target = server; + + xive_vdbg(end_x, "END info for ISN %x: prio=%d, server=0x%x (VP %x/%x)\n", + isn, prio, server, vp_blk, vp_idx); + return true; + +pick_default: + xive_vdbg(end_x, "END info for ISN %x: Using masked defaults\n", isn); + + if (out_prio) + *out_prio = 0xff; + /* Pick a random default, me will be fine ... */ + if (out_target) + *out_target = mfspr(SPR_PIR); + return true; +} + +static inline bool xive_end_for_target(uint32_t target, uint8_t prio, + uint32_t *out_end_blk, + uint32_t *out_end_idx) +{ + struct xive *x; + struct xive_nvp *vp; + uint32_t vp_blk, vp_idx; + uint32_t end_blk, end_idx; + + if (prio > xive_max_prio(one_xive)) + return false; + + /* Get the VP block/index from the target word */ + if (!xive_decode_vp(target, &vp_blk, &vp_idx, NULL, NULL)) + return false; + + /* Grab the target VP's XIVE */ + x = xive_from_pc_blk(vp_blk); + if (!x) + return false; + + /* Find the VP structrure where we stashed the END number */ + vp = xive_get_vp(x, vp_idx); + if (!vp) + return false; + + end_blk = xive_get_field32(NVP_W5_VP_END_BLOCK, vp->w5); + end_idx = xive_get_field32(NVP_W5_VP_END_INDEX, vp->w5); + + /* Currently the END block and VP block should be the same */ + if (end_blk != vp_blk) { + xive_err(x, "end_blk != vp_blk (%d vs. %d) for target 0x%08x/%d\n", + end_blk, vp_blk, target, prio); + assert(false); + } + + if (out_end_blk) + *out_end_blk = end_blk; + if (out_end_idx) + *out_end_idx = end_idx + prio; + + return true; +} + +static int64_t xive_set_irq_targetting(uint32_t isn, uint32_t target, + uint8_t prio, uint32_t lirq, + bool synchronous) +{ + struct xive *x; + struct xive_eas *eas, new_eas; + uint32_t end_blk, end_idx; + bool is_escalation = GIRQ_IS_ESCALATION(isn); + int64_t rc; + + /* Find XIVE on which the EAS resides */ + x = xive_from_isn(isn); + if (!x) + return OPAL_PARAMETER; + /* Grab the EAS */ + eas = xive_get_eas(x, isn); + if (!eas) + return OPAL_PARAMETER; + if (!xive_get_field64(EAS_VALID, eas->w) && !is_escalation) { + xive_err(x, "ISN %x lead to invalid EAS !\n", isn); + return OPAL_PARAMETER; + } + + lock(&x->lock); + + /* Read existing EAS */ + new_eas = *eas; + + /* Are we masking ? */ + if (prio == 0xff && !is_escalation) { + new_eas.w = xive_set_field64(EAS_MASKED, new_eas.w, 1); + xive_vdbg(x, "ISN %x masked !\n", isn); + + /* Put prio 7 in the END */ + prio = xive_max_prio(x); + } else { + /* Unmasking */ + new_eas.w = xive_set_field64(EAS_MASKED, new_eas.w, 0); + xive_vdbg(x, "ISN %x unmasked !\n", isn); + + /* For normal interrupt sources, keep track of which ones + * we ever enabled since the last reset + */ + if (!is_escalation) + bitmap_set_bit(*x->int_enabled_map, GIRQ_TO_IDX(isn)); + } + + /* If prio isn't 0xff, re-target the EAS. First find the END + * correponding to the target + */ + if (prio != 0xff) { + if (!xive_end_for_target(target, prio, &end_blk, &end_idx)) { + xive_err(x, "Can't find END for target/prio 0x%x/%d\n", + target, prio); + unlock(&x->lock); + return OPAL_PARAMETER; + } + + /* Try to update it atomically to avoid an intermediary + * stale state + */ + new_eas.w = xive_set_field64(EAS_END_BLOCK, new_eas.w, end_blk); + new_eas.w = xive_set_field64(EAS_END_INDEX, new_eas.w, end_idx); + } + new_eas.w = xive_set_field64(EAS_END_DATA, new_eas.w, lirq); + + xive_vdbg(x,"ISN %x routed to end %x/%x lirq=%08x EAS=%016llx !\n", + isn, end_blk, end_idx, lirq, new_eas.w); + + /* Updating the cache differs between real EAS and escalation + * EAS inside an END + */ + if (is_escalation) { + rc = xive_escalation_ive_cache_update(x, x->block_id, + GIRQ_TO_IDX(isn), &new_eas, synchronous); + } else { + sync(); + *eas = new_eas; + rc = xive_easc_scrub(x, x->block_id, GIRQ_TO_IDX(isn)); + } + + unlock(&x->lock); + return rc; +} + +static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked) +{ + void *mmio_base = s->esb_mmio + (1ul << s->esb_shift) * idx; + uint32_t offset; + + /* XXX FIXME: A quick mask/umask can make us shoot an interrupt + * more than once to a queue. We need to keep track better + */ + if (s->flags & XIVE_SRC_EOI_PAGE1) + mmio_base += 1ull << (s->esb_shift - 1); + if (masked) + offset = XIVE_ESB_SET_PQ_01; + else + offset = XIVE_ESB_SET_PQ_00; + + in_be64(mmio_base + offset); +} + +#define XIVE_SYNC_IPI 0x000 +#define XIVE_SYNC_HW 0x080 +#define XIVE_SYNC_NxC 0x100 +#define XIVE_SYNC_INT 0x180 +#define XIVE_SYNC_OS_ESC 0x200 +#define XIVE_SYNC_POOL_ESC 0x280 +#define XIVE_SYNC_HARD_ESC 0x300 + +static int64_t xive_sync(struct xive *x __unused) +{ + uint64_t r; + void *sync_base; + + lock(&x->lock); + + sync_base = x->ic_base + (XIVE_SYNC_POLL_PGOFF << x->ic_shift); + + out_be64(sync_base + XIVE_SYNC_IPI, 0); + out_be64(sync_base + XIVE_SYNC_HW, 0); + out_be64(sync_base + XIVE_SYNC_NxC, 0); + out_be64(sync_base + XIVE_SYNC_INT, 0); + out_be64(sync_base + XIVE_SYNC_OS_ESC, 0); + out_be64(sync_base + XIVE_SYNC_POOL_ESC, 0); + out_be64(sync_base + XIVE_SYNC_HARD_ESC, 0); + + /* XXX Add timeout */ + for (;;) { + r = xive_regr(x, VC_ENDC_SYNC_DONE); + if ((r & VC_ENDC_SYNC_POLL_DONE) == VC_ENDC_SYNC_POLL_DONE) + break; + cpu_relax(); + } + xive_regw(x, VC_ENDC_SYNC_DONE, r & ~VC_ENDC_SYNC_POLL_DONE); + + /* + * Do a read after clearing the sync done bit to prevent any + * race between CI write and next sync command + */ + xive_regr(x, VC_ENDC_SYNC_DONE); + + unlock(&x->lock); + return 0; +} + +static int64_t __xive_set_irq_config(struct irq_source *is, uint32_t girq, + uint64_t vp, uint8_t prio, uint32_t lirq, + bool update_esb, bool sync) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + uint32_t old_target, vp_blk; + u8 old_prio; + int64_t rc; + + /* Grab existing target */ + if (!xive_get_irq_targetting(girq, &old_target, &old_prio, NULL)) + return OPAL_PARAMETER; + + /* Let XIVE configure the END. We do the update without the + * synchronous flag, thus a cache update failure will result + * in us returning OPAL_BUSY + */ + rc = xive_set_irq_targetting(girq, vp, prio, lirq, false); + if (rc) + return rc; + + /* Do we need to update the mask ? */ + if (old_prio != prio && (old_prio == 0xff || prio == 0xff)) { + /* The source has special variants of masking/unmasking */ + if (update_esb) { + /* Ensure it's enabled/disabled in the source + * controller + */ + xive_update_irq_mask(s, girq - s->esb_base, + prio == 0xff); + } + } + + /* + * Synchronize the source and old target XIVEs to ensure that + * all pending interrupts to the old target have reached their + * respective queue. + * + * WARNING: This assumes the VP and it's queues are on the same + * XIVE instance ! + */ + if (!sync) + return OPAL_SUCCESS; + xive_sync(s->xive); + if (xive_decode_vp(old_target, &vp_blk, NULL, NULL, NULL)) { + struct xive *x = xive_from_pc_blk(vp_blk); + if (x) + xive_sync(x); + } + + return OPAL_SUCCESS; +} + +static int64_t xive_set_irq_config(uint32_t girq, uint64_t vp, uint8_t prio, + uint32_t lirq, bool update_esb) +{ + struct irq_source *is = irq_find_source(girq); + + return __xive_set_irq_config(is, girq, vp, prio, lirq, update_esb, + true); +} + +static void xive_source_interrupt(struct irq_source *is, uint32_t isn) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + + if (!s->orig_ops || !s->orig_ops->interrupt) + return; + s->orig_ops->interrupt(is, isn); +} + +static uint64_t xive_source_attributes(struct irq_source *is, uint32_t isn) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + + if (!s->orig_ops || !s->orig_ops->attributes) + return IRQ_ATTR_TARGET_LINUX; + return s->orig_ops->attributes(is, isn); +} + +static char *xive_source_name(struct irq_source *is, uint32_t isn) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + + if (!s->orig_ops || !s->orig_ops->name) + return NULL; + return s->orig_ops->name(is, isn); +} + +static const struct irq_source_ops xive_irq_source_ops = { + .interrupt = xive_source_interrupt, + .attributes = xive_source_attributes, + .name = xive_source_name, +}; + +static void __xive_register_source(struct xive *x, struct xive_src *s, + uint32_t base, uint32_t count, + uint32_t shift, void *mmio, uint32_t flags, + bool secondary, void *data, + const struct irq_source_ops *orig_ops) +{ + s->esb_base = base; + s->esb_shift = shift; + s->esb_mmio = mmio; + s->flags = flags; + s->orig_ops = orig_ops; + s->xive = x; + s->is.start = base; + s->is.end = base + count; + s->is.ops = &xive_irq_source_ops; + s->is.data = data; + + __register_irq_source(&s->is, secondary); +} + +void xive2_register_hw_source(uint32_t base, uint32_t count, uint32_t shift, + void *mmio, uint32_t flags, void *data, + const struct irq_source_ops *ops) +{ + struct xive_src *s; + struct xive *x = xive_from_isn(base); + + assert(x); + + s = malloc(sizeof(struct xive_src)); + assert(s); + __xive_register_source(x, s, base, count, shift, mmio, flags, + false, data, ops); +} + +void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, + const struct irq_source_ops *ops) +{ + struct xive_src *s; + struct xive *x = xive_from_isn(base); + uint32_t base_idx = GIRQ_TO_IDX(base); + void *mmio_base; + uint32_t flags = XIVE_SRC_EOI_PAGE1 | XIVE_SRC_TRIGGER_PAGE; + + assert(x); + assert(base >= x->int_base && (base + count) <= x->int_ipi_top); + + s = malloc(sizeof(struct xive_src)); + assert(s); + + if (XIVE_CAN_STORE_EOI(x)) + flags |= XIVE_SRC_STORE_EOI; + + /* Callbacks assume the MMIO base corresponds to the first + * interrupt of that source structure so adjust it + */ + mmio_base = x->esb_base + (1ul << XIVE_ESB_SHIFT) * base_idx; + __xive_register_source(x, s, base, count, XIVE_ESB_SHIFT, mmio_base, + flags, false, data, ops); +} + +static void xive_set_quirks(struct xive *x, struct proc_chip *chip __unused) +{ + uint64_t quirks = 0; + + /* This extension is dropped for P10 */ + if (proc_gen == proc_gen_p10) + quirks |= XIVE_QUIRK_THREADID_7BITS; + + /* Broken check on invalid priority when reduced priorities is in use */ + if (proc_gen == proc_gen_p10) + quirks |= XIVE_QUIRK_BROKEN_PRIO_CHECK; + + xive_dbg(x, "setting XIVE quirks to %016llx\n", quirks); + x->quirks = quirks; +} + +static struct xive *init_one_xive(struct dt_node *np) +{ + struct xive *x; + struct proc_chip *chip; + uint32_t flags; + + x = zalloc(sizeof(struct xive)); + assert(x); + x->x_node = np; + x->xscom_base = dt_get_address(np, 0, NULL); + x->chip_id = dt_get_chip_id(np); + + /* "Allocate" a new block ID for the chip */ + x->block_id = xive_block_count++; + assert (x->block_id < XIVE_MAX_CHIPS); + xive_block_to_chip[x->block_id] = x->chip_id; + init_lock(&x->lock); + + chip = get_chip(x->chip_id); + assert(chip); + + xive_notice(x, "Initializing XIVE block ID %d...\n", x->block_id); + chip->xive = x; + + xive_set_quirks(x, chip); + + list_head_init(&x->donated_pages); + + /* Base interrupt numbers and allocator init */ + + x->int_base = BLKIDX_TO_GIRQ(x->block_id, 0); + x->int_count = x->int_base + XIVE_INT_COUNT; + x->int_hw_bot = x->int_count; + x->int_ipi_top = x->int_base; + + if (x->int_ipi_top < XIVE_INT_FIRST) + x->int_ipi_top = XIVE_INT_FIRST; + + /* Allocate a few bitmaps */ + x->end_map = local_alloc(x->chip_id, BITMAP_BYTES(xive_end_bitmap_size(x)), PAGE_SIZE); + assert(x->end_map); + memset(x->end_map, 0, BITMAP_BYTES(xive_end_bitmap_size(x))); + + /* + * Allocate END index 0 to make sure it can not be used as an + * END base for a VP. This is the criteria to know if a VP was + * allocated. + */ + bitmap_set_bit(*x->end_map, 0); + + x->int_enabled_map = local_alloc(x->chip_id, BITMAP_BYTES(XIVE_INT_COUNT), PAGE_SIZE); + assert(x->int_enabled_map); + memset(x->int_enabled_map, 0, BITMAP_BYTES(XIVE_INT_COUNT)); + x->ipi_alloc_map = local_alloc(x->chip_id, BITMAP_BYTES(XIVE_INT_COUNT), PAGE_SIZE); + assert(x->ipi_alloc_map); + memset(x->ipi_alloc_map, 0, BITMAP_BYTES(XIVE_INT_COUNT)); + + xive_dbg(x, "Handling interrupts [%08x..%08x]\n", + x->int_base, x->int_count - 1); + + /* Setup the IC BARs */ + if (!xive_configure_ic_bars(x)) + goto fail; + + /* Some basic global inits such as page sizes etc... */ + if (!xive_config_init(x)) + goto fail; + + /* Configure the set translations for MMIO */ + if (!xive_setup_set_xlate(x)) + goto fail; + + /* Dump some MMIO registers for diagnostics */ + xive_dump_mmio(x); + + /* Pre-allocate a number of tables */ + if (!xive_prealloc_tables(x)) + goto fail; + + /* Setup the XIVE structures BARs */ + if (!xive_configure_bars(x)) + goto fail; + + /* + * Configure local tables in VSDs (forward ports will be + * handled later) + */ + if (!xive_set_local_tables(x)) + goto fail; + + /* Register built-in source controllers (aka IPIs) */ + flags = XIVE_SRC_EOI_PAGE1 | XIVE_SRC_TRIGGER_PAGE; + if (XIVE_CAN_STORE_EOI(x)) + flags |= XIVE_SRC_STORE_EOI; + __xive_register_source(x, &x->ipis, x->int_base, + x->int_hw_bot - x->int_base, XIVE_ESB_SHIFT, + x->esb_base, flags, true, NULL, NULL); + + /* Register escalation sources (ENDs) + * + * The ESe PQ bits are used for coalescing and the END ESB for + * interrupt management. The word 4&5 of the END is the EAS + * for the escalation source and the indexing is the same as + * the END. + * + * This is an OPAL primary source, IPIs are secondary. + */ + __xive_register_source(x, &x->esc_irqs, + MAKE_ESCALATION_GIRQ(x->block_id, 0), + XIVE_END_COUNT, XIVE_END_SHIFT, + x->end_base, XIVE_SRC_EOI_PAGE1, + false, NULL, NULL); + + + return x; + fail: + xive_err(x, "Initialization failed...\n"); + + /* Should this be fatal ? */ + //assert(false); + return NULL; +} + +static void xive_reset_enable_thread(struct cpu_thread *c) +{ + struct proc_chip *chip = get_chip(c->chip_id); + struct xive *x = chip->xive; + uint32_t fc, bit; + uint64_t enable; + + /* Get fused core number */ + fc = (c->pir >> 3) & 0xf; + + /* Get bit in register */ + bit = c->pir & 0x3f; + + /* Get which register to access */ + if (fc < 8) { + xive_regw(x, TCTXT_EN0_RESET, PPC_BIT(bit)); + xive_regw(x, TCTXT_EN0_SET, PPC_BIT(bit)); + + enable = xive_regr(x, TCTXT_EN0); + if (!(enable & PPC_BIT(bit))) + xive_cpu_err(c, "Failed to enable thread\n"); + } else { + xive_regw(x, TCTXT_EN1_RESET, PPC_BIT(bit)); + xive_regw(x, TCTXT_EN1_SET, PPC_BIT(bit)); + + enable = xive_regr(x, TCTXT_EN1); + if (!(enable & PPC_BIT(bit))) + xive_cpu_err(c, "Failed to enable thread\n"); + } +} + +void xive2_cpu_callin(struct cpu_thread *cpu) +{ + struct xive_cpu_state *xs = cpu->xstate; + uint8_t old_w2 __unused, w2 __unused; + + if (!xs) + return; + + /* Reset the HW thread context and enable it */ + xive_reset_enable_thread(cpu); + + /* Set VT to 1 */ + old_w2 = in_8(xs->tm_ring1 + TM_QW3_HV_PHYS + TM_WORD2); + out_8(xs->tm_ring1 + TM_QW3_HV_PHYS + TM_WORD2, 0x80); + w2 = in_8(xs->tm_ring1 + TM_QW3_HV_PHYS + TM_WORD2); + + xive_cpu_vdbg(cpu, "Initialized TIMA VP=%x/%x W01=%016llx W2=%02x->%02x\n", + xs->vp_blk, xs->vp_idx, + in_be64(xs->tm_ring1 + TM_QW3_HV_PHYS), + old_w2, w2); +} + +#ifdef XIVE_EXTRA_CHECK_INIT_CACHE +#define CHECK_INIT_CACHE_LOOP 0x100 +static void xive_special_cache_check(struct xive *x, uint32_t blk, uint32_t idx) +{ + struct xive_nvp vp = {0}; + uint32_t i; + + /* + * SIMICS checks the value of reserved fields + */ + if (chip_quirk(QUIRK_SIMICS)) + return; + + for (i = 0; i < CHECK_INIT_CACHE_LOOP; i++) { + struct xive_nvp *vp_m = xive_get_vp(x, idx); + + memset(vp_m, (~i) & 0xff, sizeof(*vp_m)); + sync(); + vp.w1 = (i << 16) | i; + assert(!xive_nxc_cache_update(x, blk, idx, &vp, true)); + if (!xive_check_nxc_update(x, idx, &vp)) { + xive_dbg(x, "NXC update test failed at %d iterations\n", i); + return; + } + } + xive_dbg(x, "NXC update test passed for %d/0x%x\n", blk, idx); +} +#else +static inline void xive_special_cache_check(struct xive *x __unused, + uint32_t blk __unused, + uint32_t idx __unused) +{ +} +#endif + +static void xive_init_cpu_exploitation(struct xive_cpu_state *xs) +{ + struct xive_end end; + struct xive_nvp vp; + struct xive *x_vp, *x_end; + int i; + + /* Grab the XIVE where the VP resides. It could be different from + * the local chip XIVE if not using block group mode + */ + x_vp = xive_from_pc_blk(xs->vp_blk); + assert(x_vp); + + /* Grab the XIVE where the END resides. It should be the same + * as the VP. + */ + x_end = xive_from_vc_blk(xs->end_blk); + assert(x_end); + + xive_init_hw_end(&end); + + /* Use the cache watch to update all ENDs reserved for HW VPs */ + lock(&x_end->lock); + for (i = 0; i < xive_cfg_vp_prio(x_end); i++) + xive_endc_cache_update(x_end, xs->end_blk, xs->end_idx + i, + &end, true); + unlock(&x_end->lock); + + /* Initialize/enable the VP */ + xive_init_default_vp(&vp, xs->end_blk, xs->end_idx); + + /* Use the cache watch to write it out */ + lock(&x_vp->lock); + xive_special_cache_check(x_vp, xs->vp_blk, xs->vp_idx); + xive_nxc_cache_update(x_vp, xs->vp_blk, xs->vp_idx, &vp, true); + unlock(&x_vp->lock); +} + +static void xive_configure_ex_special_bar(struct xive *x, struct cpu_thread *c) +{ + uint64_t xa, val; + int64_t rc; + + xive_cpu_vdbg(c, "Setting up special BAR\n"); + xa = XSCOM_ADDR_P10_NCU(pir_to_core_id(c->pir), P10_NCU_SPEC_BAR); + val = (uint64_t)x->tm_base | P10_NCU_SPEC_BAR_ENABLE; + if (x->tm_shift == 16) + val |= P10_NCU_SPEC_BAR_256K; + xive_cpu_vdbg(c, "NCU_SPEC_BAR_XA[%08llx]=%016llx\n", xa, val); + rc = xscom_write(c->chip_id, xa, val); + if (rc) { + xive_cpu_err(c, "Failed to setup NCU_SPEC_BAR\n"); + /* XXXX what do do now ? */ + } +} + +void xive2_late_init(void) +{ + prlog(PR_INFO, "SLW: Configuring self-restore for NCU_SPEC_BAR\n"); + /* + * TODO (p10): need P10 stop state engine and fix for STOP11 + */ +} + +static void xive_provision_cpu(struct xive_cpu_state *xs, struct cpu_thread *c) +{ + struct xive *x; + + /* VP ids for HW threads are pre-allocated */ + xs->vp_blk = PIR2VP_BLK(c->pir); + xs->vp_idx = PIR2VP_IDX(c->pir); + + /* For now we use identical block IDs for VC and PC but that might + * change. We allocate the ENDs on the same XIVE as the VP. + */ + xs->end_blk = xs->vp_blk; + + /* Grab the XIVE where the END resides. It could be different from + * the local chip XIVE if not using block group mode + */ + x = xive_from_vc_blk(xs->end_blk); + assert(x); + + /* Allocate a set of ENDs for that VP */ + xs->end_idx = xive_alloc_end_set(x, true); + assert(!XIVE_ALLOC_IS_ERR(xs->end_idx)); +} + +static void xive_init_cpu(struct cpu_thread *c) +{ + struct proc_chip *chip = get_chip(c->chip_id); + struct xive *x = chip->xive; + struct xive_cpu_state *xs; + + if (!x) + return; + + /* + * Each core pair (EX) needs this special BAR setup to have the + * right powerbus cycle for the TM area (as it has the same address + * on all chips so it's somewhat special). + * + * Because we don't want to bother trying to figure out which core + * of a pair is present we just do the setup for each of them, which + * is harmless. + */ + if (cpu_is_thread0(c)) + xive_configure_ex_special_bar(x, c); + + /* Initialize the state structure */ + c->xstate = xs = local_alloc(c->chip_id, sizeof(struct xive_cpu_state), 1); + assert(xs); + memset(xs, 0, sizeof(struct xive_cpu_state)); + xs->xive = x; + + init_lock(&xs->lock); + + /* Shortcut to TM HV ring */ + xs->tm_ring1 = x->tm_base + (1u << x->tm_shift); + + /* Provision a VP id and some ENDs for a HW thread */ + xive_provision_cpu(xs, c); + + xive_init_cpu_exploitation(xs); +} + +static uint64_t xive_convert_irq_flags(uint64_t iflags) +{ + uint64_t oflags = 0; + + if (iflags & XIVE_SRC_STORE_EOI) + oflags |= OPAL_XIVE_IRQ_STORE_EOI; + + /* OPAL_XIVE_IRQ_TRIGGER_PAGE is only meant to be set if + * the interrupt has a *separate* trigger page. + */ + if ((iflags & XIVE_SRC_EOI_PAGE1) && + (iflags & XIVE_SRC_TRIGGER_PAGE)) + oflags |= OPAL_XIVE_IRQ_TRIGGER_PAGE; + + if (iflags & XIVE_SRC_LSI) + oflags |= OPAL_XIVE_IRQ_LSI; + + return oflags; +} + +static int64_t opal_xive_get_irq_info(uint32_t girq, + beint64_t *out_flags, + beint64_t *out_eoi_page, + beint64_t *out_trig_page, + beint32_t *out_esb_shift, + beint32_t *out_src_chip) +{ + struct irq_source *is = irq_find_source(girq); + struct xive_src *s = container_of(is, struct xive_src, is); + uint32_t idx; + uint64_t mm_base; + uint64_t eoi_page = 0, trig_page = 0; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + if (is == NULL || out_flags == NULL) + return OPAL_PARAMETER; + assert(is->ops == &xive_irq_source_ops); + + if (out_flags) + *out_flags = cpu_to_be64(xive_convert_irq_flags(s->flags)); + + idx = girq - s->esb_base; + + if (out_esb_shift) + *out_esb_shift = cpu_to_be32(s->esb_shift); + + mm_base = (uint64_t)s->esb_mmio + (1ull << s->esb_shift) * idx; + + /* The EOI page can either be the first or second page */ + if (s->flags & XIVE_SRC_EOI_PAGE1) { + uint64_t p1off = 1ull << (s->esb_shift - 1); + eoi_page = mm_base + p1off; + } else + eoi_page = mm_base; + + /* The trigger page, if it exists, is always the first page */ + if (s->flags & XIVE_SRC_TRIGGER_PAGE) + trig_page = mm_base; + + if (out_eoi_page) + *out_eoi_page = cpu_to_be64(eoi_page); + if (out_trig_page) + *out_trig_page = cpu_to_be64(trig_page); + if (out_src_chip) + *out_src_chip = cpu_to_be32(GIRQ_TO_CHIP(girq)); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_get_irq_config(uint32_t girq, + beint64_t *out_vp, + uint8_t *out_prio, + beint32_t *out_lirq) +{ + uint32_t vp; + uint32_t lirq; + uint8_t prio; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (xive_get_irq_targetting(girq, &vp, &prio, &lirq)) { + *out_vp = cpu_to_be64(vp); + *out_prio = prio; + *out_lirq = cpu_to_be32(lirq); + return OPAL_SUCCESS; + } else + return OPAL_PARAMETER; +} + +static int64_t opal_xive_set_irq_config(uint32_t girq, + uint64_t vp, + uint8_t prio, + uint32_t lirq) +{ + /* + * This variant is meant for a XIVE-aware OS, thus it will + * *not* affect the ESB state of the interrupt. If used with + * a prio of FF, the EAS will be masked. In that case the + * races have to be handled by the OS. + */ + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + return xive_set_irq_config(girq, vp, prio, lirq, false); +} + +static int64_t opal_xive_get_queue_info(uint64_t vp, uint32_t prio, + beint64_t *out_qpage, + beint64_t *out_qsize, + beint64_t *out_qeoi_page, + beint32_t *out_escalate_irq, + beint64_t *out_qflags) +{ + uint32_t blk, idx; + struct xive *x; + struct xive_end *end; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (!xive_end_for_target(vp, prio, &blk, &idx)) + return OPAL_PARAMETER; + + x = xive_from_vc_blk(blk); + if (!x) + return OPAL_PARAMETER; + + end = xive_get_end(x, idx); + if (!end) + return OPAL_PARAMETER; + + if (out_escalate_irq) { + uint32_t esc_idx = idx; + + /* If escalations are routed to a single queue, fix up + * the escalation interrupt number here. + */ + if (xive_get_field32(END_W0_UNCOND_ESCALATE, end->w0)) + esc_idx |= xive_escalation_prio(x); + *out_escalate_irq = + cpu_to_be32(MAKE_ESCALATION_GIRQ(blk, esc_idx)); + } + + /* If this is a single-escalation gather queue, that's all + * there is to return + */ + if (xive_get_field32(END_W0_SILENT_ESCALATE, end->w0)) { + if (out_qflags) + *out_qflags = 0; + if (out_qpage) + *out_qpage = 0; + if (out_qsize) + *out_qsize = 0; + if (out_qeoi_page) + *out_qeoi_page = 0; + return OPAL_SUCCESS; + } + + if (out_qpage) { + if (xive_get_field32(END_W0_ENQUEUE, end->w0)) + *out_qpage = cpu_to_be64( + ((uint64_t)xive_get_field32(END_W2_EQ_ADDR_HI, end->w2) << 32) | + xive_get_field32(END_W3_EQ_ADDR_LO, end->w3)); + else + *out_qpage = 0; + } + if (out_qsize) { + if (xive_get_field32(END_W0_ENQUEUE, end->w0)) + *out_qsize = cpu_to_be64(xive_get_field32(END_W3_QSIZE, end->w3) + 12); + else + *out_qsize = 0; + } + if (out_qeoi_page) { + *out_qeoi_page = cpu_to_be64( + (uint64_t)x->end_base + idx * XIVE_ESB_PAGE_SIZE); + } + if (out_qflags) { + *out_qflags = 0; + if (xive_get_field32(END_W0_VALID, end->w0)) + *out_qflags |= cpu_to_be64(OPAL_XIVE_EQ_ENABLED); + if (xive_get_field32(END_W0_UCOND_NOTIFY, end->w0)) + *out_qflags |= cpu_to_be64(OPAL_XIVE_EQ_ALWAYS_NOTIFY); + if (xive_get_field32(END_W0_ESCALATE_CTL, end->w0)) + *out_qflags |= cpu_to_be64(OPAL_XIVE_EQ_ESCALATE); + } + + return OPAL_SUCCESS; +} + +static void xive_cleanup_end(struct xive_end *end) +{ + end->w0 = xive_set_field32(END_W0_FIRMWARE1, 0, xive_end_is_firmware1(end)); + end->w1 = xive_set_field32(END_W1_ESe_Q, 0, 1) | + xive_set_field32(END_W1_ESn_Q, 0, 1); + end->w2 = end->w3 = end->w4 = end->w5 = end->w6 = end->w7 = 0; +} + +static int64_t opal_xive_set_queue_info(uint64_t vp, uint32_t prio, + uint64_t qpage, + uint64_t qsize, + uint64_t qflags) +{ + uint32_t blk, idx; + struct xive *x; + struct xive_end *old_end; + struct xive_end end; + uint32_t vp_blk, vp_idx; + bool group; + int64_t rc; + + if (!xive_end_for_target(vp, prio, &blk, &idx)) + return OPAL_PARAMETER; + + x = xive_from_vc_blk(blk); + if (!x) + return OPAL_PARAMETER; + + old_end = xive_get_end(x, idx); + if (!old_end) + return OPAL_PARAMETER; + + /* If this is a silent escalation queue, it cannot be + * configured directly + */ + if (xive_get_field32(END_W0_SILENT_ESCALATE, old_end->w0)) + return OPAL_PARAMETER; + + /* This shouldn't fail or xive_end_for_target would have + * failed already + */ + if (!xive_decode_vp(vp, &vp_blk, &vp_idx, NULL, &group)) + return OPAL_PARAMETER; + + /* + * Make a local copy which we will later try to commit using + * the cache watch facility + */ + end = *old_end; + + if (qflags & OPAL_XIVE_EQ_ENABLED) { + switch(qsize) { + /* Supported sizes */ + case 12: + case 16: + case 21: + case 24: + end.w3 = cpu_to_be32(qpage & END_W3_EQ_ADDR_LO); + end.w2 = cpu_to_be32((qpage >> 32) & END_W2_EQ_ADDR_HI); + end.w3 = xive_set_field32(END_W3_QSIZE, end.w3, qsize - 12); + end.w0 = xive_set_field32(END_W0_ENQUEUE, end.w0, 1); + break; + case 0: + end.w2 = end.w3 = 0; + end.w0 = xive_set_field32(END_W0_ENQUEUE, end.w0, 0); + break; + default: + return OPAL_PARAMETER; + } + + /* Ensure the priority and target are correctly set (they will + * not be right after allocation + */ + end.w6 = xive_set_field32(END_W6_VP_BLOCK, 0, vp_blk) | + xive_set_field32(END_W6_VP_OFFSET, 0, vp_idx); + end.w7 = xive_set_field32(END_W7_F0_PRIORITY, 0, prio); + /* XXX Handle group i bit when needed */ + + /* Always notify flag */ + if (qflags & OPAL_XIVE_EQ_ALWAYS_NOTIFY) + end.w0 = xive_set_field32(END_W0_UCOND_NOTIFY, end.w0, 1); + else + end.w0 = xive_set_field32(END_W0_UCOND_NOTIFY, end.w0, 0); + + /* Escalation flag */ + if (qflags & OPAL_XIVE_EQ_ESCALATE) + end.w0 = xive_set_field32(END_W0_ESCALATE_CTL, end.w0, 1); + else + end.w0 = xive_set_field32(END_W0_ESCALATE_CTL, end.w0, 0); + + /* Unconditionally clear the current queue pointer, set + * generation to 1 and disable escalation interrupts. + */ + end.w1 = xive_set_field32(END_W1_GENERATION, 0, 1) | + xive_set_field32(END_W1_ES, 0, xive_get_field32(END_W1_ES, old_end->w1)); + + /* Enable. We always enable backlog for an enabled queue + * otherwise escalations won't work. + */ + end.w0 = xive_set_field32(END_W0_VALID, end.w0, 1); + end.w0 = xive_set_field32(END_W0_BACKLOG, end.w0, 1); + } else + xive_cleanup_end(&end); + + /* Update END, non-synchronous */ + lock(&x->lock); + rc = xive_endc_cache_update(x, blk, idx, &end, false); + unlock(&x->lock); + + return rc; +} + +static int64_t opal_xive_get_queue_state(uint64_t vp, uint32_t prio, + beint32_t *out_qtoggle, + beint32_t *out_qindex) +{ + uint32_t blk, idx; + struct xive *x; + struct xive_end *end; + int64_t rc; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (!out_qtoggle || !out_qindex || + !xive_end_for_target(vp, prio, &blk, &idx)) + return OPAL_PARAMETER; + + x = xive_from_vc_blk(blk); + if (!x) + return OPAL_PARAMETER; + + end = xive_get_end(x, idx); + if (!end) + return OPAL_PARAMETER; + + /* Scrub the queue */ + lock(&x->lock); + rc = xive_endc_scrub(x, blk, idx); + unlock(&x->lock); + if (rc) + return rc; + + /* We don't do disable queues */ + if (!xive_get_field32(END_W0_VALID, end->w0)) + return OPAL_WRONG_STATE; + + *out_qtoggle = cpu_to_be32(xive_get_field32(END_W1_GENERATION, end->w1)); + *out_qindex = cpu_to_be32(xive_get_field32(END_W1_PAGE_OFF, end->w1)); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_set_queue_state(uint64_t vp, uint32_t prio, + uint32_t qtoggle, uint32_t qindex) +{ + uint32_t blk, idx; + struct xive *x; + struct xive_end *end, new_end; + int64_t rc; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (!xive_end_for_target(vp, prio, &blk, &idx)) + return OPAL_PARAMETER; + + x = xive_from_vc_blk(blk); + if (!x) + return OPAL_PARAMETER; + + end = xive_get_end(x, idx); + if (!end) + return OPAL_PARAMETER; + + /* We don't do disable queues */ + if (!xive_get_field32(END_W0_VALID, end->w0)) + return OPAL_WRONG_STATE; + + new_end = *end; + + new_end.w1 = xive_set_field32(END_W1_GENERATION, new_end.w1, qtoggle); + new_end.w1 = xive_set_field32(END_W1_PAGE_OFF, new_end.w1, qindex); + + lock(&x->lock); + rc = xive_endc_cache_update(x, blk, idx, &new_end, false); + unlock(&x->lock); + + return rc; +} + +static int64_t opal_xive_donate_page(uint32_t chip_id, uint64_t addr) +{ + struct proc_chip *c = get_chip(chip_id); + struct list_node *n; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + if (!c) + return OPAL_PARAMETER; + if (!c->xive) + return OPAL_PARAMETER; + if (addr & 0xffff) + return OPAL_PARAMETER; + + n = (struct list_node *)addr; + lock(&c->xive->lock); + list_add(&c->xive->donated_pages, n); + unlock(&c->xive->lock); + return OPAL_SUCCESS; +} + +static int64_t opal_xive_get_vp_info(uint64_t vp_id, + beint64_t *out_flags, + beint64_t *out_cam_value, + beint64_t *out_report_cl_pair, + beint32_t *out_chip_id) +{ + struct xive *x; + struct xive_nvp *vp; + uint32_t blk, idx; + bool group; + + if (!xive_decode_vp(vp_id, &blk, &idx, NULL, &group)) + return OPAL_PARAMETER; + /* We don't do groups yet */ + if (group) + return OPAL_PARAMETER; + x = xive_from_pc_blk(blk); + if (!x) + return OPAL_PARAMETER; + vp = xive_get_vp(x, idx); + if (!vp) + return OPAL_PARAMETER; + + if (out_flags) { + uint32_t end_blk, end_idx; + struct xive_end *end; + struct xive *end_x; + *out_flags = 0; + + /* + * We would like to a way to stash a SW bit in the VP + * to know whether silent escalation is enabled or + * not, but unlike what happens with ENDs, the PC + * cache watch doesn't implement the reserved bit in + * the VPs... so we have to go look at END 7 instead. + */ + + /* Grab END for prio 7 to check for silent escalation */ + if (!xive_end_for_target(vp_id, xive_escalation_prio(x), + &end_blk, &end_idx)) + return OPAL_PARAMETER; + + end_x = xive_from_vc_blk(end_blk); + if (!end_x) + return OPAL_PARAMETER; + + end = xive_get_end(x, end_idx); + if (!end) + return OPAL_PARAMETER; + if (xive_get_field32(NVP_W0_VALID, vp->w0)) + *out_flags |= cpu_to_be64(OPAL_XIVE_VP_ENABLED); + if (xive_get_field32(END_W0_SILENT_ESCALATE, end->w0)) + *out_flags |= cpu_to_be64(OPAL_XIVE_VP_SINGLE_ESCALATION); + } + + if (out_cam_value) { + uint64_t cam_value; + + cam_value = (blk << x->vp_shift) | idx; + + *out_cam_value = cpu_to_be64(cam_value); + } + + if (out_report_cl_pair) { + uint64_t report_cl_pair; + + report_cl_pair = ((uint64_t)(be32_to_cpu(vp->w6) & 0x0fffffff)) << 32; + report_cl_pair |= be32_to_cpu(vp->w7) & 0xffffff00; + + *out_report_cl_pair = cpu_to_be64(report_cl_pair); + } + + if (out_chip_id) + *out_chip_id = cpu_to_be32(xive_block_to_chip[blk]); + + return OPAL_SUCCESS; +} + +static int64_t xive_setup_silent_gather(uint64_t vp_id, bool enable) +{ + uint32_t blk, idx, i; + struct xive_end *end_orig; + struct xive_end end; + struct xive *x; + int64_t rc; + + /* Get base END block */ + if (!xive_end_for_target(vp_id, 0, &blk, &idx)) { + prlog(PR_ERR, "%s: Invalid VP 0x%08llx\n", __func__, vp_id); + return OPAL_PARAMETER; + } + x = xive_from_vc_blk(blk); + if (!x) { + prlog(PR_ERR, "%s: VP 0x%08llx has invalid block %d\n", __func__, + vp_id, blk); + return OPAL_PARAMETER; + } + + /* Grab prio 7 */ + end_orig = xive_get_end(x, idx + xive_escalation_prio(x)); + if (!end_orig) { + xive_err(x, "Failed to get silent gather END 0x%x for VP 0x%08llx\n", + idx + xive_escalation_prio(x), vp_id); + return OPAL_PARAMETER; + } + + /* If trying to enable silent gather, make sure prio 7 is not + * already enabled as a normal queue + */ + if (enable && xive_get_field32(END_W0_VALID, end_orig->w0) && + !xive_get_field32(END_W0_SILENT_ESCALATE, end_orig->w0)) { + xive_err(x, "silent gather END 0x%x already in use\n", + idx + xive_escalation_prio(x)); + return OPAL_PARAMETER; + } + + end = *end_orig; + + if (enable) { + /* W0: Enabled and "s" set, no other bit */ + end.w0 = xive_set_field32(END_W0_FIRMWARE1, end.w0, 0); + end.w0 = xive_set_field32(END_W0_VALID, end.w0, 1); + end.w0 = xive_set_field32(END_W0_SILENT_ESCALATE, end.w0, 1); + end.w0 = xive_set_field32(END_W0_ESCALATE_CTL, end.w0, 1); + end.w0 = xive_set_field32(END_W0_BACKLOG, end.w0, 1); + + /* Set new "N" for END escalation (vs. ESB) */ + end.w0 = xive_set_field32(END_W0_ESCALATE_END, end.w0, 1); + + /* W1: Mark ESn as 01, ESe as 00 */ + end.w1 = xive_set_field32(END_W1_ESn_P, end.w1, 0); + end.w1 = xive_set_field32(END_W1_ESn_Q, end.w1, 1); + end.w1 = xive_set_field32(END_W1_ESe, end.w1, 0); + } else if (xive_get_field32(END_W0_SILENT_ESCALATE, end.w0)) + xive_cleanup_end(&end); + + if (!memcmp(end_orig, &end, sizeof(end))) + rc = 0; + else + rc = xive_endc_cache_update(x, blk, idx + xive_escalation_prio(x), + &end, false); + if (rc) + return rc; + + /* Mark/unmark all other prios with the new "u" bit and update + * escalation + */ + for (i = 0; i < xive_cfg_vp_prio(x); i++) { + if (i == xive_escalation_prio(x)) + continue; + end_orig = xive_get_end(x, idx + i); + if (!end_orig) + continue; + end = *end_orig; + if (enable) { + /* Set "u" bit */ + end.w0 = xive_set_field32(END_W0_UNCOND_ESCALATE, end.w0, 1); + + /* Set new "N" for END escalation (vs. ESB) */ + /* TODO (Gen2+) : use ESB escalation configuration */ + end.w0 = xive_set_field32(END_W0_ESCALATE_END, end.w0, 1); + + /* Re-route escalation interrupt (previous + * route is lost !) to the gather queue + */ + end.w4 = xive_set_field32(END_W4_END_BLOCK, end.w4, blk); + end.w4 = xive_set_field32(END_W4_ESC_END_INDEX, + end.w4, idx + xive_escalation_prio(x)); + } else if (xive_get_field32(END_W0_UNCOND_ESCALATE, end.w0)) { + /* Clear the "u" bit, disable escalations if it was set */ + end.w0 = xive_set_field32(END_W0_UNCOND_ESCALATE, end.w0, 0); + end.w0 = xive_set_field32(END_W0_ESCALATE_CTL, end.w0, 0); + } + if (!memcmp(end_orig, &end, sizeof(end))) + continue; + rc = xive_endc_cache_update(x, blk, idx + i, &end, false); + if (rc) + break; + } + + return rc; +} + +static int64_t opal_xive_set_vp_info(uint64_t vp_id, + uint64_t flags, + uint64_t report_cl_pair) +{ + struct xive *x; + struct xive_nvp *vp, vp_new; + uint32_t blk, idx; + bool group; + int64_t rc; + + if (!xive_decode_vp(vp_id, &blk, &idx, NULL, &group)) + return OPAL_PARAMETER; + /* We don't do groups yet */ + if (group) + return OPAL_PARAMETER; + if (report_cl_pair & 0xff) + return OPAL_PARAMETER; + x = xive_from_pc_blk(blk); + if (!x) + return OPAL_PARAMETER; + vp = xive_get_vp(x, idx); + if (!vp) + return OPAL_PARAMETER; + + lock(&x->lock); + + vp_new = *vp; + if (flags & OPAL_XIVE_VP_ENABLED) { + vp_new.w0 = xive_set_field32(NVP_W0_VALID, vp_new.w0, 1); + vp_new.w6 = cpu_to_be32(report_cl_pair >> 32); + vp_new.w7 = cpu_to_be32(report_cl_pair & 0xffffffff); + + if (flags & OPAL_XIVE_VP_SINGLE_ESCALATION) + rc = xive_setup_silent_gather(vp_id, true); + else + rc = xive_setup_silent_gather(vp_id, false); + } else { + /* + * TODO (kvm): disabling a VP invalidates the associated ENDs. + * + * The loads then return all 1s which can be an issue for the + * Linux code to handle. + */ + + vp_new.w0 = vp_new.w6 = vp_new.w7 = 0; + rc = xive_setup_silent_gather(vp_id, false); + } + + if (rc) { + if (rc != OPAL_BUSY) + xive_dbg(x, "Silent gather setup failed with err %lld\n", rc); + goto bail; + } + + rc = xive_nxc_cache_update(x, blk, idx, &vp_new, false); + if (rc) + goto bail; + + /* When disabling, we scrub clean (invalidate the entry) so + * we can avoid cache ops in alloc/free + */ + if (!(flags & OPAL_XIVE_VP_ENABLED)) + xive_nxc_scrub_clean(x, blk, idx); + +bail: + unlock(&x->lock); + return rc; +} + +static int64_t opal_xive_get_vp_state(uint64_t vp_id, beint64_t *out_state) +{ + struct xive *x; + struct xive_nvp *vp; + uint32_t blk, idx; + int64_t rc; + bool group; + + if (!out_state || !xive_decode_vp(vp_id, &blk, &idx, NULL, &group)) + return OPAL_PARAMETER; + if (group) + return OPAL_PARAMETER; + x = xive_from_pc_blk(blk); + if (!x) + return OPAL_PARAMETER; + vp = xive_get_vp(x, idx); + if (!vp) + return OPAL_PARAMETER; + + /* Scrub the vp */ + lock(&x->lock); + rc = xive_nxc_scrub(x, blk, idx); + unlock(&x->lock); + if (rc) + return rc; + + if (!xive_get_field32(NVP_W0_VALID, vp->w0)) + return OPAL_WRONG_STATE; + + /* + * return a state matching the layout of WORD 0-1 of the TIMA + * as this is expected by current implementation. + */ + *out_state = cpu_to_be64(((uint64_t) 0x0) << 54 | + (uint64_t)xive_get_field32(NVP_W2_CPPR, vp->w2) << 48 | + (uint64_t)xive_get_field32(NVP_W2_IPB, vp->w2) << 40 | + (uint64_t)xive_get_field32(NVP_W2_LSMFB, vp->w2) << 32); + + return OPAL_SUCCESS; +} + +static void *xive_cpu_get_tima(struct cpu_thread *c) +{ + struct xive_cpu_state *xs = c->xstate; + struct xive *x = xs->xive; + + return x->ic_tm_direct_base + ((c->pir & 0xff) << x->ic_shift); +} + +static void xive_cleanup_cpu_tima(struct cpu_thread *c) +{ + struct xive_cpu_state *xs __unused = c->xstate; + void *cpu_tm_base = xive_cpu_get_tima(c); + uint8_t old_w2 __unused, w2 __unused; + + /* Reset the HW context */ + xive_reset_enable_thread(c); + + /* Set VT to 1 */ + old_w2 = in_8(cpu_tm_base + TM_QW3_HV_PHYS + TM_WORD2); + out_8(cpu_tm_base + TM_QW3_HV_PHYS + TM_WORD2, 0x80); + w2 = in_8(cpu_tm_base + TM_QW3_HV_PHYS + TM_WORD2); + + /* Dump HV state */ + xive_cpu_vdbg(c, "[reset] VP TIMA VP=%x/%x W01=%016llx W2=%02x->%02x\n", + xs->vp_blk, xs->vp_idx, + in_be64(cpu_tm_base + TM_QW3_HV_PHYS), + old_w2, w2); +} + +static int64_t xive_vc_ind_cache_kill(struct xive *x, uint64_t type) +{ + uint64_t val; + + /* We clear the whole thing */ + xive_regw(x, VC_AT_MACRO_KILL_MASK, 0); + xive_regw(x, VC_AT_MACRO_KILL, VC_AT_MACRO_KILL_VALID | + SETFIELD(VC_AT_MACRO_KILL_VSD, 0ull, type)); + + /* XXX Add timeout */ + for (;;) { + val = xive_regr(x, VC_AT_MACRO_KILL); + if (!(val & VC_AT_MACRO_KILL_VALID)) + break; + } + return 0; +} + +static int64_t xive_pc_ind_cache_kill(struct xive *x) +{ + uint64_t val; + + /* We clear the whole thing */ + xive_regw(x, PC_AT_KILL_MASK, 0); + xive_regw(x, PC_AT_KILL, PC_AT_KILL_VALID | + SETFIELD(VC_AT_MACRO_KILL_VSD, 0ull, VST_NVP)); + + /* XXX Add timeout */ + for (;;) { + val = xive_regr(x, PC_AT_KILL); + if (!(val & PC_AT_KILL_VALID)) + break; + } + return 0; +} + +static void xive_cleanup_vp_ind(struct xive *x) +{ + int i; + + xive_dbg(x, "Cleaning up %d VP ind entries...\n", x->vp_ind_count); + for (i = 0; i < x->vp_ind_count; i++) { + if (be64_to_cpu(x->vp_ind_base[i]) & VSD_FIRMWARE) { + xive_dbg(x, " %04x ... skip (firmware)\n", i); + continue; + } + if (x->vp_ind_base[i] != 0) { + x->vp_ind_base[i] = 0; + xive_dbg(x, " %04x ... cleaned\n", i); + } + } + xive_pc_ind_cache_kill(x); +} + +static void xive_cleanup_end_ind(struct xive *x) +{ + int i; + + xive_dbg(x, "Cleaning up %d END ind entries...\n", x->end_ind_count); + for (i = 0; i < x->end_ind_count; i++) { + if (be64_to_cpu(x->end_ind_base[i]) & VSD_FIRMWARE) { + xive_dbg(x, " %04x ... skip (firmware)\n", i); + continue; + } + if (x->end_ind_base[i] != 0) { + x->end_ind_base[i] = 0; + xive_dbg(x, " %04x ... cleaned\n", i); + } + } + xive_vc_ind_cache_kill(x, VST_END); +} + +static void xive_reset_one(struct xive *x) +{ + struct cpu_thread *c; + bool end_firmware; + int i; + + xive_notice(x, "Resetting one xive...\n"); + + lock(&x->lock); + + /* Check all interrupts are disabled */ + i = bitmap_find_one_bit(*x->int_enabled_map, 0, XIVE_INT_COUNT); + if (i >= 0) + xive_warn(x, "Interrupt %d (and maybe more) not disabled" + " at reset !\n", i); + + /* Reset IPI allocation */ + xive_dbg(x, "freeing alloc map %p/%p\n", + x->ipi_alloc_map, *x->ipi_alloc_map); + memset(x->ipi_alloc_map, 0, BITMAP_BYTES(XIVE_INT_COUNT)); + + xive_dbg(x, "Resetting ENDs...\n"); + + /* Reset all allocated ENDs and free the user ones */ + bitmap_for_each_one(*x->end_map, xive_end_bitmap_size(x), i) { + struct xive_end end0; + struct xive_end *end; + int j; + + if (i == 0) + continue; + end_firmware = false; + for (j = 0; j < xive_cfg_vp_prio(x); j++) { + uint32_t idx = (i << xive_cfg_vp_prio_shift(x)) | j; + + end = xive_get_end(x, idx); + if (!end) + continue; + + /* We need to preserve the firmware bit, otherwise + * we will incorrectly free the ENDs that are reserved + * for the physical CPUs + */ + if (xive_get_field32(END_W0_VALID, end->w0)) { + if (!xive_end_is_firmware1(end)) + xive_dbg(x, "END 0x%x:0x%x is valid at reset: %08x %08x\n", + x->block_id, idx, end->w0, end->w1); + end0 = *end; + xive_cleanup_end(&end0); + xive_endc_cache_update(x, x->block_id, idx, &end0, true); + } + if (xive_end_is_firmware1(end)) + end_firmware = true; + } + if (!end_firmware) + bitmap_clr_bit(*x->end_map, i); + } + + /* Take out all VPs from HW and reset all CPPRs to 0 */ + for_each_present_cpu(c) { + if (c->chip_id != x->chip_id) + continue; + if (!c->xstate) + continue; + xive_cleanup_cpu_tima(c); + } + + /* Reset all user-allocated VPs. This is inefficient, we should + * either keep a bitmap of allocated VPs or add an iterator to + * the buddy which is trickier but doable. + */ + for (i = 0; i < XIVE_VP_COUNT(x); i++) { + struct xive_nvp *vp; + struct xive_nvp vp0 = {0}; + + /* Ignore the physical CPU VPs */ + if (i >= xive_hw_vp_count && + i < (xive_hw_vp_base + xive_hw_vp_count)) + continue; + + /* Is the VP valid ? */ + vp = xive_get_vp(x, i); + if (!vp || !xive_get_field32(NVP_W0_VALID, vp->w0)) + continue; + + /* Clear it */ + xive_dbg(x, "VP 0x%x:0x%x is valid at reset\n", x->block_id, i); + xive_nxc_cache_update(x, x->block_id, i, &vp0, true); + } + + /* Forget about remaining donated pages */ + list_head_init(&x->donated_pages); + + /* And cleanup donated indirect VP and END pages */ + xive_cleanup_vp_ind(x); + xive_cleanup_end_ind(x); + + /* The rest must not be called with the lock held */ + unlock(&x->lock); + + /* Re-configure VPs */ + for_each_present_cpu(c) { + struct xive_cpu_state *xs = c->xstate; + + if (c->chip_id != x->chip_id || !xs) + continue; + + xive_init_cpu_exploitation(xs); + } +} + +static void xive_reset_mask_source_cb(struct irq_source *is, + void *data __unused) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + struct xive *x; + uint32_t isn; + + if (is->ops != &xive_irq_source_ops) + return; + + /* Skip escalation sources */ + if (GIRQ_IS_ESCALATION(is->start)) + return; + + x = s->xive; + + /* Iterate all interrupts */ + for (isn = is->start; isn < is->end; isn++) { + /* Has it ever been enabled ? */ + if (!bitmap_tst_bit(*x->int_enabled_map, GIRQ_TO_IDX(isn))) + continue; + /* Mask it and clear the enabled map bit */ + xive_vdbg(x, "[reset] disabling source 0x%x\n", isn); + __xive_set_irq_config(is, isn, 0, 0xff, isn, true, false); + bitmap_clr_bit(*x->int_enabled_map, GIRQ_TO_IDX(isn)); + } +} + +void xive2_cpu_reset(void) +{ + struct cpu_thread *c = this_cpu(); + struct xive_cpu_state *xs = c->xstate; + + out_8(xs->tm_ring1 + TM_QW3_HV_PHYS + TM_CPPR, 0); + + in_be64(xs->tm_ring1 + TM_SPC_PULL_POOL_CTX); +} + +static int64_t __xive_reset(uint64_t version) +{ + struct proc_chip *chip; + + xive_mode = version; + + /* Mask all interrupt sources */ + irq_for_each_source(xive_reset_mask_source_cb, NULL); + + /* For each XIVE do a sync... */ + for_each_chip(chip) { + if (!chip->xive) + continue; + xive_sync(chip->xive); + } + + /* For each XIVE reset everything else... */ + for_each_chip(chip) { + if (!chip->xive) + continue; + xive_reset_one(chip->xive); + } + + /* Cleanup global VP allocator */ + buddy_reset(xive_vp_buddy); + + /* + * We reserve the whole range of VP ids for HW threads. + */ + assert(buddy_reserve(xive_vp_buddy, xive_hw_vp_base, xive_threadid_shift)); + + return OPAL_SUCCESS; +} + +/* Called by fast reboot */ +int64_t xive2_reset(void) +{ + if (xive_mode == XIVE_MODE_NONE) + return OPAL_SUCCESS; + return __xive_reset(XIVE_MODE_EXPL); +} + +static int64_t opal_xive_reset(uint64_t version) +{ + prlog(PR_DEBUG, "XIVE reset, version: %d...\n", (int)version); + + if (version != XIVE_MODE_EXPL) { + prerror("ignoring version %lld at reset. " + "XIVE exploitation mode is the default\n", version); + } + + return __xive_reset(XIVE_MODE_EXPL); +} + +static int64_t opal_xive_free_vp_block(uint64_t vp_base) +{ + uint32_t blk, idx, i, j, count; + uint8_t order; + bool group; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (!xive_decode_vp(vp_base, &blk, &idx, &order, &group)) + return OPAL_PARAMETER; + if (group) + return OPAL_PARAMETER; + if (blk) + return OPAL_PARAMETER; + if (order < (xive_chips_alloc_bits + 1)) + return OPAL_PARAMETER; + if (idx & ((1 << (order - xive_chips_alloc_bits)) - 1)) + return OPAL_PARAMETER; + + count = 1 << order; + for (i = 0; i < count; i++) { + uint32_t vp_id = vp_base + i; + uint32_t blk, idx, end_blk, end_idx; + struct xive *x; + struct xive_nvp *vp; + + if (!xive_decode_vp(vp_id, &blk, &idx, NULL, NULL)) { + prerror("Couldn't decode VP id %u\n", vp_id); + return OPAL_INTERNAL_ERROR; + } + x = xive_from_pc_blk(blk); + if (!x) { + prerror("Instance not found for deallocated VP" + " block %d\n", blk); + return OPAL_INTERNAL_ERROR; + } + vp = xive_get_vp(x, idx); + if (!vp) { + prerror("VP not found for deallocation !"); + return OPAL_INTERNAL_ERROR; + } + + /* VP must be disabled */ + if (xive_get_field32(NVP_W0_VALID, vp->w0)) { + prlog(PR_ERR, "freeing active VP %d\n", vp_id); + return OPAL_XIVE_FREE_ACTIVE; + } + + /* Not populated */ + if (vp->w5 == 0) + continue; + + end_blk = xive_get_field32(NVP_W5_VP_END_BLOCK, vp->w5); + end_idx = xive_get_field32(NVP_W5_VP_END_INDEX, vp->w5); + + lock(&x->lock); + + /* Ensure ENDs are disabled and cleaned up. Ideally the caller + * should have done it but we double check it here + */ + for (j = 0; j < xive_cfg_vp_prio(x); j++) { + struct xive *end_x = xive_from_vc_blk(end_blk); + struct xive_end end, *orig_end = xive_get_end(end_x, end_idx + j); + + if (!xive_get_field32(END_W0_VALID, orig_end->w0)) + continue; + + prlog(PR_WARNING, "freeing VP %d with queue %d active\n", + vp_id, j); + end = *orig_end; + xive_cleanup_end(&end); + xive_endc_cache_update(x, end_blk, end_idx + j, &end, true); + } + + /* Mark it not populated so we don't try to free it again */ + vp->w5 = 0; + + if (end_blk != blk) { + prerror("Block mismatch trying to free ENDs\n"); + unlock(&x->lock); + return OPAL_INTERNAL_ERROR; + } + + xive_free_end_set(x, end_idx); + unlock(&x->lock); + } + + xive_free_vps(vp_base); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_alloc_vp_block(uint32_t alloc_order) +{ + uint32_t vp_base, ends, count, i; + int64_t rc; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + prlog(PR_TRACE, "opal_xive_alloc_vp_block(%d)\n", alloc_order); + + vp_base = xive_alloc_vps(alloc_order); + if (XIVE_ALLOC_IS_ERR(vp_base)) { + if (vp_base == XIVE_ALLOC_NO_IND) + return OPAL_XIVE_PROVISIONING; + return OPAL_RESOURCE; + } + + /* Allocate ENDs and initialize VPs */ + count = 1 << alloc_order; + for (i = 0; i < count; i++) { + uint32_t vp_id = vp_base + i; + uint32_t blk, idx; + struct xive *x; + struct xive_nvp *vp; + + if (!xive_decode_vp(vp_id, &blk, &idx, NULL, NULL)) { + prerror("Couldn't decode VP id %u\n", vp_id); + return OPAL_INTERNAL_ERROR; + } + x = xive_from_pc_blk(blk); + if (!x) { + prerror("Instance not found for allocated VP" + " block %d\n", blk); + rc = OPAL_INTERNAL_ERROR; + goto fail; + } + vp = xive_get_vp(x, idx); + if (!vp) { + prerror("VP not found after allocation !"); + rc = OPAL_INTERNAL_ERROR; + goto fail; + } + + /* Allocate ENDs, if fails, free the VPs and return */ + lock(&x->lock); + ends = xive_alloc_end_set(x, false); + unlock(&x->lock); + if (XIVE_ALLOC_IS_ERR(ends)) { + if (ends == XIVE_ALLOC_NO_IND) + rc = OPAL_XIVE_PROVISIONING; + else + rc = OPAL_RESOURCE; + goto fail; + } + + /* Initialize the VP structure. We don't use a cache watch + * as we have made sure when freeing the entries to scrub + * it out of the cache. + */ + memset(vp, 0, sizeof(*vp)); + + /* Store the END base of the VP in W5 (new in p10) */ + xive_vp_set_end_base(vp, blk, ends); + } + return vp_base; + fail: + opal_xive_free_vp_block(vp_base); + + return rc; +} + +static int64_t xive_try_allocate_irq(struct xive *x) +{ + int idx, base_idx, max_count, girq; + struct xive_eas *eas; + + lock(&x->lock); + + base_idx = x->int_ipi_top - x->int_base; + max_count = x->int_hw_bot - x->int_ipi_top; + + idx = bitmap_find_zero_bit(*x->ipi_alloc_map, base_idx, max_count); + if (idx < 0) { + unlock(&x->lock); + return OPAL_RESOURCE; + } + bitmap_set_bit(*x->ipi_alloc_map, idx); + girq = x->int_base + idx; + + /* Mark the EAS valid. Don't bother with the HW cache, it's + * still masked anyway, the cache will be updated when unmasked + * and configured. + */ + eas = xive_get_eas(x, girq); + if (!eas) { + bitmap_clr_bit(*x->ipi_alloc_map, idx); + unlock(&x->lock); + return OPAL_PARAMETER; + } + eas->w = xive_set_field64(EAS_VALID, 0, 1) | + xive_set_field64(EAS_MASKED, 0, 1) | + xive_set_field64(EAS_END_DATA, 0, girq); + unlock(&x->lock); + + return girq; +} + +static int64_t opal_xive_allocate_irq(uint32_t chip_id) +{ + struct proc_chip *chip; + bool try_all = false; + int64_t rc; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + + if (chip_id == OPAL_XIVE_ANY_CHIP) { + try_all = true; + chip_id = this_cpu()->chip_id; + } + chip = get_chip(chip_id); + if (!chip) + return OPAL_PARAMETER; + + /* Try initial target chip */ + if (!chip->xive) + rc = OPAL_PARAMETER; + else + rc = xive_try_allocate_irq(chip->xive); + if (rc >= 0 || !try_all) + return rc; + + /* Failed and we try all... do so */ + for_each_chip(chip) { + if (!chip->xive) + continue; + rc = xive_try_allocate_irq(chip->xive); + if (rc >= 0) + break; + } + return rc; +} + +static int64_t opal_xive_free_irq(uint32_t girq) +{ + struct irq_source *is = irq_find_source(girq); + struct xive_src *s = container_of(is, struct xive_src, is); + struct xive *x = xive_from_isn(girq); + struct xive_eas *eas; + uint32_t idx; + + if (xive_mode != XIVE_MODE_EXPL) + return OPAL_WRONG_STATE; + if (!x || !is) + return OPAL_PARAMETER; + + idx = GIRQ_TO_IDX(girq); + + lock(&x->lock); + + eas = xive_get_eas(x, girq); + if (!eas) { + unlock(&x->lock); + return OPAL_PARAMETER; + } + + /* Mask the interrupt source */ + xive_update_irq_mask(s, girq - s->esb_base, true); + + /* Mark the EAS masked and invalid */ + eas->w = xive_set_field64(EAS_VALID, 0, 1) | + xive_set_field64(EAS_MASKED, 0, 1); + xive_easc_scrub(x, x->block_id, idx); + + /* Free it */ + if (!bitmap_tst_bit(*x->ipi_alloc_map, idx)) { + unlock(&x->lock); + return OPAL_PARAMETER; + } + bitmap_clr_bit(*x->ipi_alloc_map, idx); + bitmap_clr_bit(*x->int_enabled_map, idx); + unlock(&x->lock); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_dump_tm(uint32_t offset, const char *n, uint32_t pir) +{ + struct cpu_thread *c = find_cpu_by_pir(pir); + struct xive_cpu_state *xs; + struct xive *x; + void *cpu_tm_base; + uint64_t v0,v1; + + if (!c) + return OPAL_PARAMETER; + xs = c->xstate; + if (!xs || !xs->tm_ring1) + return OPAL_INTERNAL_ERROR; + x = xs->xive; + cpu_tm_base = xive_cpu_get_tima(c); + + lock(&x->lock); + v0 = in_be64(cpu_tm_base + offset); + if (offset == TM_QW3_HV_PHYS) { + v1 = in_8(cpu_tm_base + offset + 8); + v1 <<= 56; + } else { + v1 = in_be32(cpu_tm_base + offset + 8); + v1 <<= 32; + } + prlog(PR_INFO, "CPU[%04x]: TM state for QW %s\n", pir, n); + prlog(PR_INFO, "CPU[%04x]: NSR CPPR IPB LSMFB ACK# INC AGE PIPR" + " W2 W3\n", pir); + prlog(PR_INFO, "CPU[%04x]: %02x %02x %02x %02x %02x " + "%02x %02x %02x %08x %08x\n", pir, + (uint8_t)(v0 >> 58) & 0xff, (uint8_t)(v0 >> 48) & 0xff, + (uint8_t)(v0 >> 40) & 0xff, (uint8_t)(v0 >> 32) & 0xff, + (uint8_t)(v0 >> 24) & 0xff, (uint8_t)(v0 >> 16) & 0xff, + (uint8_t)(v0 >> 8) & 0xff, (uint8_t)(v0 ) & 0xff, + (uint32_t)(v1 >> 32) & 0xffffffff, + (uint32_t)(v1 & 0xffffffff)); + unlock(&x->lock); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_dump_vp(uint32_t vp_id) +{ + uint32_t blk, idx; + uint8_t order; + bool group; + struct xive *x; + struct xive_nvp *vp; + uint32_t *vpw; + + if (!xive_decode_vp(vp_id, &blk, &idx, &order, &group)) + return OPAL_PARAMETER; + + x = xive_from_vc_blk(blk); + if (!x) + return OPAL_PARAMETER; + vp = xive_get_vp(x, idx); + if (!vp) + return OPAL_PARAMETER; + lock(&x->lock); + + xive_nxc_scrub_clean(x, blk, idx); + + vpw = ((uint32_t *)vp) + (group ? 8 : 0); + prlog(PR_INFO, "VP[%08x]: 0..3: %08x %08x %08x %08x\n", vp_id, + vpw[0], vpw[1], vpw[2], vpw[3]); + prlog(PR_INFO, "VP[%08x]: 4..7: %08x %08x %08x %08x\n", vp_id, + vpw[4], vpw[5], vpw[6], vpw[7]); + unlock(&x->lock); + + return OPAL_SUCCESS; +} + +static int64_t opal_xive_sync_irq_src(uint32_t girq) +{ + struct xive *x = xive_from_isn(girq); + + if (!x) + return OPAL_PARAMETER; + return xive_sync(x); +} + +static int64_t opal_xive_sync_irq_target(uint32_t girq) +{ + uint32_t target, vp_blk; + struct xive *x; + + if (!xive_get_irq_targetting(girq, &target, NULL, NULL)) + return OPAL_PARAMETER; + if (!xive_decode_vp(target, &vp_blk, NULL, NULL, NULL)) + return OPAL_PARAMETER; + x = xive_from_pc_blk(vp_blk); + if (!x) + return OPAL_PARAMETER; + return xive_sync(x); +} + +static int64_t opal_xive_sync(uint32_t type, uint32_t id) +{ + int64_t rc = OPAL_SUCCESS;; + + if (type & XIVE_SYNC_EAS) + rc = opal_xive_sync_irq_src(id); + if (rc) + return rc; + if (type & XIVE_SYNC_QUEUE) + rc = opal_xive_sync_irq_target(id); + if (rc) + return rc; + + /* Add more ... */ + + return rc; +} + +static int64_t opal_xive_dump(uint32_t type, uint32_t id) +{ + switch (type) { + case XIVE_DUMP_TM_HYP: + return opal_xive_dump_tm(TM_QW3_HV_PHYS, "PHYS", id); + case XIVE_DUMP_TM_POOL: + return opal_xive_dump_tm(TM_QW2_HV_POOL, "POOL", id); + case XIVE_DUMP_TM_OS: + return opal_xive_dump_tm(TM_QW1_OS, "OS ", id); + case XIVE_DUMP_TM_USER: + return opal_xive_dump_tm(TM_QW0_USER, "USER", id); + case XIVE_DUMP_VP: + return opal_xive_dump_vp(id); + default: + return OPAL_PARAMETER; + } +} + +static void xive_init_globals(void) +{ + uint32_t i; + + for (i = 0; i < XIVE_MAX_CHIPS; i++) + xive_block_to_chip[i] = XIVE_INVALID_CHIP; +} + +void xive2_init(void) +{ + struct dt_node *np; + struct proc_chip *chip; + struct cpu_thread *cpu; + bool first = true; + + /* Look for xive nodes and do basic inits */ + dt_for_each_compatible(dt_root, np, "ibm,power10-xive-x") { + struct xive *x; + + /* Initialize some global stuff */ + if (first) + xive_init_globals(); + + /* Create/initialize the xive instance */ + x = init_one_xive(np); + if (first) + one_xive = x; + first = false; + } + if (first) + return; + + /* + * P8 emulation is not supported on P10 anymore. Exploitation + * is the default XIVE mode. We might introduce a GEN2 mode. + */ + xive_mode = XIVE_MODE_EXPL; + + /* Init VP allocator */ + xive_init_vp_allocator(); + + /* Create a device-tree node for Linux use */ + xive_create_mmio_dt_node(one_xive); + + /* Some inits must be done after all xive have been created + * such as setting up the forwarding ports + */ + for_each_chip(chip) { + if (chip->xive) + late_init_one_xive(chip->xive); + } + + /* Initialize per-cpu structures */ + for_each_present_cpu(cpu) { + xive_init_cpu(cpu); + } + + /* Calling boot CPU */ + xive2_cpu_callin(this_cpu()); + + /* Register XIVE exploitation calls */ + opal_register(OPAL_XIVE_RESET, opal_xive_reset, 1); + opal_register(OPAL_XIVE_GET_IRQ_INFO, opal_xive_get_irq_info, 6); + opal_register(OPAL_XIVE_GET_IRQ_CONFIG, opal_xive_get_irq_config, 4); + opal_register(OPAL_XIVE_SET_IRQ_CONFIG, opal_xive_set_irq_config, 4); + opal_register(OPAL_XIVE_GET_QUEUE_INFO, opal_xive_get_queue_info, 7); + opal_register(OPAL_XIVE_SET_QUEUE_INFO, opal_xive_set_queue_info, 5); + opal_register(OPAL_XIVE_DONATE_PAGE, opal_xive_donate_page, 2); + opal_register(OPAL_XIVE_ALLOCATE_IRQ, opal_xive_allocate_irq, 1); + opal_register(OPAL_XIVE_FREE_IRQ, opal_xive_free_irq, 1); + opal_register(OPAL_XIVE_ALLOCATE_VP_BLOCK, opal_xive_alloc_vp_block, 1); + opal_register(OPAL_XIVE_FREE_VP_BLOCK, opal_xive_free_vp_block, 1); + opal_register(OPAL_XIVE_GET_VP_INFO, opal_xive_get_vp_info, 5); + opal_register(OPAL_XIVE_SET_VP_INFO, opal_xive_set_vp_info, 3); + opal_register(OPAL_XIVE_SYNC, opal_xive_sync, 2); + opal_register(OPAL_XIVE_DUMP, opal_xive_dump, 2); + opal_register(OPAL_XIVE_GET_QUEUE_STATE, opal_xive_get_queue_state, 4); + opal_register(OPAL_XIVE_SET_QUEUE_STATE, opal_xive_set_queue_state, 4); + opal_register(OPAL_XIVE_GET_VP_STATE, opal_xive_get_vp_state, 2); +} diff --git a/include/xive.h b/include/xive.h index 477d3801d..dc1b25d03 100644 --- a/include/xive.h +++ b/include/xive.h @@ -63,4 +63,33 @@ void xive_source_mask(struct irq_source *is, uint32_t isn); void xive_cpu_reset(void); void xive_late_init(void); +/* + * POWER10 + */ + +/* + * StoreEOI requires the OS to enforce load-after-store ordering and + * the PHB5 should be configured in Address-based trigger mode with PQ + * state bit offloading. + */ +#define XIVE2_STORE_EOI_ENABLED 1 + +void xive2_init(void); +int64_t xive2_reset(void); + +uint32_t xive2_alloc_hw_irqs(uint32_t chip_id, uint32_t count, uint32_t align); +uint32_t xive2_alloc_ipi_irqs(uint32_t chip_id, uint32_t count, uint32_t align); +uint64_t xive2_get_notify_port(uint32_t chip_id, uint32_t ent); +__attrconst uint32_t xive2_get_notify_base(uint32_t girq); +void xive2_register_hw_source(uint32_t base, uint32_t count, uint32_t shift, + void *mmio, uint32_t flags, void *data, + const struct irq_source_ops *ops); +void xive2_register_ipi_source(uint32_t base, uint32_t count, void *data, + const struct irq_source_ops *ops); +void xive2_cpu_callin(struct cpu_thread *cpu); +void *xive2_get_trigger_port(uint32_t girq); + +void xive2_cpu_reset(void); +void xive2_late_init(void); + #endif /* XIVE_H */ diff --git a/include/xive2-regs.h b/include/xive2-regs.h new file mode 100644 index 000000000..6697f036e --- /dev/null +++ b/include/xive2-regs.h @@ -0,0 +1,549 @@ +// SPDX-License-Identifier: Apache-2.0 +/* + * XIVE2: eXternal Interrupt Virtualization Engine. POWER10 interrupt + * controller + * + * Copyright (c) 2019, IBM Corporation. + */ + +#ifndef XIVE2_REGS_H +#define XIVE2_REGS_H + +#include + +/* + * CQ Common Queue (PowerBus bridge) Registers + */ + +/* XIVE Capabilities */ +#define X_CQ_XIVE_CAP 0x02 +#define CQ_XIVE_CAP 0x010 +#define CQ_XIVE_CAP_VERSION PPC_BITMASK(0,3) +/* 4:6 reserved */ +#define CQ_XIVE_CAP_USER_INT_PRIO PPC_BITMASK(8,9) +#define CQ_XIVE_CAP_USER_INT_PRIO_1 0 +#define CQ_XIVE_CAP_USER_INT_PRIO_1_2 1 +#define CQ_XIVE_CAP_USER_INT_PRIO_1_4 2 +#define CQ_XIVE_CAP_USER_INT_PRIO_1_8 3 +#define CQ_XIVE_CAP_VP_INT_PRIO PPC_BITMASK(10,11) +#define CQ_XIVE_CAP_VP_INT_PRIO_1_8 0 +#define CQ_XIVE_CAP_VP_INT_PRIO_2_8 1 +#define CQ_XIVE_CAP_VP_INT_PRIO_4_8 2 +#define CQ_XIVE_CAP_VP_INT_PRIO_8 3 +#define CQ_XIVE_CAP_BLOCK_ID_WIDTH PPC_BITMASK(12,13) + +/* XIVE Configuration */ +#define X_CQ_XIVE_CFG 0x03 +#define CQ_XIVE_CFG 0x018 + +/* 0:7 reserved */ +#define CQ_XIVE_CFG_USER_INT_PRIO PPC_BITMASK(8,9) +#define CQ_XIVE_CFG_VP_INT_PRIO PPC_BITMASK(10,11) +#define CQ_XIVE_CFG_INT_PRIO_1 0 +#define CQ_XIVE_CFG_INT_PRIO_2 1 +#define CQ_XIVE_CFG_INT_PRIO_4 2 +#define CQ_XIVE_CFG_INT_PRIO_8 3 +#define CQ_XIVE_CFG_BLOCK_ID_WIDTH PPC_BITMASK(12,13) +#define CQ_XIVE_CFG_BLOCK_ID_4BITS 0 +#define CQ_XIVE_CFG_BLOCK_ID_5BITS 1 +#define CQ_XIVE_CFG_BLOCK_ID_6BITS 2 +#define CQ_XIVE_CFG_BLOCK_ID_7BITS 3 +#define CQ_XIVE_CFG_HYP_HARD_RANGE PPC_BITMASK(14,15) +#define CQ_XIVE_CFG_THREADID_7BITS 0 +#define CQ_XIVE_CFG_THREADID_8BITS 1 +#define CQ_XIVE_CFG_THREADID_9BITS 2 +#define CQ_XIVE_CFG_THREADID_10BITs 3 +#define CQ_XIVE_CFG_HYP_HARD_BLKID_OVERRIDE PPC_BIT(16) +#define CQ_XIVE_CFG_HYP_HARD_BLOCK_ID PPC_BITMASK(17,23) + +#define CQ_XIVE_CFG_GEN1_TIMA_OS PPC_BIT(24) +#define CQ_XIVE_CFG_GEN1_TIMA_HYP PPC_BIT(25) +#define CQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0 PPC_BIT(26) /* 0 if bit[25]=0 */ +#define CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS PPC_BIT(27) /* 0 if bit[25]=0 */ +#define CQ_XIVE_CFG_GEN1_END_ESX PPC_BIT(28) /* END ESx stores + are dropped */ + +/* Interrupt Controller Base Address Register - 512 pages (32M) */ +#define X_CQ_IC_BAR 0x08 +#define CQ_IC_BAR 0x040 +#define CQ_IC_BAR_VALID PPC_BIT(0) +#define CQ_IC_BAR_64K PPC_BIT(1) +/* 2:7 reserved */ +#define CQ_IC_BAR_ADDR PPC_BITMASK(8,42) +/* 43:63 reserved */ + +/* Thread Management Base Address Register - 4 pages */ +#define X_CQ_TM_BAR 0x09 +#define CQ_TM_BAR 0x048 +#define CQ_TM_BAR_VALID PPC_BIT(0) +#define CQ_TM_BAR_64K PPC_BIT(1) +#define CQ_TM_BAR_ADDR PPC_BITMASK(8,49) + +/* ESB Base Address Register */ +#define X_CQ_ESB_BAR 0x0A +#define CQ_ESB_BAR 0x050 +#define CQ_BAR_VALID PPC_BIT(0) +#define CQ_BAR_64K PPC_BIT(1) +/* 2:7 reserved */ +#define CQ_BAR_ADDR PPC_BITMASK(8,39) +#define CQ_BAR_SET_DIV PPC_BITMASK(56,58) +#define CQ_BAR_RANGE PPC_BITMASK(59,63) + /* 0 (16M) - 16 (16T) */ + +/* END Base Address Register */ +#define X_CQ_END_BAR 0x0B +#define CQ_END_BAR 0x058 + +/* NVPG Base Address Register */ +#define X_CQ_NVPG_BAR 0x0C +#define CQ_NVPG_BAR 0x060 + +/* NVC Base Address Register */ +#define X_CQ_NVC_BAR 0x0D +#define CQ_NVC_BAR 0x068 + +/* Table Address Register */ +#define X_CQ_TAR 0x0E +#define CQ_TAR 0x070 +#define CQ_TAR_AUTOINC PPC_BIT(0) +#define CQ_TAR_SELECT PPC_BITMASK(12,15) +#define CQ_TAR_ESB 0 /* 0 - 15 */ +#define CQ_TAR_END 2 /* 0 - 15 */ +#define CQ_TAR_NVPG 3 /* 0 - 15 */ +#define CQ_TAR_NVC 5 /* 0 - 15 */ +#define CQ_TAR_ENTRY_SELECT PPC_BITMASK(28,31) + +/* Table Data Register */ +#define X_CQ_TDR 0x0F +#define CQ_TDR 0x078 +/* for the NVPG, NVC, ESB, END Set Translation Tables */ +#define CQ_TDR_VALID PPC_BIT(0) +#define CQ_TDR_BLOCK_ID PPC_BITMASK(60,63) + +/* + * Processor Cores Enabled for MsgSnd + * Identifies which of the 32 possible core chiplets are enabled and + * available to receive the MsgSnd command + */ +#define X_CQ_MSGSND 0x10 +#define CQ_MSGSND 0x080 + +/* Interrupt Unit Reset Control */ +#define X_CQ_RST_CTL 0x12 +#define CQ_RST_CTL 0x090 +#define CQ_RST_SYNC_RESET PPC_BIT(0) /* Write Only */ +#define CQ_RST_QUIESCE_PB PPC_BIT(1) /* RW */ +#define CQ_RST_MASTER_IDLE PPC_BIT(2) /* Read Only */ +#define CQ_RST_SAVE_IDLE PPC_BIT(3) /* Read Only */ +#define CQ_RST_PB_BAR_RESET PPC_BIT(4) /* Write Only */ + +/* PowerBus General Configuration */ +#define X_CQ_CFG_PB_GEN 0x14 +#define CQ_CFG_PB_GEN 0x0A0 + +/* FIR + * (And-Mask) + * (Or-Mask) + */ +#define X_CQ_FIR 0x30 +#define X_CQ_FIR_AND 0x31 +#define X_CQ_FIR_OR 0x32 +#define CQ_FIR 0x180 +#define CQ_FIR_AND 0x188 +#define CQ_FIR_OR 0x190 +#define CQ_FIR_PB_RCMDX_CI_ERR1 PPC_BIT(19) +#define CQ_FIR_VC_INFO_ERROR_0_2 PPC_BITMASK(61,63) + +/* FIR Mask + * (And-Mask) + * (Or-Mask) + */ +#define X_CQ_FIRMASK 0x33 +#define X_CQ_FIRMASK_AND 0x34 +#define X_CQ_FIRMASK_OR 0x35 +#define CQ_FIRMASK 0x198 +#define CQ_FIRMASK_AND 0x1A0 +#define CQ_FIRMASK_OR 0x1A8 + +/* + * VC0 + */ + +/* VSD table address */ +#define X_VC_VSD_TABLE_ADDR 0x100 +#define VC_VSD_TABLE_ADDR 0x000 +#define VC_VSD_TABLE_AUTOINC PPC_BIT(0) +#define VC_VSD_TABLE_SELECT PPC_BITMASK(12,15) +#define VC_VSD_TABLE_ADDRESS PPC_BITMASK(28,31) + +/* VSD table data */ +#define X_VC_VSD_TABLE_DATA 0x101 +#define VC_VSD_TABLE_DATA 0x008 + +/* AIB AT macro indirect kill */ +#define X_VC_AT_MACRO_KILL 0x102 +#define VC_AT_MACRO_KILL 0x010 +#define VC_AT_MACRO_KILL_VALID PPC_BIT(0) +#define VC_AT_MACRO_KILL_VSD PPC_BITMASK(12,15) +#define VC_AT_MACRO_KILL_BLOCK_ID PPC_BITMASK(28,31) +#define VC_AT_MACRO_KILL_OFFSET PPC_BITMASK(48,60) + +/* AIB AT macro indirect kill mask (same bit definitions) */ +#define X_VC_AT_MACRO_KILL_MASK 0x103 +#define VC_AT_MACRO_KILL_MASK 0x018 + +/* Remote IRQs and ERQs configuration [n] (n = 0:6) */ +#define X_VC_QUEUES_CFG_REM0 0x117 + +#define VC_QUEUES_CFG_REM0 0x0B8 +#define VC_QUEUES_CFG_MEMB_EN PPC_BIT(38) +#define VC_QUEUES_CFG_MEMB_SZ PPC_BITMASK(42,47) + +/* + * VC1 + */ + +/* ESBC cache flush control trigger */ +#define X_VC_ESBC_FLUSH_CTRL 0x140 +#define VC_ESBC_FLUSH_CTRL 0x200 +#define VC_ESBC_FLUSH_CTRL_POLL_VALID PPC_BIT(0) +#define VC_ESBC_FLUSH_CTRL_WANT_CACHE_DISABLE PPC_BIT(2) + +/* ESBC cache flush poll trigger */ +#define X_VC_ESBC_FLUSH_POLL 0x141 +#define VC_ESBC_FLUSH_POLL 0x208 +#define VC_ESBC_FLUSH_POLL_BLOCK_ID PPC_BITMASK(0,3) +#define VC_ESBC_FLUSH_POLL_OFFSET PPC_BITMASK(4,31) /* 28-bit */ +#define VC_ESBC_FLUSH_POLL_BLOCK_ID_MASK PPC_BITMASK(32,35) +#define VC_ESBC_FLUSH_POLL_OFFSET_MASK PPC_BITMASK(36,63) /* 28-bit */ + +/* EASC flush control register */ +#define X_VC_EASC_FLUSH_CTRL 0x160 +#define VC_EASC_FLUSH_CTRL 0x300 +#define VC_EASC_FLUSH_CTRL_POLL_VALID PPC_BIT(0) +#define VC_EASC_FLUSH_CTRL_WANT_CACHE_DISABLE PPC_BIT(2) + +/* EASC flush poll register */ +#define X_VC_EASC_FLUSH_POLL 0x161 +#define VC_EASC_FLUSH_POLL 0x308 +#define VC_EASC_FLUSH_POLL_BLOCK_ID PPC_BITMASK(0,3) +#define VC_EASC_FLUSH_POLL_OFFSET PPC_BITMASK(4,31) /* 28-bit */ +#define VC_EASC_FLUSH_POLL_BLOCK_ID_MASK PPC_BITMASK(32,35) +#define VC_EASC_FLUSH_POLL_OFFSET_MASK PPC_BITMASK(36,63) /* 28-bit */ + +/* + * VC2 + */ + +/* ENDC flush control register */ +#define X_VC_ENDC_FLUSH_CTRL 0x180 +#define VC_ENDC_FLUSH_CTRL 0x400 +#define VC_ENDC_FLUSH_CTRL_POLL_VALID PPC_BIT(0) +#define VC_ENDC_FLUSH_CTRL_WANT_CACHE_DISABLE PPC_BIT(2) +#define VC_ENDC_FLUSH_CTRL_WANT_INVALIDATE PPC_BIT(3) +#define VC_ENDC_FLUSH_CTRL_INJECT_INVALIDATE PPC_BIT(7) + +/* ENDC flush poll register */ +#define X_VC_ENDC_FLUSH_POLL 0x181 +#define VC_ENDC_FLUSH_POLL 0x408 +#define VC_ENDC_FLUSH_POLL_BLOCK_ID PPC_BITMASK(4,7) +#define VC_ENDC_FLUSH_POLL_OFFSET PPC_BITMASK(8,31) /* 24-bit */ +#define VC_ENDC_FLUSH_POLL_BLOCK_ID_MASK PPC_BITMASK(36,39) +#define VC_ENDC_FLUSH_POLL_OFFSET_MASK PPC_BITMASK(40,63) /* 24-bit */ + +/* ENDC Sync done */ +#define X_VC_ENDC_SYNC_DONE 0x184 +#define VC_ENDC_SYNC_DONE 0x420 +#define VC_ENDC_SYNC_POLL_DONE PPC_BITMASK(0,6) +#define VC_ENDC_SYNC_QUEUE_IPI PPC_BIT(0) +#define VC_ENDC_SYNC_QUEUE_HWD PPC_BIT(1) +#define VC_ENDC_SYNC_QUEUE_NXC PPC_BIT(2) +#define VC_ENDC_SYNC_QUEUE_INT PPC_BIT(3) +#define VC_ENDC_SYNC_QUEUE_OS PPC_BIT(4) +#define VC_ENDC_SYNC_QUEUE_POOL PPC_BIT(5) +#define VC_ENDC_SYNC_QUEUE_HARD PPC_BIT(6) +#define VC_QUEUE_COUNT 7 + +/* ENDC cache watch specification 0 */ +#define X_VC_ENDC_WATCH0_SPEC 0x1A0 +#define VC_ENDC_WATCH0_SPEC 0x500 +#define VC_ENDC_WATCH_CONFLICT PPC_BIT(0) +#define VC_ENDC_WATCH_FULL PPC_BIT(8) +#define VC_ENDC_WATCH_BLOCK_ID PPC_BITMASK(28, 31) +#define VC_ENDC_WATCH_INDEX PPC_BITMASK(40, 63) + +/* ENDC cache watch data 0 */ +#define X_VC_ENDC_WATCH0_DATA0 0x1A4 + +#define VC_ENDC_WATCH0_DATA0 0x520 + +/* + * PC LSB1 + */ + +/* VSD table address register */ +#define X_PC_VSD_TABLE_ADDR 0x200 +#define PC_VSD_TABLE_ADDR 0x000 +#define PC_VSD_TABLE_AUTOINC PPC_BIT(0) +#define PC_VSD_TABLE_SELECT PPC_BITMASK(12,15) +#define PC_VSD_TABLE_ADDRESS PPC_BITMASK(28,31) + +/* VSD table data register */ +#define X_PC_VSD_TABLE_DATA 0x201 +#define PC_VSD_TABLE_DATA 0x008 + +/* AT indirect kill register */ +#define X_PC_AT_KILL 0x202 +#define PC_AT_KILL 0x010 +#define PC_AT_KILL_VALID PPC_BIT(0) +#define PC_AT_KILL_VSD_TYPE PPC_BITMASK(24,27) +/* Only NVP, NVG, NVC */ +#define PC_AT_KILL_BLOCK_ID PPC_BITMASK(28,31) +#define PC_AT_KILL_OFFSET PPC_BITMASK(48,60) + +/* AT indirect kill mask register */ +#define X_PC_AT_KILL_MASK 0x203 +#define PC_AT_KILL_MASK 0x018 +#define PC_AT_KILL_MASK_VSD_TYPE PPC_BITMASK(24,27) +#define PC_AT_KILL_MASK_BLOCK_ID PPC_BITMASK(28,31) +#define PC_AT_KILL_MASK_OFFSET PPC_BITMASK(48,60) + +/* Error1 configuration register 0 */ +#define X_PC_ERR1_CFG0 0x2C8 +#define PC_ERR1_CFG0 0x640 + +/* Error1 configuration register 1 */ +#define X_PC_ERR1_CFG1 0x2C9 +#define PC_ERR1_CFG1 0x648 +#define PC_ERR1_CFG1_INTERRUPT_INVALID_PRIO PPC_BIT(3) +/* + * PC LSB2 + */ + +/* NxC Cache flush control */ +#define X_PC_NXC_FLUSH_CTRL 0x280 +#define PC_NXC_FLUSH_CTRL 0x400 +#define PC_NXC_FLUSH_CTRL_POLL_VALID PPC_BIT(0) +#define PC_NXC_FLUSH_CTRL_WANT_CACHE_DISABLE PPC_BIT(2) +#define PC_NXC_FLUSH_CTRL_WANT_INVALIDATE PPC_BIT(3) +#define PC_NXC_FLUSH_CTRL_INJECT_INVALIDATE PPC_BIT(7) + +/* NxC Cache flush poll */ +#define X_PC_NXC_FLUSH_POLL 0x281 +#define PC_NXC_FLUSH_POLL 0x408 +#define PC_NXC_FLUSH_POLL_NXC_TYPE PPC_BITMASK(2,3) +#define PC_NXC_FLUSH_POLL_NXC_TYPE_NVP 0 +#define PC_NXC_FLUSH_POLL_NXC_TYPE_NVG 2 +#define PC_NXC_FLUSH_POLL_NXC_TYPE_NVC 3 +#define PC_NXC_FLUSH_POLL_BLOCK_ID PPC_BITMASK(4,7) +#define PC_NXC_FLUSH_POLL_OFFSET PPC_BITMASK(8,31) /* 24-bit */ +#define PC_NXC_FLUSH_POLL_NXC_TYPE_MASK PPC_BITMASK(34,35) /* 0: Ignore */ +#define PC_NXC_FLUSH_POLL_BLOCK_ID_MASK PPC_BITMASK(36,39) +#define PC_NXC_FLUSH_POLL_OFFSET_MASK PPC_BITMASK(40,63) /* 24-bit */ + +/* NxC Cache Watch 0 Specification */ +#define X_PC_NXC_WATCH0_SPEC 0x2A0 +#define PC_NXC_WATCH0_SPEC 0x500 +#define PC_NXC_WATCH_CONFLICT PPC_BIT(0) +#define PC_NXC_WATCH_FULL PPC_BIT(8) +#define PC_NXC_WATCH_NXC_TYPE PPC_BITMASK(26, 27) +#define PC_NXC_WATCH_NXC_NVP 0 +#define PC_NXC_WATCH_NXC_NVG 2 +#define PC_NXC_WATCH_NXC_NVC 3 +#define PC_NXC_WATCH_BLOCK_ID PPC_BITMASK(28, 31) +#define PC_NXC_WATCH_INDEX PPC_BITMASK(40, 63) + +/* NxC Cache Watch 0 Data */ +#define X_PC_NXC_WATCH0_DATA0 0x2A4 + +#define PC_NXC_WATCH0_DATA0 0x520 + +/* + * TCTXT Registers + */ + +/* Physical Thread Enable0 register */ +#define X_TCTXT_EN0 0x300 +#define TCTXT_EN0 0x000 + +/* Physical Thread Enable0 Set register */ +#define X_TCTXT_EN0_SET 0x302 +#define TCTXT_EN0_SET 0x010 + +/* Physical Thread Enable0 Reset register */ +#define X_TCTXT_EN0_RESET 0x303 +#define TCTXT_EN0_RESET 0x018 + +/* Physical Thread Enable1 register */ +#define X_TCTXT_EN1 0x304 +#define TCTXT_EN1 0x020 + +/* Physical Thread Enable1 Set register */ +#define X_TCTXT_EN1_SET 0x306 +#define TCTXT_EN1_SET 0x030 + +/* Physical Thread Enable1 Reset register */ +#define X_TCTXT_EN1_RESET 0x307 +#define TCTXT_EN1_RESET 0x038 + +/* + * VSD Tables + */ +#define VST_ESB 0 +#define VST_EAS 1 /* No used by PC */ +#define VST_END 2 +#define VST_NVP 3 +#define VST_NVG 4 +#define VST_NVC 5 +#define VST_IC 6 /* No used by PC */ +#define VST_SYNC 7 +#define VST_ERQ 8 /* No used by PC */ + +/* Bits in a VSD entry. + * + * Note: the address is naturally aligned, we don't use a PPC_BITMASK, + * but just a mask to apply to the address before OR'ing it in. + * + * Note: VSD_FIRMWARE is a SW bit ! It hijacks an unused bit in the + * VSD and is only meant to be used in indirect mode ! + */ +#define VSD_MODE PPC_BITMASK(0,1) +#define VSD_MODE_SHARED 1 +#define VSD_MODE_EXCLUSIVE 2 +#define VSD_MODE_FORWARD 3 +#define VSD_FIRMWARE PPC_BIT(2) /* Read warning */ +#define VSD_FIRMWARE2 PPC_BIT(3) /* unused */ +#define VSD_RESERVED PPC_BITMASK(4,7) /* P10 reserved */ +#define VSD_ADDRESS_MASK 0x00fffffffffff000ull +#define VSD_MIGRATION_REG PPC_BITMASK(52,55) +#define VSD_INDIRECT PPC_BIT(56) +#define VSD_TSIZE PPC_BITMASK(59,63) + +/* EAS + * + * One per interrupt source. Targets that interrupt to a given END + * and provides the corresponding logical interrupt number (END data) + * + * We also map this structure to the escalation descriptor inside + * an END, though in that case the valid and masked bits are not used. + */ +struct xive_eas { + beint64_t w; +#define EAS_VALID PPC_BIT(0) +#define EAS_END_BLOCK PPC_BITMASK(4,7) /* Destination END block# */ +#define EAS_END_INDEX PPC_BITMASK(8,31) /* Destination END index */ +#define EAS_MASKED PPC_BIT(32) /* Masked */ +#define EAS_END_DATA PPC_BITMASK(33,63) /* Data written to the EQ */ +}; + +/* EQ */ +struct xive_end { + beint32_t w0; +#define END_W0_VALID PPC_BIT32(0) /* "v" bit */ +#define END_W0_ENQUEUE PPC_BIT32(5) /* "q" bit */ +#define END_W0_UCOND_NOTIFY PPC_BIT32(6) /* "n" bit */ +#define END_W0_SILENT_ESCALATE PPC_BIT32(7) /* "s" bit */ +#define END_W0_BACKLOG PPC_BIT32(8) /* "b" bit */ +#define END_W0_UNCOND_ESCALATE PPC_BIT32(10) /* "u" bit */ +#define END_W0_ESCALATE_CTL PPC_BIT32(11) /* "e" bit */ +#define END_W0_ESCALATE_END PPC_BIT32(13) /* "N" bit */ +#define END_W0_FIRMWARE1 PPC_BIT32(16) /* Owned by FW */ +#define END_W0_FIRMWARE2 PPC_BIT32(17) /* Owned by FW */ + beint32_t w1; +#define END_W1_ES PPC_BITMASK32(0,3) +#define END_W1_ESn PPC_BITMASK32(0,1) +#define END_W1_ESn_P PPC_BIT32(0) +#define END_W1_ESn_Q PPC_BIT32(1) +#define END_W1_ESe PPC_BITMASK32(2,3) +#define END_W1_ESe_P PPC_BIT32(2) +#define END_W1_ESe_Q PPC_BIT32(3) +#define END_W1_GEN_FLIPPED PPC_BIT32(8) +#define END_W1_GENERATION PPC_BIT32(9) +#define END_W1_PAGE_OFF PPC_BITMASK32(10,31) + beint32_t w2; +#define END_W2_RESERVED PPC_BITMASK32(4,7) +#define END_W2_EQ_ADDR_HI PPC_BITMASK32(8,31) + beint32_t w3; +#define END_W3_EQ_ADDR_LO PPC_BITMASK32(0,24) +#define END_W3_QSIZE PPC_BITMASK32(28,31) + beint32_t w4; +#define END_W4_END_BLOCK PPC_BITMASK32(4,7) /* N:1 */ +#define END_W4_ESC_END_INDEX PPC_BITMASK32(8,31) /* N:1 */ +#define END_W4_ESB_BLOCK PPC_BITMASK32(0,3) /* N:0 */ +#define END_W4_ESC_ESB_INDEX PPC_BITMASK32(4,31) /* N:0 */ + beint32_t w5; +#define END_W5_ESC_END_DATA PPC_BITMASK32(1,31) + beint32_t w6; +#define END_W6_FORMAT_BIT PPC_BIT32(0) +#define END_W6_VP_BLOCK PPC_BITMASK32(4,7) +#define END_W6_VP_OFFSET PPC_BITMASK32(8,31) +#define END_W6_VP_OFFSET_GEN1 PPC_BITMASK32(13,31) + beint32_t w7; +#define END_W7_TOPO PPC_BITMASK32(0,3) /* Owned by HW */ +#define END_W7_F0_PRIORITY PPC_BITMASK32(8,15) +#define END_W7_F1_LOG_SERVER_ID PPC_BITMASK32(4,31) +}; +#define xive_end_is_firmware1(end) \ + xive_get_field32(END_W0_FIRMWARE1, (end)->w0) + +/* Notification Virtual Processor (NVP) */ +struct xive_nvp { + beint32_t w0; +#define NVP_W0_VALID PPC_BIT32(0) +#define NVP_W0_ESC_END PPC_BIT32(25) /* 'N' bit 0:ESB 1:END */ + beint32_t w1; + beint32_t w2; +#define NVP_W2_CPPR PPC_BITMASK32(0, 7) +#define NVP_W2_IPB PPC_BITMASK32(8, 15) +#define NVP_W2_LSMFB PPC_BITMASK32(16, 23) + beint32_t w3; + beint32_t w4; +#define NVP_W4_ESC_ESB_BLOCK PPC_BITMASK32(0, 3) /* N:0 */ +#define NVP_W4_ESC_ESB_INDEX PPC_BITMASK32(4, 31) /* N:0 */ +#define NVP_W4_ESC_END_BLOCK PPC_BITMASK32(4, 7) /* N:1 */ +#define NVP_W4_ESC_END_INDEX PPC_BITMASK32(8, 31) /* N:1 */ + beint32_t w5; +#define NVP_W5_PSIZE PPC_BITMASK32(0, 1) +#define NVP_W5_VP_END_BLOCK PPC_BITMASK32(4, 7) +#define NVP_W5_VP_END_INDEX PPC_BITMASK32(8, 31) + beint32_t w6; + beint32_t w7; +}; + +/* Notification Virtual Group or Crowd (NVG/NVC) */ +struct xive_nvgc { + beint32_t w0; +#define NVGC_W0_VALID PPC_BIT32(0) + beint32_t w1; + beint32_t w2; + beint32_t w3; + beint32_t w4; + beint32_t w5; + beint32_t w6; + beint32_t w7; +}; + +/* + * Thread Interrupt Management Area + * + * In Gen1 mode (P9 compat mode) word 2 is the same. However in Gen2 + * mode (P10), the CAM line is slightly different as the VP space was + * increased. + */ +#define TM10_QW0W2_VU PPC_BIT32(0) +#define TM10_QW0W2_LOGIC_SERV PPC_BITMASK32(4, 31) +#define TM10_QW1W2_VO PPC_BIT32(0) +#define TM10_QW1W2_HO PPC_BIT32(1) +#define TM10_QW1W2_NO PPC_BIT32(2) +#define TM10_QW1W2_OS_CAM PPC_BITMASK32(4, 31) +#define TM10_QW2W2_VP PPC_BIT32(0) +#define TM10_QW2W2_HP PPC_BIT32(1) +#define TM10_QW2W2_NP PPC_BIT32(2) +#define TM10_QW2W2_POOL_CAM PPC_BITMASK32(4, 31) +#define TM10_QW3W2_VT PPC_BIT32(0) +#define TM10_QW3W2_HT PPC_BIT32(1) +#define TM10_QW3W2_NT PPC_BIT32(2) +#define TM10_QW3W2_LP PPC_BIT32(6) +#define TM10_QW3W2_LE PPC_BIT32(7) + +#endif /* XIVE2_REGS_H */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:11 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:11 +0530 Subject: [Skiboot] [PATCH v2 33/59] hw/phb5: Add initial support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-34-hegdevasant@linux.vnet.ibm.com> From: Jordan Niethe The PHB5 logic on P10 is pretty close to the P9's version. So we keep our base phb4 implementation and just add the few changes within if statements. Signed-off-by: Jordan Niethe [clg: misc cleanups and fixes ] Signed-off-by: C?dric Le Goater [Fixed compilation issue - Vasant] Signed-off-by: Vasant Hegde [Nick: Unify PHB4/PHB5 drivers ] Signed-off-by: Nicholas Piggin [Mikey: set default lane eq settings for phb5] Signed-off-by: Michael Neuling [FB: squash commits + small cleanup ] Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- core/hmi.c | 4 + core/init.c | 2 +- .../opal-pci-set-phb-capi-mode-93.rst | 5 +- hw/capp.c | 11 +- hw/phb4.c | 200 ++++++++++++++---- hw/phys-map.c | 48 ++--- include/opal-api.h | 3 +- include/phb4-regs.h | 10 +- include/phb4.h | 22 +- include/phys-map.h | 4 + 10 files changed, 217 insertions(+), 92 deletions(-) diff --git a/core/hmi.c b/core/hmi.c index 35b609047..9363cc5fb 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -602,6 +602,10 @@ static void find_capp_checkstop_reason(int flat_chip_id, uint64_t reg; int64_t rc; + /* CAPP exists on P8 and P9 only */ + if (proc_gen != proc_gen_p8 && proc_gen != proc_gen_p9) + return; + /* Find the CAPP on the chip associated with the HMI. */ for_each_phb(phb) { /* get the CAPP info */ diff --git a/core/init.c b/core/init.c index e38969554..a8bac28a8 100644 --- a/core/init.c +++ b/core/init.c @@ -1364,7 +1364,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Probe PHB3 on P8 */ probe_phb3(); - /* Probe PHB4 on P9 */ + /* Probe PHB4 on P9 and PHB5 on P10 */ probe_phb4(); /* Probe NPUs */ diff --git a/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst b/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst index ffc4c6dc9..130e382b5 100644 --- a/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst +++ b/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst @@ -66,10 +66,11 @@ Notes allocate extra 16/8 dma read engines to the PHB depending on its stack (stack 0/ stack 1). This is needed to improve the Direct-GPU DMA read performance for the Mellanox CX5 card. -* Mode `OPAL_PHB_CAPI_MODE_PCIE` not yet supported on Power-9. +* Mode `OPAL_PHB_CAPI_MODE_PCIE` not supported on Power-9. * Requesting mode `OPAL_PHB_CAPI_MODE_CAPI` on Power-9 will disable fast-reboot. * Modes `OPAL_PHB_CAPI_MODE_DMA`, `OPAL_PHB_CAPI_MODE_SNOOP_OFF` are - not supported on Power-9 yet. + not supported on Power-9. +* CAPI is only supported on Power-8 and Power-9. Return Codes ------------ diff --git a/hw/capp.c b/hw/capp.c index dde8c52f6..a1aa1caa9 100644 --- a/hw/capp.c +++ b/hw/capp.c @@ -42,15 +42,12 @@ int preload_capp_ucode(void) uint64_t rc; int ret; + /* CAPI is supported on P8 and P9 only */ p = dt_find_compatible_node(dt_root, NULL, "ibm,power8-pbcq"); - - if (!p) { + if (!p) p = dt_find_compatible_node(dt_root, NULL, "ibm,power9-pbcq"); - if (!p) { - prlog(PR_INFO, "CAPI: WARNING: no compat thing found\n"); - return OPAL_SUCCESS; - } - } + if (!p) + return OPAL_SUCCESS; chip = get_chip(dt_get_chip_id(p)); diff --git a/hw/phb4.c b/hw/phb4.c index 31f9fa250..e074fa2a3 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -142,6 +142,16 @@ static bool pci_eeh_mmio; static bool pci_retry_all; static int rx_err_max = PHB4_RX_ERR_MAX; +static inline bool is_phb4(void) +{ + return (proc_gen == proc_gen_p9); +} + +static inline bool is_phb5(void) +{ + return (proc_gen == proc_gen_p10); +} + /* Note: The "ASB" name is historical, practically this means access via * the XSCOM backdoor */ @@ -988,7 +998,7 @@ static int64_t phb4_wait_bit(struct phb4 *p, uint32_t reg, * XXX Add timeout... */ /* XXX SIMICS is nasty... */ - if ((reg == PHB_TCE_KILL || reg == PHB_DMARD_SYNC) && + if ((reg == PHB_TCE_KILL || reg == PHB_DMA_READ_WRITE_SYNC) && chip_quirk(QUIRK_SIMICS)) return OPAL_SUCCESS; @@ -1084,7 +1094,17 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, } /* Start DMA sync process */ - out_be64(p->regs + PHB_DMARD_SYNC, PHB_DMARD_SYNC_START); + if (is_phb5()){ + val = in_be64(p->regs + PHB_DMA_READ_WRITE_SYNC) & + (PHB_DMA_READ_SYNC_COMPLETE | + PHB_DMA_WRITE_SYNC_COMPLETE); + out_be64(p->regs + PHB_DMA_READ_WRITE_SYNC, + val | PHB_DMA_READ_SYNC_START); + + } else { + out_be64(p->regs + PHB_DMA_READ_WRITE_SYNC, + PHB_DMA_READ_SYNC_START); + } /* Wait for kill to complete */ rc = phb4_wait_bit(p, PHB_Q_DMA_R, PHB_Q_DMA_R_TCE_KILL_STATUS, 0); @@ -1092,9 +1112,9 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, return rc; /* Wait for DMA sync to complete */ - return phb4_wait_bit(p, PHB_DMARD_SYNC, - PHB_DMARD_SYNC_COMPLETE, - PHB_DMARD_SYNC_COMPLETE); + return phb4_wait_bit(p, PHB_DMA_READ_WRITE_SYNC, + PHB_DMA_READ_SYNC_COMPLETE, + PHB_DMA_READ_SYNC_COMPLETE); } /* phb4_ioda_reset - Reset the IODA tables @@ -3537,7 +3557,11 @@ static void phb4_int_unmask_all(struct phb4 *p) { /* Init_126..130 - Re-enable error interrupts */ out_be64(p->regs + PHB_ERR_IRQ_ENABLE, 0xca8880cc00000000ull); - out_be64(p->regs + PHB_TXE_ERR_IRQ_ENABLE, 0x2008400e08200000ull); + + if (is_phb5()) + out_be64(p->regs + PHB_TXE_ERR_IRQ_ENABLE, 0x200850be08200020ull); + else + out_be64(p->regs + PHB_TXE_ERR_IRQ_ENABLE, 0x2008400e08200000ull); out_be64(p->regs + PHB_RXE_ARB_ERR_IRQ_ENABLE, 0xc40038fc01804070ull); out_be64(p->regs + PHB_RXE_MRG_ERR_IRQ_ENABLE, 0x00006100008000a8ull); out_be64(p->regs + PHB_RXE_TCE_ERR_IRQ_ENABLE, 0x60510050c0000000ull); @@ -4162,6 +4186,10 @@ static int64_t phb4_get_capp_info(int chip_id, struct phb *phb, struct phb4 *p = phb_to_phb4(phb); uint32_t offset; + /* Not even supposed to be here on P10, but doesn't hurt */ + if (is_phb5()) + return OPAL_UNSUPPORTED; + if (chip_id != p->chip_id) return OPAL_PARAMETER; @@ -4364,8 +4392,11 @@ static void phb4_init_capp_errors(struct phb4 *p) out_be64(p->regs + 0x0cb0, 0x35777073ff000000ull); } - /* - * The capi indicator is over the 8 most significant bits on p9 (and +/* + * The capi, NBW and ASN indicators are used only on P9 to flag some + * types of incoming traffic for the PHB and have been removed on P10. + * + * The capi indicator is over the 8 most significant bits (and * not 16). We stay away from bits 59 (TVE select), 60 and 61 (MSI) * * For the mask, we keep bit 59 in, as capi messages must hit TVE#0. @@ -4689,6 +4720,10 @@ static int64_t phb4_set_capi_mode(struct phb *phb, uint64_t mode, struct capp *capp = p->capp; uint64_t reg, ret; + /* No CAPI on P10. OpenCAPI only */ + if (is_phb5()) + return OPAL_UNSUPPORTED; + /* cant do a mode switch when capp is in recovery mode */ ret = capp_xscom_read(capp, CAPP_ERR_STATUS_CTRL, ®); if (ret != OPAL_SUCCESS) @@ -4954,7 +4989,7 @@ static void phb4_init_ioda3(struct phb4 *p) /* Init_19 - Interrupt Notify Base Index */ out_be64(p->regs + PHB_INT_NOTIFY_INDEX, - xive_get_notify_base(p->base_msi)); + xive2_get_notify_base(p->base_msi)); /* Init_19x - Not in spec: Initialize source ID */ PHBDBG(p, "Reset state SRC_ID: %016llx\n", @@ -4979,9 +5014,11 @@ static void phb4_init_ioda3(struct phb4 *p) /* Init_24 - CRW Base Address Reg */ /* See enable_capi_mode() */ - /* Init_25 - ASN Compare/Mask */ - out_be64(p->regs + PHB_ASN_CMPM, ((u64)ASNIND << 48) | - ((u64)ASNMASK << 32) | PHB_ASN_CMPM_ENABLE); + if (is_phb4()) { + /* Init_25 - ASN Compare/Mask - P9 only */ + out_be64(p->regs + PHB_ASN_CMPM, ((u64)ASNIND << 48) | + ((u64)ASNMASK << 32) | PHB_ASN_CMPM_ENABLE); + } /* Init_26 - CAPI Compare/Mask */ /* See enable_capi_mode() */ @@ -5123,18 +5160,26 @@ static void phb4_init_errors(struct phb4 *p) /* Init_73..81 - TXE errors */ out_be64(p->regs + 0x0d08, 0x0000000000000000ull); + /* Errata: Clear bit 17, otherwise a CFG write UR/CA will incorrectly * freeze a "random" PE (whatever last PE did an MMIO) */ - out_be64(p->regs + 0x0d28, 0x0000000a00000000ull); - if (phb4_is_dd20(p)) { - out_be64(p->regs + 0x0d00, 0xf3acff0ff7ddfff0ull); - out_be64(p->regs + 0x0d18, 0xf3acff0ff7ddfff0ull); - out_be64(p->regs + 0x0d30, 0xdfffbd05f7ddfff0ull); /* XXX CAPI has diff. value */ - } else { + if (is_phb5()) { + out_be64(p->regs + 0x0d28, 0x0000500a00000000ull); out_be64(p->regs + 0x0d00, 0xffffffffffffffffull); out_be64(p->regs + 0x0d18, 0xffffff0fffffffffull); - out_be64(p->regs + 0x0d30, 0xdff7bd05f7ddfff0ull); + out_be64(p->regs + 0x0d30, 0xdff7af41f7ddffdfull); + } else { + out_be64(p->regs + 0x0d28, 0x0000000a00000000ull); + if (phb4_is_dd20(p)) { + out_be64(p->regs + 0x0d00, 0xf3acff0ff7ddfff0ull); + out_be64(p->regs + 0x0d18, 0xf3acff0ff7ddfff0ull); + out_be64(p->regs + 0x0d30, 0xdfffbd05f7ddfff0ull); /* XXX CAPI has diff. value */ + } else { + out_be64(p->regs + 0x0d00, 0xffffffffffffffffull); + out_be64(p->regs + 0x0d18, 0xffffff0fffffffffull); + out_be64(p->regs + 0x0d30, 0xdff7bd05f7ddfff0ull); + } } out_be64(p->regs + 0x0d40, 0x0000000000000000ull); @@ -5241,7 +5286,7 @@ static void phb4_init_hw(struct phb4 *p) { uint64_t val, creset; - PHBDBG(p, "Initializing PHB4...\n"); + PHBDBG(p, "Initializing PHB...\n"); /* Init_1 - Sync reset * @@ -5288,6 +5333,18 @@ static void phb4_init_hw(struct phb4 *p) out_be64(p->regs + PHB_PCIE_DLP_CTL, val); } + if (is_phb5()) { + /* disable scaled flow control for now. SW527785 */ + PHBDBG(p, "LINK: Disabling scaled flow control\n"); + val = in_be64(p->regs + PHB_PCIE_DLP_CTL); + val |= PHB_PCIE_DLP_CTL_SFC_DISABLE; + out_be64(p->regs + PHB_PCIE_DLP_CTL, val); + + /* lane equalization settings need to be tuned on P10 */ + out_be64(p->regs + PHB_PCIE_PDL_PHY_EQ_CNTL, + 0x80F4FFFFFF0F9C00); + } + /* Init_14 - Clear link training */ phb4_pcicfg_write32(&p->phb, 0, 0x78, 0x07FE0000 | p->max_link_speed); @@ -5698,6 +5755,13 @@ static __be64 lane_eq_default[8] = { CPU_TO_BE64(0x7777777777777777UL), CPU_TO_BE64(0x7777777777777777UL), }; +static __be64 lane_eq_phb5_default[8] = { + CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), + CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), + CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), + CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), +}; + static void phb4_create(struct dt_node *np) { const struct dt_property *prop; @@ -5816,7 +5880,10 @@ static void phb4_create(struct dt_node *np) } } else { PHBDBG(p, "Using default lane equalization settings\n"); - p->lane_eq = lane_eq_default; + if (is_phb5()) + p->lane_eq = lane_eq_phb5_default; + else + p->lane_eq = lane_eq_default; } if (p->lane_eq) { PHBDBG(p, "Override lane equalization settings:\n"); @@ -5830,7 +5897,10 @@ static void phb4_create(struct dt_node *np) * 2K or 4K interrupts ... for now we just use 4K but that * needs to be fixed */ - irq_base = xive_alloc_hw_irqs(p->chip_id, p->num_irqs, p->num_irqs); + if (is_phb5()) + irq_base = xive2_alloc_hw_irqs(p->chip_id, p->num_irqs, p->num_irqs); + else + irq_base = xive_alloc_hw_irqs(p->chip_id, p->num_irqs, p->num_irqs); if (irq_base == XIVE_IRQ_ERROR) { PHBERR(p, "Failed to allocate %d interrupt sources\n", p->num_irqs); @@ -5838,8 +5908,6 @@ static void phb4_create(struct dt_node *np) } p->base_msi = irq_base; p->base_lsi = irq_base + p->num_irqs - 8; - p->irq_port = xive_get_notify_port(p->chip_id, - XIVE_HW_SRC_PHBn(p->index)); p->num_pes = p->max_num_pes; /* Allocate the SkiBoot internal in-memory tables for the PHB */ @@ -5854,7 +5922,8 @@ static void phb4_create(struct dt_node *np) phb4_init_hw(p); /* init capp that might get attached to the phb */ - phb4_init_capp(p); + if (is_phb4()) + phb4_init_capp(p); /* Compute XIVE source flags depending on PHB revision */ irq_flags = 0; @@ -5863,13 +5932,23 @@ static void phb4_create(struct dt_node *np) else irq_flags |= XIVE_SRC_TRIGGER_PAGE; - /* Register all interrupt sources with XIVE */ - xive_register_hw_source(p->base_msi, p->num_irqs - 8, 16, - p->int_mmio, irq_flags, NULL, NULL); + if (is_phb5()) { + /* Register all interrupt sources with XIVE */ + xive2_register_hw_source(p->base_msi, p->num_irqs - 8, 16, + p->int_mmio, irq_flags, NULL, NULL); - xive_register_hw_source(p->base_lsi, 8, 16, - p->int_mmio + ((p->num_irqs - 8) << 16), - XIVE_SRC_LSI, p, &phb4_lsi_ops); + xive2_register_hw_source(p->base_lsi, 8, 16, + p->int_mmio + ((p->num_irqs - 8) << 16), + XIVE_SRC_LSI, p, &phb4_lsi_ops); + } else { + /* Register all interrupt sources with XIVE */ + xive_register_hw_source(p->base_msi, p->num_irqs - 8, 16, + p->int_mmio, irq_flags, NULL, NULL); + + xive_register_hw_source(p->base_lsi, 8, 16, + p->int_mmio + ((p->num_irqs - 8) << 16), + XIVE_SRC_LSI, p, &phb4_lsi_ops); + } /* Platform additional setup */ if (platform.pci_setup_phb) @@ -5889,6 +5968,7 @@ static void phb4_create(struct dt_node *np) static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, uint32_t nest_base, uint32_t pci_base) { + enum phys_map_type phys_mmio64, phys_mmio32, phys_xive_esb, phys_reg_spc; uint32_t pci_stack, nest_stack, etu_base, gcid, phb_num, stk_index; uint64_t val, phb_bar = 0, irq_bar = 0, bar_en; uint64_t mmio0_bar = 0, mmio0_bmask, mmio0_sz; @@ -5902,12 +5982,27 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, unsigned int max_link_speed; int rc; + assert(is_phb5() || is_phb4()); /* Sanity check */ + gcid = dt_get_chip_id(stk_node); stk_index = dt_prop_get_u32(stk_node, "reg"); phb_num = dt_prop_get_u32(stk_node, "ibm,phb-index"); path = dt_get_path(stk_node); - prlog(PR_INFO, "PHB: Chip %d Found PHB4 PBCQ%d Stack %d at %s\n", - gcid, pec_index, stk_index, path); + if (is_phb5()) { + phys_mmio64 = PHB5_64BIT_MMIO; + phys_mmio32 = PHB5_32BIT_MMIO; + phys_xive_esb = PHB5_XIVE_ESB; + phys_reg_spc = PHB5_REG_SPC; + prlog(PR_INFO, "PHB: Chip %d Found PHB5 PBCQ%d Stack %d at %s\n", + gcid, pec_index, stk_index, path); + } else { + phys_mmio64 = PHB4_64BIT_MMIO; + phys_mmio32 = PHB4_32BIT_MMIO; + phys_xive_esb = PHB4_XIVE_ESB; + phys_reg_spc = PHB4_REG_SPC; + prlog(PR_INFO, "PHB: Chip %d Found PHB4 PBCQ%d Stack %d at %s\n", + gcid, pec_index, stk_index, path); + } free(path); pci_stack = pci_base + 0x40 * (stk_index + 1); @@ -5921,7 +6016,7 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, bar_en = 0; /* Initialize PHB register BAR */ - phys_map_get(gcid, PHB4_REG_SPC, phb_num, &phb_bar, NULL); + phys_map_get(gcid, phys_reg_spc, phb_num, &phb_bar, NULL); rc = xscom_write(gcid, nest_stack + XPEC_NEST_STK_PHB_REG_BAR, phb_bar << 8); @@ -5935,18 +6030,18 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, bar_en |= XPEC_NEST_STK_BAR_EN_PHB; /* Same with INT BAR (ESB) */ - phys_map_get(gcid, PHB4_XIVE_ESB, phb_num, &irq_bar, NULL); + phys_map_get(gcid, phys_xive_esb, phb_num, &irq_bar, NULL); xscom_write(gcid, nest_stack + XPEC_NEST_STK_IRQ_BAR, irq_bar << 8); bar_en |= XPEC_NEST_STK_BAR_EN_INT; /* Same with MMIO windows */ - phys_map_get(gcid, PHB4_64BIT_MMIO, phb_num, &mmio0_bar, &mmio0_sz); + phys_map_get(gcid, phys_mmio64, phb_num, &mmio0_bar, &mmio0_sz); mmio0_bmask = (~(mmio0_sz - 1)) & 0x00FFFFFFFFFFFFFFULL; xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR0, mmio0_bar << 8); xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR0_MASK, mmio0_bmask << 8); - phys_map_get(gcid, PHB4_32BIT_MMIO, phb_num, &mmio1_bar, &mmio1_sz); + phys_map_get(gcid, phys_mmio32, phb_num, &mmio1_bar, &mmio1_sz); mmio1_bmask = (~(mmio1_sz - 1)) & 0x00FFFFFFFFFFFFFFULL; xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR1, mmio1_bar << 8); xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR1_MASK, mmio1_bmask << 8); @@ -5994,7 +6089,10 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, if (!np) return; - dt_add_property_strings(np, "compatible", "ibm,power9-pciex", "ibm,ioda3-phb"); + if (is_phb5()) + dt_add_property_strings(np, "compatible", "ibm,power10-pciex", "ibm,ioda3-phb"); + else + dt_add_property_strings(np, "compatible", "ibm,power9-pciex", "ibm,ioda3-phb"); dt_add_property_strings(np, "device_type", "pciex"); dt_add_property_u64s(np, "reg", phb_bar, 0x1000, @@ -6078,12 +6176,24 @@ void probe_phb4(void) rx_err_max = MAX(rx_err_max, 0); rx_err_max = MIN(rx_err_max, 255); } - prlog(PR_DEBUG, "PHB4: Maximum RX errors during training: %d\n", rx_err_max); - /* Look for PBCQ XSCOM nodes */ - dt_for_each_compatible(dt_root, np, "ibm,power9-pbcq") - phb4_probe_pbcq(np); - /* Look for newly created PHB nodes */ - dt_for_each_compatible(dt_root, np, "ibm,power9-pciex") - phb4_create(np); + if (is_phb5()) { + prlog(PR_DEBUG, "PHB5: Maximum RX errors during training: %d\n", rx_err_max); + /* Look for PBCQ XSCOM nodes */ + dt_for_each_compatible(dt_root, np, "ibm,power10-pbcq") + phb4_probe_pbcq(np); + + /* Look for newly created PHB nodes */ + dt_for_each_compatible(dt_root, np, "ibm,power10-pciex") + phb4_create(np); + } else { + prlog(PR_DEBUG, "PHB4: Maximum RX errors during training: %d\n", rx_err_max); + /* Look for PBCQ XSCOM nodes */ + dt_for_each_compatible(dt_root, np, "ibm,power9-pbcq") + phb4_probe_pbcq(np); + + /* Look for newly created PHB nodes */ + dt_for_each_compatible(dt_root, np, "ibm,power9-pciex") + phb4_create(np); + } } diff --git a/hw/phys-map.c b/hw/phys-map.c index b8fff0a4f..d6ff99fd8 100644 --- a/hw/phys-map.c +++ b/hw/phys-map.c @@ -33,36 +33,36 @@ static const struct phys_map_entry phys_map_table_p10[] = { /* TODO: Figure out GPU memory */ /* 0 TB offset @ MMIO 0x0006000000000000ull */ - { PHB4_64BIT_MMIO, 0, 0x0006000000000000ull, 0x0000004000000000ull }, - { PHB4_64BIT_MMIO, 1, 0x0006004000000000ull, 0x0000004000000000ull }, - { PHB4_64BIT_MMIO, 2, 0x0006008000000000ull, 0x0000004000000000ull }, - { PHB4_32BIT_MMIO, 0, 0x000600c000000000ull, 0x0000000080000000ull }, - { PHB4_32BIT_MMIO, 1, 0x000600c080000000ull, 0x0000000080000000ull }, - { PHB4_32BIT_MMIO, 2, 0x000600c100000000ull, 0x0000000080000000ull }, - { PHB4_32BIT_MMIO, 3, 0x000600c180000000ull, 0x0000000080000000ull }, - { PHB4_32BIT_MMIO, 4, 0x000600c200000000ull, 0x0000000080000000ull }, - { PHB4_32BIT_MMIO, 5, 0x000600c280000000ull, 0x0000000080000000ull }, - { PHB4_XIVE_ESB , 0, 0x000600c300000000ull, 0x0000000020000000ull }, - { PHB4_XIVE_ESB , 1, 0x000600c320000000ull, 0x0000000020000000ull }, - { PHB4_XIVE_ESB , 2, 0x000600c340000000ull, 0x0000000020000000ull }, - { PHB4_XIVE_ESB , 3, 0x000600c360000000ull, 0x0000000020000000ull }, - { PHB4_XIVE_ESB , 4, 0x000600c380000000ull, 0x0000000020000000ull }, - { PHB4_XIVE_ESB , 5, 0x000600c3a0000000ull, 0x0000000020000000ull }, - { PHB4_REG_SPC , 0, 0x000600c3c0000000ull, 0x0000000000100000ull }, - { PHB4_REG_SPC , 1, 0x000600c3c0100000ull, 0x0000000000100000ull }, - { PHB4_REG_SPC , 2, 0x000600c3c0200000ull, 0x0000000000100000ull }, - { PHB4_REG_SPC , 3, 0x000600c3c0300000ull, 0x0000000000100000ull }, - { PHB4_REG_SPC , 4, 0x000600c3c0400000ull, 0x0000000000100000ull }, - { PHB4_REG_SPC , 5, 0x000600c3c0500000ull, 0x0000000000100000ull }, + { PHB5_64BIT_MMIO, 0, 0x0006000000000000ull, 0x0000004000000000ull }, + { PHB5_64BIT_MMIO, 1, 0x0006004000000000ull, 0x0000004000000000ull }, + { PHB5_64BIT_MMIO, 2, 0x0006008000000000ull, 0x0000004000000000ull }, + { PHB5_32BIT_MMIO, 0, 0x000600c000000000ull, 0x0000000080000000ull }, + { PHB5_32BIT_MMIO, 1, 0x000600c080000000ull, 0x0000000080000000ull }, + { PHB5_32BIT_MMIO, 2, 0x000600c100000000ull, 0x0000000080000000ull }, + { PHB5_32BIT_MMIO, 3, 0x000600c180000000ull, 0x0000000080000000ull }, + { PHB5_32BIT_MMIO, 4, 0x000600c200000000ull, 0x0000000080000000ull }, + { PHB5_32BIT_MMIO, 5, 0x000600c280000000ull, 0x0000000080000000ull }, + { PHB5_XIVE_ESB , 0, 0x000600c300000000ull, 0x0000000020000000ull }, + { PHB5_XIVE_ESB , 1, 0x000600c320000000ull, 0x0000000020000000ull }, + { PHB5_XIVE_ESB , 2, 0x000600c340000000ull, 0x0000000020000000ull }, + { PHB5_XIVE_ESB , 3, 0x000600c360000000ull, 0x0000000020000000ull }, + { PHB5_XIVE_ESB , 4, 0x000600c380000000ull, 0x0000000020000000ull }, + { PHB5_XIVE_ESB , 5, 0x000600c3a0000000ull, 0x0000000020000000ull }, + { PHB5_REG_SPC , 0, 0x000600c3c0000000ull, 0x0000000000100000ull }, + { PHB5_REG_SPC , 1, 0x000600c3c0100000ull, 0x0000000000100000ull }, + { PHB5_REG_SPC , 2, 0x000600c3c0200000ull, 0x0000000000100000ull }, + { PHB5_REG_SPC , 3, 0x000600c3c0300000ull, 0x0000000000100000ull }, + { PHB5_REG_SPC , 4, 0x000600c3c0400000ull, 0x0000000000100000ull }, + { PHB5_REG_SPC , 5, 0x000600c3c0500000ull, 0x0000000000100000ull }, { RESV , 0, 0x000600c3c0600000ull, 0x0000003c3fa00000ull }, /* 1 TB offset */ { RESV , 1, 0x0006010000000000ull, 0x0000010000000000ull }, /* 2 TB offset */ - { PHB4_64BIT_MMIO, 3, 0x0006020000000000ull, 0x0000004000000000ull }, - { PHB4_64BIT_MMIO, 4, 0x0006024000000000ull, 0x0000004000000000ull }, - { PHB4_64BIT_MMIO, 5, 0x0006028000000000ull, 0x0000004000000000ull }, + { PHB5_64BIT_MMIO, 3, 0x0006020000000000ull, 0x0000004000000000ull }, + { PHB5_64BIT_MMIO, 4, 0x0006024000000000ull, 0x0000004000000000ull }, + { PHB5_64BIT_MMIO, 5, 0x0006028000000000ull, 0x0000004000000000ull }, { RESV , 2, 0x000602c000000000ull, 0x0000004000000000ull }, /* 3 TB offset */ diff --git a/include/opal-api.h b/include/opal-api.h index 9cba35c7d..eb6d83527 100644 --- a/include/opal-api.h +++ b/include/opal-api.h @@ -799,7 +799,8 @@ enum { enum { OPAL_PHB_ERROR_DATA_TYPE_PHB3 = 2, - OPAL_PHB_ERROR_DATA_TYPE_PHB4 = 3 + OPAL_PHB_ERROR_DATA_TYPE_PHB4 = 3, + OPAL_PHB_ERROR_DATA_TYPE_PHB5 = 3 /* TODO change this */ }; enum { diff --git a/include/phb4-regs.h b/include/phb4-regs.h index b6e778744..03b53ae01 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -53,9 +53,11 @@ #define PHB_M64_AOMASK 0x1d0 #define PHB_M64_UPPER_BITS 0x1f0 #define PHB_NXLATE_PREFIX 0x1f8 -#define PHB_DMARD_SYNC 0x200 -#define PHB_DMARD_SYNC_START PPC_BIT(0) -#define PHB_DMARD_SYNC_COMPLETE PPC_BIT(1) +#define PHB_DMA_READ_WRITE_SYNC 0x200 +#define PHB_DMA_READ_SYNC_START PPC_BIT(0) +#define PHB_DMA_READ_SYNC_COMPLETE PPC_BIT(1) +#define PHB_DMA_WRITE_SYNC_START PPC_BIT(2) /* PHB5 */ +#define PHB_DMA_WRITE_SYNC_COMPLETE PPC_BIT(3) /* PHB5 */ #define PHB_RTC_INVALIDATE 0x208 #define PHB_RTC_INVALIDATE_ALL PPC_BIT(0) #define PHB_RTC_INVALIDATE_RID PPC_BITMASK(16,31) @@ -274,6 +276,7 @@ #define PHB_PCIE_DLP_CTL 0x1A78 #define PHB_PCIE_DLP_CTL_BYPASS_PH2 PPC_BIT(4) #define PHB_PCIE_DLP_CTL_BYPASS_PH3 PPC_BIT(5) +#define PHB_PCIE_DLP_CTL_SFC_DISABLE PPC_BIT(60) #define PHB_PCIE_DLP_TRWCTL 0x1A80 #define PHB_PCIE_DLP_TRWCTL_EN PPC_BIT(0) @@ -293,6 +296,7 @@ #define PHB_PCIE_LANE_EQ_CNTL21 0x1AF8 #define PHB_PCIE_TRACE_CTRL 0x1B20 #define PHB_PCIE_MISC_STRAP 0x1B30 +#define PHB_PCIE_PDL_PHY_EQ_CNTL 0x1B38 /* Error */ #define PHB_REGB_ERR_STATUS 0x1C00 diff --git a/include/phb4.h b/include/phb4.h index abba2d9c6..217f68462 100644 --- a/include/phb4.h +++ b/include/phb4.h @@ -154,9 +154,9 @@ struct phb4_err { #define PHB4_ETU_IN_RESET 0x00000020 struct phb4 { - unsigned int index; /* 0..5 index inside p9 */ + unsigned int index; /* 0..5 index inside p9/p10 */ unsigned int flags; - unsigned int chip_id; /* Chip ID (== GCID on p9) */ + unsigned int chip_id; /* Chip ID (== GCID on p9/p10) */ unsigned int pec; bool broken; unsigned int rev; /* 00MMmmmm */ @@ -245,16 +245,20 @@ static inline void phb4_set_err_pending(struct phb4 *p, bool pending) p->err_pending = pending; } -#define PHB4_PER_CHIP 6 /* Max 6 PHBs per chip on p9 */ -#define PHB4_MAX_PHBS_PER_CHIP_P9 PHB4_PER_CHIP -#define PHB4_MAX_PHBS_PER_CHIP_P9P 0x10 /* extra for virt PHBs */ +#define MAX_PHBS_PER_CHIP_P10 6 /* Max 6 PHBs per chip on p10 */ +#define MAX_PHBS_PER_CHIP_P9 6 /* Max 6 PHBs per chip on p9 */ +#define MAX_PHBS_PER_CHIP_P9P 0x10 /* extra for virt PHBs */ static inline int phb4_get_opal_id(unsigned int chip_id, unsigned int index) { - if (PVR_TYPE(mfspr(SPR_PVR)) == PVR_TYPE_P9) - return chip_id * PHB4_MAX_PHBS_PER_CHIP_P9 + index; - else - return chip_id * PHB4_MAX_PHBS_PER_CHIP_P9P + index; + if (proc_gen == proc_gen_p10) { + return chip_id * MAX_PHBS_PER_CHIP_P10 + index; + } else { + if (PVR_TYPE(mfspr(SPR_PVR)) == PVR_TYPE_P9) + return chip_id * MAX_PHBS_PER_CHIP_P9 + index; + else + return chip_id * MAX_PHBS_PER_CHIP_P9P + index; + } } void phb4_pec2_dma_engine_realloc(struct phb4 *p); diff --git a/include/phys-map.h b/include/phys-map.h index a3394c0d0..1dd337a56 100644 --- a/include/phys-map.h +++ b/include/phys-map.h @@ -20,6 +20,10 @@ enum phys_map_type { PHB4_32BIT_MMIO, PHB4_XIVE_ESB, PHB4_REG_SPC, + PHB5_64BIT_MMIO, + PHB5_32BIT_MMIO, + PHB5_XIVE_ESB, + PHB5_REG_SPC, NPU_OCAPI_MMIO, XIVE_VC, XIVE_PC, -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:15 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:15 +0530 Subject: [Skiboot] [PATCH v2 37/59] psi/p10: Introduce xive2_source_mask() In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-38-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Commit fa161cd89fbf ("hw/psi-p9: Mask OPAL-owned LSIs without handlers") introduced xive_source_mask(). Do the same for P10. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/psi.c | 11 ++++++++++- hw/xive2.c | 7 +++++++ include/xive.h | 2 ++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/hw/psi.c b/hw/psi.c index 291422539..e9b8e2ea7 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -564,7 +564,16 @@ static void psi_p9_mask_unhandled_irq(struct irq_source *is, uint32_t isn) * have a handler for the interrupt then it needs to be masked to * prevent the IRQ from locking up the thread which handles it. */ - xive_source_mask(is, isn); + switch (proc_gen) { + case proc_gen_p9: + xive_source_mask(is, isn); + break; + case proc_gen_p10: + xive2_source_mask(is, isn); + return; + default: + assert(false); + } } diff --git a/hw/xive2.c b/hw/xive2.c index cba050fa1..f565be1fd 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -2532,6 +2532,13 @@ static char *xive_source_name(struct irq_source *is, uint32_t isn) return s->orig_ops->name(is, isn); } +void xive2_source_mask(struct irq_source *is, uint32_t isn) +{ + struct xive_src *s = container_of(is, struct xive_src, is); + + xive_update_irq_mask(s, isn - s->esb_base, true); +} + static const struct irq_source_ops xive_irq_source_ops = { .interrupt = xive_source_interrupt, .attributes = xive_source_attributes, diff --git a/include/xive.h b/include/xive.h index faaef2aeb..8d5fbeddb 100644 --- a/include/xive.h +++ b/include/xive.h @@ -91,6 +91,8 @@ uint64_t xive2_get_esb_base(uint32_t girq); void xive2_cpu_callin(struct cpu_thread *cpu); void *xive2_get_trigger_port(uint32_t girq); +void xive2_source_mask(struct irq_source *is, uint32_t isn); + void xive2_cpu_reset(void); void xive2_late_init(void); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:16 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:16 +0530 Subject: [Skiboot] [PATCH v2 38/59] psi/p10: Mask all sources at init In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-39-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/psi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/hw/psi.c b/hw/psi.c index e9b8e2ea7..954b7bf68 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -766,6 +766,8 @@ static void psi_init_p10_interrupts(struct psi *psi) u64 val; uint32_t esb_shift = 16; uint32_t flags = XIVE_SRC_LSI; + struct irq_source *is; + int isn; /* Grab chip */ chip = get_chip(psi->chip_id); @@ -813,6 +815,11 @@ static void psi_init_p10_interrupts(struct psi *psi) esb_shift, psi->esb_mmio, flags, psi, &psi_p10_irq_ops); + /* Mask all sources */ + is = irq_find_source(psi->interrupt); + for (isn = is->start; isn < is->end; isn++) + xive2_source_mask(is, isn); + /* Reset irq handling and switch to ESB mode */ out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, PSIHB_IRQ_RESET); out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, 0); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:14 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:14 +0530 Subject: [Skiboot] [PATCH v2 36/59] hw/phb5: Add support for 'Address-Based Interrupt Trigger' mode In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-37-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater The PHB5 introduces a new Address-Based Interrupt mode which extends the notification offloading to the ESB pages. When ABT is activated, the PHB maps the interrupt source number into the interrupt command address. The PHB triggers the interrupt using directly the IC ESB page of the interrupt number and does not use the notify page of the IC anymore. The PHB interrrupt configuration under ABT is a little different. The 'Interrupt Notify Base Address' register points to the base address of the IC ESB pages and not to the notify page of the IC anymore as on P9. The 'Interrupt Notify Base Index' register is unused. This should improve overall performance. The P10 IC can handle higher interrupt rates compared to P9 and the PHB latency should be improved under ABT. Debug is easier as the interrupt number is now exposed on the PowerBUS. Signed-off-by: C?dric Le Goater [FB: port to phb4.c] Signed-off-by: Frederic Barrat Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/phb4.c | 63 ++++++++++++++++++++++++++++++++++++++++----- hw/xive2.c | 6 ----- include/phb4-regs.h | 2 ++ 3 files changed, 59 insertions(+), 12 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index d2d9f9ec0..d2fc274b3 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -159,6 +159,18 @@ static inline bool phb_pq_disable(struct phb4 *p __unused) return false; } +/* + * Use the ESB page of the XIVE IC for event notification. Latency + * improvement. + */ +static inline bool phb_abt_mode(struct phb4 *p __unused) +{ + if (is_phb5()) + return 1; + + return false; +} + static inline bool phb_can_store_eoi(struct phb4 *p) { if (is_phb5()) @@ -5000,12 +5012,49 @@ static const struct phb_ops phb4_ops = { static void phb4_init_ioda3(struct phb4 *p) { - /* Init_18 - Interrupt Notify Base Address */ - out_be64(p->regs + PHB_INT_NOTIFY_ADDR, p->irq_port); + if (is_phb5()) { + /* + * When ABT is on, the MSIs on the PHB use the PQ state bits + * of the IC and MSI triggers from the PHB are forwarded + * directly to the IC ESB page. However, the LSIs are still + * controlled locally on the PHB and LSI triggers use a + * special offset for trigger injection. + */ + if (phb_abt_mode(p)) { + uint64_t mmio_base = xive2_get_esb_base(p->base_msi); + + PHBDBG(p, "Using ABT mode. ESB: 0x%016llx\n", mmio_base); + + /* Init_18 - Interrupt Notify Base Address */ + out_be64(p->regs + PHB_INT_NOTIFY_ADDR, + PHB_INT_NOTIFY_ADDR_64K | mmio_base); + + /* Interrupt Notify Base Index is unused */ + } else { + p->irq_port = xive2_get_notify_port(p->chip_id, + XIVE_HW_SRC_PHBn(p->index)); + + PHBDBG(p, "Using IC notif page at 0x%016llx\n", + p->irq_port); - /* Init_19 - Interrupt Notify Base Index */ - out_be64(p->regs + PHB_INT_NOTIFY_INDEX, - xive2_get_notify_base(p->base_msi)); + /* Init_18 - Interrupt Notify Base Address */ + out_be64(p->regs + PHB_INT_NOTIFY_ADDR, p->irq_port); + + /* Init_19 - Interrupt Notify Base Index */ + out_be64(p->regs + PHB_INT_NOTIFY_INDEX, + xive2_get_notify_base(p->base_msi)); + } + + } else { /* p9 */ + p->irq_port = xive_get_notify_port(p->chip_id, + XIVE_HW_SRC_PHBn(p->index)); + /* Init_18 - Interrupt Notify Base Address */ + out_be64(p->regs + PHB_INT_NOTIFY_ADDR, p->irq_port); + + /* Init_19 - Interrupt Notify Base Index */ + out_be64(p->regs + PHB_INT_NOTIFY_INDEX, + xive_get_notify_base(p->base_msi)); + } /* Init_19x - Not in spec: Initialize source ID */ PHBDBG(p, "Reset state SRC_ID: %016llx\n", @@ -5384,6 +5433,8 @@ static void phb4_init_hw(struct phb4 *p) val |= SETFIELD(PHB_CTRLR_TVT_ADDR_SEL, 0ull, TVT_2_PER_PE); if (phb_pq_disable(p)) val |= PHB_CTRLR_IRQ_PQ_DISABLE; + if (phb_abt_mode(p)) + val |= PHB_CTRLR_IRQ_ABT_MODE; if (phb_can_store_eoi(p)) { val |= PHB_CTRLR_IRQ_STORE_EOI; PHBDBG(p, "store EOI is enabled\n"); @@ -5958,7 +6009,7 @@ static void phb4_create(struct dt_node *np) * ESB pages of the XIVE IC for the MSI sources instead of the * ESB pages of the PHB. */ - if (phb_pq_disable(p)) { + if (phb_pq_disable(p) || phb_abt_mode(p)) { xive2_register_esb_source(p->base_msi, p->num_irqs - 8); } else { xive2_register_hw_source(p->base_msi, diff --git a/hw/xive2.c b/hw/xive2.c index 3f4958fce..cba050fa1 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -2141,12 +2141,6 @@ uint64_t xive2_get_notify_port(uint32_t chip_id, uint32_t ent) * * P10 might now be randomizing the cache line bits in HW to * balance snoop bus usage - * - * TODO (phb5) : implement "address based triggers" (DD2.0?) - * - * The PHBs would no longer target the notify port page but - * the "base ESB MMIO address" of the ESB/EAS range they are - * allocated. Needs a XIVE API change for the PHBs. */ switch(ent) { case XIVE_HW_SRC_PHBn(0): diff --git a/include/phb4-regs.h b/include/phb4-regs.h index 139522814..99633e103 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -97,11 +97,13 @@ #define PHB_PAPR_ERR_INJ_MASK_MMIO PPC_BITMASK(16,63) #define PHB_ETU_ERR_SUMMARY 0x2c8 #define PHB_INT_NOTIFY_ADDR 0x300 +#define PHB_INT_NOTIFY_ADDR_64K PPC_BIT(1) /* PHB5 */ #define PHB_INT_NOTIFY_INDEX 0x308 #define PHB_VERSION 0x800 #define PHB_CTRLR 0x810 #define PHB_CTRLR_IRQ_PQ_DISABLE PPC_BIT(9) /* PHB5 */ +#define PHB_CTRLR_IRQ_ABT_MODE PPC_BIT(10) /* PHB5 */ #define PHB_CTRLR_IRQ_PGSZ_64K PPC_BIT(11) #define PHB_CTRLR_IRQ_STORE_EOI PPC_BIT(12) #define PHB_CTRLR_MMIO_RD_STRICT PPC_BIT(13) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:19 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:19 +0530 Subject: [Skiboot] [PATCH v2 41/59] xive/p10: Configure XIVE for fused cores In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-42-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 17 ++++++++++++++++- include/xive2-regs.h | 12 ++++++++++++ 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/hw/xive2.c b/hw/xive2.c index 0005a8314..67b497082 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1594,6 +1594,19 @@ static bool xive_has_cap(struct xive *x, uint64_t cap) #define XIVE_CAN_STORE_EOI(x) xive_has_cap(x, CQ_XIVE_CAP_STORE_EOI) +static void xive_config_fused_core(struct xive *x) +{ + uint64_t val = xive_regr(x, TCTXT_CFG); + + if (this_cpu()->is_fused_core) { + val |= TCTXT_CFG_FUSE_CORE_EN; + xive_dbg(x, "configured for fused cores. " + "PC_TCTXT_CFG=%016llx\n", val); + } else + val &= ~TCTXT_CFG_FUSE_CORE_EN; + xive_regw(x, TCTXT_CFG, val); +} + static void xive_config_reduced_priorities_fixup(struct xive *x) { if (xive_cfg_vp_prio_shift(x) < CQ_XIVE_CFG_INT_PRIO_8 && @@ -1686,6 +1699,8 @@ static bool xive_config_init(struct xive *x) xive_dbg(x, "store EOI is %savailable\n", XIVE_CAN_STORE_EOI(x) ? "" : "not "); + xive_config_fused_core(x); + xive_config_reduced_priorities_fixup(x); return true; @@ -2981,7 +2996,7 @@ static void xive_init_cpu(struct cpu_thread *c) * of a pair is present we just do the setup for each of them, which * is harmless. */ - if (cpu_is_thread0(c)) + if (cpu_is_thread0(c) || cpu_is_core_chiplet_primary(c)) xive_configure_ex_special_bar(x, c); /* Initialize the state structure */ diff --git a/include/xive2-regs.h b/include/xive2-regs.h index 79c36ebca..6295dd191 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -392,6 +392,18 @@ #define X_TCTXT_EN1_RESET 0x307 #define TCTXT_EN1_RESET 0x038 +/* TCTXT Config register */ +#define X_TCTXT_CFG 0x328 +#define TCTXT_CFG 0x140 +#define TCTXT_CFG_FUSE_CORE_EN PPC_BIT(0) +#define TCTXT_CFG_PHYP_CORE_MODE PPC_BIT(1) /* O:Linux 1:pHyp */ +#define TCTXT_CFG_GEN1_HYP_TARGET_DIS PPC_BIT(4) +#define TCTXT_CFG_GEN1_OS_ST_ACK PPC_BIT(5) +#define TCTXT_CFG_GEN1_OGEN_FINE PPC_BIT(6) +#define TCTXT_CFG_INT_MSGSND_DIS PPC_BIT(17) +#define TCTXT_CFG_HOSTBOOT_MODE PPC_BIT(20) +#define TCTXT_CFG_COMPLEX_STORE_DIS PPC_BITMASK(25, 27) + /* * VSD Tables */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:20 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:20 +0530 Subject: [Skiboot] [PATCH v2 42/59] xive/p10: Add automatic Context Save and Restore support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-43-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater The save-restore feature is forced when available. It would have been better to introduce some negotiation but the CAM line value is returned by get_vp_info() before the save-restore feature can be enabled by KVM in xive_native_enable_vp(). This is compatible with the current KVM implementation for P9. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 48 ++++++++++++++++++++++++++++++++++++++++++++ include/opal-api.h | 1 + include/xive2-regs.h | 8 +++++++- 3 files changed, 56 insertions(+), 1 deletion(-) diff --git a/hw/xive2.c b/hw/xive2.c index 67b497082..7ece64251 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1512,6 +1512,7 @@ static const struct { { CQ_XIVE_CAP_PHB_ABT, "PHB address based trigger mode support" }, { CQ_XIVE_CAP_EXPLOITATION_MODE, "Exploitation mode" }, { CQ_XIVE_CAP_STORE_EOI, "StoreEOI mode support" }, + { CQ_XIVE_CAP_VP_SAVE_RESTORE, "VP Context Save and Restore" }, }; static void xive_dump_capabilities(struct xive *x, uint64_t cap_val) @@ -1543,6 +1544,8 @@ static const struct { { CQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0, "Gen1 mode TIMA General Hypervisor Block0" }, { CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS, "Gen1 mode TIMA Crowd disable" }, { CQ_XIVE_CFG_GEN1_END_ESX, "Gen1 mode END ESx" }, + { CQ_XIVE_CFG_EN_VP_SAVE_RESTORE, "VP Context Save and Restore" }, + { CQ_XIVE_CFG_EN_VP_SAVE_REST_STRICT, "VP Context Save and Restore strict" }, }; static void xive_dump_configuration(struct xive *x, const char *prefix, @@ -1594,6 +1597,11 @@ static bool xive_has_cap(struct xive *x, uint64_t cap) #define XIVE_CAN_STORE_EOI(x) xive_has_cap(x, CQ_XIVE_CAP_STORE_EOI) +static bool xive_cfg_save_restore(struct xive *x) +{ + return !!(x->config & CQ_XIVE_CFG_EN_VP_SAVE_RESTORE); +} + static void xive_config_fused_core(struct xive *x) { uint64_t val = xive_regr(x, TCTXT_CFG); @@ -1654,6 +1662,14 @@ static bool xive_config_init(struct xive *x) x->config |= CQ_XIVE_CFG_HYP_HARD_BLKID_OVERRIDE | SETFIELD(CQ_XIVE_CFG_HYP_HARD_BLOCK_ID, 0ull, x->block_id); + /* + * Enable "VP Context Save and Restore" by default. it is + * compatible with KVM which currently does the context + * save&restore in the entry/exit path of the vCPU + */ + if (x->capabilities & CQ_XIVE_CAP_VP_SAVE_RESTORE) + x->config |= CQ_XIVE_CFG_EN_VP_SAVE_RESTORE; + xive_dump_configuration(x, "new", x->config); xive_regw(x, CQ_XIVE_CFG, x->config); if (xive_regr(x, CQ_XIVE_CFG) != x->config) { @@ -1903,6 +1919,9 @@ static void xive_create_mmio_dt_node(struct xive *x) if (XIVE_CAN_STORE_EOI(x)) dt_add_property(xive_dt_node, "store-eoi", NULL, 0); + if (xive_cfg_save_restore(x)) + dt_add_property(xive_dt_node, "vp-save-restore", NULL, 0); + xive_add_provisioning_properties(); } @@ -3470,6 +3489,8 @@ static int64_t opal_xive_get_vp_info(uint64_t vp_id, return OPAL_PARAMETER; if (xive_get_field32(NVP_W0_VALID, vp->w0)) *out_flags |= cpu_to_be64(OPAL_XIVE_VP_ENABLED); + if (xive_cfg_save_restore(x)) + *out_flags |= cpu_to_be64(OPAL_XIVE_VP_SAVE_RESTORE); if (xive_get_field32(END_W0_SILENT_ESCALATE, end->w0)) *out_flags |= cpu_to_be64(OPAL_XIVE_VP_SINGLE_ESCALATION); } @@ -3479,6 +3500,13 @@ static int64_t opal_xive_get_vp_info(uint64_t vp_id, cam_value = (blk << x->vp_shift) | idx; + /* + * If save-restore is enabled, force the CAM line + * value with the H bit. + */ + if (xive_cfg_save_restore(x)) + cam_value |= TM10_QW1W2_HO; + *out_cam_value = cpu_to_be64(cam_value); } @@ -3626,6 +3654,10 @@ static int64_t opal_xive_set_vp_info(uint64_t vp_id, if (!vp) return OPAL_PARAMETER; + /* Consistency check. */ + if ((flags & OPAL_XIVE_VP_SAVE_RESTORE) && !xive_cfg_save_restore(x)) + return OPAL_PARAMETER; + lock(&x->lock); vp_new = *vp; @@ -3638,6 +3670,22 @@ static int64_t opal_xive_set_vp_info(uint64_t vp_id, rc = xive_setup_silent_gather(vp_id, true); else rc = xive_setup_silent_gather(vp_id, false); + + /* + * Prepare NVP to be HW owned for automatic save-restore + */ + if (xive_cfg_save_restore(x)) { + /* + * Set NVP privilege level. Default to OS. + * This check only makes sense for KVM guests + * currently. We would need an extra flag to + * distinguish from pool level. + */ + vp_new.w0 = xive_set_field32(NVP_W0_VPRIV, vp_new.w0, 0); + + vp_new.w2 = xive_set_field32(NVP_W2_CPPR, vp_new.w2, 0xFF); + vp_new.w0 = xive_set_field32(NVP_W0_HW, vp_new.w0, 1); + } } else { /* * TODO (kvm): disabling a VP invalidates the associated ENDs. diff --git a/include/opal-api.h b/include/opal-api.h index eb6d83527..d7b301a30 100644 --- a/include/opal-api.h +++ b/include/opal-api.h @@ -1177,6 +1177,7 @@ enum { enum { OPAL_XIVE_VP_ENABLED = 0x00000001, OPAL_XIVE_VP_SINGLE_ESCALATION = 0x00000002, + OPAL_XIVE_VP_SAVE_RESTORE = 0x00000004, }; /* "Any chip" replacement for chip ID for allocation functions */ diff --git a/include/xive2-regs.h b/include/xive2-regs.h index 6295dd191..ad1a9b79f 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -31,7 +31,7 @@ #define CQ_XIVE_CAP_VP_INT_PRIO_4_8 2 #define CQ_XIVE_CAP_VP_INT_PRIO_8 3 #define CQ_XIVE_CAP_BLOCK_ID_WIDTH PPC_BITMASK(12,13) - +#define CQ_XIVE_CAP_VP_SAVE_RESTORE PPC_BIT(38) #define CQ_XIVE_CAP_PHB_PQ_DISABLE PPC_BIT(56) #define CQ_XIVE_CAP_PHB_ABT PPC_BIT(57) #define CQ_XIVE_CAP_EXPLOITATION_MODE PPC_BIT(58) @@ -68,6 +68,10 @@ #define CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS PPC_BIT(27) /* 0 if bit[25]=0 */ #define CQ_XIVE_CFG_GEN1_END_ESX PPC_BIT(28) /* END ESx stores are dropped */ +#define CQ_XIVE_CFG_EN_VP_SAVE_RESTORE PPC_BIT(38) /* 0 if bit[25]=1 */ +#define CQ_XIVE_CFG_EN_VP_SAVE_REST_STRICT PPC_BIT(39) /* 0 if bit[25]=1 */ + +#define CQ_XIVE_CFG_EN_VP_SAVE_RESTORE PPC_BIT(38) /* 0 if bit[25]=1 */ /* Interrupt Controller Base Address Register - 512 pages (32M) */ #define X_CQ_IC_BAR 0x08 @@ -508,6 +512,8 @@ struct xive_end { struct xive_nvp { beint32_t w0; #define NVP_W0_VALID PPC_BIT32(0) +#define NVP_W0_HW PPC_BIT32(7) +#define NVP_W0_VPRIV PPC_BITMASK32(14,15) #define NVP_W0_ESC_END PPC_BIT32(25) /* 'N' bit 0:ESB 1:END */ beint32_t w1; beint32_t w2; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:17 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:17 +0530 Subject: [Skiboot] [PATCH v2 39/59] xive/p10: Introduce new capability bits In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-40-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater These bits control the availability of interrupt features : StoreEOI, PHB PQ_disable, PHB Address-Based Trigger and the overall XIVE exploitation mode. These bits can be set at early boot time of the system to activate/deactivate a feature for testing purposes. The default value should be '1'. The 'XIVE exploitation mode' bit is a software bit that skiboot could use to disable the XIVE OS interface and propose a P8 style XICS interface instead. There are no plans for that for the moment. The 'PHB PQ_disable', 'PHB Address-Based Trigger' bits are only used by the PHB5 driver and we deduce their availability from the capabilities of the first XIVE chip. If called from a PHB4 driver, the capabilities should be set to false. Signed-off-by: C?dric Le Goater [FB: port to phb4.c] Signed-off-by: Frederic Barrat Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/phb4.c | 4 ++-- hw/xive2.c | 56 ++++++++++++++++++++++++++++++++++++++------ include/xive.h | 5 +++- include/xive2-regs.h | 6 +++++ 4 files changed, 61 insertions(+), 10 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index d2fc274b3..de314b13f 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -154,7 +154,7 @@ static inline bool is_phb5(void) static inline bool phb_pq_disable(struct phb4 *p __unused) { if (is_phb5()) - return 1; + return xive2_cap_phb_pq_disable(); return false; } @@ -166,7 +166,7 @@ static inline bool phb_pq_disable(struct phb4 *p __unused) static inline bool phb_abt_mode(struct phb4 *p __unused) { if (is_phb5()) - return 1; + return xive2_cap_phb_abt(); return false; } diff --git a/hw/xive2.c b/hw/xive2.c index f565be1fd..0005a8314 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -224,6 +224,7 @@ struct xive { struct dt_node *x_node; enum xive_generation generation; + uint64_t capabilities; uint64_t config; uint64_t xscom_base; @@ -341,8 +342,6 @@ struct xive { uint64_t quirks; }; -#define XIVE_CAN_STORE_EOI(x) XIVE2_STORE_EOI_ENABLED - /* First XIVE unit configured on the system */ static struct xive *one_xive; @@ -1509,6 +1508,10 @@ static const struct { uint64_t bitmask; const char *name; } xive_capabilities[] = { + { CQ_XIVE_CAP_PHB_PQ_DISABLE, "PHB PQ disable mode support" }, + { CQ_XIVE_CAP_PHB_ABT, "PHB address based trigger mode support" }, + { CQ_XIVE_CAP_EXPLOITATION_MODE, "Exploitation mode" }, + { CQ_XIVE_CAP_STORE_EOI, "StoreEOI mode support" }, }; static void xive_dump_capabilities(struct xive *x, uint64_t cap_val) @@ -1584,6 +1587,13 @@ static void xive_dump_configuration(struct xive *x, const char *prefix, CQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS | \ CQ_XIVE_CFG_GEN1_END_ESX) +static bool xive_has_cap(struct xive *x, uint64_t cap) +{ + return !!x && !!(x->capabilities & cap); +} + +#define XIVE_CAN_STORE_EOI(x) xive_has_cap(x, CQ_XIVE_CAP_STORE_EOI) + static void xive_config_reduced_priorities_fixup(struct xive *x) { if (xive_cfg_vp_prio_shift(x) < CQ_XIVE_CFG_INT_PRIO_8 && @@ -1599,12 +1609,10 @@ static void xive_config_reduced_priorities_fixup(struct xive *x) static bool xive_config_init(struct xive *x) { - uint64_t cap_val; - - cap_val = xive_regr(x, CQ_XIVE_CAP); - xive_dump_capabilities(x, cap_val); + x->capabilities = xive_regr(x, CQ_XIVE_CAP); + xive_dump_capabilities(x, x->capabilities); - x->generation = GETFIELD(CQ_XIVE_CAP_VERSION, cap_val); + x->generation = GETFIELD(CQ_XIVE_CAP_VERSION, x->capabilities); /* * Allow QEMU to override version for tests @@ -4420,6 +4428,40 @@ static void xive_init_globals(void) xive_block_to_chip[i] = XIVE_INVALID_CHIP; } +/* + * The global availability of some capabilities used in other drivers + * (PHB, PSI) is deduced from the capabilities of the first XIVE chip + * of the system. It should be common to all chips. + */ +bool xive2_cap_phb_pq_disable(void) +{ + return xive_has_cap(one_xive, CQ_XIVE_CAP_PHB_PQ_DISABLE); +} + +bool xive2_cap_phb_abt(void) +{ + if (!xive_has_cap(one_xive, CQ_XIVE_CAP_PHB_ABT)) + return false; + + /* + * We need 'PQ disable' to use ABT mode, else the OS will use + * two different sets of ESB pages (PHB and IC) to control the + * interrupt sources. Can not work. + */ + if (!xive2_cap_phb_pq_disable()) { + prlog_once(PR_ERR, "ABT mode is set without PQ disable. " + "Ignoring bogus configuration\n"); + return false; + } + + return true; +} + +bool xive2_cap_store_eoi(void) +{ + return xive_has_cap(one_xive, CQ_XIVE_CAP_STORE_EOI); +} + void xive2_init(void) { struct dt_node *np; diff --git a/include/xive.h b/include/xive.h index 8d5fbeddb..1a8a2e027 100644 --- a/include/xive.h +++ b/include/xive.h @@ -72,9 +72,12 @@ void xive_late_init(void); * the PHB5 should be configured in Address-based trigger mode with PQ * state bit offloading. */ -#define XIVE2_STORE_EOI_ENABLED 1 +#define XIVE2_STORE_EOI_ENABLED xive2_cap_store_eoi() void xive2_init(void); +bool xive2_cap_phb_pq_disable(void); +bool xive2_cap_phb_abt(void); +bool xive2_cap_store_eoi(void); int64_t xive2_reset(void); uint32_t xive2_alloc_hw_irqs(uint32_t chip_id, uint32_t count, uint32_t align); diff --git a/include/xive2-regs.h b/include/xive2-regs.h index 6697f036e..79c36ebca 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -32,6 +32,12 @@ #define CQ_XIVE_CAP_VP_INT_PRIO_8 3 #define CQ_XIVE_CAP_BLOCK_ID_WIDTH PPC_BITMASK(12,13) +#define CQ_XIVE_CAP_PHB_PQ_DISABLE PPC_BIT(56) +#define CQ_XIVE_CAP_PHB_ABT PPC_BIT(57) +#define CQ_XIVE_CAP_EXPLOITATION_MODE PPC_BIT(58) +#define CQ_XIVE_CAP_STORE_EOI PPC_BIT(59) +/* 62:63 reserved */ + /* XIVE Configuration */ #define X_CQ_XIVE_CFG 0x03 #define CQ_XIVE_CFG 0x018 -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:18 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:18 +0530 Subject: [Skiboot] [PATCH v2 40/59] hw/psi-p10: Configure interrupt offset before notify addr In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-41-hegdevasant@linux.vnet.ibm.com> From: Oliver O'Halloran When configuring the XIVE notification address any currently pending interrupts will be delivered once the the valid bit in the BAR is set. Currently we enable the notify BAR before we've configured the global interrupt number offset for the PSI interrupts. If any PSI interrupt is we'll send an interrupt trigger notification to the XIVE with the wrong interrupt vector (0..15). This can potentially cause a checkstop since there may not be an EAS / IVT configure for that vector. Fix this by registering and masking all the PSI interrupts after we've configured the ESB BAR, but before configuring the notification address and offset. Signed-off-by: Oliver O'Halloran Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/psi.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/hw/psi.c b/hw/psi.c index 954b7bf68..de074ce4a 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -796,16 +796,6 @@ static void psi_init_p10_interrupts(struct psi *psi) flags |= XIVE_SRC_STORE_EOI; } - /* Grab and configure the notification port */ - val = xive2_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); - val |= PSIHB_ESB_NOTIF_VALID; - out_be64(psi->regs + PSIHB_ESB_NOTIF_ADDR, val); - - /* Setup interrupt offset */ - val = xive2_get_notify_base(psi->interrupt); - val <<= 32; - out_be64(psi->regs + PSIHB_IVT_OFFSET, val); - /* Register sources */ prlog(PR_DEBUG, "PSI[0x%03x]: Interrupts sources registered for P10 DD%i.%i\n", @@ -820,6 +810,16 @@ static void psi_init_p10_interrupts(struct psi *psi) for (isn = is->start; isn < is->end; isn++) xive2_source_mask(is, isn); + /* Setup interrupt offset */ + val = xive2_get_notify_base(psi->interrupt); + val <<= 32; + out_be64(psi->regs + PSIHB_IVT_OFFSET, val); + + /* Grab and configure the notification port */ + val = xive2_get_notify_port(psi->chip_id, XIVE_HW_SRC_PSI); + val |= PSIHB_ESB_NOTIF_VALID; + out_be64(psi->regs + PSIHB_ESB_NOTIF_ADDR, val); + /* Reset irq handling and switch to ESB mode */ out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, PSIHB_IRQ_RESET); out_be64(psi->regs + PSIHB_INTERRUPT_CONTROL, 0); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:22 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:22 +0530 Subject: [Skiboot] [PATCH v2 44/59] xive/p10: Activate split mode for PHB ESBs when PQ_disable is available In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-45-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater 1/3rd of the cache is reserved for PHB ESBs and the rest to IPIs. This is sufficient to keep all the PHB ESBs in cache and avoid ESB cache misses during IO interrupt processing. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 25 +++++++++++++++++++++++++ include/xive2-regs.h | 5 +++++ 2 files changed, 30 insertions(+) diff --git a/hw/xive2.c b/hw/xive2.c index 2291e9379..0f9c93d6a 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1602,6 +1602,29 @@ static bool xive_cfg_save_restore(struct xive *x) return !!(x->config & CQ_XIVE_CFG_EN_VP_SAVE_RESTORE); } +/* + * When PQ_disable is available, configure the ESB cache to improve + * performance for PHB ESBs. + * + * split_mode : + * 1/3rd of the cache is reserved for PHB ESBs and the rest to + * IPIs. This is sufficient to keep all the PHB ESBs in cache and + * avoid ESB cache misses during IO interrupt processing. + */ +static void xive_config_esb_cache(struct xive *x) +{ + uint64_t val = xive_regr(x, VC_ESBC_CFG); + + if (xive_has_cap(x, CQ_XIVE_CAP_PHB_PQ_DISABLE)) { + val |= VC_ESBC_CFG_SPLIT_MODE; + xive_dbg(x, "ESB cache configured with split mode. " + "VC_ESBC_CFG=%016llx\n", val); + } else + val &= ~VC_ESBC_CFG_SPLIT_MODE; + + xive_regw(x, VC_ESBC_CFG, val); +} + static void xive_config_fused_core(struct xive *x) { uint64_t val = xive_regr(x, TCTXT_CFG); @@ -1717,6 +1740,8 @@ static bool xive_config_init(struct xive *x) xive_config_fused_core(x); + xive_config_esb_cache(x); + xive_config_reduced_priorities_fixup(x); return true; diff --git a/include/xive2-regs.h b/include/xive2-regs.h index ad1a9b79f..4638c3d89 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -227,6 +227,11 @@ #define VC_ESBC_FLUSH_POLL_BLOCK_ID_MASK PPC_BITMASK(32,35) #define VC_ESBC_FLUSH_POLL_OFFSET_MASK PPC_BITMASK(36,63) /* 28-bit */ +/* ESBC configuration */ +#define X_VC_ESBC_CFG 0x148 +#define VC_ESBC_CFG 0x240 +#define VC_ESBC_CFG_SPLIT_MODE PPC_BIT(56) + /* EASC flush control register */ #define X_VC_EASC_FLUSH_CTRL 0x160 #define VC_EASC_FLUSH_CTRL 0x300 -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:23 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:23 +0530 Subject: [Skiboot] [PATCH v2 45/59] xive/p10: Activate has_array when PQ_disable is available In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-46-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater hash_array is an Internal cache hashing optimization. It tracks for ESBs where the original trigger came from so that we avoid getting the EAS into the cache twice. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 11 ++++++++--- include/xive2-regs.h | 2 ++ 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/hw/xive2.c b/hw/xive2.c index 0f9c93d6a..1ad1f138d 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1610,15 +1610,20 @@ static bool xive_cfg_save_restore(struct xive *x) * 1/3rd of the cache is reserved for PHB ESBs and the rest to * IPIs. This is sufficient to keep all the PHB ESBs in cache and * avoid ESB cache misses during IO interrupt processing. + * + * hash_array_enable : + * Internal cache hashing optimization. The hash_array tracks for + * ESBs where the original trigger came from so that we avoid + * getting the EAS into the cache twice. */ static void xive_config_esb_cache(struct xive *x) { uint64_t val = xive_regr(x, VC_ESBC_CFG); if (xive_has_cap(x, CQ_XIVE_CAP_PHB_PQ_DISABLE)) { - val |= VC_ESBC_CFG_SPLIT_MODE; - xive_dbg(x, "ESB cache configured with split mode. " - "VC_ESBC_CFG=%016llx\n", val); + val |= VC_ESBC_CFG_SPLIT_MODE | VC_ESBC_CFG_HASH_ARRAY_ENABLE; + xive_dbg(x, "ESB cache configured with split mode " + "and hash array. VC_ESBC_CFG=%016llx\n", val); } else val &= ~VC_ESBC_CFG_SPLIT_MODE; diff --git a/include/xive2-regs.h b/include/xive2-regs.h index 4638c3d89..c2ed265f6 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -230,6 +230,8 @@ /* ESBC configuration */ #define X_VC_ESBC_CFG 0x148 #define VC_ESBC_CFG 0x240 +#define VC_ESBC_CFG_HASH_ARRAY_ENABLE PPC_BIT(40) +#define VC_ESBC_CFG_HASH_STORE_MODE PPC_BITMASK(41,42) #define VC_ESBC_CFG_SPLIT_MODE PPC_BIT(56) /* EASC flush control register */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:25 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:25 +0530 Subject: [Skiboot] [PATCH v2 47/59] xive/p10: Change alignment of the queue overflow pages In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-48-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater The Memory Coherence Directory uses 16M "granule" to track shared copies of a cache line. If any cache line within the 16M range gets touched by someone outside of the group, the MCD forces accesses to any cache line within the range to include everyone that might have a shared copy. Allocate the queue overflow pages and use a 16M alignment to avoid sharing with other structures and reduce traffic on the PowerBus. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/hw/xive2.c b/hw/xive2.c index 56b02fc67..a7b45a005 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1492,6 +1492,8 @@ static bool xive_configure_bars(struct xive *x) xive_dbg(x, "NVP: %14p [0x%012llx]\n", x->nvp_base, x->nvp_size); xive_dbg(x, "ESB: %14p [0x%012llx]\n", x->esb_base, x->esb_size); xive_dbg(x, "END: %14p [0x%012llx]\n", x->end_base, x->end_size); + xive_dbg(x, "OVF: %14p [0x%012x]\n", x->q_ovf, + VC_QUEUE_COUNT * PAGE_SIZE); return true; } @@ -1898,8 +1900,22 @@ static bool xive_prealloc_tables(struct xive *x) return false; } - /* Allocate the queue overflow pages */ - x->q_ovf = local_alloc(x->chip_id, VC_QUEUE_COUNT * PAGE_SIZE, PAGE_SIZE); + /* + * The Memory Coherence Directory uses 16M "granule" to track + * shared copies of a cache line. If any cache line within the + * 16M range gets touched by someone outside of the group, the + * MCD forces accesses to any cache line within the range to + * include everyone that might have a shared copy. + */ +#define QUEUE_OVF_ALIGN (16 << 20) /* MCD granule size */ + + /* + * Allocate the queue overflow pages and use a 16M alignment + * to avoid sharing with other structures and reduce traffic + * on the PowerBus. + */ + x->q_ovf = local_alloc(x->chip_id, VC_QUEUE_COUNT * PAGE_SIZE, + QUEUE_OVF_ALIGN); if (!x->q_ovf) { xive_err(x, "Failed to allocate queue overflow\n"); return false; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:21 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:21 +0530 Subject: [Skiboot] [PATCH v2 43/59] xive/p10: Introduce a new OPAL_XIVE_IRQ_STORE_EOI2 flag In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-44-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater StoreEOI (the capability to EOI with a store) requires load-after-store ordering in some cases to be reliable. P10 introduced a new offset for load operations to enforce correct ordering and the XIVE driver has the required support since kernel 5.8, commit b1f9be9392f0. OPAL on P10 will advertise support of StoreEOI with a new flag. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 2 +- include/opal-api.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/xive2.c b/hw/xive2.c index 7ece64251..2291e9379 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -3040,7 +3040,7 @@ static uint64_t xive_convert_irq_flags(uint64_t iflags) uint64_t oflags = 0; if (iflags & XIVE_SRC_STORE_EOI) - oflags |= OPAL_XIVE_IRQ_STORE_EOI; + oflags |= OPAL_XIVE_IRQ_STORE_EOI2; /* OPAL_XIVE_IRQ_TRIGGER_PAGE is only meant to be set if * the interrupt has a *separate* trigger page. diff --git a/include/opal-api.h b/include/opal-api.h index d7b301a30..348fda8c6 100644 --- a/include/opal-api.h +++ b/include/opal-api.h @@ -1164,6 +1164,7 @@ enum { OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008, /* DD1.0 workaround */ OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010, /* DD1.0 workaround */ OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020, /* DD1.0 workaround */ + OPAL_XIVE_IRQ_STORE_EOI2 = 0x00000040, }; /* Flags for OPAL_XIVE_GET/SET_QUEUE_INFO */ -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:26 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:26 +0530 Subject: [Skiboot] [PATCH v2 48/59] hw/phb5: Update PHB numbering to allow for virtual PHBs In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-49-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat Make room for a per-chip numbering of virtual PHBs used by opencapi. We can have up to 12 opencapi PHBs (two per PAU) on P10. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- include/phb4.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/phb4.h b/include/phb4.h index 217f68462..0bbfc926c 100644 --- a/include/phb4.h +++ b/include/phb4.h @@ -245,9 +245,9 @@ static inline void phb4_set_err_pending(struct phb4 *p, bool pending) p->err_pending = pending; } -#define MAX_PHBS_PER_CHIP_P10 6 /* Max 6 PHBs per chip on p10 */ #define MAX_PHBS_PER_CHIP_P9 6 /* Max 6 PHBs per chip on p9 */ #define MAX_PHBS_PER_CHIP_P9P 0x10 /* extra for virt PHBs */ +#define MAX_PHBS_PER_CHIP_P10 0x12 /* 6 PCI + 12 opencapi */ static inline int phb4_get_opal_id(unsigned int chip_id, unsigned int index) { -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:27 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:27 +0530 Subject: [Skiboot] [PATCH v2 49/59] phb5: Activate StoreEOI for LSIs In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-50-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/phb4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/phb4.c b/hw/phb4.c index de314b13f..6700c7fbb 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -6023,7 +6023,7 @@ static void phb4_create(struct dt_node *np) */ xive2_register_hw_source(p->base_lsi, 8, 16, p->int_mmio + ((p->num_irqs - 8) << 16), - XIVE_SRC_LSI, p, &phb4_lsi_ops); + XIVE_SRC_LSI | irq_flags, p, &phb4_lsi_ops); } else { /* Register all interrupt sources with XIVE */ xive_register_hw_source(p->base_msi, p->num_irqs - 8, 16, -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:28 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:28 +0530 Subject: [Skiboot] [PATCH v2 50/59] phb5: Add register inits specific to Gen5 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-51-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat Update init sequence to take into account Gen5. Define default equlization settings if HDAT is not used. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hw/phb4.c | 18 +++++++++++++----- include/phb4-regs.h | 6 ++++-- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index 6700c7fbb..0e98042ce 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -5387,8 +5387,12 @@ static void phb4_init_hw(struct phb4 *p) out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL1, be64_to_cpu(p->lane_eq[1])); out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL2, be64_to_cpu(p->lane_eq[2])); out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL3, be64_to_cpu(p->lane_eq[3])); - out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL20, be64_to_cpu(p->lane_eq[4])); - out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL21, be64_to_cpu(p->lane_eq[5])); + out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL40, be64_to_cpu(p->lane_eq[4])); + out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL41, be64_to_cpu(p->lane_eq[5])); + if (is_phb5()) { + out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL50, be64_to_cpu(p->lane_eq[6])); + out_be64(p->regs + PHB_PCIE_LANE_EQ_CNTL51, be64_to_cpu(p->lane_eq[7])); + } } if (!p->lane_eq_en) { /* Read modify write and set to 2 bits */ @@ -5830,7 +5834,7 @@ static __be64 lane_eq_phb5_default[8] = { CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), - CPU_TO_BE64(0x4444444444444444UL), CPU_TO_BE64(0x4444444444444444UL), + CPU_TO_BE64(0x9999999999999999UL), CPU_TO_BE64(0x9999999999999999UL), }; static void phb4_create(struct dt_node *np) @@ -5842,7 +5846,7 @@ static void phb4_create(struct dt_node *np) struct dt_node *iplp; char *path; uint32_t irq_base, irq_flags; - int i; + int i, eq_reg_count; int chip_id; chip_id = dt_prop_get_u32(np, "ibm,chip-id"); @@ -5942,7 +5946,11 @@ static void phb4_create(struct dt_node *np) /* Check for lane equalization values from HB or HDAT */ p->lane_eq_en = true; p->lane_eq = dt_prop_get_def_size(np, "ibm,lane-eq", NULL, &lane_eq_len); - lane_eq_len_req = 6 * 8; + if (is_phb5()) + eq_reg_count = 8; + else + eq_reg_count = 6; + lane_eq_len_req = eq_reg_count * 8; if (p->lane_eq) { if (lane_eq_len < lane_eq_len_req) { PHBERR(p, "Device-tree has ibm,lane-eq too short: %ld" diff --git a/include/phb4-regs.h b/include/phb4-regs.h index 99633e103..8ab78c377 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -295,8 +295,10 @@ #define PHB_PCIE_LANE_EQ_CNTL1 0x1AD8 #define PHB_PCIE_LANE_EQ_CNTL2 0x1AE0 #define PHB_PCIE_LANE_EQ_CNTL3 0x1AE8 -#define PHB_PCIE_LANE_EQ_CNTL20 0x1AF0 -#define PHB_PCIE_LANE_EQ_CNTL21 0x1AF8 +#define PHB_PCIE_LANE_EQ_CNTL40 0x1AF0 +#define PHB_PCIE_LANE_EQ_CNTL41 0x1AF8 +#define PHB_PCIE_LANE_EQ_CNTL50 0x1B00 +#define PHB_PCIE_LANE_EQ_CNTL51 0x1B08 #define PHB_PCIE_TRACE_CTRL 0x1B20 #define PHB_PCIE_MISC_STRAP 0x1B30 #define PHB_PCIE_PDL_PHY_EQ_CNTL 0x1B38 -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:29 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:29 +0530 Subject: [Skiboot] [PATCH v2 51/59] phb5: Enable Gen5 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-52-hegdevasant@linux.vnet.ibm.com> From: Michael Neuling Registers for Gen5 have been initialized in a previous patch. So let's activate it! Signed-off-by: Michael Neuling Signed-off-by: Vasant Hegde --- hw/phb4.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index 0e98042ce..9bc8d47ee 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3008,12 +3008,16 @@ static int64_t phb4_poll_link(struct pci_slot *slot) static unsigned int phb4_get_max_link_speed(struct phb4 *p, struct dt_node *np) { - unsigned int max_link_speed; + unsigned int max_link_speed, hw_max_link_speed; struct proc_chip *chip; chip = get_chip(p->chip_id); + hw_max_link_speed = 4; + if (is_phb5()) + hw_max_link_speed = 5; + /* Priority order: NVRAM -> dt -> GEN3 dd2.00 -> GEN4 */ - max_link_speed = 4; + max_link_speed = hw_max_link_speed; if (p->rev == PHB4_REV_NIMBUS_DD20 && ((0xf & chip->ec_level) == 0) && chip->ec_rev == 0) max_link_speed = 3; @@ -3033,8 +3037,8 @@ static unsigned int phb4_get_max_link_speed(struct phb4 *p, struct dt_node *np) } if (pcie_max_link_speed) max_link_speed = pcie_max_link_speed; - if (max_link_speed > 4) /* clamp to 4 */ - max_link_speed = 4; + if (max_link_speed > hw_max_link_speed) + max_link_speed = hw_max_link_speed; return max_link_speed; } -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:30 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:30 +0530 Subject: [Skiboot] [PATCH v2 52/59] phb5: Workaround for PCI bug HW551382 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-53-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat The workaround forces a state machine deep in the PHB to start from scratch and to block its evolution until after the link has been reset. It applies on all paths where the link can go down unexpectedly, though it's probably useless on the creset path, since we're going to deep-reset the PHB anyway. But it doesn't hurt and it keeps the set/unset path symmetrical. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hw/phb4.c | 35 +++++++++++++++++++++++++++++++++++ include/phb4-regs.h | 2 +- 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/hw/phb4.c b/hw/phb4.c index 9bc8d47ee..e30339fab 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3072,6 +3072,18 @@ static void phb4_assert_perst(struct pci_slot *slot, bool assert) phb4_pcicfg_write16(&p->phb, 0, p->ecap + PCICAP_EXP_LCTL, linkctl); } +static void set_sys_disable_detect(struct phb4 *p, bool set) +{ + uint64_t val; + + val = in_be64(p->regs + PHB_PCIE_DLP_TRAIN_CTL); + if (set) + val |= PHB_PCIE_DLP_SYS_DISABLEDETECT; + else + val &= ~PHB_PCIE_DLP_SYS_DISABLEDETECT; + out_be64(p->regs + PHB_PCIE_DLP_TRAIN_CTL, val); +} + static int64_t phb4_hreset(struct pci_slot *slot) { struct phb4 *p = phb_to_phb4(slot->phb); @@ -3088,6 +3100,12 @@ static int64_t phb4_hreset(struct pci_slot *slot) return OPAL_SUCCESS; } + /* circumvention for HW551382 */ + if (is_phb5()) { + PHBINF(p, "HRESET: Workaround for HW551382\n"); + set_sys_disable_detect(p, true); + } + PHBDBG(p, "HRESET: Prepare for link down\n"); phb4_prepare_link_change(slot, false); /* fall through */ @@ -3120,6 +3138,8 @@ static int64_t phb4_hreset(struct pci_slot *slot) pci_slot_set_state(slot, PHB4_SLOT_HRESET_DELAY2); return pci_slot_set_sm_timeout(slot, secs_to_tb(1)); case PHB4_SLOT_HRESET_DELAY2: + if (is_phb5()) + set_sys_disable_detect(p, false); pci_slot_set_state(slot, PHB4_SLOT_LINK_START); return slot->ops.poll_link(slot); default: @@ -3146,6 +3166,12 @@ static int64_t phb4_freset(struct pci_slot *slot) phb4_prepare_link_change(slot, false); if (!p->skip_perst) { + /* circumvention for HW551382 */ + if (is_phb5()) { + PHBINF(p, "FRESET: Workaround for HW551382\n"); + set_sys_disable_detect(p, true); + } + PHBDBG(p, "FRESET: Assert\n"); phb4_assert_perst(slot, true); pci_slot_set_state(slot, PHB4_SLOT_FRESET_ASSERT_DELAY); @@ -3169,6 +3195,9 @@ static int64_t phb4_freset(struct pci_slot *slot) if (pci_tracing) phb4_link_trace(p, PHB_PCIE_DLP_LTSSM_L0, 3000); + if (is_phb5()) + set_sys_disable_detect(p, false); + pci_slot_set_state(slot, PHB4_SLOT_LINK_START); return slot->ops.poll_link(slot); default: @@ -3398,6 +3427,12 @@ static int64_t phb4_creset(struct pci_slot *slot) p->creset_start_time = mftb(); + /* circumvention for HW551382 */ + if (is_phb5()) { + PHBINF(p, "CRESET: Workaround for HW551382\n"); + set_sys_disable_detect(p, true); + } + phb4_prepare_link_change(slot, false); /* Clear error inject register, preventing recursive errors */ xscom_write(p->chip_id, p->pe_xscom + 0x2, 0x0); diff --git a/include/phb4-regs.h b/include/phb4-regs.h index 8ab78c377..85d2cf2ea 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -275,7 +275,7 @@ #define PHB_PCIE_DLP_DL_PGRESET PPC_BIT(22) #define PHB_PCIE_DLP_TRAINING PPC_BIT(20) #define PHB_PCIE_DLP_INBAND_PRESENCE PPC_BIT(19) - +#define PHB_PCIE_DLP_SYS_DISABLEDETECT PPC_BIT(12) #define PHB_PCIE_DLP_CTL 0x1A78 #define PHB_PCIE_DLP_CTL_BYPASS_PH2 PPC_BIT(4) #define PHB_PCIE_DLP_CTL_BYPASS_PH3 PPC_BIT(5) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:31 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:31 +0530 Subject: [Skiboot] [PATCH v2 53/59] phb4: Cleanup PEC config discovery in CAPI mode In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-54-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat Small cleanup when reading the PEC config when setting up CAPI, in preparation for P10. Scom addresses vary between P9 and P10 and we'll be accessing more than one PCI chiplet. No functional change. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hw/phb4.c | 17 +++++++++-------- include/phb4-regs.h | 10 +++++++--- 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index e30339fab..8857a8ab5 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -4282,7 +4282,7 @@ static int64_t phb4_get_capp_info(int chip_id, struct phb *phb, static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) { - uint64_t reg; + uint64_t addr, reg; uint32_t offset; uint8_t link_width_x16 = 1; @@ -4293,9 +4293,10 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) /* Check if PEC2 is in x8 or x16 mode. * PEC0 is always in x16 */ - xscom_read(p->chip_id, XPEC_PCI2_CPLT_CONF1, ®); - link_width_x16 = ((reg & XPEC_PCI2_IOVALID_MASK) == - XPEC_PCI2_IOVALID_X16); + addr = XPEC_P9_PCI_CPLT_CONF1 + 2 * XPEC_PCI_CPLT_OFFSET; + xscom_read(p->chip_id, addr, ®); + link_width_x16 = ((reg & XPEC_P9_PCI_IOVALID_MASK) == + XPEC_P9_PCI_IOVALID_X16); } /* APC Master PowerBus Control Register */ @@ -4515,7 +4516,7 @@ static void phb4_init_capp_errors(struct phb4 *p) static int64_t enable_capi_mode(struct phb4 *p, uint64_t pe_number, uint32_t capp_eng) { - uint64_t reg, start_addr, end_addr, stq_eng, dma_eng; + uint64_t addr, reg, start_addr, end_addr, stq_eng, dma_eng; uint64_t mbt0, mbt1; int i, window_num = -1; @@ -4553,9 +4554,9 @@ static int64_t enable_capi_mode(struct phb4 *p, uint64_t pe_number, if (p->index == CAPP1_PHB_INDEX) { /* Check if PEC is in x8 or x16 mode */ - xscom_read(p->chip_id, XPEC_PCI2_CPLT_CONF1, ®); - - if ((reg & XPEC_PCI2_IOVALID_MASK) == XPEC_PCI2_IOVALID_X16) { + addr = XPEC_P9_PCI_CPLT_CONF1 + 2 * XPEC_PCI_CPLT_OFFSET; + xscom_read(p->chip_id, addr, ®); + if ((reg & XPEC_P9_PCI_IOVALID_MASK) == XPEC_P9_PCI_IOVALID_X16) { /* PBCQ is operating as a x16 stack * - The maximum number of engines give to CAPP will be * 14 and will be assigned in the order of STQ 15 to 2. diff --git a/include/phb4-regs.h b/include/phb4-regs.h index 85d2cf2ea..b4a94c056 100644 --- a/include/phb4-regs.h +++ b/include/phb4-regs.h @@ -385,9 +385,13 @@ /* PCI Chiplet Config Register */ -#define XPEC_PCI2_CPLT_CONF1 0x000000000F000009ULL -#define XPEC_PCI2_IOVALID_MASK PPC_BITMASK(4, 6) -#define XPEC_PCI2_IOVALID_X16 PPC_BIT(4) +#define XPEC_PCI_CPLT_OFFSET 0x1000000ULL +#define XPEC_P9_PCI_CPLT_CONF1 0x000000000D000009ULL +#define XPEC_P9_PCI_IOVALID_MASK PPC_BITMASK(4, 6) +#define XPEC_P9_PCI_IOVALID_X16 PPC_BIT(4) +#define XPEC_P9_PCI_LANE_CFG PPC_BITMASK(10, 11) +#define XPEC_P10_PCI_CPLT_CONF1 0x0000000008000009ULL +#define XPEC_P10_PCI_LANE_CFG PPC_BITMASK(0, 1) /* * IODA3 on-chip tables -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:32 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:32 +0530 Subject: [Skiboot] [PATCH v2 54/59] phb4/5: Fix PHB link width detection to avoid useless retrainings In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-55-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat On P9 and P10, the PCI express controller (PEC) controls a set of 16 lanes, which can be grouped to form link(s) of various width (4, 8 or 16 lanes). A PCI host bridge (PHB) is handling each link. How many PHBs are active in each PEC is configurable per chip and vary between 2 chips in a system. Therefore PHBs have different link width. The link width of the PHB is used to check if the link is trained optimally and can cause link training retries if that's not the case. We were reading the max link width of a PHB from the link capability register of the PCI express capability of the root bridge. But that value is always an overshoot as it needs to accommodate any PEC configuration. It was hard to fault on P9, as a PEC needs to be trifurcated to start noticing a difference and the device-supported width can also mask it. But on P10, it's also noticeable on bifurcated configuration so it's a bit easier to spot. For example, on P10, PHB0 reports a supported width of 16 in its link capability register because that's what is needed in case of no furcation, but if the PEC is bifurcated or trifurcated, only 8 lanes are wired. So we won't be able to train at more than x8. If we believe the PHB is x16-capable, then we'll retrain the link, potentially several times, thinking it's not optimal, which is a waste of time. This patch finds out the real maximum link width of each PHB, which may require to go check the PEC configuration. The logic is the same on P9 and P10 though the hardware implementations differ slightly. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hw/phb4.c | 91 ++++++++++++++++++++++++++++++++++++++++++++------ include/phb4.h | 1 + 2 files changed, 81 insertions(+), 11 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index 8857a8ab5..b173e25ca 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -2726,18 +2726,14 @@ static bool phb4_link_optimal(struct pci_slot *slot, uint32_t *vdid) uint64_t reg; uint32_t id; uint16_t bdfn, lane_errs; - uint8_t trained_speed, phb_speed, dev_speed, target_speed, rx_errs; - uint8_t trained_width, phb_width, dev_width, target_width; + uint8_t trained_speed, dev_speed, target_speed, rx_errs; + uint8_t trained_width, dev_width, target_width; bool optimal_speed, optimal_width, optimal, retry_enabled, rx_err_ok; /* Current trained state */ phb4_get_link_info(slot, &trained_speed, &trained_width); - /* Get PHB capability */ - /* NOTE: phb_speed will account for the software speed limit */ - phb4_get_info(slot->phb, 0, &phb_speed, &phb_width); - /* Get device capability */ bdfn = 0x0100; /* bus=1 dev=0 device=0 */ /* Since this is the first access, we need to wait for CRS */ @@ -2746,9 +2742,9 @@ static bool phb4_link_optimal(struct pci_slot *slot, uint32_t *vdid) phb4_get_info(slot->phb, bdfn, &dev_speed, &dev_width); /* Work out if we are optimally trained */ - target_speed = MIN(phb_speed, dev_speed); + target_speed = MIN(p->max_link_speed, dev_speed); optimal_speed = (trained_speed >= target_speed); - target_width = MIN(phb_width, dev_width); + target_width = MIN(p->max_link_width, dev_width); optimal_width = (trained_width >= target_width); optimal = optimal_width && optimal_speed; retry_enabled = (phb4_chip_retry_workaround() && @@ -2764,9 +2760,11 @@ static bool phb4_link_optimal(struct pci_slot *slot, uint32_t *vdid) DEVICE(id), optimal ? "Optimal" : "Degraded", retry_enabled ? "enabled" : "disabled"); PHBDBG(p, "LINK: Speed Train:GEN%i PHB:GEN%i DEV:GEN%i%s\n", - trained_speed, phb_speed, dev_speed, optimal_speed ? "" : " *"); + trained_speed, p->max_link_speed, dev_speed, + optimal_speed ? "" : " *"); PHBDBG(p, "LINK: Width Train:x%02i PHB:x%02i DEV:x%02i%s\n", - trained_width, phb_width, dev_width, optimal_width ? "" : " *"); + trained_width, p->max_link_width, dev_width, + optimal_width ? "" : " *"); PHBDBG(p, "LINK: RX Errors Now:%i Max:%i Lane:0x%04x%s\n", rx_errs, rx_err_max, lane_errs, rx_err_ok ? "" : " *"); @@ -3043,6 +3041,75 @@ static unsigned int phb4_get_max_link_speed(struct phb4 *p, struct dt_node *np) return max_link_speed; } +static unsigned int __phb4_get_max_link_width(struct phb4 *p) +{ + uint64_t addr, reg; + unsigned int lane_config, width = 16; + + /* + * On P9, only PEC2 is configurable (no-/bi-/tri-furcation) + */ + switch (p->pec) { + case 0: + width = 16; + break; + case 1: + width = 8; + break; + case 2: + addr = XPEC_P9_PCI_CPLT_CONF1 + 2 * XPEC_PCI_CPLT_OFFSET; + xscom_read(p->chip_id, addr, ®); + lane_config = GETFIELD(XPEC_P9_PCI_LANE_CFG, reg); + + if (lane_config == 0b10 && p->index >= 4) + width = 4; + else + width = 8; + } + return width; +} + +static unsigned int __phb5_get_max_link_width(struct phb4 *p) +{ + uint64_t addr, reg; + unsigned int lane_config, width = 16; + + /* + * On P10, the 2 PECs are identical and each can have a + * different furcation, so we always need to check the PEC + * config + */ + addr = XPEC_P10_PCI_CPLT_CONF1 + p->pec * XPEC_PCI_CPLT_OFFSET; + xscom_read(p->chip_id, addr, ®); + lane_config = GETFIELD(XPEC_P10_PCI_LANE_CFG, reg); + + switch (lane_config) { + case 0b00: + width = 16; + break; + case 0b01: + width = 8; + break; + case 0b10: + if (p->index == 0 || p->index == 3) + width = 8; + else + width = 4; + break; + default: + PHBERR(p, "Unexpected PEC lane config value %#x\n", + lane_config); + } + return width; +} + +static unsigned int phb4_get_max_link_width(struct phb4 *p) +{ + if (is_phb5()) + return __phb5_get_max_link_width(p); + else + return __phb4_get_max_link_width(p); +} static void phb4_assert_perst(struct pci_slot *slot, bool assert) { @@ -5981,7 +6048,9 @@ static void phb4_create(struct dt_node *np) goto failed; p->max_link_speed = phb4_get_max_link_speed(p, np); - PHBINF(p, "Max link speed: GEN%i\n", p->max_link_speed); + p->max_link_width = phb4_get_max_link_width(p); + PHBINF(p, "Max link speed: GEN%i, max link width %i\n", + p->max_link_speed, p->max_link_width); /* Check for lane equalization values from HB or HDAT */ p->lane_eq_en = true; diff --git a/include/phb4.h b/include/phb4.h index 0bbfc926c..4f1fb31c5 100644 --- a/include/phb4.h +++ b/include/phb4.h @@ -197,6 +197,7 @@ struct phb4 { bool lane_eq_en; unsigned int max_link_speed; unsigned int dt_max_link_speed; + unsigned int max_link_width; uint64_t mrt_size; uint64_t mbt_size; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:33 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:33 +0530 Subject: [Skiboot] [PATCH v2 55/59] phb5: Fix PHB max link speed definition on P10 In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-56-hegdevasant@linux.vnet.ibm.com> From: Frederic Barrat Not all PHBs are capable of GEN5 speed on P10. In all PEC configurations, the first PHB is the only one which can handle GEN5. Signed-off-by: Frederic Barrat Signed-off-by: Vasant Hegde --- hw/phb4.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index b173e25ca..79083d4a1 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3011,10 +3011,10 @@ static unsigned int phb4_get_max_link_speed(struct phb4 *p, struct dt_node *np) chip = get_chip(p->chip_id); hw_max_link_speed = 4; - if (is_phb5()) + if (is_phb5() && (p->index == 0 || p->index == 3)) hw_max_link_speed = 5; - /* Priority order: NVRAM -> dt -> GEN3 dd2.00 -> GEN4 */ + /* Priority order: NVRAM -> dt -> GEN3 dd2.00 -> hw default */ max_link_speed = hw_max_link_speed; if (p->rev == PHB4_REV_NIMBUS_DD20 && ((0xf & chip->ec_level) == 0) && chip->ec_rev == 0) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:35 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:35 +0530 Subject: [Skiboot] [PATCH v2 57/59] xive2: Add NCU_SPEC_BAR to stop engine for restore In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-58-hegdevasant@linux.vnet.ibm.com> From: Vaidyanathan Srinivasan P10 Stop engines have apis similar to P9 to set xscom restores after wakeup from deep-sleep states. This xscom restore will be used to support STOP11 on P10. Signed-off-by: Vaidyanathan Srinivasan Signed-off-by: Pratik Rajesh Sampat Signed-off-by: Vasant Hegde --- hw/xive2.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/hw/xive2.c b/hw/xive2.c index a7b45a005..aece99a0d 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -20,8 +20,7 @@ #include #include #include -#include /* TODO (p10): need P10 stop state engine */ - +#include /* Verbose debug */ #undef XIVE_VERBOSE_DEBUG @@ -3014,10 +3013,30 @@ static void xive_configure_ex_special_bar(struct xive *x, struct cpu_thread *c) void xive2_late_init(void) { + struct cpu_thread *c; + prlog(PR_INFO, "SLW: Configuring self-restore for NCU_SPEC_BAR\n"); - /* - * TODO (p10): need P10 stop state engine and fix for STOP11 - */ + for_each_present_cpu(c) { + if(cpu_is_thread0(c)) { + struct proc_chip *chip = get_chip(c->chip_id); + struct xive *x = chip->xive; + uint64_t xa, val, rc; + xa = XSCOM_ADDR_P10_NCU(pir_to_core_id(c->pir), P10_NCU_SPEC_BAR); + val = (uint64_t)x->tm_base | P10_NCU_SPEC_BAR_ENABLE; + /* Bail out if wakeup engine has already failed */ + if (wakeup_engine_state != WAKEUP_ENGINE_PRESENT) { + prlog(PR_ERR, "XIVE proc_stop_api fail detected\n"); + break; + } + rc = proc_stop_save_scom((void *)chip->homer_base, xa, val, + PROC_STOP_SCOM_REPLACE, PROC_STOP_SECTION_L3); + if (rc) { + xive_cpu_err(c, "proc_stop_save_scom failed for NCU_SPEC_BAR rc=%lld\n", + rc); + wakeup_engine_state = WAKEUP_ENGINE_FAILED; + } + } + } } static void xive_provision_cpu(struct xive_cpu_state *xs, struct cpu_thread *c) -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:36 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:36 +0530 Subject: [Skiboot] [PATCH v2 58/59] hw/chiptod: Retry the sync procedure on failure In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-59-hegdevasant@linux.vnet.ibm.com> From: Ryan Grimm The chiptod sync will sometimes fail and then sync successfully after a retry. So, try an arbitrary 10 numbers of times before we either abort() on main procedure fail or disable threads on secondary procedure fail. Also, put a message on the log if secondaries fail so we have evidence in the log when they aren't enabled. Signed-off-by: Ryan Grimm Signed-off-by: Vasant Hegde --- hw/chiptod.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/hw/chiptod.c b/hw/chiptod.c index 3b57f5f16..fd9414990 100644 --- a/hw/chiptod.c +++ b/hw/chiptod.c @@ -221,6 +221,8 @@ static uint64_t base_tfmr; static struct lock chiptod_lock = LOCK_UNLOCKED; static bool chiptod_unrecoverable; +#define NUM_SYNC_RETRIES 10 + static void _chiptod_cache_tod_regs(int32_t chip_id) { int i; @@ -892,7 +894,7 @@ static void chiptod_sync_master(void *data) *result = true; return; error: - prerror("Master sync failed! TFMR=0x%016lx\n", mfspr(SPR_TFMR)); + prerror("Master sync failed! TFMR=0x%016lx, retrying...\n", mfspr(SPR_TFMR)); *result = false; } @@ -962,7 +964,7 @@ static void chiptod_sync_slave(void *data) *result = true; return; error: - prerror("Slave sync failed ! TFMR=0x%016lx\n", mfspr(SPR_TFMR)); + prerror("Slave sync failed ! TFMR=0x%016lx, retrying...\n", mfspr(SPR_TFMR)); *result = false; } @@ -1818,6 +1820,7 @@ void chiptod_init(void) { struct cpu_thread *cpu0, *cpu; bool sres; + int i; /* Mambo and qemu doesn't simulate the chiptod */ if (chip_quirk(QUIRK_NO_CHIPTOD)) @@ -1841,10 +1844,14 @@ void chiptod_init(void) prlog(PR_DEBUG, "Base TFMR=0x%016llx\n", base_tfmr); - /* Schedule master sync */ - sres = false; - cpu_wait_job(cpu_queue_job(cpu0, "chiptod_sync_master", + i = NUM_SYNC_RETRIES; + do { + /* Schedule master sync */ + sres = false; + cpu_wait_job(cpu_queue_job(cpu0, "chiptod_sync_master", chiptod_sync_master, &sres), true); + } while (!sres && i--); + if (!sres) { op_display(OP_FATAL, OP_MOD_CHIPTOD, 2); abort(); @@ -1858,13 +1865,19 @@ void chiptod_init(void) if (cpu == cpu0) continue; - /* Queue job */ - sres = false; - cpu_wait_job(cpu_queue_job(cpu, "chiptod_sync_slave", - chiptod_sync_slave, &sres), - true); + i = NUM_SYNC_RETRIES; + do { + /* Queue job */ + sres = false; + cpu_wait_job(cpu_queue_job(cpu, "chiptod_sync_slave", + chiptod_sync_slave, &sres), + true); + } while (!sres && i--); + if (!sres) { op_display(OP_WARN, OP_MOD_CHIPTOD, 3|(cpu->pir << 8)); + prerror("CHIPTOD: Failed to sync PIR 0x%04x\n", + this_cpu()->pir); /* Disable threads */ cpu_disable_all_threads(cpu); -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:37 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:37 +0530 Subject: [Skiboot] [PATCH v2 59/59] hw/chiptod: Abort if core frequency is not set In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-60-hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde Signed-off-by: Reza Arbab Signed-off-by: Vasant Hegde --- hw/chiptod.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/hw/chiptod.c b/hw/chiptod.c index fd9414990..7c0a1ffc7 100644 --- a/hw/chiptod.c +++ b/hw/chiptod.c @@ -499,6 +499,12 @@ static void chiptod_setup_base_tfmr(void) core_freq = dt_prop_get_u64(cpu, "ibm,extended-clock-frequency"); else core_freq = dt_prop_get_u32(cpu, "clock-frequency"); + + if (!core_freq) { + prlog(PR_ERR, "CPU clock frequency is not set\n"); + abort(); + } + tod_freq = 32000000; /* Calculate the "Max Cycles Between Steps" value according -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:34 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:34 +0530 Subject: [Skiboot] [PATCH v2 56/59] libpore: P10 stop-api support In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-57-hegdevasant@linux.vnet.ibm.com> From: Pratik Rajesh Sampat Update libpore with P10 STOP API. Add minor changes to make P9 stop-api and P10 stop-api to co-exist in OPAL. These calls are required for STOP11 support on P10. STIOP0,2,3 on P10 does not lose full core state or scoms. stop-api based restore of SPRs or xscoms required only for STOP11 on P10. STOP11 on P10 will be a limited lab test/stress feature and not a product feature. (Same case as P9) Co-authored-by: Pratik Rajesh Sampat Signed-off-by: Pratik Rajesh Sampat Co-authored-by: Vaidyanathan Srinivasan Signed-off-by: Vaidyanathan Srinivasan Co-authored-by: Ryan Grimm Signed-off-by: Ryan Grimm Signed-off-by: Vasant Hegde --- core/direct-controls.c | 31 +- hw/slw.c | 86 +- include/p10_stop_api.H | 239 +++ libpore/Makefile.inc | 2 +- libpore/p10_cpu_reg_restore_instruction.H | 88 + libpore/p10_hcd_header_defs.H | 152 ++ libpore/p10_hcd_memmap_base.H | 463 ++++++ libpore/p10_hcd_memmap_homer.H | 94 ++ libpore/p10_hcd_memmap_occ_sram.H | 174 ++ libpore/p10_hcode_image_defines.H | 462 ++++++ libpore/p10_stop_api.C | 1816 +++++++++++++++++++++ libpore/p10_stop_api.H | 238 +++ libpore/p10_stop_data_struct.H | 162 ++ libpore/p10_stop_util.C | 190 +++ libpore/p10_stop_util.H | 123 ++ 15 files changed, 4311 insertions(+), 9 deletions(-) create mode 100644 include/p10_stop_api.H create mode 100644 libpore/p10_cpu_reg_restore_instruction.H create mode 100644 libpore/p10_hcd_header_defs.H create mode 100644 libpore/p10_hcd_memmap_base.H create mode 100644 libpore/p10_hcd_memmap_homer.H create mode 100644 libpore/p10_hcd_memmap_occ_sram.H create mode 100644 libpore/p10_hcode_image_defines.H create mode 100644 libpore/p10_stop_api.C create mode 100644 libpore/p10_stop_api.H create mode 100644 libpore/p10_stop_data_struct.H create mode 100644 libpore/p10_stop_util.C create mode 100644 libpore/p10_stop_util.H diff --git a/core/direct-controls.c b/core/direct-controls.c index f7509dde0..37bcf9826 100644 --- a/core/direct-controls.c +++ b/core/direct-controls.c @@ -602,14 +602,37 @@ static int p10_core_set_special_wakeup(struct cpu_thread *cpu) * CORE_GATED will be unset on a successful special * wakeup of the core which indicates that the core is * out of stop state. If CORE_GATED is still set then - * raise error. + * check SPWU register and raise error only if SPWU_DONE + * is not set, else print a warning and consider SPWU + * operation as successful. + * This is in conjunction with a micocode bug, which + * calls out the fact that SPW can succeed in the case + * the core is gated but SPWU_HYP bit is set. */ if (p10_core_is_gated(cpu)) { + if(xscom_read(chip_id, spwu_addr, &val)) { + prlog(PR_ERR, "Core %u:%u:" + " unable to read QME_SPWU_HYP\n", + chip_id, core_id); + return OPAL_HARDWARE; + } + if (val & P10_SPWU_DONE) { + /* + * If SPWU DONE bit is set then + * SPWU operation is complete + */ + prlog(PR_DEBUG, "Special wakeup on " + "%u:%u: core remains gated while" + " SPWU_HYP DONE set\n", + chip_id, core_id); + return 0; + } /* Deassert spwu for this strange error */ xscom_write(chip_id, spwu_addr, 0); - prlog(PR_ERR, "Failed special wakeup on %u:%u" - " core remains gated.\n", - chip_id, core_id); + prlog(PR_ERR, + "Failed special wakeup on %u:%u" + " core remains gated.\n", + chip_id, core_id); return OPAL_HARDWARE; } else { return 0; diff --git a/hw/slw.c b/hw/slw.c index 9e676af74..56ba05b0a 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -24,7 +25,7 @@ #include #include -#include +#include #include #include @@ -220,6 +221,30 @@ static bool slw_set_overrides(struct proc_chip *chip, struct cpu_thread *c) return true; } +static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) +{ + uint64_t tmp; + int rc; + uint32_t core = pir_to_core_id(c->pir); + + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), + &tmp); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_SET), + "SLW: Failed to read P10_QME_SPWU_HYP\n"); + return false; + } + if (tmp & P10_SPWU_REQ) + prlog(PR_WARNING, + "SLW: core %d P10_QME_SPWU_HYP requested 0x%016llx\n", + core, tmp); + + return true; +} + + static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) { uint64_t tmp; @@ -872,6 +897,31 @@ static void slw_late_init_p9(struct proc_chip *chip) } } +static void slw_late_init_p10(struct proc_chip *chip) +{ + struct cpu_thread *c; + int rc; + + prlog(PR_INFO, "SLW: Configuring self-restore for HRMOR\n"); + for_each_available_cpu(c) { + if (c->chip_id != chip->id) + continue; + /* + * Clear HRMOR. Need to update only for thread + * 0 of each core. Doing it anyway for all threads + */ + rc = proc_stop_save_cpureg((void *)chip->homer_base, + PROC_STOP_SPR_HRMOR, 0, + c->pir); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_REG), + "SLW: Failed to set HRMOR for CPU %x,RC=0x%x\n", + c->pir, rc); + prlog(PR_ERR, "Disabling deep stop states\n"); + } + } +} + /* Add device tree properties to describe idle states */ void add_cpu_idle_state_properties(void) { @@ -971,7 +1021,7 @@ void add_cpu_idle_state_properties(void) xive_late_init(); nx_p9_rng_late_init(); } else if (chip->type == PROC_CHIP_P10) { - /* TODO (p10): need P10 stop state engine */ + slw_late_init_p10(chip); xive2_late_init(); } } @@ -1380,6 +1430,20 @@ static void slw_init_chip_p9(struct proc_chip *chip) } +static void slw_init_chip_p10(struct proc_chip *chip) +{ + struct cpu_thread *c; + + prlog(PR_DEBUG, "SLW: Init chip 0x%x\n", chip->id); + + /* At power ON setup inits for power-mgt */ + for_each_available_core_in_chip(c, chip->id) + slw_set_overrides_p10(chip, c); + + +} + + static bool slw_image_check_p9(struct proc_chip *chip) { @@ -1575,8 +1639,13 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) wakeup_engine_state,chip->id); return OPAL_INTERNAL_ERROR; } - rc = p9_stop_save_cpureg((void *)chip->homer_base, + if (proc_gen == proc_gen_p9) { + rc = p9_stop_save_cpureg((void *)chip->homer_base, + sprn, val, cpu_pir); + } else { + rc = proc_stop_save_cpureg((void *)chip->homer_base, sprn, val, cpu_pir); + } } else if (proc_gen == proc_gen_p8) { int spr_is_supported = 0; @@ -1640,7 +1709,7 @@ void slw_init(void) slw_late_init_p8(chip); } p8_sbe_init_timer(); - } else if (proc_gen >= proc_gen_p9) { + } else if (proc_gen == proc_gen_p9) { for_each_chip(chip) { slw_init_chip_p9(chip); if(slw_image_check_p9(chip)) @@ -1648,6 +1717,15 @@ void slw_init(void) if (wakeup_engine_state == WAKEUP_ENGINE_PRESENT) slw_late_init_p9(chip); } + } else if (proc_gen == proc_gen_p10) { + for_each_chip(chip) { + slw_init_chip_p10(chip); + if(slw_image_check_p9(chip)) + wakeup_engine_state = WAKEUP_ENGINE_PRESENT; + if (wakeup_engine_state == WAKEUP_ENGINE_PRESENT) { + slw_late_init_p10(chip); + } + } } add_cpu_idle_state_properties(); } diff --git a/include/p10_stop_api.H b/include/p10_stop_api.H new file mode 100644 index 000000000..2bcf03a45 --- /dev/null +++ b/include/p10_stop_api.H @@ -0,0 +1,239 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/utils/stopreg/p10_stop_api.C $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2015,2021 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +#ifndef __P10_STOP_IMAGE_API_ +#define __P10_STOP_IMAGE_API_ + +#include + +#ifdef __SKIBOOT__ + #include + #include +#endif + +/// +/// @file p10_stop_api.H +/// @brief describes STOP API which create/manipulate STOP image. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP + +#ifdef __cplusplus +namespace stopImageSection +{ +#endif + +/** + * @brief all SPRs and MSR for which register restore is to be supported. + * @note STOP API design has built in support to accomodate 8 register of + * scope core and thread each. + */ +typedef enum +{ + PROC_STOP_SPR_DAWR = 180, // thread register + PROC_STOP_SPR_CIABR = 187, // thread register + PROC_STOP_SPR_DAWRX = 188, // thread register + PROC_STOP_SPR_HSPRG0 = 304, // thread register + PROC_STOP_SPR_HRMOR = 313, // core register + PROC_STOP_SPR_LPCR = 318, // thread register + PROC_STOP_SPR_HMEER = 337, // core register + PROC_STOP_SPR_PTCR = 464, // core register + PROC_STOP_SPR_USPRG0 = 496, // thread register + PROC_STOP_SPR_USPRG1 = 497, // thread register + PROC_STOP_SPR_URMOR = 505, // core register + PROC_STOP_SPR_SMFCTRL = 511, // thread register + PROC_STOP_SPR_LDBAR = 850, // thread register + PROC_STOP_SPR_PSSCR = 855, // thread register + PROC_STOP_SPR_PMCR = 884, // core register + PROC_STOP_SPR_HID = 1008, // core register + PROC_STOP_SPR_MSR = 2000, // thread register + +} CpuReg_p10_t; + +// /** +// * @brief lists all the bad error codes. +// */ +// typedef enum +// { +// STOP_SAVE_SUCCESS = 0, +// STOP_SAVE_ARG_INVALID_IMG = 1, +// STOP_SAVE_ARG_INVALID_REG = 2, +// STOP_SAVE_ARG_INVALID_THREAD = 3, +// STOP_SAVE_ARG_INVALID_MODE = 4, +// STOP_SAVE_ARG_INVALID_CORE = 5, +// STOP_SAVE_SPR_ENTRY_NOT_FOUND = 6, +// STOP_SAVE_SPR_ENTRY_UPDATE_FAILED = 7, +// STOP_SAVE_SCOM_INVALID_OPERATION = 8, +// STOP_SAVE_SCOM_INVALID_SECTION = 9, +// STOP_SAVE_SCOM_INVALID_ADDRESS = 10, +// STOP_SAVE_SCOM_INVALID_CHIPLET = 11, +// STOP_SAVE_SCOM_ENTRY_UPDATE_FAILED = 12, +// STOP_SAVE_INVALID_FUSED_CORE_STATUS = 13, +// STOP_SAVE_FAIL = 14, // for internal failure within firmware. +// STOP_SAVE_SPR_ENTRY_MISSING = 15, +// STOP_SAVE_MAX_ENTRY_REACHED = 16, +// STOP_SAVE_SPR_BIT_POS_RESERVE = 17, +// } StopReturnCode_t; + +/** + * @brief summarizes all operations supported on scom entries of STOP image. + */ +typedef enum +{ + //enum members which are project agnostic + PROC_STOP_SCOM_OP_MIN = 0, + PROC_STOP_SCOM_APPEND = 1, + PROC_STOP_SCOM_REPLACE = 2, + PROC_STOP_SCOM_OR = 3, + PROC_STOP_SCOM_AND = 4, + PROC_STOP_SCOM_NOOP = 5, + PROC_STOP_SCOM_RESET = 6, + PROC_STOP_SCOM_OR_APPEND = 7, + PROC_STOP_SCOM_AND_APPEND = 8, + PROC_STOP_SCOM_OP_MAX = 9, + +} StopReturnCode_p10_t; + +/** + * @brief All subsections that contain scom entries in a STOP image. + */ +typedef enum +{ + PROC_STOP_SECTION_CORE = 1, + PROC_STOP_SECTION_L2 = 1, + PROC_STOP_SECTION_L3 = 2, + PROC_STOP_SECTION_CACHE = 2, +} ScomSection_p10_t; + +/** + * @brief versions pertaining relvant to STOP API. + */ +typedef enum +{ + STOP_API_VER = 0x00, + STOP_API_VER_CONTROL = 0x02, +} VersionList_t; + +/** + * @brief Summarizes bit position allocated to SPRs in save bit mask vector. + */ +typedef enum +{ + BIT_POS_CIABR = 0, + BIT_POS_DAWR = 1, + BIT_POS_DAWRX = 2, + BIT_POS_HSPRG0 = 3, + BIT_POS_LDBAR = 4, + BIT_POS_LPCR = 5, + BIT_POS_PSSCR = 6, + BIT_POS_MSR = 7, + BIT_POS_HID = 21, + BIT_POS_HMEER = 22, + BIT_POS_PMCR = 23, + BIT_POS_PTCR = 24, + BIT_POS_SMFCTRL = 28, + BIT_POS_USPRG0 = 29, + BIT_POS_USPRG1 = 30, +} SprBitPositionList_t; + + +#ifdef __cplusplus +extern "C" { +#endif +/** + * @brief creates SCOM restore entry for a given scom adress in HOMER. + * @param i_pImage points to start address of HOMER image. + * @param i_scomAddress address associated with SCOM restore entry. + * @param i_scomData data associated with SCOM restore entry. + * @param i_operation operation type requested for API. + * @param i_section section of HOMER in which restore entry needs to be created. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for creating SCOM restore entry in HOMER. It is agnostic to + * generation of POWER processor. + */ + +StopReturnCode_t proc_stop_save_scom( void* const i_pImage, + const uint32_t i_scomAddress, + const uint64_t i_scomData, + const StopReturnCode_p10_t i_operation, + const ScomSection_p10_t i_section ); + +/** + * @brief initializes self save restore region of HOMER. + * @param[in] i_pImage points to base of HOMER image. + * @param[in] i_corePos position of the physical core. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for initializing self restore region in HOMER. It is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_init_cpureg( void* const i_pImage, const uint32_t i_corePos ); + +/** + * @brief enables self save for a given set of SPRs + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_pir PIR value associated with core and thread. + * @param[in] i_saveRegVector bit vector representing the SPRs that needs to be self saved. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for enabling self save of SPRs and it is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_save_cpureg_control( void* i_pImage, + const uint64_t i_pir, + const uint32_t i_saveRegVector ); + +/** + * @brief creates an SPR restore entry in HOMER + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_regId SPR number to be saved in HOMER + * @param[in] i_regData SPR data to be saved in HOMER + * @param[in] i_pir PIR value associated with core and thread. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for enabling self save of SPRs and it is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_save_cpureg( void* const i_pImage, + const CpuReg_p10_t i_regId, + const uint64_t i_regData, + const uint64_t i_pir ); + +/** + * @brief initializes self-save region with specific instruction. + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_corePos physical core's relative position within processor chip. + * @return STOP_SAVE_SUCCESS if self-save is initialized successfully, + * error code otherwise. + * @note API is project agnostic and is intended only for use case of HOMER build. + * There is no explicit effort to support any other use case. + */ +StopReturnCode_t proc_stop_init_self_save( void* const i_pImage, const uint32_t i_corePos ); + +#ifdef __cplusplus +} // extern "C" +}; // namespace stopImageSection ends +#endif //__cplusplus + +#endif //__P10_STOP_IMAGE_API_ diff --git a/libpore/Makefile.inc b/libpore/Makefile.inc index 1060a0492..06d9c8902 100644 --- a/libpore/Makefile.inc +++ b/libpore/Makefile.inc @@ -1,4 +1,4 @@ -LIBPORE_SRCS = p8_pore_table_gen_api_fixed.C p9_stop_api.C p9_stop_util.C +LIBPORE_SRCS = p8_pore_table_gen_api_fixed.C p9_stop_api.C p9_stop_util.C p10_stop_api.C p10_stop_util.C LIBPORE_SRCS += p8_pore_table_static_data.c sbe_xip_image.c pore_inline_assembler.c LIBPORE_OBJS_1 = $(LIBPORE_SRCS:%.c=%.o) LIBPORE_OBJS = $(LIBPORE_OBJS_1:%.C=%.o) diff --git a/libpore/p10_cpu_reg_restore_instruction.H b/libpore/p10_cpu_reg_restore_instruction.H new file mode 100644 index 000000000..4da194d1e --- /dev/null +++ b/libpore/p10_cpu_reg_restore_instruction.H @@ -0,0 +1,88 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/utils/stopreg/p10_cpu_reg_restore_instruction.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2015,2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ + +/// +/// @file p10_cpu_reg_restore_instruction.H +/// @brief enumerates all the opcodes used for SPR restoration. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP + +#ifndef __REG_RESTORE_INSTRUCTION_H +#define __REG_RESTORE_INSTRUCTION_H + +#include + +#ifdef __cplusplus +extern "C" { + +namespace stopImageSection +{ +#endif + +/** + * @brief enumerates opcodes for few instructions. + */ +enum +{ + ORI_OPCODE = 24, + RFI_OPCODE = 19, + RFI_CONST = 50, + MFMSR_CONST = 83, + ORIS_OPCODE = 25, + OPCODE_31 = 31, + XOR_CONST = 316, + RLDICR_OPCODE = 30, + RLDICR_CONST = 1, + MTSPR_CONST1 = 467, + MTMSRD_CONST1 = 178, + MR_R0_TO_R10 = 0x7c0a0378, //mr r10, r0 + MR_R0_TO_R21 = 0x7c150378, //mr r21, r0 + MR_R0_TO_R9 = 0x7c090378, //mr r9, r0 + URMOR_CORRECTION = 0x7d397ba6, + MFSPR_CONST = 339, + BLR_INST = 0x4e800020, + MTSPR_BASE_OPCODE = 0x7c0003a6, + MFSPR_BASE_OPCODE = 0x7c0002a6, + ATTN_OPCODE = 0x00000200, + OPCODE_18 = 18, + SELF_SAVE_FUNC_ADD = 0x2300, + SELF_SAVE_OFFSET = 0x180, + SKIP_SPR_REST_INST = 0x4800001c, //b . +0x01c + MFLR_R30 = 0x7fc802a6, + SKIP_SPR_SELF_SAVE = 0x3bff0020, //addi r31 r31, 0x20 + MTLR_INST = 0x7fc803a6 //mtlr r30 +}; + +#ifdef __cplusplus +} // namespace stopImageSection ends + +} // extern "C" +#endif //__cplusplus + +#endif //__REG_RESTORE_INSTRUCTION_H diff --git a/libpore/p10_hcd_header_defs.H b/libpore/p10_hcd_header_defs.H new file mode 100644 index 000000000..d02a72524 --- /dev/null +++ b/libpore/p10_hcd_header_defs.H @@ -0,0 +1,152 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_hcd_header_defs.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2016,2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +/// +/// @file p10_hcd_header_defs.H +/// @brief defines header constants based on file types +/// +/// This header contains those cpp manifest constants required for processing +/// the linker scripts used to generate OCC code images. As these are used +/// by linker scripts as well as by C++ code, these cannot be solely be put +/// into a namespace. Prefixing these with the region name is the attempt +/// to make these globally unique when this header is included in C++ code. +/// +// *HWP HWP Owner: David Du +// *HWP Backup HWP Owner: Greg Still +// *HWP FW Owner: Prem Jha +// *HWP Team: PM +// *HWP Level: 2 +// *HWP Consumed by: PM +// + +#ifndef __HCD_HEADER_DEFS_H__ +#define __HCD_HEADER_DEFS_H__ + +/// Macros for generating an Hcode header section +/// +/// The CPP macros HCD_HDR_UINTxx generate equivalent code depending on +/// whether they are being called from assembler (where they actually +/// create the header section data) or from C (where they specifiy a +/// C-structure form of the contents of the header section. +/// +/// In assembler each invocation also creates space in the header section + +#ifdef __ASSEMBLER__ + +// *INDENT-OFF* + .macro hcd_header_uint64, symbol:req, value = 0 + .global \symbol +\symbol\(): + .quad (\value) + .endm + + .macro hcd_header_uint32, symbol:req, value = 0 + .global \symbol + \symbol\(): + .long (\value) + .endm + + .macro hcd_header_uint16, symbol:req, value = 0 + .global \symbol +\symbol\(): + .short (\value) + .endm + + .macro hcd_header_uint8, symbol:req, value = 0 + .global \symbol +\symbol\(): + .byte (\value) + .endm + + .macro hcd_header_uint8_vec, symbol:req, number:req, value = 0 + .global \symbol +\symbol\(): + .rept (\number) + .byte (\value) + .endr + .endm + + .macro hcd_header_attn, symbol:req, number = 1 + .global \symbol +\symbol\(): + .rept (\number) + .long 0x00000200 + .endr + .endm + + .macro hcd_header_attn_pad, align:req + .balignl (\align), 0x00000200 + .endm + + .macro hcd_header_pad, align:req + .balignl (\align), 0 + .endm +// *INDENT-ON* + +#define ULL(x) x +#define HCD_CONST(name, expr) .set name, expr; +#define HCD_CONST64(name, expr) .set name, expr; + +#define HCD_HDR_UINT64(symbol, value) hcd_header_uint64 symbol value +#define HCD_HDR_UINT32(symbol, value) hcd_header_uint32 symbol value +#define HCD_HDR_UINT16(symbol, value) hcd_header_uint16 symbol value +#define HCD_HDR_UINT8(symbol, value) hcd_header_uint8 symbol value +#define HCD_HDR_UINT8_VEC(symbol, number, value) hcd_header_uint8_vec symbol number value +#define HCD_HDR_ATTN(symbol, number) hcd_header_attn symbol number +#define HCD_HDR_ATTN_PAD(align) hcd_header_attn_pad align +#define HCD_HDR_PAD(align) hcd_header_pad align + +#else // NOT __ASSEMBLER__ + +#ifdef __LINKERSCRIPT__ + + #define ULL(x) x + #define POUND_DEFINE #define + #define HCD_CONST(name, expr) POUND_DEFINE name expr + #define HCD_CONST64(name, expr) POUND_DEFINE name expr + +#else + + #define ULL(x) x##ull + #define HCD_CONST(name, expr) enum { name = expr }; + #define HCD_CONST64(name, expr) enum { name = expr }; + + #define HCD_HDR_UINT64(symbol, value) uint64_t symbol + #define HCD_HDR_UINT32(symbol, value) uint32_t symbol + #define HCD_HDR_UINT16(symbol, value) uint16_t symbol + #define HCD_HDR_UINT8(symbol, value) uint8_t symbol + #define HCD_HDR_UINT8_VEC(symbol, number, value) uint8_t symbol[number] + #define HCD_HDR_ATTN(symbol, number) uint32_t symbol[number] + #define HCD_HDR_ATTN_PAD(align) + #define HCD_HDR_PAD(align) + +#endif // __LINKERSCRIPT__ +#endif // __ASSEMBLER__ + +// Stringification + +#define STR_HELPER(x) #x +#define STR(x) STR_HELPER(x) + +#endif // __HCD_HEADER_DEFS_H__ diff --git a/libpore/p10_hcd_memmap_base.H b/libpore/p10_hcd_memmap_base.H new file mode 100644 index 000000000..4dac9c93b --- /dev/null +++ b/libpore/p10_hcd_memmap_base.H @@ -0,0 +1,463 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_hcd_memmap_base.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2019,2020 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +/// +/// @file p10_hcd_memmap_base.H +/// @brief defines region constants shared by different memory components. +/// + +// *HWP HWP Owner: David Du +// *HWP Backup HWP Owner: Greg Still +// *HWP FW Owner: Prem S Jha +// *HWP Team: PM +// *HWP Level: 2 +// *HWP Consumed by: PM:Hostboot:Phyp + +#ifndef __HCD_MEMMAP_BASE_H__ +#define __HCD_MEMMAP_BASE_H__ + +#include + +// ------------------------------------------------------------------- +// Note: There can be NO semicolons(";") at end of macros in this file +// There can ONLY have HCD_CONST/HCD_CONST64 macros in this file +// ------------------------------------------------------------------- + +/// Image Magic Numbers + +HCD_CONST64(CPMR_MAGIC_NUMBER, ULL(0x43504d525f312e30)) // CPMR_1.0 +HCD_CONST64(QME_MAGIC_NUMBER , ULL(0x514d455f5f312e30)) // QME__1.0 + +HCD_CONST64(XPMR_MAGIC_NUMBER, ULL(0x58504d525f312e30)) // XPMR_1.0 +HCD_CONST64(XGPE_MAGIC_NUMBER, ULL(0x584750455f312e30)) // XGPE_1.0 + +HCD_CONST64(PPMR_MAGIC_NUMBER, ULL(0x50504d525f312e30)) // PPMR_1.0 +HCD_CONST64(PGPE_MAGIC_NUMBER, ULL(0x504750455F312E30)) // PGPE_1.0 + +HCD_CONST(QME_BUILD_VERSION, 0x001) // QME__1.0 +HCD_CONST(XGPE_BUILD_VERSION, 0x001) // XGPE_1.0 +HCD_CONST(PGPE_BUILD_VERSION, 0x001) // PGPE_1.0 + + +HCD_CONST(CPMR_REGION_CHECK_WORD, (0x43504d52)) // CPMR +HCD_CONST(SCOM_REST_MAGIC_WORD, (0x5343))//SC +HCD_CONST(CPMR_BUILD_VER, 1) + + +/// Size constants + +HCD_CONST(HALF_KB, 512) +HCD_CONST(ONE_KB, 1024) +HCD_CONST(HALF_MB, (1024 * 512)) +HCD_CONST(ONE_MB, (1024 * 1024)) +HCD_CONST(TWO_MB, (2 * 1024 * 1024)) + +/// Memory constants + +HCD_CONST(QME_SRAM_SIZE, (64 * ONE_KB)) + +HCD_CONST(HOMER_MEMORY_SIZE, (4 * ONE_MB)) +HCD_CONST(HOMER_OPMR_REGION_NUM, 0) +HCD_CONST(HOMER_XPMR_REGION_NUM, 1) +HCD_CONST(HOMER_CPMR_REGION_NUM, 2) +HCD_CONST(HOMER_PPMR_REGION_NUM, 3) + +/// Chip constants +HCD_CONST(OCC_HOST_AREA_SIZE, ONE_MB) + +HCD_CONST(MAX_THREADS_PER_CORE, 4) +HCD_CONST(MAX_CORES_PER_CHIP, 32) + +HCD_CONST(MAX_QMES_PER_CHIP, 8) +HCD_CONST(MAX_EXES_PER_CHIP, 16) + +HCD_CONST(MAX_QUADS_PER_CHIP, 8) +HCD_CONST(MAX_CACHES_PER_CHIP, 32) + +HCD_CONST(MAX_CORES_PER_QME, 4) +HCD_CONST(MAX_CORES_PER_EX, 2) + +HCD_CONST(MAX_QMES_PER_QUAD, 1) +HCD_CONST(MAX_EXES_PER_QUAD, 2) +HCD_CONST(MAX_CORES_PER_QUAD, 4) +HCD_CONST(MAX_L3_PER_QUAD, 4) + +HCD_CONST(MAX_QUAD_ID_SUPPORTED, 7) +HCD_CONST(MAX_CORE_ID_SUPPORTED, 31) +HCD_CONST(MAX_THREAD_ID_SUPPORTED, 3) + +/// Image build constants + +HCD_CONST(HARDWARE_IMG_SIZE, ONE_MB) + +HCD_CONST(FUSED_CORE_MODE, 0xBB) +HCD_CONST(NONFUSED_CORE_MODE, 0xAA) + +HCD_CONST(SELF_RESTORE_BLR_INST, 0x4e800020) +HCD_CONST(CORE_RESTORE_PAD_OPCODE, 0x00000200) //ATTN Opcode + +HCD_CONST(SCOM_RESTORE_PAD_OPCODE, 0x00000000) //zero pads +HCD_CONST(SCOM_RESTORE_ENTRY_SIZE, 12) //4B address,8B data + +HCD_CONST(QME_BLOCK_READ_LEN, 32) +HCD_CONST(QME_BLK_SIZE_SHIFT, 0x05) + +HCD_CONST(RING_ALIGN_BOUNDARY, 0x08) +HCD_CONST64(DARN_BAR_EN_POS, ULL(0x8000000000000000)) + +//FFDC Region +HCD_CONST(FFDC_REGION_XPMR_BASE_OFFSET, 0xE0000) //Offset wrt to XPMR base +HCD_CONST(FFDC_REGION_SIZE, (80 * ONE_KB)) +//end offset of FFDC region wrt to XPMR base +HCD_CONST(FFDC_REGION_XPMR_END_OFFSET, (FFDC_REGION_XPMR_BASE_OFFSET + + FFDC_REGION_SIZE )) +//--------------------------------------------------------------------------------------- + +//XPMR Header +HCD_CONST(XGPE_BUILD_VER, 1) +HCD_CONST(XPMR_BUILD_VER, 1) +HCD_CONST(XPMR_HEADER_SIZE, 512) +HCD_CONST(XGPE_INT_VECTOR_SIZE, 384) +HCD_CONST(XGPE_HEADER_IMAGE_OFFSET, XGPE_INT_VECTOR_SIZE) +HCD_CONST(XGPE_BOOT_COPIER_OFFSET, 512) +HCD_CONST(XGPE_BOOT_COPIER_LENGTH, ONE_KB) +HCD_CONST(XGPE_BOOT_LOADER_OFFSET, + XGPE_BOOT_COPIER_OFFSET + XGPE_BOOT_COPIER_LENGTH ) +HCD_CONST(XGPE_BOOT_LOADER_LENGTH, ONE_KB) +HCD_CONST(XGPE_HCODE_OFFSET, + XGPE_BOOT_LOADER_OFFSET + XGPE_BOOT_LOADER_LENGTH ) +HCD_CONST(XGPE_SRAM_SIZE, (64 * ONE_KB)) +HCD_CONST(XGPE_HCODE_SIZE, (64 * ONE_KB)) +HCD_CONST(XPMR_BOOT_REGION, (XPMR_HEADER_SIZE + XGPE_BOOT_COPIER_LENGTH + + XGPE_BOOT_LOADER_LENGTH )) + +HCD_CONST(XGPE_HCODE_RESET_ADDR_VAL, 0x40) +HCD_CONST(XGPE_DBG_PTR_AREA_SIZE, 64) + +HCD_CONST(XPMR_MAGIC_WORD_BYTE, 0x00) +HCD_CONST(XPMR_BOOT_COPIER_OFFSET_BYTE, 0x08) +HCD_CONST(XPMR_BOOT_LOADER_OFFSET_BYTE, 0x10) +HCD_CONST(XPMR_BOOT_LOADER_LENGTH_BYTE, 0x14) +HCD_CONST(XPMR_BUILD_DATE_BYTE, 0x18) +HCD_CONST(XPMR_BUILD_VER_BYTE, 0x1c) +HCD_CONST(XPMR_XGPE_HCODE_OFFSET_BYTE, 0x28) +HCD_CONST(XPMR_XGPE_HCODE_LENGTH_BYTE, 0x2c) +HCD_CONST(XPMR_XGPE_BOOT_PROG_CODE_BYTE, 0x30) +HCD_CONST(XPMR_XGPE_SRAM_IMAGE_SIZE_BYTE, 0x34) +HCD_CONST(XGPE_IMAGE_XPMR_OFFSET, + (XGPE_BOOT_LOADER_OFFSET + XGPE_BOOT_LOADER_LENGTH)) + +//--------------------------------------------------------------------------------------- + +/// CPMR Header + +HCD_CONST(CPMR_HOMER_OFFSET, (HOMER_CPMR_REGION_NUM* ONE_MB)) +HCD_CONST(CPMR_HEADER_SIZE, 256) + +HCD_CONST(CPMR_ATTN_WORD0_BYTE, 0x00) +HCD_CONST(CPMR_ATTN_WORD1_BYTE, 0x04) +HCD_CONST(CPMR_MAGIC_NUMBER_BYTE, 0x08) +HCD_CONST(CPMR_BUILD_DATE_BYTE, 0x10) +HCD_CONST(CPMR_BUILD_VER_BYTE, 0x14) +HCD_CONST(CPMR_SELF_RESTORE_VER_BYTE, 0x1C) +HCD_CONST(CPMR_STOP_API_VER_BYTE, 0x1D) +HCD_CONST(CPMR_FUSED_CORE_FLAG, 0x1F) +HCD_CONST(CPMR_QME_HCODE_OFFSET_BYTE, 0x20) +HCD_CONST(CPMR_QME_HCODE_LENGTH_BYTE, 0x24) +HCD_CONST(CPMR_CORE_COMMON_RING_OFFSET_BYTE, 0x28) +HCD_CONST(CPMR_CORE_COMMON_RING_LENGTH_BYTE, 0x2C) +HCD_CONST(CPMR_QME_LOCAL_PSTATE_OFFSET_BYTE, 0x30) +HCD_CONST(CPMR_QME_LOCAL_PSTATE_LENGTH_BYTE, 0x34) +HCD_CONST(CPMR_CORE_SPECIFIC_RING_OFFSET_BYTE, 0x38) +HCD_CONST(CPMR_CORE_SPECIFIC_RING_LENGTH_BYTE, 0x3C) +HCD_CONST(CPMR_CORE_SCOM_RESTORE_OFFSET_BYTE, 0x40) +HCD_CONST(CPMR_CORE_SCOM_RESTORE_LENGTH_BYTE, 0x44) +HCD_CONST(CPMR_SELF_RESTORE_OFFSET_BYTE, 0x48) +HCD_CONST(CPMR_SELF_RESTORE_LENGTH_BYTE, 0x4C) +HCD_CONST(CPMR_MAX_CORE_L2_SCOM_ENTRIES, 0x50) +HCD_CONST(CPMR_MAX_QUAD_L3_SCOM_ENTRIES, 0x54) +HCD_CONST(CPMR_MAX_CORE_L2_SCOM_OFFSET, 0x58) +HCD_CONST(CPMR_MAX_CORE_L2_SCOM_LENGTH, 0x5C) +HCD_CONST(CPMR_MAX_QUAD_SCOM_OFFSET, 0x60) +HCD_CONST(CPMR_MAX_QUAD_SCOM_LENGTH, 0x64) + +/// Self Restore without SMF Support + +HCD_CONST(SELF_RESTORE_CPMR_OFFSET, CPMR_HEADER_SIZE) +HCD_CONST(SELF_RESTORE_INT_SIZE, (8 * ONE_KB)) +HCD_CONST(SELF_RESTORE_FFDC_OFFSET, (224 * ONE_KB)) +HCD_CONST(SELF_RESTORE_FFDC_LENGTH, (32 * ONE_KB)) +HCD_CONST(SELF_RESTORE_FFDC_PER_CORE, 864) +HCD_CONST(SELF_RESTORE_FFDC_PER_CORE_IN_HOMER, 1024) +HCD_CONST(SELF_RESTORE_FFDC_PER_QUAD_IN_HOMER, (SELF_RESTORE_FFDC_PER_CORE_IN_HOMER * 4)) +HCD_CONST(SELF_RESTORE_FFDC_BLK_CNT, 27) + +// Self Restore Region With SMF Support +HCD_CONST(SMF_THREAD_LAUNCHER_SIZE, 1024) +HCD_CONST(SMF_SELF_RESTORE_CODE_SIZE, + (SELF_RESTORE_INT_SIZE + SMF_THREAD_LAUNCHER_SIZE)) + +HCD_CONST(SMF_CORE_RESTORE_THREAD_AREA_SIZE, HALF_KB) +HCD_CONST(SMF_SELF_SAVE_THREAD_AREA_SIZE, 256) +HCD_CONST(SMF_CORE_RESTORE_CORE_AREA_SIZE, HALF_KB) +HCD_CONST(SMF_CORE_SAVE_CORE_AREA_SIZE, HALF_KB) + +HCD_CONST(SMF_SELF_RESTORE_CORE_REGS_SIZE, + MAX_CORES_PER_CHIP * ((SMF_CORE_RESTORE_THREAD_AREA_SIZE* MAX_THREADS_PER_CORE ) + + (SMF_SELF_SAVE_THREAD_AREA_SIZE* MAX_THREADS_PER_CORE ) + + SMF_CORE_RESTORE_CORE_AREA_SIZE + + SMF_CORE_SAVE_CORE_AREA_SIZE )) + +HCD_CONST(SMF_SELF_RESTORE_SIZE_TOTAL, + (SMF_SELF_RESTORE_CODE_SIZE + SMF_SELF_RESTORE_CORE_REGS_SIZE)) +/// Core Scom + +HCD_CONST(SELF_SAVE_RESTORE_REGION_SIZE, (256 * ONE_KB)) +HCD_CONST(SCOM_RESTORE_CPMR_OFFSET, (256 * ONE_KB)) +HCD_CONST(SCOM_RESTORE_HOMER_OFFSET, + (SCOM_RESTORE_CPMR_OFFSET + CPMR_HOMER_OFFSET)) + +HCD_CONST(MAX_CORE_SCOM_ENTRIES, 16) +HCD_CONST(MAX_L2_SCOM_ENTRIES, 32) +HCD_CONST(MAX_L3_SCOM_ENTRIES, 64) +HCD_CONST(MAX_EQ_SCOM_ENTRIES, 16) +HCD_CONST(MAX_SCOM_RESTORE_ENTRIES_PER_CORE, (MAX_CORE_SCOM_ENTRIES + + MAX_L2_SCOM_ENTRIES + MAX_L3_SCOM_ENTRIES + + MAX_EQ_SCOM_ENTRIES)) + + +HCD_CONST(SCOM_RESTORE_SIZE_PER_CORE, + (SCOM_RESTORE_ENTRY_SIZE* MAX_SCOM_RESTORE_ENTRIES_PER_CORE)) // 128 * 16 +HCD_CONST(SCOM_RESTORE_SIZE_PER_QME, + (SCOM_RESTORE_SIZE_PER_CORE* MAX_CORES_PER_QME)) // 128 * 16 * 4 + +HCD_CONST(SCOM_RESTORE_SIZE_TOTAL, (96 * ONE_KB)) + +HCD_CONST(SCOM_RESTORE_EL_AREA, + MAX_CORE_SCOM_ENTRIES* SCOM_RESTORE_ENTRY_SIZE) +HCD_CONST(SCOM_RESTORE_L2_AREA, + MAX_L2_SCOM_ENTRIES* SCOM_RESTORE_ENTRY_SIZE) +HCD_CONST(SCOM_RESTORE_L3_AREA, + MAX_L3_SCOM_ENTRIES* SCOM_RESTORE_ENTRY_SIZE) +HCD_CONST(SCOM_RESTORE_EQ_AREA, + MAX_EQ_SCOM_ENTRIES* SCOM_RESTORE_ENTRY_SIZE) +HCD_CONST(SCOM_RESTORE_VER, 1) +HCD_CONST(SCOM_RESTORE_L2_CORE, + (MAX_CORE_SCOM_ENTRIES + MAX_L2_SCOM_ENTRIES)) +HCD_CONST(SCOM_RESTORE_L3_CACHE, + (MAX_EQ_SCOM_ENTRIES + MAX_L3_SCOM_ENTRIES)) +/// QME Image + +HCD_CONST(QME_IMAGE_CPMR_OFFSET, 0x58000) // assumes SCOMs take up the first 96KB of second 256KB +//HCD_CONST(QME_IMAGE_SIZE, 0) +HCD_CONST(QME_INT_VECTOR_SIZE, 384) // 0x180 +HCD_CONST(QME_HCODE_OFFSET, (SELF_SAVE_RESTORE_REGION_SIZE + SCOM_RESTORE_SIZE_TOTAL)) + +/// QME Header + +HCD_CONST(QME_HEADER_CPMR_OFFSET, + (QME_IMAGE_CPMR_OFFSET + QME_INT_VECTOR_SIZE)) +HCD_CONST(QME_HEADER_IMAGE_OFFSET, QME_INT_VECTOR_SIZE) +HCD_CONST(QME_HEADER_SIZE, 128) // 0x80, +0x180=0x200 + +HCD_CONST(QME_MAGIC_NUMBER_BYTE, 0x00) +HCD_CONST(QME_HCODE_OFFSET_BYTE, 0x08) +HCD_CONST(QME_HCODE_LENGTH_BYTE, 0x0C) +HCD_CONST(QME_COMMON_RING_OFFSET_BYTE, 0x10) +HCD_CONST(QME_OVERRIDE_RING_OFFSET_BYTE, 0x14) +HCD_CONST(QME_COMMON_RING_LENGTH_BYTE, 0x18) +HCD_CONST(QME_LOCAL_PSTATE_OFFSET_BYTE, 0x1C) +HCD_CONST(QME_LOCAL_PSTATE_LENGTH_BYTE, 0x20) +HCD_CONST(QME_SPECIFIC_RING_OFFSET_BYTE, 0x24) +HCD_CONST(QME_SPECIFIC_RING_LENGTH_BYTE, 0x28) +HCD_CONST(QME_QUAD_SCOM_RESTORE_OFFSET_BYTE, 0x2C) +HCD_CONST(QME_QUAD_SCOM_RESTORE_LENGTH_BYTE, 0x30) +HCD_CONST(QME_ATTR_TANK_ADDRESS, 0x34) +HCD_CONST(QME_LOCATION_ID_BYTE, 0x38) +HCD_CONST(QME_TIME_BASE, 0x3C) +HCD_CONST(QME_CPMR_HOMER_ADDRESS_BYTE, 0x40) + +HCD_CONST(QME_HCODE_OFF_IMAGE_OFFSET, (QME_HEADER_IMAGE_OFFSET + QME_HCODE_OFFSET_BYTE)) +HCD_CONST(QME_HCODE_LEN_IMAGE_OFFSET, (QME_HEADER_IMAGE_OFFSET + QME_HCODE_LENGTH_BYTE)) + +/// QME Hcode + +HCD_CONST(QME_HCODE_IMAGE_OFFSET, (QME_INT_VECTOR_SIZE + QME_HEADER_SIZE)) // 0x200 +HCD_CONST(QME_HCODE_SIZE, (43 * ONE_KB)) +HCD_CONST(QME_COMMON_RING_SIZE, (5 * ONE_KB)) +HCD_CONST(QME_INST_RING_SIZE, (5 * ONE_KB)) +HCD_CONST(QME_DEBUG_PTRS_OFFSET, 0x200) +HCD_CONST(QME_DEBUG_PTRS_SIZE, 0x10) +HCD_CONST(QME_DUMP_PTRS_OFFSET, QME_DEBUG_PTRS_OFFSET + QME_DEBUG_PTRS_SIZE) +HCD_CONST(QME_DUMP_PTRS_SIZE, 0x300) +HCD_CONST(QME_ATTR_PTRS_OFFSET, QME_DUMP_PTRS_OFFSET + QME_DUMP_PTRS_SIZE) +HCD_CONST(QME_INSTRUMENTATION_SIZE, HALF_KB) +HCD_CONST(QME_SRAM_HCODE_OFFSET, 0) +HCD_CONST(QME_OVERRIDE_RING_SIZE, (2 * ONE_KB)) + +// QME Hcode + Core Scan + Pstate +HCD_CONST(QME_REGION_SIZE, (128 * ONE_KB)) + +// Debug + +HCD_CONST(CPMR_TRACE_REGION_OFFSET, (512 * ONE_KB)) +HCD_CONST(QME_TRACE_REGION_SIZE, (16 * ONE_KB)) +HCD_CONST(CPMR_TRACE_REGION_SIZE, (QME_TRACE_REGION_SIZE* MAX_QMES_PER_CHIP)) // 192K +HCD_CONST(CPMR_DEBUG_REGION_OFFSET, CPMR_TRACE_REGION_OFFSET + CPMR_TRACE_REGION_SIZE) +HCD_CONST(CPMR_DEBUG_REGION_SIZE, (64 * ONE_KB)) // 192K + 64K = 256K + +HCD_CONST(CACHE_CHIPLET_ID_MIN, 0x20 ) +HCD_CONST(CACHE_CHIPLET_ID_MAX, 0x27 ) + +//--------------------------------------------------------------------------------------- + +/// PPMR Header +HCD_CONST(PPMR_BUILD_VERSION, 1) +HCD_CONST(PPMR_HEADER_SIZE, 512) +HCD_CONST(PGPE_INT_VECTOR_SIZE, 384) +HCD_CONST(PGPE_HEADER_IMAGE_OFFSET, PGPE_INT_VECTOR_SIZE) +HCD_CONST(PGPE_BOOT_COPIER_OFFSET, PPMR_HEADER_SIZE) +HCD_CONST(PGPE_BOOT_COPIER_LENGTH, ONE_KB) +HCD_CONST(PGPE_BOOT_LOADER_OFFSET, + (PGPE_BOOT_COPIER_OFFSET + PGPE_BOOT_COPIER_LENGTH) ) + +HCD_CONST(PGPE_BOOT_LOADER_LENGTH, ONE_KB) +HCD_CONST(PGPE_HCODE_OFFSET, + PGPE_BOOT_LOADER_OFFSET + PGPE_BOOT_LOADER_LENGTH ) +HCD_CONST(PPMR_HOMER_OFFSET, (HOMER_PPMR_REGION_NUM* ONE_MB)) + +HCD_CONST(PPMR_MAGIC_NUMBER_BYTE, 0x00) +HCD_CONST(PPMR_BOOT_COPIER_OFFSET_BYTE, 0x08) +HCD_CONST(PPMR_BOOT_LOADER_OFFSET_BYTE, 0x10) +HCD_CONST(PPMR_BOOT_LOADER_LENGTH_BYTE, 0x14) +HCD_CONST(PPMR_BUILD_DATE_BYTE, 0x18) +HCD_CONST(PPMR_BUILD_VER_BYTE, 0x1C) +HCD_CONST(PPMR_PGPE_HCODE_OFFSET_BYTE, 0x28) +HCD_CONST(PPMR_PGPE_HCODE_LENGTH_BYTE, 0x2C) +HCD_CONST(PPMR_GLOBAL_PSTATE_OFFSET_BYTE, 0x30) +HCD_CONST(PPMR_GLOBAL_PSTATE_LENGTH_BYTE, 0x34) +HCD_CONST(PPMR_LOCAL_PSTATE_OFFSET_BYTE, 0x38) +HCD_CONST(PPMR_LOCAL_PSTATE_LENGTH_BYTE, 0x3C) +HCD_CONST(PPMR_OCC_PSTATE_OFFSET_BYTE, 0x40) +HCD_CONST(PPMR_OCC_PSTATE_LENGTH_BYTE, 0x44) +HCD_CONST(PPMR_PSTATE_TABLE_OFFSET_BYTE, 0x48) +HCD_CONST(PPMR_PSTATE_TABLE_LENGTH_BYTE, 0x4C) +HCD_CONST(PPMR_PGPE_SRAM_IMAGE_SIZE_BYTE, 0x50) +HCD_CONST(PPMR_PGPE_BOOT_PROG_CODE_BYTE, 0x54) +HCD_CONST(PPMR_WOF_TABLE_OFFSET, 0x58) +HCD_CONST(PPMR_WOF_TABLE_LENGTH, 0x5C) +HCD_CONST(PPMR_AUX_TASK_OFFSET, 0x60) +HCD_CONST(PPMR_AUX_TASK_LENGTH, 0x64) +HCD_CONST(PPMR_DEEP_OP_TRACE_OFFSET, 0x68) +HCD_CONST(PPMR_DEEP_OP_TRACE_LENGTH, 0x6C) + +/// PGPE Boot + +HCD_CONST(PGPE_BOOT_COPIER_PPMR_OFFSET, PPMR_HEADER_SIZE) +HCD_CONST(PGPE_BOOT_COPIER_SIZE, ONE_KB) + +HCD_CONST(PGPE_BOOT_LOADER_PPMR_OFFSET, + (PGPE_BOOT_COPIER_PPMR_OFFSET + PGPE_BOOT_COPIER_SIZE)) +HCD_CONST(PGPE_BOOT_LOADER_SIZE, ONE_KB) +HCD_CONST(PGPE_BOOT_LOADER_RESET_ADDR_VAL, 0x40) +HCD_CONST(XGPE_BOOT_LOADER_RESET_ADDR_VAL, PGPE_BOOT_LOADER_RESET_ADDR_VAL) + +HCD_CONST(PGPE_INSTRUMENTATION_SIZE, (2 * ONE_KB)) +/// PGPE Image +HCD_CONST(PGPE_IMAGE_PPMR_OFFSET, + (PGPE_BOOT_LOADER_PPMR_OFFSET + PGPE_BOOT_LOADER_SIZE)) + +HCD_CONST(PGPE_HCODE_RESET_ADDR_VAL, 0x40) +HCD_CONST(PGPE_DBG_PTR_AREA_SIZE, 64) + +/// PGPE Header + +HCD_CONST(PGPE_HEADER_SIZE, 128) + +HCD_CONST(PGPE_MAGIC_NUMBER_BYTE, 0x00) +HCD_CONST(PGPE_SYSTEM_RESET_ADDR_BYTE, 0x08) +HCD_CONST(PGPE_SHARED_SRAM_ADDR_BYTE, 0x0C) +HCD_CONST(PGPE_IVPR_ADDR_BYTE, 0x10) +HCD_CONST(PGPE_SHARED_SRAM_LENGTH_BYTE, 0x14) +HCD_CONST(PGPE_BUILD_DATE_BYTE, 0x18) +HCD_CONST(PGPE_BUILD_VER_BYTE, 0x1C) +HCD_CONST(PGPE_PGPE_FLAGS_BYTE, 0x20) +HCD_CONST(PGPE_PGPE_TIMEBASE_HZ, 0x24) +HCD_CONST(PGPE_GLOBAL_PSTATE_SRAM_ADDR_BYTE, 0x28) +HCD_CONST(PGPE_HCODE_LENGTH_BYTE, 0x2C) +HCD_CONST(PGPE_GLOBAL_PSTATE_MEM_OFFSET_BYTE, 0x30) +HCD_CONST(PGPE_GLOBAL_PSTATE_PPB_SIZE_BYTE, 0x34) +HCD_CONST(PGPE_GEN_PSTATE_TABLE_MEM_OFFSET_BYTE, 0x38) +HCD_CONST(PGPE_GEN_PSTATE_TABLE_SIZE_BYTE, 0x3C) +HCD_CONST(PGPE_OCC_PSTATE_TABLE_MEM_OFFSET_BYTE, 0x40) +HCD_CONST(PGPE_OCC_PSTATE_TABLE_SIZE_BYTE, 0x44) +HCD_CONST(PGPE_BEACON_ADDR_BYTE, 0x48) +HCD_CONST(PGPE_RESERVE_1, 0x4C) +HCD_CONST(PGPE_WOF_STATE_ADDR_BYTE, 0x50) +HCD_CONST(PGPE_RESERVE_2, 0x54) +HCD_CONST(PGPE_WOF_TABLE_ADDR_BYTE, 0x58) +HCD_CONST(PGPE_WOF_TABLE_LENGTH_BYTE, 0x5C) +HCD_CONST(PGPE_RESERVE_3, 0x60) +HCD_CONST(PGPE_RESERVE_4, 0x64) +HCD_CONST(PGPE_RESERVE_5, 0x68) +HCD_CONST(PGPE_OP_TRACE_PTR_BYTE, 0x6C) +HCD_CONST(PGPE_DEEP_OP_TRACE_MEM_ADDR_BYTE, 0x70) +HCD_CONST(PGPE_DEEP_OP_TRACE_LENGTH_BYTE, 0x74) + +HCD_CONST(PGPE_RESET_ADDR_IMAGE_OFFSET, (PGPE_HEADER_IMAGE_OFFSET + PGPE_SYSTEM_RESET_ADDR_BYTE)) +HCD_CONST(PGPE_BUILD_DATE_IMAGE_OFFSET, (PGPE_HEADER_IMAGE_OFFSET + PGPE_BUILD_DATE_BYTE)) +HCD_CONST(PGPE_BUILD_VER_IMAGE_OFFSET, (PGPE_HEADER_IMAGE_OFFSET + PGPE_BUILD_VER_BYTE)) + +//PPMR Misc +HCD_CONST(PPMR_MEM_MASK, 0x80300000) + +/// PGPE Hcode +HCD_CONST(PPMR_BOOT_REGION, (PPMR_HEADER_SIZE + PGPE_BOOT_COPIER_SIZE + PGPE_BOOT_LOADER_SIZE )) +HCD_CONST(PGPE_SRAM_BOOT_REGION, (PPMR_HEADER_SIZE + PGPE_BOOT_LOADER_SIZE )) +HCD_CONST(PGPE_GLOBAL_PSTATE_PARAM_BLOCK_SIZE, (6 * ONE_KB)) +HCD_CONST(PGPE_OCC_SHARED_SRAM_SIZE, (2 * ONE_KB)) +HCD_CONST(PGPE_DEBUG_PTRS_OFFSET, 0x200) +HCD_CONST(PGPE_DEBUG_PTRS_SIZE, 0x24) + + +/// Pstate Parameter Block + Pstate Table + +HCD_CONST(OCC_PSTATE_PARAM_BLOCK_PPMR_OFFSET, (128 * ONE_KB)) +HCD_CONST(OCC_PSTATE_PARAM_BLOCK_SIZE, (8 * ONE_KB)) // this is over allocated +HCD_CONST(OCC_PSTATE_PARAM_BLOCK_REGION_SIZE, (16 * ONE_KB)) + +HCD_CONST(PGPE_PSTATE_OUTPUT_TABLES_PPMR_OFFSET, (144 * ONE_KB)) +HCD_CONST(PGPE_PSTATE_OUTPUT_TABLES_SIZE, (8 * ONE_KB)) // this is over allocated +HCD_CONST(PGPE_PSTATE_OUTPUT_TABLES_REGION_SIZE, (16 * ONE_KB)) + +HCD_CONST(OCC_WOF_TABLES_PPMR_OFFSET, (768 * ONE_KB)) +HCD_CONST(OCC_WOF_TABLES_SIZE, (256 * ONE_KB)) +HCD_CONST(PPMR_RESERVE_PSTATE_TABLE_TO_WOF, + ( OCC_WOF_TABLES_PPMR_OFFSET - ( PGPE_PSTATE_OUTPUT_TABLES_PPMR_OFFSET + PGPE_PSTATE_OUTPUT_TABLES_REGION_SIZE ) )) + +HCD_CONST(WOF_TABLE_RESERVE, + OCC_WOF_TABLES_PPMR_OFFSET - (PGPE_PSTATE_OUTPUT_TABLES_PPMR_OFFSET + PGPE_PSTATE_OUTPUT_TABLES_REGION_SIZE)) +HCD_CONST(PGPE_AUX_TASK_SIZE, (2 * ONE_KB)) + +#endif /* __HCD_MEMMAP_BASE_H__ */ diff --git a/libpore/p10_hcd_memmap_homer.H b/libpore/p10_hcd_memmap_homer.H new file mode 100644 index 000000000..6338bf2dd --- /dev/null +++ b/libpore/p10_hcd_memmap_homer.H @@ -0,0 +1,94 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_hcd_memmap_homer.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2015,2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +/// +/// @file p9_hcd_memmap_homer.H +/// @brief defines region constants of homer. +/// + +// *HWP HWP Owner: David Du +// *HWP Backup HWP Owner: Greg Still +// *HWP FW Owner: Prem S Jha +// *HWP Team: PM +// *HWP Level: 2 +// *HWP Consumed by: PM:Hostboot:Phyp + +#ifndef __P9_HCD_MEMMAP_HOMER_H__ +#define __P9_HCD_MEMMAP_HOMER_H__ + +#include +#include + +// ------------------------------------------------------------------- +// Note: There can be NO semicolons(";") at end of macros in this file +// There can ONLY have HCD_CONST/HCD_CONST64 macros in this file +//-------------------------------------------------------------------- + +/// HOMER + +HCD_CONST(HOMER_BASE_ADDR, 0x80000000) +HCD_CONST(IMG_HDR_ALIGN_SIZE, 128) + +/// OPMR +HCD_CONST(OPMR_REGION_SIZE, ONE_MB ) + + +/// XPMR +HCD_CONST(XPMR_HOMER_OFFSET, (HOMER_XPMR_REGION_NUM* ONE_MB)) +HCD_CONST(HOMER_XPMR_BASE_ADDR, (HOMER_BASE_ADDR + (XPMR_HOMER_OFFSET))) +HCD_CONST(HOMER_XPMR_HEADER_ADDR, HOMER_XPMR_BASE_ADDR) +HCD_CONST(HOMER_XGPE_BOOT_COPIER_ADDR, (HOMER_XPMR_HEADER_ADDR + XPMR_HEADER_SIZE)) +HCD_CONST(XGPE_BOOT_COPIER_SIZE, (ONE_KB)) +HCD_CONST(HOMER_XGPE_BOOT_LOADER_OFFSET_ADDR, + (HOMER_XPMR_HEADER_ADDR + XPMR_BOOT_LOADER_OFFSET_BYTE)) +HCD_CONST(HOMER_XGPE_BOOT_LOADER_LENGTH_ADDR, + (HOMER_XPMR_HEADER_ADDR + XPMR_BOOT_LOADER_LENGTH_BYTE)) + +/// CPMR + +HCD_CONST(HOMER_CPMR_BASE_ADDR, (HOMER_BASE_ADDR + (CPMR_HOMER_OFFSET))) +HCD_CONST(HOMER_CPMR_HEADER_ADDR, HOMER_CPMR_BASE_ADDR) +HCD_CONST(HOMER_CPMR_TRACE_ADDR, (HOMER_CPMR_BASE_ADDR + CPMR_TRACE_REGION_OFFSET)) +HCD_CONST(HOMER_CPMR_DEBUG_ADDR, (HOMER_CPMR_BASE_ADDR + CPMR_DEBUG_REGION_OFFSET)) + + +/// PPMR + +HCD_CONST(HOMER_PPMR_BASE_ADDR, (HOMER_BASE_ADDR + (PPMR_HOMER_OFFSET))) +HCD_CONST(HOMER_PPMR_HEADER_ADDR, HOMER_PPMR_BASE_ADDR) +HCD_CONST(HOMER_PGPE_BOOT_LOADER_OFFSET_ADDR, + (HOMER_PPMR_HEADER_ADDR + PPMR_BOOT_LOADER_OFFSET_BYTE)) +HCD_CONST(HOMER_PGPE_BOOT_LOADER_LENGTH_ADDR, + (HOMER_PPMR_HEADER_ADDR + PPMR_BOOT_LOADER_LENGTH_BYTE)) +HCD_CONST(HOMER_PGPE_BOOT_COPIER_ADDR, + (HOMER_PPMR_HEADER_ADDR + PPMR_HEADER_SIZE)) + +HCD_CONST(HOMER_OCC_PSTATE_PARAM_BLOCK_ADDR, + (HOMER_PPMR_BASE_ADDR + OCC_PSTATE_PARAM_BLOCK_PPMR_OFFSET)) +HCD_CONST(HOMER_PGPE_PSTATE_OUTPUT_TABLES_ADDR, + (HOMER_PPMR_BASE_ADDR + PGPE_PSTATE_OUTPUT_TABLES_PPMR_OFFSET)) +HCD_CONST(HOMER_OCC_WOF_TABLES_ADDR, + (HOMER_PPMR_BASE_ADDR + OCC_WOF_TABLES_PPMR_OFFSET)) + +#endif /* __P9_HCD_MEMMAP_HOMER_H__ */ diff --git a/libpore/p10_hcd_memmap_occ_sram.H b/libpore/p10_hcd_memmap_occ_sram.H new file mode 100644 index 000000000..255748bc8 --- /dev/null +++ b/libpore/p10_hcd_memmap_occ_sram.H @@ -0,0 +1,174 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_hcd_memmap_occ_sram.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2015,2020 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +/// +/// @file p10_hcd_memmap_occ_sram.H +/// @brief defines region constants of occ sram. +/// + +// *HWP HWP Owner: David Du +// *HWP Backup HWP Owner: Greg Still +// *HWP FW Owner: Prem S Jha +// *HWP Team: PM +// *HWP Level: 2 +// *HWP Consumed by: HB, XGPE,PGPE + +#ifndef __HCD_MEMMAP_OCC_SRAM_H__ +#define __HCD_MEMMAP_OCC_SRAM_H__ + +#include +#include + +// ------------------------------------------------------------------- +// Note: There can be NO semicolons(";") at end of macros in this file +// There can ONLY have HCD_CONST/HCD_CONST64 macros in this file +// ------------------------------------------------------------------- + +/// OCC SRAM + +HCD_CONST(OCC_SRAM_BASE_ADDR, 0xFFF00000) +HCD_CONST(GPE0_SRAM_BASE_ADDR, 0xFFF01000) +HCD_CONST(GPE1_SRAM_BASE_ADDR, 0xFFF10000) +HCD_CONST(PGPE_SRAM_BASE_ADDR, 0xFFF20000) +HCD_CONST(XGPE_SRAM_BASE_ADDR, 0xFFF30000) +HCD_CONST(OCC_SRAM_SIZE, ONE_MB) +HCD_CONST(OCC_SRAM_END_ADDR, ( OCC_SRAM_BASE_ADDR + OCC_SRAM_SIZE)) + +/// Base Addresses for various debug/trace regions in OCC SRAM +HCD_CONST(OCC_SRAM_TRACE_BUF_BASE_ERR, 0xFFFB4000) +HCD_CONST(OCC_SRAM_TRACE_BUF_BASE_INF, 0xFFFB6000) +HCD_CONST(OCC_SRAM_TRACE_BUF_BASE_IMP, 0xFFFB8000) +HCD_CONST(OCC_SRAM_TRACE_BUF_BASE_SSX_PTR, 0xFFF40824) +HCD_CONST(OCC_SRAM_PGPE_REGION_SIZE, (64 * ONE_KB)) +HCD_CONST(OCC_SHARED_SRAM_ADDR_START, + ((PGPE_SRAM_BASE_ADDR + OCC_SRAM_PGPE_REGION_SIZE) - PGPE_OCC_SHARED_SRAM_SIZE)) + +// Offset to trace buf ptr and trace buffer size from base +HCD_CONST(GPE_DEBUG_PTR_OFFSET, 0x180) + +// Size of various traces regions in OCC SRAM +HCD_CONST(OCC_SRAM_TRACE_BUF_SSX_SIZE_PTR, 0xFFF40828) +HCD_CONST(OCC_SRAM_TRACE_BUF_ERR_SIZE, (8 * ONE_KB)) +HCD_CONST(OCC_SRAM_TRACE_BUF_INF_SIZE, (8 * ONE_KB)) +HCD_CONST(OCC_SRAM_TRACE_BUF_IMP_SIZE, (8 * ONE_KB)) + +HCD_CONST(OCC_SRAM_IPC_REGION_SIZE, (4 * ONE_KB)) +HCD_CONST(OCC_SRAM_GPE0_REGION_SIZE, (60 * ONE_KB)) +HCD_CONST(OCC_SRAM_GPE1_REGION_SIZE, (64 * ONE_KB)) +HCD_CONST(OCC_SRAM_OCC_REGION_SIZE, (512 * ONE_KB)) +HCD_CONST(OCC_SRAM_XGPE_REGION_SIZE, (64 * ONE_KB)) + + +HCD_CONST(PPE_RESET_VECTOR, 0x40) +//-------------------------------------------------------------------------------------- + +/// PGPE Base + +HCD_CONST(OCC_SRAM_PGPE_BASE_ADDR, PGPE_SRAM_BASE_ADDR) +HCD_CONST(OCC_SRAM_PGPE_END_ADDR, + (PGPE_SRAM_BASE_ADDR + OCC_SRAM_PGPE_REGION_SIZE)) +HCD_CONST(OCC_SRAM_PGPE_HCODE_RESET_ADDR, + (PGPE_SRAM_BASE_ADDR + PGPE_HCODE_RESET_ADDR_VAL)) +HCD_CONST(OCC_SRAM_PGPE_HEADER_ADDR, + (OCC_SRAM_PGPE_BASE_ADDR + PGPE_INT_VECTOR_SIZE)) +//PGPE image size is sum of various parts hence located here instead of p10_hcd_memmap_base.H +HCD_CONST(PGPE_HCODE_SIZE, (OCC_SRAM_PGPE_REGION_SIZE - ( PGPE_OCC_SHARED_SRAM_SIZE + + PGPE_GLOBAL_PSTATE_PARAM_BLOCK_SIZE + PGPE_SRAM_BOOT_REGION ))) +HCD_CONST(PGPE_IMAGE_SIZE, (PGPE_HCODE_SIZE + PGPE_GLOBAL_PSTATE_PARAM_BLOCK_SIZE + + PGPE_OCC_SHARED_SRAM_SIZE + PGPE_SRAM_BOOT_REGION)) +HCD_CONST(PGPE_IMAGE_RESERVE_SIZE, + (OCC_PSTATE_PARAM_BLOCK_PPMR_OFFSET - PGPE_IMAGE_PPMR_OFFSET - PGPE_IMAGE_SIZE)) + + +/// PGPE Boot + +HCD_CONST(OCC_SRAM_PGPE_COPY_BOOT_LOADER_SIZE, ONE_KB) +HCD_CONST(OCC_SRAM_PGPE_COPY_PPMR_HEADER_SIZE, 512) +HCD_CONST(OCC_SRAM_PGPE_BOOT_LOADER_ADDR, + (OCC_SRAM_END_ADDR - OCC_SRAM_PGPE_COPY_BOOT_LOADER_SIZE)) +HCD_CONST(OCC_SRAM_PGPE_BOOT_LOADER_RESET_ADDR, + (OCC_SRAM_PGPE_BOOT_LOADER_ADDR + PGPE_BOOT_LOADER_RESET_ADDR_VAL)) +HCD_CONST(OCC_SRAM_PGPE_PPMR_HEADER_ADDR, + (OCC_SRAM_PGPE_BOOT_LOADER_ADDR - OCC_SRAM_PGPE_COPY_PPMR_HEADER_SIZE)) +HCD_CONST(OCC_SRAM_PGPE_OPTRACE_ADDR, OCC_SRAM_PGPE_BOOT_LOADER_ADDR) +HCD_CONST(OCC_SRAM_PGPE_OPTRACE_SIZE, OCC_SRAM_PGPE_COPY_BOOT_LOADER_SIZE) + +/// PGPE Copy + +HCD_CONST(OCC_SRAM_PGPE_HCODE_OFFSET_ADDR, + (OCC_SRAM_PGPE_PPMR_HEADER_ADDR + PPMR_PGPE_HCODE_OFFSET_BYTE)) +HCD_CONST(OCC_SRAM_PGPE_HCODE_LENGTH_ADDR, + (OCC_SRAM_PGPE_PPMR_HEADER_ADDR + PPMR_PGPE_HCODE_LENGTH_BYTE)) +HCD_CONST(OCC_SRAM_PGPE_IMAGE_LENGTH_ADDR, + (OCC_SRAM_PGPE_PPMR_HEADER_ADDR + PPMR_PGPE_SRAM_IMAGE_SIZE_BYTE)) + +// Misc constants used in PGPE boot loader and boot copier. +HCD_CONST(PGPE_BOOT_COPY_SUCCESS, 0x42432d53 ) // ASCII code for BC-S +HCD_CONST(PGPE_BOOT_COPIER_FAIL, 0x42432d46 ) // ASCII code for BC-F +HCD_CONST(PGPE_BOOT_LOADER_SUCCESS, 0x424c2d53 ) // ASCII code for BL-S +HCD_CONST(PGPE_BOOT_LOADER_FAIL, 0x424c2d46 ) // ASCII code for BL-F + +//-------------------------------------------------------------------------------------- + +// Misc constants used in XGPE boot loader and boot copier. +HCD_CONST(DIVDE_BY_8, 3) +HCD_CONST(DOUBLE_WORD_SIZE, 8) +HCD_CONST(XGPE_IMG_OFFSET_POS, 40) +HCD_CONST(BOOT_COPIER_LEN_ZERO, 0) +HCD_CONST(ENABLE_TRAP, 0) +HCD_CONST(XGPE_BOOT_COPY_SUCCESS, 0x42432d53 ) // ASCII code for BC-S +HCD_CONST(XGPE_BOOT_COPIER_FAIL, 0x42432d46 ) // ASCII code for BC-F +HCD_CONST(XGPE_BOOT_LOADER_SUCCESS, 0x424c2d53 ) // ASCII code for BL-S +HCD_CONST(XGPE_BOOT_LOADER_FAIL, 0x424c2d46 ) // ASCII code for BL-F + +/// XGPE Base +HCD_CONST(OCC_SRAM_XGPE_SYSTEM_RESET_ADDR, + (XGPE_SRAM_BASE_ADDR + XGPE_HCODE_RESET_ADDR_VAL)) +HCD_CONST(OCC_SRAM_XGPE_IVPR_ADDR, XGPE_SRAM_BASE_ADDR) +HCD_CONST(OCC_SRAM_XGPE_GPPB_ADDR, + (PGPE_SRAM_BASE_ADDR + PGPE_HEADER_IMAGE_OFFSET + PGPE_GLOBAL_PSTATE_SRAM_ADDR_BYTE)) +HCD_CONST(OCC_SRAM_XGPE_GPPB_LEN, + (PGPE_SRAM_BASE_ADDR + PGPE_HEADER_IMAGE_OFFSET + PGPE_GLOBAL_PSTATE_PPB_SIZE_BYTE)) + +/// XGPE Boot +HCD_CONST(OCC_SRAM_XGPE_COPY_BOOT_LOADER_SIZE, ONE_KB) +HCD_CONST(OCC_SRAM_XGPE_COPY_XPMR_HEADER_SIZE, 512) +HCD_CONST(OCC_SRAM_XGPE_BOOT_LOADER_ADDR, + (OCC_SRAM_END_ADDR - OCC_SRAM_XGPE_COPY_BOOT_LOADER_SIZE)) +HCD_CONST(OCC_SRAM_XGPE_BOOT_LOADER_RESET_ADDR, + (OCC_SRAM_XGPE_BOOT_LOADER_ADDR + XGPE_BOOT_LOADER_RESET_ADDR_VAL)) +HCD_CONST(OCC_SRAM_XGPE_XPMR_HEADER_ADDR, + (OCC_SRAM_XGPE_BOOT_LOADER_ADDR - OCC_SRAM_XGPE_COPY_XPMR_HEADER_SIZE)) + +/// XGPE Copy +HCD_CONST(OCC_SRAM_XGPE_HCODE_OFFSET_ADDR, + (OCC_SRAM_XGPE_XPMR_HEADER_ADDR + XPMR_XGPE_HCODE_OFFSET_BYTE)) +HCD_CONST(OCC_SRAM_XGPE_HCODE_LENGTH_ADDR, + (OCC_SRAM_XGPE_XPMR_HEADER_ADDR + XPMR_XGPE_HCODE_LENGTH_BYTE)) +HCD_CONST(OCC_SRAM_XGPE_IMAGE_LENGTH_ADDR, + (OCC_SRAM_XGPE_XPMR_HEADER_ADDR + XPMR_XGPE_SRAM_IMAGE_SIZE_BYTE)) +HCD_CONST(OCC_SRAM_XGPE_HCODE_RESET_ADDR, + (XGPE_SRAM_BASE_ADDR + XGPE_HCODE_RESET_ADDR_VAL)) + +#endif /* __HCD_MEMMAP_OCC_SRAM_H__ */ diff --git a/libpore/p10_hcode_image_defines.H b/libpore/p10_hcode_image_defines.H new file mode 100644 index 000000000..6a14cb241 --- /dev/null +++ b/libpore/p10_hcode_image_defines.H @@ -0,0 +1,462 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_hcode_image_defines.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2019,2020 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +/// +/// @file p10_hcode_image_defines.H +/// @brief defines constants associated with hcode image build. +/// +// *HWP HWP Owner: Greg Still +// *HWP FW Owner: Prem S Jha +// *HWP Team: PM +// *HWP Level: 2 +// *HWP Consumed by: Hostboot: HBRT + +#ifndef __HW_IMG_DEFINE +#define __HW_IMG_DEFINE + +#include +#include +#include +#include + +//-------------------------------------------------------------------------- +// local structs and constants +// ------------------------------------------------------------------------- +#ifndef __ASSEMBLER__ +#ifdef __cplusplus +#ifndef __PPE_PLAT + +#define IMG_HDR_ALIGN_SIZE 32 + +namespace hcodeImageBuild +{ +#endif //__PPE_PLAT +#endif //__cplusplus +#endif //__ASSEMBLER__ + +/** + * CPMR Header + * + * This header is only consumed by Hcode Image Build and + * lab tools, not by PPE code. It is generated with assembler + * primitives during QME build and placed in HOMER by + * Hcode Image Build. + */ + +#ifdef __ASSEMBLER__ +.macro .cpmr_header +.section ".cpmr" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_ATTN ( iv_attnOpcodes, 2); +HCD_HDR_UINT64( iv_cpmrMagicWord, CPMR_MAGIC_NUMBER); +HCD_HDR_UINT32( iv_buildDate, 0); +HCD_HDR_UINT32( iv_version, 0); +HCD_HDR_UINT8_VEC (iv_reserveFlags, 4, 0); +HCD_HDR_UINT8 ( iv_selfRestoreVer, 0); +HCD_HDR_UINT8 ( iv_stopApiVer, 0); +HCD_HDR_UINT8 ( iv_urmorFix, 0); +HCD_HDR_UINT8 ( iv_fusedMode, 0); +HCD_HDR_UINT32( iv_qmeImgOffset, 0); +HCD_HDR_UINT32( iv_qmeImgLength, 0); +HCD_HDR_UINT32( iv_commonRingOffset, 0); +HCD_HDR_UINT32( iv_commonRingLength, 0); +HCD_HDR_UINT32( iv_localPstateOffset, 0); +HCD_HDR_UINT32( iv_localPstateLength, 0); +HCD_HDR_UINT32( iv_specRingOffset, 0); +HCD_HDR_UINT32( iv_specRingLength, 0); +HCD_HDR_UINT32( iv_scomRestoreOffset, 0); +HCD_HDR_UINT32( iv_scomRestoreLength, 0); +HCD_HDR_UINT32( iv_selfRestoreOffset, 0); +HCD_HDR_UINT32( iv_selfRestoreLength, 0); +HCD_HDR_UINT32( iv_maxCoreL2ScomEntry, 0); +HCD_HDR_UINT32( iv_maxEqL3ScomEntry, 0); +HCD_HDR_UINT32( iv_coreL2ScomOffset, 0); +HCD_HDR_UINT32( iv_coreL2ScomLength, 0); +HCD_HDR_UINT32( iv_eqL3ScomOffset, 0); +HCD_HDR_UINT32( iv_eqL3ScomLength, 0); +HCD_HDR_PAD(CPMR_HEADER_SIZE); +#ifdef __ASSEMBLER__ +.endm +#else +} __attribute__((packed, aligned(256))) CpmrHeader_t; +#endif + +/** + * QME Header + * + * The QME header is loaded in the QME SRAM so it is "tight" (little extra space) + * Thus, this "structure" is NOT padded to a specific size and is limited to + * 64B. Also, structure member names are preceded with "g_" as these becoming + * global variables in the QME Hcode. + */ +#ifdef __ASSEMBLER__ +.macro .qme_header +.section ".qme_image_header" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_UINT64( g_qme_magic_number, QME_MAGIC_NUMBER ); +HCD_HDR_UINT32( g_qme_hcode_offset, 0 ); +HCD_HDR_UINT32( g_qme_hcode_length, 0 ); +HCD_HDR_UINT32( g_qme_common_ring_offset, 0 ); +HCD_HDR_UINT32( g_qme_cmn_ring_ovrd_offset, 0 ); +HCD_HDR_UINT32( g_qme_common_ring_length, 0 ); +HCD_HDR_UINT32( g_qme_pstate_region_offset, 0 ); +HCD_HDR_UINT32( g_qme_pstate_region_length, 0 ); +HCD_HDR_UINT32( g_qme_inst_spec_ring_offset, 0 ); +HCD_HDR_UINT32( g_qme_max_spec_ring_length, 0 ); +HCD_HDR_UINT32( g_qme_scom_offset, 0 ); +HCD_HDR_UINT32( g_qme_scom_length, 0 ); +HCD_HDR_UINT32( g_qme_attr_tank_address, 0 ); +HCD_HDR_UINT16( g_qme_location_id, 0 ); +HCD_HDR_UINT16( g_qme_reserved , 0 ); +HCD_HDR_UINT32( g_qme_timebase_hz, 0 ); //Retain next field at 8B boundary +HCD_HDR_UINT64( g_qme_cpmr_PhyAddr, 0 ); +HCD_HDR_UINT64( g_qme_unsec_cpmr_PhyAddr, 0 ); +HCD_HDR_UINT32( g_qme_custom_length, 0 ); +HCD_HDR_UINT32( g_qme_elog_addr, 0 ); +HCD_HDR_PAD(IMG_HDR_ALIGN_SIZE); +#ifdef __ASSEMBLER__ +.endm +#else +//QME Header size is 96B +} __attribute__((packed, aligned(32))) QmeHeader_t; +#endif + +#ifndef __ASSEMBLER__ + +typedef struct QMEImageFlags +{ + uint32_t fused_mode : 1; + uint32_t reserved0 : 31; +} QMEImageFlags_t; + +#endif //__ASSEMBLER__ + +#ifdef __ASSEMBLER__ +.macro .ppmr_header +.section ".ppmr" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_UINT64( iv_ppmrMagicWord, PPMR_MAGIC_NUMBER); +HCD_HDR_UINT32( iv_bootCopierOffset, 0); +HCD_HDR_UINT32( iv_reserved1, 0); +HCD_HDR_UINT32( iv_bootLoaderOffset, 0); +HCD_HDR_UINT32( iv_bootLoaderLength, 0); +HCD_HDR_UINT32( iv_buildDate, 0); +HCD_HDR_UINT32( iv_buildVer, 0); +HCD_HDR_UINT64( iv_reserved2, 0); +HCD_HDR_UINT32( iv_hcodeOffset, 0); +HCD_HDR_UINT32( iv_hcodeLength, 0); +HCD_HDR_UINT32( iv_gpspbOffset, 0); +HCD_HDR_UINT32( iv_gpspbLength, 0); +HCD_HDR_UINT32( iv_lpspbOffset, 0); +HCD_HDR_UINT32( iv_lpspbLength, 0); +HCD_HDR_UINT32( iv_opspbOffset, 0); +HCD_HDR_UINT32( iv_opspbLength, 0); +HCD_HDR_UINT32( iv_pstateOffset, 0); +HCD_HDR_UINT32( iv_pstateLength, 0); +HCD_HDR_UINT32( iv_sramSize, 0); +HCD_HDR_UINT32( iv_progCode, 0); +HCD_HDR_UINT32( iv_wofTableOffset, 0); +HCD_HDR_UINT32( iv_wofTableLength, 0); +HCD_HDR_UINT32( iv_deepOptraceOffset, 0); +HCD_HDR_UINT32( iv_deepOptraceLength, 0); + +#ifdef __ASSEMBLER__ +.endm +#else +} __attribute__((packed, aligned(32))) PpmrHeader_t; +#endif + +#ifdef __ASSEMBLER__ +.macro .pgpe_header +.section ".pgpe_hdr" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_UINT64( g_pgpe_magicWord, PGPE_MAGIC_NUMBER); +HCD_HDR_UINT32( g_pgpe_sysResetAddress, 0); +HCD_HDR_UINT32( g_pgpe_sharedSramAddress, 0); +HCD_HDR_UINT32( g_pgpe_ivprAddress, 0); +HCD_HDR_UINT32( g_pgpe_sharedLength, 0); +HCD_HDR_UINT32( g_pgpe_buildDate, 0); +HCD_HDR_UINT32( g_pgpe_buildVer, 0); +HCD_HDR_UINT32( g_pgpe_reserved0, 0); +HCD_HDR_UINT32( g_pgpe_timeBaseHz, 0); +HCD_HDR_UINT32( g_pgpe_gpspbSramAddress, 0); +HCD_HDR_UINT32( g_pgpe_hcodeLength, 0); +HCD_HDR_UINT32( g_pgpe_gpspbMemOffset, 0); +HCD_HDR_UINT32( g_pgpe_gpspbMemLength, 0); +HCD_HDR_UINT32( g_pgpe_genPsTableMemOffset, 0); +HCD_HDR_UINT32( g_pgpe_genPsTableMemLength, 0); +HCD_HDR_UINT32( g_pgpe_opspbTableAddress, 0); +HCD_HDR_UINT32( g_pgpe_opspbTableLength, 0); +HCD_HDR_UINT32( g_pgpe_beaconAddress, 0); +HCD_HDR_UINT32( g_pgpe_reserved1, 0); +HCD_HDR_UINT32( g_pgpe_pgpeWofStateAddress, 0); +HCD_HDR_UINT32( g_pgpe_reserved2, 0); +HCD_HDR_UINT32( g_pgpe_wofTableAddress, 0); +HCD_HDR_UINT32( g_pgpe_wofTableLength, 0); +HCD_HDR_UINT32( g_pgpe_reserved3, 0); +HCD_HDR_UINT32( g_pgpe_reserved4, 0); +HCD_HDR_UINT32( g_pgpe_reserved5, 0); +HCD_HDR_UINT32( g_pgpe_opTracePtr, 0); +HCD_HDR_UINT32( g_pgpe_deepOpTraceMemAddress, 0); +HCD_HDR_UINT32( g_pgpe_deepOpTraceLength, 0); +#ifdef __ASSEMBLER__ +.endm +#else +} __attribute__((packed, aligned(32))) PgpeHeader_t; +#endif + +#ifdef __ASSEMBLER__ +.macro .xpmr_hdr +.section ".xpmr" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_UINT64( iv_xpmrMagicWord, XPMR_MAGIC_NUMBER); +HCD_HDR_UINT32( iv_bootCopierOffset, 0); +HCD_HDR_UINT32( iv_reserve1, 0); +HCD_HDR_UINT32( iv_bootLoaderOffset, 0); +HCD_HDR_UINT32( iv_bootLoaderLength, 0); +HCD_HDR_UINT32( iv_buildDate, 0); +HCD_HDR_UINT32( iv_version, 0); +HCD_HDR_UINT32( iv_reserve2, 0); +HCD_HDR_UINT32( iv_reserve3, 0); +HCD_HDR_UINT32( iv_xgpeHcodeOffset, 0); +HCD_HDR_UINT32( iv_xgpeHcodeLength, 0); +HCD_HDR_UINT32( iv_xgpeBootProgCode, 0); +HCD_HDR_UINT32( iv_xgpeSramSize, 0); +HCD_HDR_PAD(XPMR_HEADER_SIZE); +#ifdef __ASSEMBLER__ +.endm +#else +} __attribute__((packed, aligned(512))) XpmrHeader_t; +#endif + +#ifdef __ASSEMBLER__ +.macro .xgpe_header +.section ".xgpe_header" , "aw" +.balign 8 +#else +typedef struct +{ +#endif +HCD_HDR_UINT64( g_xgpe_magicWord, XGPE_MAGIC_NUMBER); +HCD_HDR_UINT32( g_xgpe_sysResetAddress, 0 ); //FIXME need to add correct address +HCD_HDR_UINT32( g_xgpe_sharedSramAddress, 0 ); //FIXME need to add correct address +HCD_HDR_UINT32( g_xgpe_ivprAddress, 0 ); //FIXME need to add correct address +HCD_HDR_UINT32( g_xgpe_sharedSramLength, 0 ); +HCD_HDR_UINT32( g_xgpe_buildDate, 0 ); +HCD_HDR_UINT32( g_xgpe_buildVer, 0 ); +HCD_HDR_UINT16( g_xgpe_xgpeFlags, 0 ); +HCD_HDR_UINT16( g_xgpe_reserve1, 0 ); +HCD_HDR_UINT32( g_xgpe_timeBaseHz, 0 ); +HCD_HDR_UINT32( g_xgpe_gpspbSramAddress, 0 ); +HCD_HDR_UINT32( g_xgpe_hcodeLength, 0 ); +HCD_HDR_UINT32( g_xgpe_reserve2, 0 ); +HCD_HDR_UINT32( g_xgpe_gpspbLength, 0 ); +HCD_HDR_UINT32( g_xgpe_coreThrottleAssertCnt, 0 ); +HCD_HDR_UINT32( g_xgpe_coreThrottleDeAssertCnt, 0 ); +HCD_HDR_UINT32( g_xgpe_charactControls, 0 ); +HCD_HDR_UINT32( g_xgpe_xgpeOpTracePointer, 0 ); +HCD_HDR_UINT32( g_xgpe_xgpeDeepOpTraceMemAddr, 0 ); +HCD_HDR_UINT32( g_xgpe_xgpeDeepOpTraceLength, 0 ); +HCD_HDR_PAD(IMG_HDR_ALIGN_SIZE); +#ifdef __ASSEMBLER__ +.endm +#else +} __attribute__((packed, aligned(32))) XgpeHeader_t; +#endif + +#ifndef __ASSEMBLER__ + +/** + * @brief enumerates all return codes associated with hcode image build. + */ +enum ImgBldRetCode_t +{ + IMG_BUILD_SUCCESS = 0, + BUILD_FAIL_XGPE_IMAGE = 1, + BUILD_FAIL_SELF_REST_IMAGE = 2, + BUILD_FAIL_QME_IMAGE = 3, + BUILD_FAIL_PGPE_IMAGE = 4, + BUILD_FAIL_XGPE_QPMR = 5, + BUILD_FAIL_XGPE_BL1 = 6, + BUILD_FAIL_XGPE_BL2 = 7, + BUILD_FAIL_XGPE_INT_VECT = 8, + BUILD_FAIL_XGPE_HDR = 9, + BUILD_FAIL_XGPE_HCODE = 10, + BUILD_FAIL_XGPE_CMN_RINGS = 11, + BUILD_FAIL_XGPE_SPEC_RINGS = 12, + BUILD_FAIL_CPMR_HDR = 13, + BUILD_FAIL_SRESET_HNDLR = 14, + BUILD_FAIL_THRD_LAUNCHER = 15, + BUILD_FAIL_SPR_RESTORE = 16, + BUILD_FAIL_SCOM_RESTORE = 17, + BUILD_FAIL_QME_IMG_HDR = 18, + BUILD_FAIL_QME_HCODE = 19, + BUILD_FAIL_CMN_RINGS = 20, + BUILD_FAIL_QME_QUAD_PSTATE = 21, + BUILD_FAIL_SPEC_RINGS = 22, + BUILD_FAIL_INT_VECT = 23, + BUILD_FAIL_PGPE_BL1 = 24, + BUILD_FAIL_PGPE_BL2 = 25, + BUILD_FAIL_PGPE_HCODE = 26, + BUILD_FAIL_OVERRIDE = 27, + BUILD_SEC_SIZE_OVERFLOW = 28, + BUILD_FAIL_INVALID_SECTN = 29, + BUILD_FAIL_RING_EXTRACTN = 30, + QME_SRAM_IMG_SIZE_ERR = 31, + XGPE_SRAM_IMG_SIZE_ERR = 32, + PGPE_SRAM_IMG_SIZE_ERR = 33, + BUILD_FAIL_PGPE_PPMR = 34, + BUILD_FAIL_XIP_CUST_ERR = 35, + BUILD_ERR_INTERNAL = 0xffff, +}; + +/** + * @brief models SCOM restore header region. + */ +typedef struct +{ + uint16_t iv_magicMark; + uint8_t iv_version; + uint8_t iv_reserved1; + uint8_t iv_reserved2[4]; + uint16_t iv_coreOffset; + uint16_t iv_coreLength; + uint16_t iv_eqOffset; + uint16_t iv_eqLength; + uint16_t iv_l2Offset; + uint16_t iv_l2Length; + uint16_t iv_l3Offset; + uint16_t iv_l3Length; +} ScomRestoreHeader_t; + +/** + * @brief models a CPU register restoration area in STOP section of homer image. + */ +typedef struct +{ + uint8_t iv_threadRestoreArea[MAX_THREADS_PER_CORE][SMF_CORE_RESTORE_THREAD_AREA_SIZE]; + uint8_t iv_threadSaveArea[MAX_THREADS_PER_CORE][SMF_SELF_SAVE_THREAD_AREA_SIZE]; + uint8_t iv_coreRestoreArea[SMF_CORE_RESTORE_CORE_AREA_SIZE]; + uint8_t iv_coreSaveArea[SMF_CORE_SAVE_CORE_AREA_SIZE]; +} SmfSprRestoreRegion_t; + +/** + * @brief models image section of CPMR in HOMER. + */ +typedef union CPMRSelfRestoreLayout +{ + uint8_t iv_region[SMF_SELF_RESTORE_CODE_SIZE]; + struct + { + CpmrHeader_t iv_CPMRHeader; + uint8_t iv_exe[SMF_SELF_RESTORE_CODE_SIZE - sizeof(CpmrHeader_t)]; + } elements; +} CPMRSelfRestoreLayout_t; + +/** + * @brief models image section associated with core self restore in HOMER. + */ +typedef struct +{ + CPMRSelfRestoreLayout_t iv_CPMR_SR; + uint8_t iv_coreSelfRestore[SMF_SELF_RESTORE_CORE_REGS_SIZE]; + uint8_t iv_reserve[SCOM_RESTORE_CPMR_OFFSET - SMF_SELF_RESTORE_SIZE_TOTAL]; + uint8_t iv_coreScom[SCOM_RESTORE_SIZE_TOTAL]; +} SelfRestoreLayout_t; + +typedef struct +{ + SelfRestoreLayout_t iv_selfRestoreRegion; + uint8_t iv_qmeSramRegion[QME_REGION_SIZE]; +} CPMRLayout_t; + +/** + * @brief models image section associated with PGPE in HOMER. + */ +typedef struct +{ + uint8_t iv_ppmrHeader[PPMR_HEADER_SIZE]; + uint8_t iv_bootCopier[PGPE_BOOT_COPIER_SIZE]; + uint8_t iv_bootLoader[PGPE_BOOT_LOADER_SIZE]; + uint8_t iv_pgpeSramRegion[OCC_SRAM_PGPE_REGION_SIZE]; + uint8_t iv_reserve1[OCC_PSTATE_PARAM_BLOCK_PPMR_OFFSET - (PPMR_BOOT_REGION + OCC_SRAM_PGPE_REGION_SIZE)]; + uint8_t iv_occPstateParamBlock[OCC_PSTATE_PARAM_BLOCK_REGION_SIZE]; + uint8_t iv_pstateTable[PGPE_PSTATE_OUTPUT_TABLES_REGION_SIZE]; + uint8_t iv_reserve2[PPMR_RESERVE_PSTATE_TABLE_TO_WOF]; + uint8_t iv_wofTable[OCC_WOF_TABLES_SIZE]; +} PPMRLayout_t; + +/** + * @brief models XPMR in HOMER. + */ +typedef struct +{ + uint8_t iv_xpmrHeader[XPMR_HEADER_SIZE]; + uint8_t iv_bootCopier[XGPE_BOOT_COPIER_LENGTH]; + uint8_t iv_bootLoader[XGPE_BOOT_LOADER_LENGTH]; + uint8_t iv_xgpeSramRegion[XGPE_SRAM_SIZE]; +} XPMRLayout_t; + +/** + * @brief models layout of HOMER. + */ +typedef struct +{ + uint8_t iv_occHostRegion[OCC_HOST_AREA_SIZE]; + XPMRLayout_t iv_xpmrRegion; + uint8_t iv_xpmrReserve[ONE_MB - sizeof( XPMRLayout_t )]; + CPMRLayout_t iv_cpmrRegion; + uint8_t iv_cpmrReserve[ONE_MB - sizeof( CPMRLayout_t )]; + PPMRLayout_t iv_ppmrRegion; + uint8_t iv_ppmrReserve[ONE_MB - sizeof( PPMRLayout_t )]; +} Homerlayout_t; + +#ifdef __cplusplus +#ifndef __PPE_PLAT +}// namespace hcodeImageBuild ends +#endif //__PPE_PLAT +#endif //__cplusplus + +#endif //__ASSEMBLER__ +#endif //__HW_IMG_DEFINE diff --git a/libpore/p10_stop_api.C b/libpore/p10_stop_api.C new file mode 100644 index 000000000..4a8efa7cd --- /dev/null +++ b/libpore/p10_stop_api.C @@ -0,0 +1,1816 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: chips/p10/procedures/utils/stopreg/p10_stop_api.C $ */ +/* */ +/* IBM CONFIDENTIAL */ +/* */ +/* EKB Project */ +/* */ +/* COPYRIGHT 2015,2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* The source code for this program is not published or otherwise */ +/* divested of its trade secrets, irrespective of what has been */ +/* deposited with the U.S. Copyright Office. */ +/* */ +/* IBM_PROLOG_END_TAG */ + +/// +/// @file p10_stop_api.C +/// @brief implements STOP API which create/manipulate STOP image. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP + +// *INDENT-OFF* +#ifdef PPC_HYP + #include +#endif + +#include "p10_stop_api.H" +#include "p10_cpu_reg_restore_instruction.H" +#include "p10_stop_data_struct.H" +#include +#include "p10_stop_util.H" +#include "p10_hcode_image_defines.H" +#ifdef __cplusplus +extern "C" { + +using namespace hcodeImageBuild; +namespace stopImageSection +{ +#endif +// a true in the table below means register is of scope thread +// whereas a false meanse register is of scope core. + +const StopSprReg_t g_sprRegister_p10[] = +{ + { PROC_STOP_SPR_CIABR, true, 0 }, + { PROC_STOP_SPR_DAWR, true, 1 }, + { PROC_STOP_SPR_DAWRX, true, 2 }, + { PROC_STOP_SPR_HSPRG0, true, 3 }, + { PROC_STOP_SPR_LDBAR, true, 4, }, + { PROC_STOP_SPR_LPCR, true, 5 }, + { PROC_STOP_SPR_PSSCR, true, 6 }, + { PROC_STOP_SPR_MSR, true, 7 }, + { PROC_STOP_SPR_HRMOR, false, 255 }, + { PROC_STOP_SPR_HID, false, 21 }, + { PROC_STOP_SPR_HMEER, false, 22 }, + { PROC_STOP_SPR_PMCR, false, 23 }, + { PROC_STOP_SPR_PTCR, false, 24 }, + { PROC_STOP_SPR_SMFCTRL, true, 28 }, + { PROC_STOP_SPR_USPRG0, true, 29 }, + { PROC_STOP_SPR_USPRG1, true, 30 }, + { PROC_STOP_SPR_URMOR, false, 255 }, +}; + +const uint32_t MAX_SPR_SUPPORTED_P10 = 17; +const uint32_t DEFAULT_CORE_SCOM_SUPPORTED = 15; +const uint32_t DEFAULT_QUAD_SCOM_SUPPORTED = 255; + +//----------------------------------------------------------------------------- + +/** + * @brief validated input arguments passed to proc_stop_save_cpureg_control. + * @param[in] i_pImage point to start of HOMER + * @param[in] i_coreId id of the core + * @param[in] i_threadId id of the thread + * @param[in] i_saveMaskVector SPR save bit mask vector + * @return STOP_SAVE_SUCCESS if function succeeds, error code otherwise. + */ +STATIC StopReturnCode_t validateArgumentSaveRegMask( void* const i_pImage, + uint32_t const i_coreId, + uint32_t const i_threadId, + uint64_t i_saveMaskVector ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + + do + { + if( !i_pImage ) + { + l_rc = STOP_SAVE_ARG_INVALID_IMG; + break; + } + + if( i_coreId > MAX_CORE_ID_SUPPORTED ) + { + l_rc = STOP_SAVE_ARG_INVALID_CORE; + break; + } + + if( i_threadId > MAX_THREAD_ID_SUPPORTED ) + { + l_rc = STOP_SAVE_ARG_INVALID_THREAD; + break; + } + + if( ( 0 == i_saveMaskVector ) || ( BAD_SAVE_MASK & i_saveMaskVector ) ) + { + l_rc = STOP_SAVE_ARG_INVALID_REG; + break; + } + + } + while(0); + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief validates input arguments provided by STOP API caller. + * @param[in] i_pImage pointer to beginning of chip's HOMER image. + * @param[in] i_regId SPR register id + * @param[in] i_coreId core id + * @param[in|out] i_pThreadId points to thread id + * @param[in|out] i_pThreadLevelReg points to scope information of SPR + * @return STOP_SAVE_SUCCESS if arguments found valid, error code otherwise. + * @note for register of scope core, function shall force io_threadId to + * zero. + */ +STATIC StopReturnCode_t validateSprImageInputs( void* const i_pImage, + const CpuReg_t i_regId, + const uint32_t i_coreId, + uint32_t* i_pThreadId, + bool* i_pThreadLevelReg ) +{ + uint32_t index = 0; + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + bool sprSupported = false; + *i_pThreadLevelReg = false; + + do + { + if( NULL == i_pImage ) + { + // Error: HOMER image start location is not valid + // Cannot proceed further. So, let us exit. + l_rc = STOP_SAVE_ARG_INVALID_IMG; + MY_ERR( "invalid image location " ); + + break; + } + + // STOP API manages STOP image based on physical core Id. PIR value + // is interpreted to calculate the physical core number and virtual + // thread number. + if( MAX_CORE_ID_SUPPORTED < i_coreId ) + { + // Error: invalid core number. given core number exceeds maximum + // cores supported by chip. + + // Physical core number is calculated based on following formula: + // core id = 4 * quad id (0..5) + core no within quad ( 0..3) + l_rc = STOP_SAVE_ARG_INVALID_CORE; + MY_ERR( "invalid core id " ); + break; + } + + if( MAX_THREAD_ID_SUPPORTED < *i_pThreadId ) + { + //Error: invalid core thread. Given core thread exceeds maximum + //threads supported in a core. + + // 64 bit PIR value is interpreted to calculate virtual thread + // Id. In fuse mode, b61 and b62 gives virtual thread id whereas in + // non fuse mode, b62 and b63 is read to determine the same. + + l_rc = STOP_SAVE_ARG_INVALID_THREAD; + MY_ERR( "invalid thread " ); + break; + } + + for( index = 0; index < MAX_SPR_SUPPORTED_P10; ++index ) + { + if( i_regId == (CpuReg_t )g_sprRegister_p10[index].iv_sprId ) + { + // given register is in the list of register supported + sprSupported = true; + *i_pThreadLevelReg = g_sprRegister_p10[index].iv_isThreadScope; + *i_pThreadId = *i_pThreadLevelReg ? *i_pThreadId : 0; + break; + } + } + + if( !sprSupported ) + { + // Following SPRs are supported + // trace out all registers supported + MY_ERR("Register not supported" ); + // error code to caller. + l_rc = STOP_SAVE_ARG_INVALID_REG; + break; + } + + } + while(0); + + if( l_rc ) + { + MY_ERR( "regId %08d, coreId %d, " + "threadId %d return code 0x%08x", i_regId, + i_coreId, *i_pThreadId, l_rc ); + } + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates ori instruction code. + * @param[in] i_Rs Source register number + * @param[in] i_Ra destination register number + * @param[in] i_data 16 bit immediate data + * @return returns 32 bit number representing ori instruction. + */ +STATIC uint32_t getOriInstruction( const uint16_t i_Rs, const uint16_t i_Ra, + const uint16_t i_data ) +{ + uint32_t oriInstOpcode = 0; + oriInstOpcode = 0; + oriInstOpcode = ORI_OPCODE << 26; + oriInstOpcode |= i_Rs << 21; + oriInstOpcode |= i_Ra << 16; + oriInstOpcode |= i_data; + + return SWIZZLE_4_BYTE(oriInstOpcode); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates 32 bit key used for SPR lookup in core section. + */ +STATIC uint32_t genKeyForSprLookup( const CpuReg_t i_regId ) +{ + return getOriInstruction( 24, 0, (uint16_t) i_regId ); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates xor instruction code. + * @param[in] i_Rs source register number for xor operation + * @param[in] i_Ra destination register number for xor operation result + * @param[in] i_Rb source register number for xor operation + * @return returns 32 bit number representing xor immediate instruction. + */ +STATIC uint32_t getXorInstruction( const uint16_t i_Ra, const uint16_t i_Rs, + const uint16_t i_Rb ) +{ + uint32_t xorRegInstOpcode; + xorRegInstOpcode = XOR_CONST << 1; + xorRegInstOpcode |= OPCODE_31 << 26; + xorRegInstOpcode |= i_Rs << 21; + xorRegInstOpcode |= i_Ra << 16; + xorRegInstOpcode |= i_Rb << 11; + + return SWIZZLE_4_BYTE(xorRegInstOpcode); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates oris instruction code. + * @param[in] i_Rs source register number + * @param[in] i_Ra destination register number + * @param[in] i_data 16 bit immediate data + * @return returns 32 bit number representing oris immediate instruction. + */ +STATIC uint32_t getOrisInstruction( const uint16_t i_Rs, const uint16_t i_Ra, + const uint16_t i_data ) +{ + uint32_t orisInstOpcode; + orisInstOpcode = 0; + orisInstOpcode = ORIS_OPCODE << 26; + orisInstOpcode |= ( i_Rs & 0x001F ) << 21 | ( i_Ra & 0x001F ) << 16; + orisInstOpcode |= i_data; + + return SWIZZLE_4_BYTE(orisInstOpcode); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates instruction for mtspr + * @param[in] i_Rs source register number + * @param[in] i_Spr represents spr where data is to be moved. + * @return returns 32 bit number representing mtspr instruction. + */ +STATIC uint32_t getMtsprInstruction( const uint16_t i_Rs, const uint16_t i_Spr ) +{ + uint32_t mtsprInstOpcode = 0; + uint32_t temp = (( i_Spr & 0x03FF ) << 11); + mtsprInstOpcode = (uint8_t)i_Rs << 21; + mtsprInstOpcode |= ( temp & 0x0000F800 ) << 5; + mtsprInstOpcode |= ( temp & 0x001F0000 ) >> 5; + mtsprInstOpcode |= MTSPR_BASE_OPCODE; + + return SWIZZLE_4_BYTE(mtsprInstOpcode); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates instruction for mfmsr + * @param[in] i_Rt target register for SPR content. + * @return returns 32 bit number representing mfmsr instruction. + */ +STATIC uint32_t getMfmsrInstruction( const uint16_t i_Rt ) +{ + uint32_t mfmsrInstOpcode = ((OPCODE_31 << 26) | (i_Rt << 21) | ((MFMSR_CONST)<< 1)); + + return SWIZZLE_4_BYTE(mfmsrInstOpcode); +} + +//----------------------------------------------------------------------------- + +/** + * @brief generates rldicr instruction. + * @param[in] i_Rs source register number + * @param[in] i_Ra destination register number + * @param[in] i_sh bit position by which contents of i_Rs are to be shifted + * @param[in] i_me bit position up to which mask should be 1. + * @return returns 32 bit number representing rldicr instruction. + */ +STATIC uint32_t getRldicrInstruction( const uint16_t i_Ra, const uint16_t i_Rs, + const uint16_t i_sh, uint16_t i_me ) +{ + uint32_t rldicrInstOpcode = 0; + rldicrInstOpcode = ((RLDICR_OPCODE << 26 ) | ( i_Rs << 21 ) | ( i_Ra << 16 )); + rldicrInstOpcode |= ( ( i_sh & 0x001F ) << 11 ) | (RLDICR_CONST << 2 ); + rldicrInstOpcode |= (( i_sh & 0x0020 ) >> 4); + rldicrInstOpcode |= (i_me & 0x001F ) << 6; + rldicrInstOpcode |= (i_me & 0x0020 ); + return SWIZZLE_4_BYTE(rldicrInstOpcode); +} + +//----------------------------------------------------------------------------- + +STATIC uint32_t getMfsprInstruction( const uint16_t i_Rt, const uint16_t i_sprNum ) +{ + uint32_t mfsprInstOpcode = 0; + uint32_t temp = (( i_sprNum & 0x03FF ) << 11); + mfsprInstOpcode = (uint8_t)i_Rt << 21; + mfsprInstOpcode |= (( temp & 0x0000F800 ) << 5); + mfsprInstOpcode |= (( temp & 0x001F0000 ) >> 5); + mfsprInstOpcode |= MFSPR_BASE_OPCODE; + + return SWIZZLE_4_BYTE(mfsprInstOpcode); +} + +//----------------------------------------------------------------------------- + +STATIC uint32_t getBranchLinkRegInstruction(void) +{ + uint32_t branchConstInstOpcode = 0; + branchConstInstOpcode = (( OPCODE_18 << 26 ) | ( SELF_SAVE_FUNC_ADD ) | 0x03 ); + + return SWIZZLE_4_BYTE(branchConstInstOpcode); +} +//----------------------------------------------------------------------------- + +/** + * @brief looks up entry for given SPR in given thread/core section. + * @param[in] i_pThreadSectLoc start of given thread section or core section. + * @param[in] i_lookUpKey search key for lookup of given SPR entry. + * @param[in] i_isThreadReg true if register is of scope thread, false + * otherwise. + * @param[in|out] io_pSprEntryLoc Input: NULL + * Output: location of given entry or end of table. + * @return STOP_SAVE_SUCCESS if entry is found, STOP_SAVE_FAIL in case of + * an error. + */ +STATIC StopReturnCode_t lookUpSprInImage( uint32_t* i_pThreadSectLoc, const uint32_t i_lookUpKey, + const bool i_isThreadReg, void** io_pSprEntryLoc ) +{ + StopReturnCode_t l_rc = STOP_SAVE_FAIL; + uint32_t temp = 0; + uint32_t* i_threadSectEnd = NULL; + uint32_t bctr_inst = SWIZZLE_4_BYTE(BLR_INST); + *io_pSprEntryLoc = NULL; + + do + { + if( !i_pThreadSectLoc ) + { + MY_ERR( "Bad SPR Start Location" ); + break; + } + + temp = i_isThreadReg ? (uint32_t)(SMF_CORE_RESTORE_THREAD_AREA_SIZE) : + (uint32_t)(SMF_CORE_RESTORE_CORE_AREA_SIZE); + + i_threadSectEnd = i_pThreadSectLoc + ( temp >> 2 ); + + temp = 0; + + while( ( i_pThreadSectLoc <= i_threadSectEnd ) && + ( temp != bctr_inst ) ) + { + temp = *i_pThreadSectLoc; + + if( ( temp == i_lookUpKey ) || ( temp == bctr_inst ) ) + { + *io_pSprEntryLoc = i_pThreadSectLoc; + l_rc = STOP_SAVE_SUCCESS; + break; + } + + i_pThreadSectLoc = i_pThreadSectLoc + SIZE_PER_SPR_RESTORE_INST; + } + } + while(0); + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief updates an SPR STOP image entry. + * @param[in] i_pSprEntryLocation location of entry. + * @param[in] i_regId register Id associated with SPR. + * @param[in] i_regData data needs to be written to SPR entry. + * @return STOP_SAVE_SUCCESS if update works, STOP_SAVE_FAIL otherwise. + */ +STATIC StopReturnCode_t updateSprEntryInImage( uint32_t* i_pSprEntryLocation, + const CpuReg_t i_regId, + const uint64_t i_regData, + const enum SprEntryUpdateMode i_mode + ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t tempInst = 0; + uint64_t tempRegData = 0; + bool newEntry = true; + uint16_t regRs = 0; //to use R0 for SPR restore insruction generation + uint16_t regRa = 0; + + do + { + if( !i_pSprEntryLocation ) + { + MY_ERR("invalid location of SPR image entry" ); + l_rc = STOP_SAVE_FAIL; + break; + } + + tempInst = genKeyForSprLookup( i_regId ); + + if( *i_pSprEntryLocation == tempInst ) + { + newEntry = false; + } + + //Add SPR search instruction i.e. "ori r0, r0, SPRID" + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + if( INIT_SPR_REGION == i_mode ) + { + //adding inst 'b . + 0x1C' + *i_pSprEntryLocation = SWIZZLE_4_BYTE(SKIP_SPR_REST_INST); + } + else + { + //clear R0 i.e. "xor ra, rs, rb" + tempInst = getXorInstruction( regRs, regRs, regRs ); + *i_pSprEntryLocation = tempInst; + } + + + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + tempRegData = i_regData >> 48; + //get lower order 16 bits of SPR restore value in R0 + tempInst = getOrisInstruction( regRs, regRa, (uint16_t)tempRegData ); + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + tempRegData = ((i_regData >> 32) & 0x0000FFFF ); + //get bit b16-b31 of SPR restore value in R0 + tempInst = getOriInstruction( regRs, regRa, (uint16_t)tempRegData ); + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + //Rotate R0 to left by 32 bit position and zero lower order 32 bits. + //Place the result in R0 + tempInst = getRldicrInstruction(regRa, regRs, 32, 31); + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + tempRegData = ((i_regData >> 16) & 0x000000FFFF ); + //get bit b32-b47 of SPR restore value to R0 + tempInst = getOrisInstruction( regRs, regRa, (uint16_t)tempRegData ); + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + tempRegData = (uint16_t)i_regData; + //get bit b48-b63 of SPR restore value to R0 + tempInst = getOriInstruction( regRs, regRa, (uint16_t)i_regData ); + *i_pSprEntryLocation = tempInst; + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + + if( PROC_STOP_SPR_MSR == i_regId ) + { + //MSR cannot be restored completely with mtmsrd instruction. + //as it does not update ME, LE and HV bits. In self restore code + //inorder to restore MSR, contents of R21 is moved to SRR1. It also + //executes an RFID which causes contents of SRR1 to be copied to + //MSR. This allows copy of LE bit which are specifically interested + //in. Instruction below moves contents of MSR Value (in R0 ) to R21. + tempInst = SWIZZLE_4_BYTE( MR_R0_TO_R21 ); + } + else if ( PROC_STOP_SPR_HRMOR == i_regId ) + { + //Case HRMOR, move contents of R0 to a placeholder GPR (R10) + //Thread Launcher expects HRMOR value in R10 + tempInst = SWIZZLE_4_BYTE( MR_R0_TO_R10 ); + } + else if( PROC_STOP_SPR_URMOR == i_regId ) + { + //Case URMOR, move contents of R0 to a placeholder GPR (R9) + //Thread Launcher expects URMOR value in R9 + tempInst = SWIZZLE_4_BYTE( MR_R0_TO_R9 ); + } + else + { + // Case other SPRs, move contents of R0 to SPR + // For a UV system, even HRMOR is treated like any other SPR. + tempInst = + getMtsprInstruction( 0, (uint16_t)i_regId ); + } + + *i_pSprEntryLocation = tempInst; + + if( newEntry ) + { + i_pSprEntryLocation += SIZE_PER_SPR_RESTORE_INST; + //at the end of SPR restore, add instruction BLR to go back to thread + //launcher. + tempInst = SWIZZLE_4_BYTE(BLR_INST); + *i_pSprEntryLocation = tempInst; + } + } + while(0); + + return l_rc; +} + +//----------------------------------------------------------------------------- + +STATIC StopReturnCode_t initSelfSaveEntry( void* const i_pImage, uint16_t i_sprNum ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t* i_pSprSave = (uint32_t*)i_pImage; + + //ori r0, r0, 0x00nn + *i_pSprSave = getOriInstruction( 0, 0, i_sprNum ); + + i_pSprSave++; + + //addi r31, r31, 0x20 + *i_pSprSave = SWIZZLE_4_BYTE(SKIP_SPR_SELF_SAVE); + i_pSprSave++; + + //nop + *i_pSprSave = getOriInstruction( 0, 0, 0 );; + i_pSprSave++; + + //mtlr, r30 + *i_pSprSave = SWIZZLE_4_BYTE( MTLR_INST ); + i_pSprSave++; + + //blr + *i_pSprSave = SWIZZLE_4_BYTE(BLR_INST); + i_pSprSave++; + + return l_rc; +} + +//----------------------------------------------------------------------------- + +STATIC StopReturnCode_t getSprRegIndexAdjustment( const uint32_t i_saveMaskPos, uint32_t* i_sprAdjIndex ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + + do + { + if( (( i_saveMaskPos >= SPR_BIT_POS_8 ) && ( i_saveMaskPos <= SPR_BIT_POS_20 )) || + (( i_saveMaskPos >= SPR_BIT_POS_25 ) && ( i_saveMaskPos <= SPR_BIT_POS_27 )) ) + { + l_rc = STOP_SAVE_SPR_BIT_POS_RESERVE; + break; + } + + if( (i_saveMaskPos > SPR_BIT_POS_20) && (i_saveMaskPos < SPR_BIT_POS_25) ) + { + *i_sprAdjIndex = 12; + } + else if( i_saveMaskPos > SPR_BIT_POS_27 ) + { + *i_sprAdjIndex = 15; + } + else + { + *i_sprAdjIndex = 0; + } + + } + while(0); + + return l_rc; +} + + +//----------------------------------------------------------------------------- + +/** + * @brief returns core region and relative id wrt to quad + * @param[in] i_scomAddress scom address associated with a core + * @param[in] o_scomRegion SCOM region in HOMER + * @param[in] o_coreRelativeInst core relative id + * @return STOP_SAVE_SUCCESS if function succeeds, error code otherwise + */ +STATIC StopReturnCode_t decodeScomAddress( const uint32_t i_scomAddress, uint32_t * o_scomRegion, + uint32_t * o_coreRelativeInst ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t l_regionSelect = ( i_scomAddress & CORE_REGION_MASK ); + uint32_t l_endPoint = ( i_scomAddress & EP_SELECT_MASK ); + l_endPoint = ( l_endPoint >> 16 ); + l_regionSelect = l_regionSelect >> 12; + + if( 1 == l_endPoint ) + { + *o_scomRegion = PROC_STOP_SECTION_L3; + } + else if ( 2 == l_endPoint ) + { + *o_scomRegion = PROC_STOP_SECTION_CORE; + } + + switch( l_regionSelect ) + { + case 8: + *o_coreRelativeInst = 0; + break; + + case 4: + *o_coreRelativeInst = 1; + break; + + case 2: + *o_coreRelativeInst = 2; + break; + + case 1: + *o_coreRelativeInst = 3; + break; + + default: + l_rc = STOP_SAVE_SCOM_INVALID_ADDRESS; + break; + } + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief validates all the input arguments. + * @param[in] i_pImage pointer to start of HOMER of image for proc chip. + * @param[in] i_scomAddress SCOM address of register. + * @param[in] i_chipletId core or cache chiplet id + * @param[in] i_operation operation requested for SCOM entry. + * @param[in] i_section image section on which operation is to be performed + * @return STOP_SAVE_SUCCESS if arguments found valid, error code otherwise. + * @note Function does not validate that the given SCOM address really + * belongs to the given section. + */ +STATIC StopReturnCode_t validateScomImageInputs( void* const i_pImage, + const uint32_t i_scomAddress, + const uint8_t i_chipletId, + const ScomOperation_t i_operation, + const ScomSection_t i_section ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t l_scomRegion = 0; + uint32_t l_coreId = 0; + + do + { + if( !i_pImage ) + { + //Error Invalid image pointer + l_rc = STOP_SAVE_ARG_INVALID_IMG; + MY_ERR("invalid image location "); + break; + } + + if( 0 == i_scomAddress ) + { + l_rc = STOP_SAVE_SCOM_INVALID_ADDRESS; + MY_ERR("invalid SCOM address"); + break; + } + + if(( CACHE_CHIPLET_ID_MIN > i_chipletId ) || + ( CACHE_CHIPLET_ID_MAX < i_chipletId )) + { + l_rc = STOP_SAVE_SCOM_INVALID_CHIPLET; + MY_ERR("chiplet id not valid"); + break; + } + + if(( PROC_STOP_SCOM_OP_MIN >= i_operation ) || + ( PROC_STOP_SCOM_OP_MAX <= i_operation )) + { + //invalid SCOM image operation requested + l_rc = STOP_SAVE_SCOM_INVALID_OPERATION; + MY_ERR("invalid SCOM image operation"); + break; + } + + l_rc = decodeScomAddress( i_scomAddress, &l_scomRegion, &l_coreId ); + + if( l_rc ) + { + MY_ERR( "Bad Scom Address 0x%08x", i_chipletId ); + break; + } + + if( PROC_STOP_SECTION_CORE == l_scomRegion ) + { + if( ( i_section != PROC_STOP_SECTION_CORE ) || + ( i_section != PROC_STOP_SECTION_L2 ) ) + { + MY_ERR( "SCOM adress doesn't match with section type passed," + " EP : %d , Section Type %d", l_scomRegion, i_section ); + l_rc = STOP_SAVE_SCOM_INVALID_SECTION; + break; + } + } + + if( PROC_STOP_SECTION_L3 == l_scomRegion ) + { + if( ( i_section != PROC_STOP_SECTION_L3 ) || + ( i_section != PROC_STOP_SECTION_CACHE ) ) + { + MY_ERR( "SCOM adress doesn't match with section type passed," + " EP : %d , Section Type %d", l_scomRegion, i_section ); + l_rc = STOP_SAVE_SCOM_INVALID_SECTION; + break; + } + } + } + while(0); + + if( l_rc ) + { + MY_ERR("SCOMAddress 0x%08x chipletId 0x%08x operation" + "0x%08x section 0x%08x", i_scomAddress, i_chipletId, + i_operation, i_section ); + } + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief determines HOMER region for SCOM restore entry request. + * @param[in] i_pImage points to base of HOMER image. + * @param[in] i_sectn SCOM restore section + * @param[in] i_instanceId core instance id + * @param[out]o_entryDat meta data pertaining to SCOM restore entry analysis + * @return STOP_SAVE_SUCCESS if HWP succeeds, error code otherwise. + */ +STATIC StopReturnCode_t lookUpScomRestoreRegion( void * i_pImage, const ScomSection_t i_sectn, uint32_t i_instanceId, + ScomEntryDat_t * o_entryDat ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + CpmrHeader_t * l_pCpmrHdr = NULL; + ScomRestoreHeader_t *l_scomHdr = NULL; + uint32_t l_relativeCorePos = 0; + uint32_t l_offset = 0; + uint32_t l_quadId = 0; + uint32_t l_scomLen = 0; + + MY_INF( ">>lookUpScomRestoreRegion" ); + + o_entryDat->iv_subRegionBaseOffset = 0; + o_entryDat->iv_subRegionLength = 0; + l_quadId = ( i_instanceId >> 2 ); + + l_relativeCorePos = i_instanceId % MAX_CORES_PER_QUAD; + l_pCpmrHdr = ( CpmrHeader_t *) ( (uint8_t *) i_pImage + CPMR_HOMER_OFFSET ); + l_scomLen = SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxCoreL2ScomEntry) + + SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxEqL3ScomEntry); + l_scomLen = ( l_scomLen * SCOM_RESTORE_ENTRY_SIZE ); + + l_offset = ( l_scomLen * l_quadId * MAX_CORES_PER_QUAD ) + SCOM_RESTORE_HOMER_OFFSET; + + MY_INF( "QUAD_ID 0x%08x BASE OFFSET 0x%08x", l_quadId, l_offset ); + + l_scomHdr = ( ScomRestoreHeader_t *) ( (uint8_t *) i_pImage + l_offset ); + + if( ( PROC_STOP_SECTION_CORE == i_sectn ) || ( PROC_STOP_SECTION_L2 == i_sectn ) ) + { + MY_INF( "Core Offset 0x%04x", SWIZZLE_2_BYTE(l_scomHdr->iv_coreOffset) ); + l_offset += SWIZZLE_2_BYTE(l_scomHdr->iv_coreOffset); + o_entryDat->iv_subRegionLength = SWIZZLE_2_BYTE(l_scomHdr->iv_coreLength); + l_offset += ( SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxCoreL2ScomEntry) * l_relativeCorePos ); + } + else if( ( PROC_STOP_SECTION_L3 == i_sectn ) || ( PROC_STOP_SECTION_CACHE == i_sectn ) ) + { + MY_INF( "Cache Offset 0x%04x", SWIZZLE_2_BYTE(l_scomHdr->iv_l3Offset) ); + l_offset += SWIZZLE_2_BYTE(l_scomHdr->iv_l3Offset); + o_entryDat->iv_subRegionLength = SWIZZLE_2_BYTE(l_scomHdr->iv_l3Length); + l_offset += ( SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxEqL3ScomEntry) * l_relativeCorePos ); + } + else + { + o_entryDat->iv_subRegionBaseOffset = 0; + l_rc = STOP_SAVE_SCOM_INVALID_SECTION; + } + + if( !l_rc ) + { + o_entryDat->iv_subRegionBaseOffset = l_offset; + } + + MY_INF( "SCOM Section Offset 0x%08x", l_offset ); + + MY_INF( "<> lookUpScomRestoreEntry" ); + + o_pScomDat->iv_slotFound = 0x00; + o_pScomDat->iv_entryOffset = 0x00; + o_pScomDat->iv_lastEntryOffset = 0x00; + o_pScomDat->iv_entryMatchOffset = 0x00; + o_pScomDat->iv_matchFound = 0x00; + l_pCpmrHdr = ( CpmrHeader_t * ) ( (uint8_t *) i_pImage + CPMR_HOMER_OFFSET ); + l_pScomByte = ( uint8_t * )( (uint8_t *) i_pImage + o_pScomDat->iv_subRegionBaseOffset ); + l_pScom = (ScomEntry_t *)( l_pScomByte ); + + switch( i_sectn ) + { + case PROC_STOP_SECTION_CORE: + l_entryLimit = SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxCoreL2ScomEntry); + break; + + case PROC_STOP_SECTION_L3: + l_entryLimit = SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxEqL3ScomEntry); + break; + + default: + l_rc = STOP_SAVE_SCOM_INVALID_SECTION; + break; + } + + if( l_rc ) + { + return l_rc; + } + + for( l_entry = 0; l_entry < l_entryLimit; l_entry++ ) + { + if( !( l_pScom->iv_scomAddress & SWIZZLE_4_BYTE(SCOM_ENTRY_VALID) ) ) + { + o_pScomDat->iv_slotFound = 0x01; + o_pScomDat->iv_entryOffset = l_entry; + break; + } + + l_pScom++; + } + + l_pScom = (ScomEntry_t *)( l_pScomByte ); + + for( l_entry = 0; l_entry < l_entryLimit; l_entry++ ) + { + if( l_pScom->iv_scomAddress & SWIZZLE_4_BYTE(LAST_SCOM_ENTRY) ) + { + o_pScomDat->iv_lastEntryOffset = l_entry; + MY_INF( "SCOM Restore Entry Limit 0x%08x", + o_pScomDat->iv_lastEntryOffset ); + break; + } + l_pScom++; + } + + l_pScom = (ScomEntry_t *)( l_pScomByte ); + + for( l_entry = 0; l_entry < l_entryLimit; l_entry++ ) + { + l_temp = l_pScom->iv_scomAddress & SWIZZLE_4_BYTE(SCOM_ADDR_MASK); + + if( SWIZZLE_4_BYTE((i_scomAddress & SCOM_ADDR_MASK)) == l_temp ) + { + o_pScomDat->iv_entryMatchOffset = l_entry; + o_pScomDat->iv_matchFound = 0x01; + MY_INF( "Existing Entry Slot No 0x%08x", l_entry ); + break; + } + l_pScom++; + } + + o_pScomDat->iv_entryLimit = l_entryLimit; + + MY_INF( "<< lookUpScomRestoreEntry" ); + return l_rc; +} + +//----------------------------------------------------------------------------- + +#define UNUSED(x) (void)(x) + +/** + * @brief edits a SCOM restore entry associated with the given core. + * @param[in] i_pScom points to SCOM restore location + * @param[in] i_scomAddr SCOM address of register. + * @param[in] i_scomData data associated with SCOM register. + * @param[in] i_operation operation to be performed on SCOM entry. + * @param[in] i_pScomDat points to meta data associated with entry analysis + * @return STOP_SAVE_SUCCESS if existing entry is updated, STOP_SAVE_FAIL + * otherwise. + */ +STATIC StopReturnCode_t editScomEntry( uint8_t * i_pScom, uint32_t i_scomAddr, + uint64_t i_scomData, ScomOperation_t i_operation, + ScomEntryDat_t * i_pScomDat ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + ScomEntry_t * l_pScom = (ScomEntry_t *)i_pScom; + UNUSED(i_scomAddr); + + MY_INF( ">> editScomEntry " ); + + l_pScom = l_pScom + i_pScomDat->iv_entryMatchOffset; + + switch( i_operation ) + { + case PROC_STOP_SCOM_OR: + case PROC_STOP_SCOM_OR_APPEND: + l_pScom->iv_scomData |= SWIZZLE_8_BYTE(i_scomData); + break; + + case PROC_STOP_SCOM_AND: + case PROC_STOP_SCOM_AND_APPEND: + l_pScom->iv_scomData &= SWIZZLE_8_BYTE(i_scomData); + break; + + case PROC_STOP_SCOM_REPLACE: + l_pScom->iv_scomData = SWIZZLE_8_BYTE(i_scomData); + break; + + default: + break; + } + + MY_INF( "<< editScomEntry " ); + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief update SCOM restore entry list associated with the given core. + * @param[in] i_pImage points to base of HOMER image. + * @param[in] i_scomAddr address of SCOM register. + * @param[in] i_scomData data associated with SCOM register. + * @param[in] i_sectn SCOM restore section in HOMER. + * @param[in] i_operation operation type requested on restore entry. + * @param[in] i_pScomDat points entry analysis meta data. + * @return STOP_SAVE_SUCCESS if new entry is added, STOP_SAVE_FAIL otherwise. + */ +STATIC StopReturnCode_t updateScomEntry( void * i_pImage, uint32_t i_scomAddr, + uint64_t i_scomData, const ScomSection_t i_sectn, + ScomOperation_t i_operation, ScomEntryDat_t * i_pScomDat ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + CpmrHeader_t * l_pCpmrHdr = NULL; + ScomEntry_t * l_pScom = NULL; + uint32_t l_maxScomEntry = 0; + l_pCpmrHdr = ( CpmrHeader_t * ) ( (uint8_t *) i_pImage + CPMR_HOMER_OFFSET ); + l_pScom = ( ScomEntry_t * )( (uint8_t *) i_pImage + i_pScomDat->iv_subRegionBaseOffset ); + switch( i_operation ) + { + case PROC_STOP_SCOM_OR_APPEND: + case PROC_STOP_SCOM_AND_APPEND: + case PROC_STOP_SCOM_APPEND: + case PROC_STOP_SCOM_REPLACE: + + l_pScom = l_pScom + i_pScomDat->iv_lastEntryOffset; + + if( i_pScomDat->iv_entryLimit > i_pScomDat->iv_lastEntryOffset ) + { + l_pScom->iv_scomAddress &= ~(SWIZZLE_LAST_SCOM_ENTRY); + l_pScom++; // takes us to offset stored in iv_entryOffset + l_pScom->iv_scomAddress = i_scomAddr & SCOM_ADDR_MASK; + l_pScom->iv_scomAddress |= (SCOM_ENTRY_VALID | LAST_SCOM_ENTRY | SCOM_ENTRY_VER); + + if( PROC_STOP_SECTION_CORE == i_sectn ) + { + l_maxScomEntry = SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxCoreL2ScomEntry); + l_pScom->iv_scomAddress |= CORE_SECTION_ID_CODE; + } + else + { + l_maxScomEntry = SWIZZLE_4_BYTE(l_pCpmrHdr->iv_maxEqL3ScomEntry); + l_pScom->iv_scomAddress |= L3_SECTION_ID_CODE; + } + + l_pScom->iv_scomAddress |= ( l_maxScomEntry << MAX_SCOM_ENTRY_POS ); + l_pScom->iv_scomAddress = SWIZZLE_4_BYTE(l_pScom->iv_scomAddress); + l_pScom->iv_scomData = SWIZZLE_8_BYTE(i_scomData); + + MY_INF( "SCOM Data 0x%08x", SWIZZLE_4_BYTE(l_pScom->iv_scomAddress) ); + } + else + { + MY_ERR( "Current Entry Count 0x%08x More than Max Entry Count 0x%08x", + i_pScomDat->iv_lastEntryOffset, i_pScomDat->iv_entryLimit ); + l_rc = STOP_SAVE_MAX_ENTRY_REACHED; + } + + break; + default: + break; + } + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief searches a self save entry of an SPR in self-save segment. + * @param[in] i_sprBitPos bit position associated with SPR in save mask vector. + * @param[in] i_pSprSaveStart start location of SPR save segment + * @param[in] i_searchLength length of SPR save segment + * @param[in] i_pSaveSprLoc start location of save entry for a given SPR. + * @return STOP_SAVE_SUCCESS if look up succeeds, error code otherwise. + */ +STATIC StopReturnCode_t lookUpSelfSaveSpr( uint32_t i_sprBitPos, uint32_t* i_pSprSaveStart, + uint32_t i_searchLength, uint32_t** i_pSaveSprLoc ) +{ + int32_t l_saveWordLength = (int32_t)(i_searchLength >> 2); + uint32_t l_oriInst = getOriInstruction( 0, 0, i_sprBitPos ); + StopReturnCode_t l_rc = STOP_SAVE_FAIL; + + while( l_saveWordLength > 0 ) + { + if( l_oriInst == *i_pSprSaveStart ) + { + *i_pSaveSprLoc = i_pSprSaveStart; + l_rc = STOP_SAVE_SUCCESS; + break; + } + + i_pSprSaveStart++; + l_saveWordLength--; + } + + return l_rc; +} + +//----------------------------------------------------------------------------- + +/** + * @brief searches a self save entry of an SPR in self-save segment. + * @param[in] i_pSaveReg start of editable location of a SPR save entry. + * @param[in] i_sprNum Id of the SPR for which entry needs to be edited. + * @return STOP_SAVE_SUCCESS if look up succeeds, error code otherwise. + */ +STATIC StopReturnCode_t updateSelfSaveEntry( uint32_t* i_pSaveReg, uint16_t i_sprNum ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + + do + { + if( !i_pSaveReg ) + { + l_rc = STOP_SAVE_FAIL; + MY_ERR( "Failed to update self save area for SPR 0x%04x", i_sprNum ); + break; + } + + if( PROC_STOP_SPR_MSR == i_sprNum ) + { + *i_pSaveReg = getMfmsrInstruction( 1 ); + } + else + { + *i_pSaveReg = getMfsprInstruction( 1, i_sprNum ); + } + + i_pSaveReg++; + + *i_pSaveReg = getBranchLinkRegInstruction( ); + } + while(0); + + return l_rc; +} + +//----------------------------------------------------------------------------- + +StopReturnCode_t proc_stop_init_cpureg( void* const i_pImage, const uint32_t i_corePos ) +{ + + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t* l_pRestoreStart = NULL; + void* l_pTempLoc = NULL; + Homerlayout_t* l_pHomer = NULL; + SmfSprRestoreRegion_t * l_pSprRest = NULL; + uint32_t l_threadPos = 0; + uint32_t l_lookUpKey = 0; + uint32_t l_sprIndex = 0; + + MY_INF( ">> proc_stop_init_cpureg" ); + + do + { + if( !i_pImage ) + { + l_rc = STOP_SAVE_ARG_INVALID_IMG; + break; + } + + if( i_corePos > MAX_CORE_ID_SUPPORTED ) + { + l_rc = STOP_SAVE_ARG_INVALID_CORE; + break; + } + + l_pHomer = ( Homerlayout_t * ) i_pImage; + + for( l_sprIndex = 0; l_sprIndex < MAX_SPR_SUPPORTED_P10; l_sprIndex++ ) + { + //Check if a given SPR needs to be self-saved each time on STOP entry + + l_lookUpKey = genKeyForSprLookup( ( CpuReg_t )g_sprRegister_p10[l_sprIndex].iv_sprId ); + l_pSprRest = + ( SmfSprRestoreRegion_t * ) &l_pHomer->iv_cpmrRegion.iv_selfRestoreRegion.iv_coreSelfRestore[0]; + + l_pSprRest += i_corePos; + + if( g_sprRegister_p10[l_sprIndex].iv_isThreadScope ) + { + for( l_threadPos = 0; l_threadPos < MAX_THREADS_PER_CORE; l_threadPos++ ) + { + l_pRestoreStart = + (uint32_t*)&l_pSprRest->iv_threadRestoreArea[l_threadPos][0]; + + + l_rc = lookUpSprInImage( (uint32_t*)l_pRestoreStart, l_lookUpKey, + g_sprRegister_p10[l_sprIndex].iv_isThreadScope, + &l_pTempLoc ); + + if( l_rc ) + { + MY_ERR( "Thread SPR lookup failed in proc_stop_init_cpureg SPR %d Core %d Thread %d Index %d", + g_sprRegister_p10[l_sprIndex].iv_sprId, i_corePos, l_threadPos, l_sprIndex ); + break; + } + + l_rc = updateSprEntryInImage( (uint32_t*) l_pTempLoc, + ( CpuReg_t )g_sprRegister_p10[l_sprIndex].iv_sprId, + 0x00, + INIT_SPR_REGION ); + + if( l_rc ) + { + MY_ERR( "Thread SPR region init failed. Core %d SPR Id %d", + i_corePos, g_sprRegister_p10[l_sprIndex].iv_sprId ); + break; + } + + }//end for thread + + if( l_rc ) + { + break; + } + + }//end if SPR threadscope + else + { + l_pRestoreStart = (uint32_t*)&l_pSprRest->iv_coreRestoreArea[0]; + + l_rc = lookUpSprInImage( (uint32_t*)l_pRestoreStart, l_lookUpKey, + g_sprRegister_p10[l_sprIndex].iv_isThreadScope, &l_pTempLoc ); + + if( l_rc ) + { + MY_ERR( "Core SPR lookup failed in proc_stop_init_cpureg" ); + break; + } + + l_rc = updateSprEntryInImage( (uint32_t*) l_pTempLoc, + ( CpuReg_t )g_sprRegister_p10[l_sprIndex].iv_sprId, + 0x00, + INIT_SPR_REGION ); + + if( l_rc ) + { + MY_ERR( "Core SPR region init failed. Core %d SPR Id %d SPR Index %d", + i_corePos, g_sprRegister_p10[l_sprIndex].iv_sprId, l_sprIndex ); + break; + } + + }// end else + + }// end for l_sprIndex + + } + while(0); + + MY_INF( "<< proc_stop_init_cpureg" ); + + return l_rc; +} + +//----------------------------------------------------------------------------------------------------- + +StopReturnCode_t proc_stop_save_scom( void* const i_pImage, + const uint32_t i_scomAddress, + const uint64_t i_scomData, + const ScomOperation_t i_operation, + const ScomSection_t i_section ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t l_quadId = 0; + uint32_t l_coreId = 0; + uint32_t l_coreRegion = 0; + uint8_t * l_pScom = NULL; + ScomEntryDat_t l_entryDat; + + MY_INF( ">> proc_stop_save_scom" ); + + do + { + l_quadId = i_scomAddress >> 24; + l_quadId = l_quadId & 0x3F; + + l_rc = validateScomImageInputs( i_pImage, i_scomAddress, + l_quadId, i_operation, i_section ); + if( l_rc ) + { + MY_ERR( "invalid argument: aborting"); + break; + } + + l_rc = decodeScomAddress( i_scomAddress, &l_coreRegion, &l_coreId ); + + if( l_rc ) + { + MY_ERR( "Failed To get Core Details For Address 0x%08x", i_scomAddress ); + break; + } + + //Converting Superchiplet Id to instance number + l_quadId = l_quadId - MIN_SUPERCHIPLET_ID; + + //getting core position relative to the chip + l_coreId += ( l_quadId << 2 ); + + MY_INF( "Quad Id 0x%08x COre Id 0x%08x", l_quadId, l_coreId ); + + // Let us find the start address of SCOM area + + l_rc = lookUpScomRestoreRegion( i_pImage, + i_section, + l_coreId, + &l_entryDat ); + if( l_rc ) + { + MY_ERR( "Failed To Find SCOM Section Requested 0x%08x", + ( uint32_t) i_section ); + break; + } + + l_pScom = (uint8_t *)( (uint8_t *)i_pImage + l_entryDat.iv_subRegionBaseOffset ); + + l_rc = lookUpScomRestoreEntry( i_pImage, + i_section, + i_scomAddress, + &l_entryDat ); + if( l_rc ) + { + MY_ERR( "Failed To Find SCOM Entry Slot 0x%08x", (uint32_t) l_rc ); + break; + } + + switch( i_operation ) + { + case PROC_STOP_SCOM_APPEND: + l_rc = updateScomEntry( i_pImage, + i_scomAddress, + i_scomData, + i_section, + i_operation, + &l_entryDat ); + break; + + case PROC_STOP_SCOM_OR: + case PROC_STOP_SCOM_AND: + //case PROC_STOP_SCOM_NOOP: + + if( l_entryDat.iv_matchFound ) + { + l_rc = editScomEntry( l_pScom, + i_scomAddress, + i_scomData, + i_operation, + &l_entryDat ); + } + + break; + + case PROC_STOP_SCOM_RESET: + + l_rc = lookUpScomRestoreRegion( i_pImage, + PROC_STOP_SECTION_CORE, + l_coreId, + &l_entryDat ); + if( l_rc ) + { + MY_ERR( "Failed To Reset SCOM Section Requested 0x%08x", + ( uint32_t) i_section ); + break; + } + + memset( (uint8_t *)((uint8_t *)i_pImage + l_entryDat.iv_subRegionBaseOffset), + 0x00, l_entryDat.iv_subRegionLength ); + + l_rc = lookUpScomRestoreRegion( i_pImage, + PROC_STOP_SECTION_CACHE, + l_coreId, + &l_entryDat ); + if( l_rc ) + { + MY_ERR( "Failed To Reset SCOM Section Requested 0x%08x", + ( uint32_t) i_section ); + break; + } + + memset( (uint8_t *)((uint8_t *)i_pImage + l_entryDat.iv_subRegionBaseOffset), + 0x00, l_entryDat.iv_subRegionLength ); + + break; + + case PROC_STOP_SCOM_OR_APPEND: + case PROC_STOP_SCOM_AND_APPEND: + case PROC_STOP_SCOM_REPLACE: + + if( l_entryDat.iv_matchFound ) + { + l_rc = editScomEntry( l_pScom, + i_scomAddress, + i_scomData, + i_operation, + &l_entryDat ); + } + else + { + l_rc = updateScomEntry( i_pImage, + i_scomAddress, + i_scomData, + i_section, + i_operation, + &l_entryDat ); + } + + break; + + default: + l_rc = STOP_SAVE_SCOM_INVALID_OPERATION; + break; + } + } + while(0); + + if( l_rc ) + { + MY_ERR("SCOM image operation 0x%08x failed for chiplet 0x%08x addr" + "0x%08x", i_operation, l_quadId , + i_scomAddress ); + } + else + { + + } + + MY_INF( "<< proc_stop_save_scom" ); + + return l_rc; +} + +//----------------------------------------------------------------------------------------------------- + +StopReturnCode_t proc_stop_save_cpureg_control( void* i_pImage, + const uint64_t i_pir, + const uint32_t i_saveRegVector ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t l_coreId = 0; + uint32_t l_threadId = 0; + uint32_t l_sprPos = 0; + uint32_t l_sprIndex = 0; + uint32_t l_lookupLength = 0; + uint32_t l_lookUpKey = 0; + uint32_t* l_pSaveStart = NULL; + uint32_t* l_pRestoreStart = NULL; + uint32_t* l_pSprSave = NULL; + void* l_pTempLoc = NULL; + uint32_t * l_pTempWord = NULL; + Homerlayout_t* l_pHomer = NULL; + SmfSprRestoreRegion_t * l_pSpr = NULL; + MY_INF(">> proc_stop_save_cpureg_control" ); + + do + { + l_rc = getCoreAndThread_p10( i_pImage, i_pir, &l_coreId, &l_threadId ); + + if( l_rc ) + { + MY_ERR( "Error in getting core no 0x%08x and thread no 0x%08x from PIR 0x%016lx", + l_coreId, l_threadId, i_pir ); + break; + } + + l_rc = validateArgumentSaveRegMask( i_pImage, l_coreId, l_threadId, i_saveRegVector ); + + if( l_rc ) + { + MY_ERR( "Invalid argument rc 0x%08x", (uint32_t) l_rc ); + break; + } + + l_pHomer = ( Homerlayout_t * )i_pImage; + l_pSpr = ( SmfSprRestoreRegion_t *) &l_pHomer->iv_cpmrRegion.iv_selfRestoreRegion.iv_coreSelfRestore[0]; + l_pSpr += l_coreId; + + for( l_sprIndex = 0; l_sprIndex < MAX_SPR_SUPPORTED_P10; l_sprIndex++ ) + { + l_sprPos = g_sprRegister_p10[l_sprIndex].iv_saveMaskPos; + + if( l_sprPos > MAX_SPR_BIT_POS ) + { + continue; + } + + //Check if a given SPR needs to be self-saved each time on STOP entry + + if( i_saveRegVector & ( TEST_BIT_PATTERN >> l_sprPos ) ) + { + + if( g_sprRegister_p10[l_sprIndex].iv_isThreadScope ) + { + l_lookupLength = SMF_SELF_SAVE_THREAD_AREA_SIZE; + l_pSaveStart = + (uint32_t*)&l_pSpr->iv_threadSaveArea[l_threadId][0]; + l_pRestoreStart = + (uint32_t*)&l_pSpr->iv_threadRestoreArea[l_threadId][0]; + } + else + { + l_lookupLength = SMF_CORE_SAVE_CORE_AREA_SIZE; + l_pSaveStart = (uint32_t*)&l_pSpr->iv_coreSaveArea[0]; + l_pRestoreStart = (uint32_t*)&l_pSpr->iv_coreRestoreArea[0]; + } + + // an SPR restore section for given core already exists + l_lookUpKey = genKeyForSprLookup( ( CpuReg_t )g_sprRegister_p10[l_sprIndex].iv_sprId ); + + l_rc = lookUpSprInImage( (uint32_t*)l_pRestoreStart, l_lookUpKey, + g_sprRegister_p10[l_sprIndex].iv_isThreadScope, &l_pTempLoc ); + + if( l_rc ) + { + //SPR specified in the save mask but there is no restore entry present in the memory + //Self-Save instruction will edit it during STOP entry to make it a valid entry + + l_rc = proc_stop_save_cpureg( i_pImage, + (CpuReg_t)g_sprRegister_p10[l_sprIndex].iv_sprId, + 0x00, //creates a dummy entry + i_pir ); + } + + //Find if SPR-Save eye catcher exist in self-save segment of SPR restore region. + l_rc = lookUpSelfSaveSpr( l_sprPos, l_pSaveStart, l_lookupLength, &l_pSprSave ); + + if( l_rc ) + { + MY_INF( "Failed to find SPR No %02d save entry", l_sprPos ); + l_rc = STOP_SAVE_SPR_ENTRY_MISSING; + break; + } + + l_pSprSave++; //point to next instruction location + + //update specific instructions of self save region to enable saving for SPR + l_rc = updateSelfSaveEntry( l_pSprSave, g_sprRegister_p10[l_sprIndex].iv_sprId ); + + if( l_rc ) + { + MY_ERR( "Failed to update self save instructions for 0x%08x", + (uint32_t) g_sprRegister_p10[l_sprIndex].iv_sprId ); + } + + if( l_pTempLoc ) + { + l_pTempWord = (uint32_t *)l_pTempLoc; + l_pTempWord++; + *l_pTempWord = getXorInstruction( 0, 0, 0 ); + } + + }// end if( i_saveRegVector..) + }// end for + } + while(0); + + MY_INF("<< proc_stop_save_cpureg_control" ); + + return l_rc; + +} + +//----------------------------------------------------------------------------------------------------- + +StopReturnCode_t proc_stop_save_cpureg( void* const i_pImage, + const CpuReg_t i_regId, + const uint64_t i_regData, + const uint64_t i_pir ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; // procedure return code + SmfSprRestoreRegion_t* l_sprRegion = NULL; + Homerlayout_t* l_pHomer = NULL; + + MY_INF(">> proc_stop_save_cpureg" ); + + do + { + uint32_t threadId = 0; + uint32_t coreId = 0; + uint32_t lookUpKey = 0; + void* pSprEntryLocation = NULL; // an offset w.r.t. to start of image + void* pThreadLocation = NULL; + bool threadScopeReg = false; + + l_rc = getCoreAndThread_p10( i_pImage, i_pir, &coreId, &threadId ); + + if( l_rc ) + { + MY_ERR("Failed to determine Core Id and Thread Id from PIR 0x%016lx", + i_pir); + break; + } + + MY_INF( " PIR 0x%016lx coreId %d threadid %d " + " registerId %d", i_pir, coreId, + threadId, i_regId ); + + // First of all let us validate all input arguments. + l_rc = validateSprImageInputs( i_pImage, + i_regId, + coreId, + &threadId, + &threadScopeReg ); + if( l_rc ) + { + // Error: bad argument traces out error code + MY_ERR("Bad input argument rc %d", l_rc ); + + break; + } + + + l_pHomer = ( Homerlayout_t *) i_pImage; + l_sprRegion = ( SmfSprRestoreRegion_t* )&l_pHomer->iv_cpmrRegion.iv_selfRestoreRegion.iv_coreSelfRestore[0]; + l_sprRegion += coreId; + + if( threadScopeReg ) + { + pThreadLocation = (uint32_t *)&l_sprRegion->iv_threadRestoreArea[threadId][0]; + } + else + { + pThreadLocation = (uint32_t *)&l_sprRegion->iv_coreRestoreArea[0]; + } + + if( ( SWIZZLE_4_BYTE(BLR_INST) == *(uint32_t*)pThreadLocation ) || + ( SWIZZLE_4_BYTE(ATTN_OPCODE) == *(uint32_t*) pThreadLocation ) ) + { + // table for given core id doesn't exit. It needs to be + // defined. + pSprEntryLocation = pThreadLocation; + } + else + { + // an SPR restore section for given core already exists + lookUpKey = genKeyForSprLookup( i_regId ); + l_rc = lookUpSprInImage( (uint32_t*)pThreadLocation, + lookUpKey, + threadScopeReg, + &pSprEntryLocation ); + } + + if( l_rc ) + { + MY_ERR("Invalid or corrupt SPR entry. CoreId 0x%08x threadId " + "0x%08x regId 0x%08x lookUpKey 0x%08x " + , coreId, threadId, i_regId, lookUpKey ); + break; + } + + l_rc = updateSprEntryInImage( (uint32_t*) pSprEntryLocation, + i_regId, + i_regData, + UPDATE_SPR_ENTRY ); + + if( l_rc ) + { + MY_ERR( " Failed to update the SPR entry of PIR 0x%016lx reg" + "0x%08x", (uint64_t)i_pir, i_regId ); + break; + } + + } + while(0); + + MY_INF("<< proc_stop_save_cpureg" ); + + return l_rc; +} + +//----------------------------------------------------------------------------------------------------- + +StopReturnCode_t proc_stop_init_self_save( void* const i_pImage, const uint32_t i_corePos ) +{ + + SmfSprRestoreRegion_t * l_pSelfSave = NULL; + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint32_t* l_pSaveStart = NULL; + Homerlayout_t * l_pHomer = NULL; + uint32_t l_threadPos = 0; + uint32_t l_sprBitPos = 0; + uint32_t l_sprIndexAdj = 0; + + MY_INF(">> proc_stop_init_self_save" ); + + do + { + if( !i_pImage ) + { + l_rc = STOP_SAVE_ARG_INVALID_IMG; + break; + } + + if( i_corePos > MAX_CORE_ID_SUPPORTED ) + { + l_rc = STOP_SAVE_ARG_INVALID_CORE; + break; + } + + l_pHomer = ( Homerlayout_t* ) i_pImage; + l_pSelfSave = + ( SmfSprRestoreRegion_t *) &l_pHomer->iv_cpmrRegion.iv_selfRestoreRegion.iv_coreSelfRestore[0]; + + l_pSelfSave += i_corePos; + + for( l_threadPos = 0; l_threadPos < MAX_THREADS_PER_CORE; l_threadPos++ ) + { + l_pSaveStart = + (uint32_t*)&l_pSelfSave->iv_threadSaveArea[l_threadPos][0]; + + //Adding instruction 'mflr r30' + *l_pSaveStart = SWIZZLE_4_BYTE(MFLR_R30); + l_pSaveStart++; + + for( l_sprBitPos = 0; l_sprBitPos <= MAX_SPR_BIT_POS; l_sprBitPos++ ) + { + l_rc = getSprRegIndexAdjustment( l_sprBitPos, &l_sprIndexAdj ); + + if( STOP_SAVE_SPR_BIT_POS_RESERVE == l_rc ) + { + //Failed to find SPR index adjustment + continue; + } + + if( !g_sprRegister_p10[l_sprBitPos - l_sprIndexAdj].iv_isThreadScope ) + { + continue; + } + + //Initialize self save region with SPR save entry for each thread + //level SPR + l_rc = initSelfSaveEntry( l_pSaveStart, + g_sprRegister_p10[l_sprBitPos - l_sprIndexAdj].iv_saveMaskPos ); + + if( l_rc ) + { + MY_ERR( "Failed to init thread self-save region for core %d thread %d", + i_corePos, l_threadPos ); + break; + } + + l_pSaveStart++; + l_pSaveStart++; + l_pSaveStart++; + } + + }// for thread = 0; + + if( l_rc ) + { + //breakout if saw an error while init of thread SPR region + break; + } + + l_pSaveStart = + (uint32_t*)&l_pSelfSave->iv_coreSaveArea[0]; + + *l_pSaveStart = SWIZZLE_4_BYTE(MFLR_R30); + l_pSaveStart++; + + for( l_sprBitPos = 0; l_sprBitPos <= MAX_SPR_BIT_POS; l_sprBitPos++ ) + { + l_rc = getSprRegIndexAdjustment( l_sprBitPos, &l_sprIndexAdj ); + + if( STOP_SAVE_SPR_BIT_POS_RESERVE == l_rc ) + { + //Failed to find SPR index adjustment + continue; + } + + if( g_sprRegister_p10[l_sprBitPos - l_sprIndexAdj].iv_isThreadScope ) + { + continue; + } + + //Initialize self save region with SPR save entry for each core + //level SPR + l_rc = initSelfSaveEntry( l_pSaveStart, + g_sprRegister_p10[l_sprBitPos - l_sprIndexAdj].iv_saveMaskPos ); + + if( l_rc ) + { + MY_ERR( "Failed to init core self-save region for core %d thread %d", + i_corePos, l_threadPos ); + break; + } + + l_pSaveStart++; + l_pSaveStart++; + l_pSaveStart++; + } + } + while(0); + + MY_INF("<< proc_stop_init_self_save" ); + + return l_rc; +} + +//----------------------------------------------------------------------------------------------------- +#ifdef __cplusplus +} //namespace stopImageSection ends +} //extern "C" +#endif diff --git a/libpore/p10_stop_api.H b/libpore/p10_stop_api.H new file mode 100644 index 000000000..a70d2b281 --- /dev/null +++ b/libpore/p10_stop_api.H @@ -0,0 +1,238 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/utils/stopreg/p10_stop_api.C $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2015,2021 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +#ifndef __P10_STOP_IMAGE_API_ +#define __P10_STOP_IMAGE_API_ + +#include + +#ifdef __SKIBOOT__ + #include +#endif + +/// +/// @file p10_stop_api.H +/// @brief describes STOP API which create/manipulate STOP image. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP + +#ifdef __cplusplus +namespace stopImageSection +{ +#endif + +/** + * @brief all SPRs and MSR for which register restore is to be supported. + * @note STOP API design has built in support to accomodate 8 register of + * scope core and thread each. + */ +typedef enum +{ + PROC_STOP_SPR_DAWR = 180, // thread register + PROC_STOP_SPR_CIABR = 187, // thread register + PROC_STOP_SPR_DAWRX = 188, // thread register + PROC_STOP_SPR_HSPRG0 = 304, // thread register + PROC_STOP_SPR_HRMOR = 313, // core register + PROC_STOP_SPR_LPCR = 318, // thread register + PROC_STOP_SPR_HMEER = 337, // core register + PROC_STOP_SPR_PTCR = 464, // core register + PROC_STOP_SPR_USPRG0 = 496, // thread register + PROC_STOP_SPR_USPRG1 = 497, // thread register + PROC_STOP_SPR_URMOR = 505, // core register + PROC_STOP_SPR_SMFCTRL = 511, // thread register + PROC_STOP_SPR_LDBAR = 850, // thread register + PROC_STOP_SPR_PSSCR = 855, // thread register + PROC_STOP_SPR_PMCR = 884, // core register + PROC_STOP_SPR_HID = 1008, // core register + PROC_STOP_SPR_MSR = 2000, // thread register + +} CpuReg_t; + +/** + * @brief lists all the bad error codes. + */ +typedef enum +{ + STOP_SAVE_SUCCESS = 0, + STOP_SAVE_ARG_INVALID_IMG = 1, + STOP_SAVE_ARG_INVALID_REG = 2, + STOP_SAVE_ARG_INVALID_THREAD = 3, + STOP_SAVE_ARG_INVALID_MODE = 4, + STOP_SAVE_ARG_INVALID_CORE = 5, + STOP_SAVE_SPR_ENTRY_NOT_FOUND = 6, + STOP_SAVE_SPR_ENTRY_UPDATE_FAILED = 7, + STOP_SAVE_SCOM_INVALID_OPERATION = 8, + STOP_SAVE_SCOM_INVALID_SECTION = 9, + STOP_SAVE_SCOM_INVALID_ADDRESS = 10, + STOP_SAVE_SCOM_INVALID_CHIPLET = 11, + STOP_SAVE_SCOM_ENTRY_UPDATE_FAILED = 12, + STOP_SAVE_INVALID_FUSED_CORE_STATUS = 13, + STOP_SAVE_FAIL = 14, // for internal failure within firmware. + STOP_SAVE_SPR_ENTRY_MISSING = 15, + STOP_SAVE_MAX_ENTRY_REACHED = 16, + STOP_SAVE_SPR_BIT_POS_RESERVE = 17, +} StopReturnCode_t; + +/** + * @brief summarizes all operations supported on scom entries of STOP image. + */ +typedef enum +{ + //enum members which are project agnostic + PROC_STOP_SCOM_OP_MIN = 0, + PROC_STOP_SCOM_APPEND = 1, + PROC_STOP_SCOM_REPLACE = 2, + PROC_STOP_SCOM_OR = 3, + PROC_STOP_SCOM_AND = 4, + PROC_STOP_SCOM_NOOP = 5, + PROC_STOP_SCOM_RESET = 6, + PROC_STOP_SCOM_OR_APPEND = 7, + PROC_STOP_SCOM_AND_APPEND = 8, + PROC_STOP_SCOM_OP_MAX = 9, + +} ScomOperation_t; + +/** + * @brief All subsections that contain scom entries in a STOP image. + */ +typedef enum +{ + PROC_STOP_SECTION_CORE = 1, + PROC_STOP_SECTION_L2 = 1, + PROC_STOP_SECTION_L3 = 2, + PROC_STOP_SECTION_CACHE = 2, +} ScomSection_t; + +/** + * @brief versions pertaining relvant to STOP API. + */ +typedef enum +{ + STOP_API_VER = 0x00, + STOP_API_VER_CONTROL = 0x02, +} VersionList_t; + +/** + * @brief Summarizes bit position allocated to SPRs in save bit mask vector. + */ +typedef enum +{ + BIT_POS_CIABR = 0, + BIT_POS_DAWR = 1, + BIT_POS_DAWRX = 2, + BIT_POS_HSPRG0 = 3, + BIT_POS_LDBAR = 4, + BIT_POS_LPCR = 5, + BIT_POS_PSSCR = 6, + BIT_POS_MSR = 7, + BIT_POS_HID = 21, + BIT_POS_HMEER = 22, + BIT_POS_PMCR = 23, + BIT_POS_PTCR = 24, + BIT_POS_SMFCTRL = 28, + BIT_POS_USPRG0 = 29, + BIT_POS_USPRG1 = 30, +} SprBitPositionList_t; + + +#ifdef __cplusplus +extern "C" { +#endif +/** + * @brief creates SCOM restore entry for a given scom adress in HOMER. + * @param i_pImage points to start address of HOMER image. + * @param i_scomAddress address associated with SCOM restore entry. + * @param i_scomData data associated with SCOM restore entry. + * @param i_operation operation type requested for API. + * @param i_section section of HOMER in which restore entry needs to be created. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for creating SCOM restore entry in HOMER. It is agnostic to + * generation of POWER processor. + */ + +StopReturnCode_t proc_stop_save_scom( void* const i_pImage, + const uint32_t i_scomAddress, + const uint64_t i_scomData, + const ScomOperation_t i_operation, + const ScomSection_t i_section ); + +/** + * @brief initializes self save restore region of HOMER. + * @param[in] i_pImage points to base of HOMER image. + * @param[in] i_corePos position of the physical core. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for initializing self restore region in HOMER. It is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_init_cpureg( void* const i_pImage, const uint32_t i_corePos ); + +/** + * @brief enables self save for a given set of SPRs + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_pir PIR value associated with core and thread. + * @param[in] i_saveRegVector bit vector representing the SPRs that needs to be self saved. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for enabling self save of SPRs and it is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_save_cpureg_control( void* i_pImage, + const uint64_t i_pir, + const uint32_t i_saveRegVector ); + +/** + * @brief creates an SPR restore entry in HOMER + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_regId SPR number to be saved in HOMER + * @param[in] i_regData SPR data to be saved in HOMER + * @param[in] i_pir PIR value associated with core and thread. + * @return STOP_SAVE_SUCCESS if API succeeds, error code otherwise. + * @note It is an API for enabling self save of SPRs and it is agnostic to + * generation of POWER processor. + */ +StopReturnCode_t proc_stop_save_cpureg( void* const i_pImage, + const CpuReg_t i_regId, + const uint64_t i_regData, + const uint64_t i_pir ); + +/** + * @brief initializes self-save region with specific instruction. + * @param[in] i_pImage points to start address of HOMER image. + * @param[in] i_corePos physical core's relative position within processor chip. + * @return STOP_SAVE_SUCCESS if self-save is initialized successfully, + * error code otherwise. + * @note API is project agnostic and is intended only for use case of HOMER build. + * There is no explicit effort to support any other use case. + */ +StopReturnCode_t proc_stop_init_self_save( void* const i_pImage, const uint32_t i_corePos ); + +#ifdef __cplusplus +} // extern "C" +}; // namespace stopImageSection ends +#endif //__cplusplus + +#endif //__P10_STOP_IMAGE_API_ diff --git a/libpore/p10_stop_data_struct.H b/libpore/p10_stop_data_struct.H new file mode 100644 index 000000000..3a16fcda9 --- /dev/null +++ b/libpore/p10_stop_data_struct.H @@ -0,0 +1,162 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: chips/p10/procedures/utils/stopreg/p10_stop_data_struct.H $ */ +/* */ +/* IBM CONFIDENTIAL */ +/* */ +/* EKB Project */ +/* */ +/* COPYRIGHT 2015,2020 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* The source code for this program is not published or otherwise */ +/* divested of its trade secrets, irrespective of what has been */ +/* deposited with the U.S. Copyright Office. */ +/* */ +/* IBM_PROLOG_END_TAG */ + +/// +/// @file p10_stop_data_struct.H +/// @brief describes data structures internal to STOP API. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP +#ifndef __STOP_DATA_STRUCT_ +#define __STOP_DATA_STRUCT_ + +#include "p10_hcd_memmap_base.H" + +#ifdef __SKIBOOT__ + #include +#endif + +#ifdef __FAPI_2_ + #include +#endif + +#ifdef PPC_HYP + + #define STATIC + +#else + + #define STATIC static + +#endif + + +#ifdef __DEBUG_ + #include +#endif + +#ifdef __cplusplus +extern "C" { +namespace stopImageSection +{ +#endif + +/** + * @brief Misc constants pertaining to instruction opcodes. + */ +enum +{ + MAX_SPR_RESTORE_INST = 0x08, + SIZE_PER_SPR_RESTORE_INST = ((4 * sizeof(uint8_t)) / sizeof(uint32_t)), + MAX_THREAD_LEVEL_SPRS = 11, + MAX_CORE_LEVEL_SPRS = 6, + MAX_SPR_BIT_POS = 30, + SPR_BIT_POS_8 = 8, + SPR_BIT_POS_20 = 20, + SPR_BIT_POS_25 = 25, + SPR_BIT_POS_27 = 27, +}; + +/** + * @brief various operations supported on SPR restore entry. + */ +enum SprEntryUpdateMode +{ + INIT_SPR_REGION = 0x01, + UPDATE_SPR_ENTRY = 0x02, +}; + +/** + * @brief models an individual SCOM restore entry. + */ +typedef struct +{ + uint32_t iv_scomAddress; + uint64_t iv_scomData; +} __attribute__((packed)) ScomEntry_t; + +/** + * @brief describes details pertaining to SCOM entry + */ +typedef struct +{ + uint32_t iv_subRegionBaseOffset; + uint32_t iv_subRegionLength; + uint8_t iv_slotFound; + uint8_t iv_lastEntryOffset; + uint16_t iv_entryOffset; + uint8_t iv_entryMatchOffset; + uint8_t iv_matchFound; + uint8_t iv_entryLimit; + uint8_t iv_reserved; +} ScomEntryDat_t; + +/** + * @brief summarizes attributes associated with a SPR register. + */ +typedef struct +{ + uint32_t iv_sprId; + bool iv_isThreadScope; + uint32_t iv_saveMaskPos; +} StopSprReg_t; + +/** + * @brief Misc constants. + */ +enum +{ + SIZE_SCOM_ENTRY = sizeof( ScomEntry_t ), + SCOM_ENTRY_START = 0xDEADDEAD, + BAD_SAVE_MASK = 0x007FF000, + MAX_SPR_INDEX = 31, + TEST_BIT_PATTERN = 0x80000000, + EP_SELECT_MASK = 0x000F0000, + CORE_REGION_MASK = 0x0000F000, + SCOM_ENTRY_VALID = 0x80000000, + LAST_SCOM_ENTRY = 0x40000000, + SWIZZLE_LAST_SCOM_ENTRY = 0x00000040, + SCOM_ADDR_MASK = 0x0000FFFF, + SCOM_ADDR_CHIPLET_MASK = 0x000FFFFF, + SCOM_ENTRY_VER = 0x10000000, //Ver 1.0 + CORE_SECTION_ID_CODE = 0x00000000, //Core Section Id 0 + L3_SECTION_ID_CODE = 0x03000000, //L3 Section Id 3 b4:b7 + MAX_SCOM_ENTRY_POS = 0x10, + MIN_SUPERCHIPLET_ID = 0x20, + +}; + +#ifdef __DEBUG_ + #define MY_ERR( _fmt_, _args_...) printf( "\n"); printf( _fmt_, ##_args_) + #define MY_INF(_fmt_, _args_...) printf( "\n"); printf( _fmt_, ##_args_) +#else + #define MY_ERR( _fmt_, _args_...) + #define MY_INF(_fmt_, _args_...) +#endif + +#ifdef __cplusplus +} // extern "C" + +} //namespace stopImageSection ends +#endif //__cplusplus + +#endif diff --git a/libpore/p10_stop_util.C b/libpore/p10_stop_util.C new file mode 100644 index 000000000..ba3ec1535 --- /dev/null +++ b/libpore/p10_stop_util.C @@ -0,0 +1,190 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: chips/p10/procedures/utils/stopreg/p10_stop_util.C $ */ +/* */ +/* IBM CONFIDENTIAL */ +/* */ +/* EKB Project */ +/* */ +/* COPYRIGHT 2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* The source code for this program is not published or otherwise */ +/* divested of its trade secrets, irrespective of what has been */ +/* deposited with the U.S. Copyright Office. */ +/* */ +/* IBM_PROLOG_END_TAG */ + +/// +/// @file p10_stop_util.C +/// @brief implements some utilty functions for STOP API. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP +#ifdef PPC_HYP + #include +#endif + +#include "p10_stop_api.H" +#include "p10_stop_util.H" +#include "p10_stop_data_struct.H" +#include "p10_hcd_memmap_base.H" +#include "p10_hcode_image_defines.H" +#include "stddef.h" + +#ifdef __cplusplus +using namespace hcodeImageBuild; +namespace stopImageSection +{ +#endif + +//----------------------------------------------------------------------- + +/** + * @brief Returns proc chip's fuse mode status. + * @param i_pImage points to start of chip's HOMER image. + * @param o_fusedMode points to fuse mode information. + * @return STOP_SAVE_SUCCESS if functions succeeds, error code otherwise. + */ +STATIC StopReturnCode_t isFusedMode( void* const i_pImage, bool* o_fusedMode ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + uint64_t l_cpmrCheckWord = 0; + uint32_t* l_pMagic = NULL; + CpmrHeader_t* l_pCpmr = NULL; + *o_fusedMode = false; + + do + { + + if( !i_pImage ) + { + MY_ERR( "invalid pointer to HOMER image"); + l_rc = STOP_SAVE_ARG_INVALID_IMG; + break; + } + + l_pMagic = (uint32_t*)( (uint8_t*)i_pImage + CPMR_HOMER_OFFSET + 8 ); + l_cpmrCheckWord = SWIZZLE_4_BYTE( *l_pMagic ); + + if( CPMR_REGION_CHECK_WORD != l_cpmrCheckWord ) + { + MY_ERR("corrupt or invalid HOMER image location 0x%016lx", + l_cpmrCheckWord ); + l_rc = STOP_SAVE_ARG_INVALID_IMG; + break; + } + + l_pCpmr = (CpmrHeader_t*)( (uint8_t*)i_pImage + CPMR_HOMER_OFFSET ); + + if( (uint8_t) FUSED_CORE_MODE == l_pCpmr->iv_fusedMode ) + { + *o_fusedMode = true; + break; + } + + if( (uint8_t) NONFUSED_CORE_MODE == l_pCpmr->iv_fusedMode ) + { + break; + } + + MY_ERR("Unexpected value 0x%08x for fused mode. Bad or corrupt " + "HOMER location", l_pCpmr->iv_fusedMode ); + l_rc = STOP_SAVE_INVALID_FUSED_CORE_STATUS ; + + } + while(0); + + return l_rc; +} + +//---------------------------------------------------------------------- + +StopReturnCode_t getCoreAndThread_p10( void* const i_pImage, const uint64_t i_pir, + uint32_t* o_pCoreId, uint32_t* o_pThreadId ) +{ + StopReturnCode_t l_rc = STOP_SAVE_SUCCESS; + + do + { + // for SPR restore using 'Virtual Thread' and 'Physical Core' number + // In Fused Mode: + // bit b28 and b31 of PIR give physical core and b29 and b30 gives + // virtual thread id. + // In Non Fused Mode + // bit 28 and b29 of PIR give both logical and physical core number + // whereas b30 and b31 gives logical and virtual thread id. + bool fusedMode = false; + uint8_t coreThreadInfo = (uint8_t)i_pir; + *o_pCoreId = 0; + *o_pThreadId = 0; + l_rc = isFusedMode( i_pImage, &fusedMode ); + + if( l_rc ) + { + MY_ERR(" Checking Fused mode. Read failed 0x%08x", l_rc ); + break; + } + + if( fusedMode ) + { + if( coreThreadInfo & FUSED_CORE_BIT1 ) + { + *o_pThreadId = 2; + } + + if( coreThreadInfo & FUSED_CORE_BIT2 ) + { + *o_pThreadId += 1; + } + + if( coreThreadInfo & FUSED_CORE_BIT0 ) + { + *o_pCoreId = 2; + } + + if( coreThreadInfo & FUSED_CORE_BIT3 ) + { + *o_pCoreId += 1; + } + } + else + { + if( coreThreadInfo & FUSED_CORE_BIT0 ) + { + *o_pCoreId = 2; + } + + if ( coreThreadInfo & FUSED_CORE_BIT1 ) + { + *o_pCoreId += 1; + } + + if( coreThreadInfo & FUSED_CORE_BIT2 ) + { + *o_pThreadId = 2; + } + + if( coreThreadInfo & FUSED_CORE_BIT3 ) + { + *o_pThreadId += 1; + } + } + + MY_INF("Core Type %s", fusedMode ? "Fused" : "Un-Fused" ); + //quad field is not affected by fuse mode + *o_pCoreId += 4 * (( coreThreadInfo & 0x70 ) >> 4 ); + } + while(0); + + return l_rc; +} + +#ifdef __cplusplus +}//namespace stopImageSection ends +#endif diff --git a/libpore/p10_stop_util.H b/libpore/p10_stop_util.H new file mode 100644 index 000000000..7836dbc86 --- /dev/null +++ b/libpore/p10_stop_util.H @@ -0,0 +1,123 @@ +/* IBM_PROLOG_BEGIN_TAG */ +/* This is an automatically generated prolog. */ +/* */ +/* $Source: src/import/chips/p10/procedures/hwp/lib/p10_stop_util.H $ */ +/* */ +/* OpenPOWER HostBoot Project */ +/* */ +/* Contributors Listed Below - COPYRIGHT 2016,2019 */ +/* [+] International Business Machines Corp. */ +/* */ +/* */ +/* Licensed under the Apache License, Version 2.0 (the "License"); */ +/* you may not use this file except in compliance with the License. */ +/* You may obtain a copy of the License at */ +/* */ +/* http://www.apache.org/licenses/LICENSE-2.0 */ +/* */ +/* Unless required by applicable law or agreed to in writing, software */ +/* distributed under the License is distributed on an "AS IS" BASIS, */ +/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or */ +/* implied. See the License for the specific language governing */ +/* permissions and limitations under the License. */ +/* */ +/* IBM_PROLOG_END_TAG */ +#ifndef __P10_STOP_UTIL_ +#define __P10_STOP_UTIL_ + +#include + +#ifdef _AIX + #define __BYTE_ORDER __BIG_ENDIAN +#elif __SKIBOOT__ + #include +#else + #include +#endif + +#ifndef __PPE_PLAT + #include "p10_stop_api.H" +#endif + +#ifdef FAPI_2 + #include +#endif + +/// +/// @file p10_stop_util.H +/// @brief describes some utilty functions for STOP API. +/// +// *HWP HW Owner : Greg Still +// *HWP FW Owner : Prem Shanker Jha +// *HWP Team : PM +// *HWP Level : 2 +// *HWP Consumed by : HB:HYP +#ifndef __PPE_PLAT +#ifdef __cplusplus +namespace stopImageSection +{ +#endif +#endif //__PPE_PLAT +/** + * @brief helper function to swizzle given input data + * @note swizles bytes to handle endianess issue. + */ +#if( __BYTE_ORDER == __BIG_ENDIAN ) + +// NOP if it is a big endian system +#define SWIZZLE_2_BYTE(WORD) WORD +#define SWIZZLE_4_BYTE(WORD) WORD +#define SWIZZLE_8_BYTE(WORD) WORD + +#else +#define SWIZZLE_2_BYTE(WORD) \ + ( (((WORD) >> 8) & 0x00FF) | (((WORD) << 8) & 0xFF00) ) + +#define SWIZZLE_4_BYTE(WORD) \ + ( { uint64_t l_tmp64 = WORD; \ + (uint32_t)( (((l_tmp64) >> 24) & 0x000000FF) | (((l_tmp64) >> 8) & 0x0000FF00) | \ + (((l_tmp64) << 8) & 0x00FF0000) | (((l_tmp64) << 24) & 0xFF000000) ) ;\ + }) + +#define SWIZZLE_8_BYTE(WORD) \ + ( (((WORD) >> 56) & 0x00000000000000FF) | \ + (((WORD) >> 40) & 0x000000000000FF00)| \ + (((WORD) >> 24) & 0x0000000000FF0000) | \ + (((WORD) >> 8) & 0x00000000FF000000) | \ + (((WORD) << 8) & 0x000000FF00000000) | \ + (((WORD) << 24) & 0x0000FF0000000000) | \ + (((WORD) << 40) & 0x00FF000000000000) | \ + (((WORD) << 56) & 0xFF00000000000000) ) +#endif + +/** + * @brief enumerates bit(s) positions of interest for PIR. + */ +enum +{ + FUSED_CORE_BIT0 = 0x08, + FUSED_CORE_BIT1 = 0x04, + FUSED_CORE_BIT2 = 0x02, + FUSED_CORE_BIT3 = 0x01, + QUAD_BITS = 0x70, +}; + +#ifndef __PPE_PLAT +/** + * @brief returns core id and thread id by parsing a given PIR. + * @param i_pStopImage points to STOP image associated with a proc chip. + * @param i_pir PIR associated with a core's thread. + * @param o_coreId points to core id value obtained from PIR. + * @param o_threadId points to thread id value obtained from PIR. + * @return SUCCESS if function suceeds, error code otherwise. + */ +StopReturnCode_t getCoreAndThread_p10( void* const i_pStopImage, + const uint64_t i_pir, + uint32_t* o_coreId, + uint32_t* o_threadId ); +#ifdef __cplusplus +} // namespace stopImageSection ends + +#endif +#endif //__PPE_PLAT +#endif -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Wed Aug 4 17:21:24 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 12:51:24 +0530 Subject: [Skiboot] [PATCH v2 46/59] xive/p10: Tune max_entries_in_modified when split_mode is on In-Reply-To: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <20210804072137.1147875-47-hegdevasant@linux.vnet.ibm.com> From: C?dric Le Goater This reduces the number of entries currently modified in the ESB cache. Signed-off-by: C?dric Le Goater Signed-off-by: Vasant Hegde --- hw/xive2.c | 1 + include/xive2-regs.h | 1 + 2 files changed, 2 insertions(+) diff --git a/hw/xive2.c b/hw/xive2.c index 1ad1f138d..56b02fc67 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1622,6 +1622,7 @@ static void xive_config_esb_cache(struct xive *x) if (xive_has_cap(x, CQ_XIVE_CAP_PHB_PQ_DISABLE)) { val |= VC_ESBC_CFG_SPLIT_MODE | VC_ESBC_CFG_HASH_ARRAY_ENABLE; + val = SETFIELD(VC_ESBC_CFG_MAX_ENTRIES_IN_MODIFIED, val, 0xE); xive_dbg(x, "ESB cache configured with split mode " "and hash array. VC_ESBC_CFG=%016llx\n", val); } else diff --git a/include/xive2-regs.h b/include/xive2-regs.h index c2ed265f6..1f7a3e721 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -233,6 +233,7 @@ #define VC_ESBC_CFG_HASH_ARRAY_ENABLE PPC_BIT(40) #define VC_ESBC_CFG_HASH_STORE_MODE PPC_BITMASK(41,42) #define VC_ESBC_CFG_SPLIT_MODE PPC_BIT(56) +#define VC_ESBC_CFG_MAX_ENTRIES_IN_MODIFIED PPC_BITMASK(59,63) /* EASC flush control register */ #define X_VC_EASC_FLUSH_CTRL 0x160 -- 2.31.1 From clg at kaod.org Wed Aug 4 18:43:43 2021 From: clg at kaod.org (=?UTF-8?Q?C=c3=a9dric_Le_Goater?=) Date: Wed, 4 Aug 2021 10:43:43 +0200 Subject: [Skiboot] [PATCH v2 57/59] xive2: Add NCU_SPEC_BAR to stop engine for restore In-Reply-To: <20210804072137.1147875-58-hegdevasant@linux.vnet.ibm.com> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> <20210804072137.1147875-58-hegdevasant@linux.vnet.ibm.com> Message-ID: <5a557906-295b-4f69-67b5-731ebaf0b731@kaod.org> On 8/4/21 9:21 AM, Vasant Hegde wrote: > From: Vaidyanathan Srinivasan > > P10 Stop engines have apis similar to P9 to set xscom restores > after wakeup from deep-sleep states. > > This xscom restore will be used to support STOP11 on P10. > > Signed-off-by: Vaidyanathan Srinivasan > Signed-off-by: Pratik Rajesh Sampat > Signed-off-by: Vasant Hegde > --- > hw/xive2.c | 29 ++++++++++++++++++++++++----- > 1 file changed, 24 insertions(+), 5 deletions(-) > > diff --git a/hw/xive2.c b/hw/xive2.c > index a7b45a005..aece99a0d 100644 > --- a/hw/xive2.c > +++ b/hw/xive2.c > @@ -20,8 +20,7 @@ > #include > #include > #include > -#include /* TODO (p10): need P10 stop state engine */ > - > +#include > This change could be merged in the initial patch :/ Sorry about that. C. > /* Verbose debug */ > #undef XIVE_VERBOSE_DEBUG > @@ -3014,10 +3013,30 @@ static void xive_configure_ex_special_bar(struct xive *x, struct cpu_thread *c) > > void xive2_late_init(void) > { > + struct cpu_thread *c; > + > prlog(PR_INFO, "SLW: Configuring self-restore for NCU_SPEC_BAR\n"); > - /* > - * TODO (p10): need P10 stop state engine and fix for STOP11 > - */ > + for_each_present_cpu(c) { > + if(cpu_is_thread0(c)) { > + struct proc_chip *chip = get_chip(c->chip_id); > + struct xive *x = chip->xive; > + uint64_t xa, val, rc; > + xa = XSCOM_ADDR_P10_NCU(pir_to_core_id(c->pir), P10_NCU_SPEC_BAR); > + val = (uint64_t)x->tm_base | P10_NCU_SPEC_BAR_ENABLE; > + /* Bail out if wakeup engine has already failed */ > + if (wakeup_engine_state != WAKEUP_ENGINE_PRESENT) { > + prlog(PR_ERR, "XIVE proc_stop_api fail detected\n"); > + break; > + } > + rc = proc_stop_save_scom((void *)chip->homer_base, xa, val, > + PROC_STOP_SCOM_REPLACE, PROC_STOP_SECTION_L3); > + if (rc) { > + xive_cpu_err(c, "proc_stop_save_scom failed for NCU_SPEC_BAR rc=%lld\n", > + rc); > + wakeup_engine_state = WAKEUP_ENGINE_FAILED; > + } > + } > + } > } > > static void xive_provision_cpu(struct xive_cpu_state *xs, struct cpu_thread *c) > From hegdevasant at linux.vnet.ibm.com Wed Aug 4 19:35:15 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Wed, 4 Aug 2021 15:05:15 +0530 Subject: [Skiboot] [PATCH v2 57/59] xive2: Add NCU_SPEC_BAR to stop engine for restore In-Reply-To: <5a557906-295b-4f69-67b5-731ebaf0b731@kaod.org> References: <20210804072137.1147875-1-hegdevasant@linux.vnet.ibm.com> <20210804072137.1147875-58-hegdevasant@linux.vnet.ibm.com> <5a557906-295b-4f69-67b5-731ebaf0b731@kaod.org> Message-ID: <13d24990-a23b-f57b-44b1-d1b182734a9c@linux.vnet.ibm.com> On 8/4/21 2:13 PM, C?dric Le Goater wrote: > On 8/4/21 9:21 AM, Vasant Hegde wrote: >> From: Vaidyanathan Srinivasan >> >> P10 Stop engines have apis similar to P9 to set xscom restores >> after wakeup from deep-sleep states. >> >> This xscom restore will be used to support STOP11 on P10. >> >> Signed-off-by: Vaidyanathan Srinivasan >> Signed-off-by: Pratik Rajesh Sampat >> Signed-off-by: Vasant Hegde >> --- >> hw/xive2.c | 29 ++++++++++++++++++++++++----- >> 1 file changed, 24 insertions(+), 5 deletions(-) >> >> diff --git a/hw/xive2.c b/hw/xive2.c >> index a7b45a005..aece99a0d 100644 >> --- a/hw/xive2.c >> +++ b/hw/xive2.c >> @@ -20,8 +20,7 @@ >> #include >> #include >> #include >> -#include /* TODO (p10): need P10 stop state engine */ >> - >> +#include >> > > This change could be merged in the initial patch :/ Sorry about that. I wanted to remove `p9_stop_api.H` from initial patch and add p10_stop_api.H in this patch. Somehow I missed it before sending. If this is only change then I can take care of it before merging. -Vasant From hegdevasant at linux.vnet.ibm.com Thu Aug 5 22:23:02 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 5 Aug 2021 17:53:02 +0530 Subject: [Skiboot] [PATCH 1/2] POWER9 Cleanups: de-assert SPW In-Reply-To: <20210802143354.971727-1-hegdevasant@linux.vnet.ibm.com> References: <20210802143354.971727-1-hegdevasant@linux.vnet.ibm.com> Message-ID: On 8/2/21 8:03 PM, Vasant Hegde wrote: > From: "Pratik R. Sampat" > > De-assert special wakeup bits for the case when SPWU bit is set, however > the core is gated to maintain a coherent state for special wakeup. Thanks! Merged series to master as of 5f670896. -Vasant From hegdevasant at linux.vnet.ibm.com Thu Aug 5 22:26:40 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 5 Aug 2021 17:56:40 +0530 Subject: [Skiboot] [PATCH] Don't warn about stack size on host binaries In-Reply-To: <20210802110210.13d1300c@kryten.localdomain> References: <20210802110210.13d1300c@kryten.localdomain> Message-ID: On 8/2/21 6:32 AM, Anton Blanchard wrote: > I'm hitting a stack size warning when building pflash: > > common/arch_flash_powerpc.c: In function ?get_dev_mtd.constprop?: > common/arch_flash_powerpc.c:177:1: error: the frame size of 8240 > bytes is larger than 2048 bytes [-Werror=frame-larger-than=] > > That function has 2 PATH_MAX strings, each of which will use up 4kB of > stack. > > We've tried to work around the issue of stack size warnings on host > binaries in a few places, with limited success. This patch removes the > check completely instead. We need to modify the HOSTCFLAGS variable > assignment to be immediate for this to work. > > Signed-off-by: Anton Blanchard Thanks! Merged to master as fcc828cdc. -Vasant From hegdevasant at linux.vnet.ibm.com Thu Aug 5 22:27:15 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 5 Aug 2021 17:57:15 +0530 Subject: [Skiboot] [PATCH] Makefile: Avoid errors with GCC 11 In-Reply-To: <20210729075438.652246-1-joel@jms.id.au> References: <20210729075438.652246-1-joel@jms.id.au> Message-ID: <2c27ed38-88c8-b3af-a2d2-3a09adda9e63@linux.vnet.ibm.com> On 7/29/21 1:24 PM, Joel Stanley wrote: > GCC's string and memory functions blow up as the compiler thinks the > objects have no size: > Thanks! Merged to master as 8246de86. -Vasant From hegdevasant at linux.vnet.ibm.com Fri Aug 6 18:52:26 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Fri, 6 Aug 2021 14:22:26 +0530 Subject: [Skiboot] [PATCH] hello_world: Add p10 mambo tests Message-ID: <20210806085226.1440163-1-hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde --- test/hello_world/Makefile.check | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/test/hello_world/Makefile.check b/test/hello_world/Makefile.check index 0390cf662..8cf15cb2f 100644 --- a/test/hello_world/Makefile.check +++ b/test/hello_world/Makefile.check @@ -4,14 +4,18 @@ HELLO_WORLD_STB_TEST := test/hello_world/hello_kernel/hello_kernel.stb .PHONY: hello_world-tests hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-p9-mambo) +hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-p10-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-p9-mambo) +hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-p10-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-qemu) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p9-mambo) +hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p10-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-p9-mambo) +hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-p10-mambo) boot-tests: hello_world-tests check: hello_world-tests @@ -29,12 +33,18 @@ $(HELLO_WORLD_TEST:%=%-check-smt-mambo): %-check-smt-mambo: % skiboot.lid $(HELLO_WORLD_TEST:%=%-check-smt-p9-mambo): %-check-smt-p9-mambo: % skiboot.lid $(call Q , BOOT TEST , THREADS=2 ./test/hello_world/run_mambo_p9_hello_world.sh , $@) +$(HELLO_WORLD_TEST:%=%-check-smt-p10-mambo): %-check-smt-p10-mambo: % skiboot.lid + $(call Q , BOOT TEST , THREADS=2 ./test/hello_world/run_mambo_p10_hello_world.sh , $@) + $(HELLO_WORLD_TEST:%=%-check-mambo): %-check-mambo: % skiboot.lid $(call Q , BOOT TEST , ./test/hello_world/run_mambo_hello_world.sh, $@) $(HELLO_WORLD_TEST:%=%-check-p9-mambo): %-check-p9-mambo: % skiboot.lid $(call Q , BOOT TEST , ./test/hello_world/run_mambo_p9_hello_world.sh, $@) +$(HELLO_WORLD_TEST:%=%-check-p10-mambo): %-check-p10-mambo: % skiboot.lid + $(call Q , BOOT TEST , ./test/hello_world/run_mambo_p10_hello_world.sh, $@) + # and now, with secure and trusted boot: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo): %-check-stb-smt-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_hello_world.sh , $@) @@ -42,12 +52,18 @@ $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo): %-check-stb-smt-mambo: % skiboo $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p9-mambo): %-check-stb-smt-p9-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_p9_hello_world.sh , $@) +$(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p10-mambo): %-check-stb-smt-p10-mambo: % skiboot.lid.stb + $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_p10_hello_world.sh , $@) + $(HELLO_WORLD_STB_TEST:%=%-check-stb-mambo): %-check-stb-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_hello_world.sh, $@) $(HELLO_WORLD_STB_TEST:%=%-check-stb-p9-mambo): %-check-stb-p9-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_p9_hello_world.sh, $@) +$(HELLO_WORLD_STB_TEST:%=%-check-stb-p10-mambo): %-check-stb-p10-mambo: % skiboot.lid.stb + $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_p10_hello_world.sh, $@) + # qemu $(HELLO_WORLD_TEST:%=%-check-qemu): %-check-qemu: % skiboot.lid -- 2.31.1 From npiggin at gmail.com Sat Aug 7 14:13:46 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:13:46 +1000 Subject: [Skiboot] [PATCH v1 1/2] asm/head: Fix P10 HILE for little endian build In-Reply-To: <20210807041347.395291-1-npiggin@gmail.com> References: <20210807041347.395291-1-npiggin@gmail.com> Message-ID: <20210807041347.395291-2-npiggin@gmail.com> Fixes: 891ed8df67 ("Initial POWER10 enablement") Signed-off-by: Nicholas Piggin --- asm/head.S | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/asm/head.S b/asm/head.S index fa8933b14..b2755344d 100644 --- a/asm/head.S +++ b/asm/head.S @@ -829,8 +829,13 @@ init_shared_sprs: /* HID0: * Boot with PPC_BIT(5) set (dis_recovery). * Leave bit 5 set to disable recovery (due to HW570622) + * Set/clear bit 4 (HILE) depending on skiboot endian */ +#if HAVE_BIG_ENDIAN LOAD_IMM64(%r3, PPC_BIT(5)) +#else + LOAD_IMM64(%r3, PPC_BIT(5) | PPC_BIT(4)) +#endif sync mtspr SPR_HID0,%r3 isync -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:13:45 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:13:45 +1000 Subject: [Skiboot] [PATCH v1 0/2] Default to little endian Message-ID: <20210807041347.395291-1-npiggin@gmail.com> Starting from P10 I propose we default to little endian skiboot. Thanks, Nick Nicholas Piggin (2): asm/head: Fix P10 HILE for little endian build Build skiboot little-endian by default Makefile | 12 +++++++++--- asm/head.S | 5 +++++ 2 files changed, 14 insertions(+), 3 deletions(-) -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:13:47 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:13:47 +1000 Subject: [Skiboot] [PATCH v1 2/2] Build skiboot little-endian by default In-Reply-To: <20210807041347.395291-1-npiggin@gmail.com> References: <20210807041347.395291-1-npiggin@gmail.com> Message-ID: <20210807041347.395291-3-npiggin@gmail.com> LE is the way to go. Significantly smaller, less stack, faster, and with later OPAL calling convention changes, it can avoid endian flips when called from an LE OS, and there are other new features in the pipeline that may initially only be implemented for LE OS and LE skiboot. This reduces skiboot.lid.xz size by 10KiB. Signed-off-by: Nicholas Piggin --- Makefile | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index d236df9ec..6e5b91d84 100644 --- a/Makefile +++ b/Makefile @@ -48,12 +48,18 @@ KERNEL ?= # STACK_CHECK ?= $(DEBUG) +BIG_ENDIAN ?= 0 +ifeq ($(BIG_ENDIAN),1) +LITTLE_ENDIAN = 0 +else +LITTLE_ENDIAN ?= 1 +endif + # # Experimental (unsupported) build options # -# Little-endian does not yet build. Include it here to set ELF ABI. -LITTLE_ENDIAN ?= 0 -# ELF v2 ABI is more efficient and compact +# ELF v2 ABI is more efficient and compact. +# This can be set for big-endian builds. Clearing it for LE probably won't work. ELF_ABI_v2 ?= $(LITTLE_ENDIAN) # Discard unreferenced code and data at link-time DEAD_CODE_ELIMINATION ?= 0 -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:50 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:50 +1000 Subject: [Skiboot] [PATCH v2 00/10] hwprobe patches Message-ID: <20210807042100.399449-1-npiggin@gmail.com> Since v1: - Rebased on upstream Nicholas Piggin (2): Remove support for POWER8 DD1 hw/slw: Move P8 bits behind CONFIG_P8 Stewart Smith (8): Introduce hwprobe facility to avoid hard-coding probe functions hwprobe: convert PHB, NPU subsystems to hwprobe Add CONFIG_P8 with PHB3 behind it hwprobe: convert vas_init(), nx_init() npu: move npu_set_fence_state() to phb_ops npu: Move npu.o and npu-hw-procedules.o under CONIFG_P8 platforms: put P8 platforms behind CONFIG_P8 npu: Add CONFIG_NPU to optionally skip NPU code Makefile | 4 + Makefile.main | 19 +- core/Makefile.inc | 1 + core/cpu.c | 30 +- core/fast-reboot.c | 2 + core/hmi.c | 12 +- core/hwprobe.c | 70 +++++ core/init.c | 18 +- core/platform.c | 1 - hw/Makefile.inc | 20 +- hw/npu.c | 9 +- hw/npu2-common.c | 2 + hw/npu2.c | 1 + hw/npu3.c | 2 + hw/nx.c | 2 + hw/phb3.c | 2 +- hw/phb4.c | 2 + hw/slw.c | 491 ++++++--------------------------- hw/vas.c | 2 + include/npu.h | 1 - include/npu2.h | 6 + include/pci.h | 6 + include/skiboot.h | 44 ++- libpore/Makefile.inc | 8 +- platforms/astbmc/Makefile.inc | 23 +- platforms/ibm-fsp/Makefile.inc | 7 +- skiboot.lds.S | 6 + 27 files changed, 330 insertions(+), 461 deletions(-) create mode 100644 core/hwprobe.c -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:51 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:51 +1000 Subject: [Skiboot] [PATCH v2 01/10] Remove support for POWER8 DD1 In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-2-npiggin@gmail.com> This significantly simplifies the SLW code. HILE is now always supported. Reviewed-by: Stewart Smith Signed-off-by: Nicholas Piggin --- core/cpu.c | 23 ++-- hw/slw.c | 323 ---------------------------------------------- include/skiboot.h | 5 - 3 files changed, 9 insertions(+), 342 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index f58aeb27a..60a9ea1c3 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -35,7 +35,6 @@ unsigned int cpu_thread_count; unsigned int cpu_max_pir; struct cpu_thread *boot_cpu; static struct lock reinit_lock = LOCK_UNLOCKED; -static bool hile_supported; static bool radix_supported; static unsigned long hid0_hile; static unsigned long hid0_attn; @@ -999,27 +998,23 @@ void init_boot_cpu(void) case PVR_TYPE_P8E: case PVR_TYPE_P8: proc_gen = proc_gen_p8; - hile_supported = PVR_VERS_MAJ(mfspr(SPR_PVR)) >= 2; hid0_hile = SPR_HID0_POWER8_HILE; hid0_attn = SPR_HID0_POWER8_ENABLE_ATTN; break; case PVR_TYPE_P8NVL: proc_gen = proc_gen_p8; - hile_supported = true; hid0_hile = SPR_HID0_POWER8_HILE; hid0_attn = SPR_HID0_POWER8_ENABLE_ATTN; break; case PVR_TYPE_P9: case PVR_TYPE_P9P: proc_gen = proc_gen_p9; - hile_supported = true; radix_supported = true; hid0_hile = SPR_HID0_POWER9_HILE; hid0_attn = SPR_HID0_POWER9_ENABLE_ATTN; break; case PVR_TYPE_P10: proc_gen = proc_gen_p10; - hile_supported = true; radix_supported = true; hid0_hile = SPR_HID0_POWER10_HILE; hid0_attn = SPR_HID0_POWER10_ENABLE_ATTN; @@ -1056,6 +1051,11 @@ void init_boot_cpu(void) cpu_thread_count = 1; } + if (proc_gen == proc_gen_p8 && (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1)) { + prerror("CPU: POWER8 DD1 is not supported\n"); + abort(); + } + if (is_power9n(pvr) && (PVR_VERS_MAJ(pvr) == 1)) { prerror("CPU: POWER9N DD1 is not supported\n"); abort(); @@ -1597,7 +1597,7 @@ static int64_t opal_reinit_cpus(uint64_t flags) } /* * Now we need to mark ourselves "active" or we'll be skipped - * by the various "for_each_active_..." calls done by slw_reinit() + * by the various "for_each_active_..." */ this_cpu()->state = cpu_state_active; this_cpu()->in_reinit = true; @@ -1611,10 +1611,8 @@ static int64_t opal_reinit_cpus(uint64_t flags) */ cpu_cleanup_all(); - /* If HILE change via HID0 is supported ... */ - if (hile_supported && - (flags & (OPAL_REINIT_CPUS_HILE_BE | - OPAL_REINIT_CPUS_HILE_LE))) { + if (flags & (OPAL_REINIT_CPUS_HILE_BE | + OPAL_REINIT_CPUS_HILE_LE)) { bool hile = !!(flags & OPAL_REINIT_CPUS_HILE_LE); flags &= ~(OPAL_REINIT_CPUS_HILE_BE | OPAL_REINIT_CPUS_HILE_LE); @@ -1669,10 +1667,7 @@ static int64_t opal_reinit_cpus(uint64_t flags) rc = OPAL_SUCCESS; } - /* Handle P8 DD1 SLW reinit */ - if (flags != 0 && proc_gen == proc_gen_p8 && !hile_supported) - rc = slw_reinit(flags); - else if (flags != 0) + if (flags != 0) rc = OPAL_UNSUPPORTED; /* And undo the above */ diff --git a/hw/slw.c b/hw/slw.c index 56ba05b0a..178ee4f85 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -29,10 +29,6 @@ #include #include -static uint32_t slw_saved_reset[0x100]; - -static bool slw_current_le = false; - enum wakeup_engine_states wakeup_engine_state = WAKEUP_ENGINE_NOT_PRESENT; bool has_deep_states = false; @@ -52,125 +48,6 @@ DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -static void slw_do_rvwinkle(void *data) -{ - struct cpu_thread *cpu = this_cpu(); - struct cpu_thread *master = data; - uint64_t lpcr = mfspr(SPR_LPCR); - struct proc_chip *chip; - - /* Setup our ICP to receive IPIs */ - icp_prep_for_pm(); - - /* Setup LPCR to wakeup on external interrupts only */ - mtspr(SPR_LPCR, ((lpcr & ~SPR_LPCR_P8_PECE) | SPR_LPCR_P8_PECE2)); - isync(); - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x going to rvwinkle...\n", - cpu->pir); - - /* Tell that we got it */ - cpu->state = cpu_state_rvwinkle; - - enter_p8_pm_state(1); - - /* Restore SPRs */ - init_shared_sprs(); - init_replicated_sprs(); - - /* Ok, it's ours again */ - cpu->state = cpu_state_active; - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x woken up !\n", cpu->pir); - - /* Cleanup our ICP */ - reset_cpu_icp(); - - /* Resync timebase */ - chiptod_wakeup_resync(); - - /* Restore LPCR */ - mtspr(SPR_LPCR, lpcr); - isync(); - - /* If we are passed a master pointer we are the designated - * waker, let's proceed. If not, return, we are finished. - */ - if (!master) - return; - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x waiting for master...\n", - cpu->pir); - - /* Allriiiight... now wait for master to go down */ - while(master->state != cpu_state_rvwinkle) - sync(); - - /* XXX Wait one second ! (should check xscom state ? ) */ - time_wait_ms(1000); - - for_each_chip(chip) { - struct cpu_thread *c; - uint64_t tmp; - for_each_available_core_in_chip(c, chip->id) { - xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - prlog(PR_TRACE, "SLW: core %x:%x" - " history: 0x%016llx (mid2)\n", - chip->id, pir_to_core_id(c->pir), - tmp); - } - } - - prlog(PR_DEBUG, "SLW: Waking master (PIR 0x%04x)...\n", master->pir); - - /* Now poke all the secondary threads on the master's core */ - for_each_cpu(cpu) { - if (!cpu_is_sibling(cpu, master) || (cpu == master)) - continue; - icp_kick_cpu(cpu); - - /* Wait for it to claim to be back (XXX ADD TIMEOUT) */ - while(cpu->state != cpu_state_active) - sync(); - } - - /* Now poke the master and be gone */ - icp_kick_cpu(master); -} - -static void slw_patch_reset(void) -{ - uint32_t *src, *dst, *sav; - - src = &reset_patch_start; - dst = (uint32_t *)0x100; - sav = slw_saved_reset; - while(src < &reset_patch_end) { - *(sav++) = *(dst); - *(dst++) = *(src++); - } - sync_icache(); -} - -static void slw_unpatch_reset(void) -{ - extern uint32_t reset_patch_start; - extern uint32_t reset_patch_end; - uint32_t *src, *dst, *sav; - - src = &reset_patch_start; - dst = (uint32_t *)0x100; - sav = slw_saved_reset; - while(src < &reset_patch_end) { - *(dst++) = *(sav++); - src++; - } - sync_icache(); -} - static bool slw_general_init(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -274,15 +151,6 @@ static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) return true; } -static bool slw_unset_overrides(struct proc_chip *chip, struct cpu_thread *c) -{ - uint32_t core = pir_to_core_id(c->pir); - - /* XXX FIXME: Save and restore the overrides */ - prlog(PR_DEBUG, "SLW: slw_unset_overrides %x:%x\n", chip->id, core); - return true; -} - static bool slw_set_idle_mode(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -1201,197 +1069,6 @@ void add_cpu_idle_state_properties(void) free(pm_ctrl_reg_mask_buf); } -static void slw_cleanup_core(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - - /* Display history to check transition */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_GET), - "SLW: Failed to read PM_IDLE_STATE_HISTORY\n"); - /* XXX error handling ? return false; */ - } - - prlog(PR_DEBUG, "SLW: core %x:%x history: 0x%016llx (new1)\n", - chip->id, pir_to_core_id(c->pir), tmp); - - rc = xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_GET), - "SLW: Failed to read PM_IDLE_STATE_HISTORY\n"); - /* XXX error handling ? return false; */ - } - - prlog(PR_DEBUG, "SLW: core %x:%x history: 0x%016llx (new2)\n", - chip->id, pir_to_core_id(c->pir), tmp); - - /* - * XXX FIXME: Error out if the transition didn't reach rvwinkle ? - */ - - /* - * XXX FIXME: We should restore a bunch of the EX bits we - * overwrite to sane values here - */ - slw_unset_overrides(chip, c); -} - -static void slw_cleanup_chip(struct proc_chip *chip) -{ - struct cpu_thread *c; - - for_each_available_core_in_chip(c, chip->id) - slw_cleanup_core(chip, c); -} - -static void slw_patch_scans(struct proc_chip *chip, bool le_mode) -{ - int64_t rc; - uint64_t old_val, new_val; - - rc = sbe_xip_get_scalar((void *)chip->slw_base, - "skip_ex_override_ring_scans", &old_val); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to read scan override on chip %d\n", - chip->id); - return; - } - - new_val = le_mode ? 0 : 1; - - prlog(PR_TRACE, "SLW: Chip %d, LE value was: %lld, setting to %lld\n", - chip->id, old_val, new_val); - - rc = sbe_xip_set_scalar((void *)chip->slw_base, - "skip_ex_override_ring_scans", new_val); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to set LE mode on chip %d\n", chip->id); - return; - } -} - -int64_t slw_reinit(uint64_t flags) -{ - struct proc_chip *chip; - struct cpu_thread *cpu; - bool has_waker = false; - bool target_le = slw_current_le; - - if (flags & OPAL_REINIT_CPUS_HILE_BE) - target_le = false; - if (flags & OPAL_REINIT_CPUS_HILE_LE) - target_le = true; - - prlog(PR_TRACE, "SLW Reinit from CPU PIR 0x%04x," - " HILE set to %s endian...\n", - this_cpu()->pir, - target_le ? "little" : "big"); - - /* Prepare chips/cores for rvwinkle */ - for_each_chip(chip) { - if (!chip->slw_base) { - log_simple_error(&e_info(OPAL_RC_SLW_INIT), - "SLW: Not found on chip %d\n", chip->id); - return OPAL_HARDWARE; - } - - slw_patch_scans(chip, target_le); - } - slw_current_le = target_le; - - /* XXX Save HIDs ? Or do that in head.S ... */ - - slw_patch_reset(); - - /* rvwinkle everybody and pick one to wake me once I rvwinkle myself */ - for_each_available_cpu(cpu) { - struct cpu_thread *master = NULL; - - if (cpu == this_cpu()) - continue; - - /* Pick up a waker for myself: it must not be a sibling of - * the current CPU and must be a thread 0 (so it gets to - * sync its timebase before doing time_wait_ms() - */ - if (!has_waker && !cpu_is_sibling(cpu, this_cpu()) && - cpu_is_thread0(cpu)) { - has_waker = true; - master = this_cpu(); - } - __cpu_queue_job(cpu, "slw_do_rvwinkle", - slw_do_rvwinkle, master, true); - - /* Wait for it to claim to be down */ - while(cpu->state != cpu_state_rvwinkle) - sync(); - } - - /* XXX Wait one second ! (should check xscom state ? ) */ - prlog(PR_TRACE, "SLW: Waiting one second...\n"); - time_wait_ms(1000); - prlog(PR_TRACE, "SLW: Done.\n"); - - for_each_chip(chip) { - struct cpu_thread *c; - uint64_t tmp; - for_each_available_core_in_chip(c, chip->id) { - xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - prlog(PR_DEBUG, "SLW: core %x:%x" - " history: 0x%016llx (mid)\n", - chip->id, pir_to_core_id(c->pir), tmp); - } - } - - - /* Wake everybody except on my core */ - for_each_cpu(cpu) { - if (cpu->state != cpu_state_rvwinkle || - cpu_is_sibling(cpu, this_cpu())) - continue; - icp_kick_cpu(cpu); - - /* Wait for it to claim to be back (XXX ADD TIMEOUT) */ - while(cpu->state != cpu_state_active) - sync(); - } - - /* Did we find a waker ? If we didn't, that means we had no - * other core in the system, we can't do it - */ - if (!has_waker) { - prlog(PR_TRACE, "SLW: No candidate waker, giving up !\n"); - return OPAL_HARDWARE; - } - - /* Our siblings are rvwinkling, and our waker is waiting for us - * so let's just go down now - */ - slw_do_rvwinkle(NULL); - - slw_unpatch_reset(); - - for_each_chip(chip) - slw_cleanup_chip(chip); - - prlog(PR_TRACE, "SLW Reinit complete !\n"); - - return OPAL_SUCCESS; -} - static void slw_patch_regs(struct proc_chip *chip) { struct cpu_thread *c; diff --git a/include/skiboot.h b/include/skiboot.h index f3378ec28..fa5323231 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -311,11 +311,6 @@ extern enum wakeup_engine_states wakeup_engine_state; extern bool has_deep_states; extern void nx_p9_rng_late_init(void); - - -/* SLW reinit function for switching core settings */ -extern int64_t slw_reinit(uint64_t flags); - /* Patch SPR in SLW image */ extern int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val); -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:52 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:52 +1000 Subject: [Skiboot] [PATCH v2 02/10] Introduce hwprobe facility to avoid hard-coding probe functions In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-3-npiggin@gmail.com> From: Stewart Smith hwprobe is a little system to have different hardware probing modules run in the dependency order they choose rather than hard coding that order in core/init.c. Signed-off-by: Stewart Smith --- core/Makefile.inc | 1 + core/hwprobe.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++ core/init.c | 3 ++ include/skiboot.h | 39 +++++++++++++++++++++++++- skiboot.lds.S | 6 ++++ 5 files changed, 118 insertions(+), 1 deletion(-) create mode 100644 core/hwprobe.c diff --git a/core/Makefile.inc b/core/Makefile.inc index 829800e5b..f80019b6a 100644 --- a/core/Makefile.inc +++ b/core/Makefile.inc @@ -13,6 +13,7 @@ CORE_OBJS += timer.o i2c.o rtc.o flash.o sensor.o ipmi-opal.o CORE_OBJS += flash-subpartition.o bitmap.o buddy.o pci-quirk.o powercap.o psr.o CORE_OBJS += pci-dt-slot.o direct-controls.o cpufeatures.o CORE_OBJS += flash-firmware-versions.o opal-dump.o +CORE_OBJS += hwprobe.o ifeq ($(SKIBOOT_GCOV),1) CORE_OBJS += gcov-profiling.o diff --git a/core/hwprobe.c b/core/hwprobe.c new file mode 100644 index 000000000..de331af48 --- /dev/null +++ b/core/hwprobe.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later +/* Copyright 2021 Stewart Smith */ + +#define pr_fmt(fmt) "HWPROBE: " fmt +#include +#include + +static bool hwprobe_deps_satisfied(const struct hwprobe *hwp) +{ + struct hwprobe *hwprobe; + const char *dep; + unsigned int i; + + if (hwp->deps == NULL) + return true; + + dep = hwp->deps[0]; + + prlog(PR_TRACE, "Checking deps for %s\n", hwp->name); + + while (dep != NULL) { + prlog(PR_TRACE, "Checking %s dep %s\n", hwp->name, dep); + hwprobe = &__hwprobes_start; + for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { + if(strcmp(hwprobe[i].name,dep) == 0 && + !hwprobe[i].probed) + return false; + } + dep++; + } + + prlog(PR_TRACE, "deps for %s are satisfied!\n", hwp->name); + return true; + +} + +void probe_hardware(void) +{ + struct hwprobe *hwprobe; + unsigned int i; + bool work_todo = true; + bool did_something = true; + + while (work_todo) { + work_todo = false; + did_something = false; + hwprobe = &__hwprobes_start; + prlog(PR_DEBUG, "Begin loop\n"); + for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { + if (hwprobe[i].probed) + continue; + if (hwprobe_deps_satisfied(&hwprobe[i])) { + prlog(PR_DEBUG, "Probing %s...\n", hwprobe[i].name); + if (hwprobe[i].probe) + hwprobe[i].probe(); + did_something = true; + hwprobe[i].probed = true; + } else { + prlog(PR_DEBUG, "Dependencies for %s not yet satisfied, skipping\n", + hwprobe[i].name); + work_todo = true; + } + } + + if (work_todo && !did_something) { + prlog(PR_ERR, "Cannot satisfy dependencies! Bailing out\n"); + break; + } + } +} diff --git a/core/init.c b/core/init.c index a8bac28a8..61934c9fe 100644 --- a/core/init.c +++ b/core/init.c @@ -1372,6 +1372,9 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) probe_npu2(); probe_npu3(); + /* Probe all HWPROBE hardware we have code linked for*/ + probe_hardware(); + /* Initialize PCI */ pci_init_slots(); diff --git a/include/skiboot.h b/include/skiboot.h index fa5323231..f83fcbdf6 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -1,5 +1,7 @@ // SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later -/* Copyright 2013-2019 IBM Corp. */ +/* Copyright 2013-2019 IBM Corp. + * Copyright 2021 Stewart Smith + */ #ifndef __SKIBOOT_H #define __SKIBOOT_H @@ -341,4 +343,39 @@ extern int fake_nvram_info(uint32_t *total_size); extern int fake_nvram_start_read(void *dst, uint32_t src, uint32_t len); extern int fake_nvram_write(uint32_t offset, void *src, uint32_t size); +/* + * A bunch of hardware needs to be probed, sometimes in a particular order. + * Very simple dependency graph, with a even simpler way to resolve it. + * But it means we can now at link time choose what hardware we support. + * This struct should not be defined directly but with the macros. + */ +struct hwprobe { + const char *name; + void (*probe)(void); + + bool probed; + + /* NULL or NULL-terminated array of strings */ + const char **deps; +}; + +#define DEFINE_HWPROBE(__name, __probe) \ +static const struct hwprobe __used __section(".hwprobes") hwprobe_##__name = { \ + .name = #__name, \ + .probe = __probe, \ + .deps = NULL, \ +} + +#define DEFINE_HWPROBE_DEPS(__name, __probe, ...) \ +static const struct hwprobe __used __section(".hwprobes") hwprobe_##__name = { \ + .name = #__name, \ + .probe = __probe, \ + .deps = (const char *[]){ __VA_ARGS__, NULL}, \ +} + +extern struct hwprobe __hwprobes_start; +extern struct hwprobe __hwprobes_end; + +extern void probe_hardware(void); + #endif /* __SKIBOOT_H */ diff --git a/skiboot.lds.S b/skiboot.lds.S index 5a7f9e316..c8e6e747c 100644 --- a/skiboot.lds.S +++ b/skiboot.lds.S @@ -164,6 +164,12 @@ SECTIONS __platforms_end = .; } + .hwprobes : { + __hwprobes_start = .; + KEEP(*(.hwprobes)) + __hwprobes_end = .; + } + /* Relocations */ . = ALIGN(0x10); .dynamic : { -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:53 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:53 +1000 Subject: [Skiboot] [PATCH v2 03/10] hwprobe: convert PHB, NPU subsystems to hwprobe In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-4-npiggin@gmail.com> From: Stewart Smith [npiggin: split out from initial hwprobe pach] Signed-off-by: Stewart Smith --- core/init.c | 13 +------------ hw/npu.c | 2 ++ hw/npu2-common.c | 2 ++ hw/npu3.c | 2 ++ hw/phb3.c | 2 +- hw/phb4.c | 2 ++ 6 files changed, 10 insertions(+), 13 deletions(-) diff --git a/core/init.c b/core/init.c index 61934c9fe..5e2b18d85 100644 --- a/core/init.c +++ b/core/init.c @@ -1361,18 +1361,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* NX init */ nx_init(); - /* Probe PHB3 on P8 */ - probe_phb3(); - - /* Probe PHB4 on P9 and PHB5 on P10 */ - probe_phb4(); - - /* Probe NPUs */ - probe_npu(); - probe_npu2(); - probe_npu3(); - - /* Probe all HWPROBE hardware we have code linked for*/ + /* Probe all HWPROBE hardware we have code linked for */ probe_hardware(); /* Initialize PCI */ diff --git a/hw/npu.c b/hw/npu.c index dba7ee50f..2b5364c33 100644 --- a/hw/npu.c +++ b/hw/npu.c @@ -1691,3 +1691,5 @@ void probe_npu(void) dt_for_each_compatible(dt_root, np, "ibm,power8-npu-pciex") npu_create_phb(np); } + +DEFINE_HWPROBE_DEPS(npu, probe_npu, "phb3"); diff --git a/hw/npu2-common.c b/hw/npu2-common.c index 3bc9bcee6..87ebf8232 100644 --- a/hw/npu2-common.c +++ b/hw/npu2-common.c @@ -679,3 +679,5 @@ void probe_npu2(void) setup_devices(npu); } } + +DEFINE_HWPROBE_DEPS(npu2, probe_npu2, "phb4"); diff --git a/hw/npu3.c b/hw/npu3.c index 03461373e..92af96b23 100644 --- a/hw/npu3.c +++ b/hw/npu3.c @@ -547,3 +547,5 @@ void probe_npu3(void) npu3_init(npu); } } + +DEFINE_HWPROBE_DEPS(npu3, probe_npu3, "phb4"); diff --git a/hw/phb3.c b/hw/phb3.c index 8af6b6164..320023e57 100644 --- a/hw/phb3.c +++ b/hw/phb3.c @@ -5049,4 +5049,4 @@ void probe_phb3(void) phb3_create(np); } - +DEFINE_HWPROBE(phb3, probe_phb3); diff --git a/hw/phb4.c b/hw/phb4.c index 79083d4a1..ec07fe2bb 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -6398,3 +6398,5 @@ void probe_phb4(void) phb4_create(np); } } + +DEFINE_HWPROBE(phb4, probe_phb4); -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:54 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:54 +1000 Subject: [Skiboot] [PATCH v2 04/10] Add CONFIG_P8 with PHB3 behind it In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-5-npiggin@gmail.com> From: Stewart Smith We can use a base CPU of POWER9 if we don't have P8. We can also hide PHB3 code behind this, and shave 12kb off skiboot.lid.xz [npiggin: add cpp define, fail gracefully on P8] Signed-off-by: Stewart Smith --- Makefile | 2 ++ Makefile.main | 15 ++++++++++++++- core/cpu.c | 11 +++++++++-- hw/Makefile.inc | 8 ++++++-- 4 files changed, 31 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 6e5b91d84..a9807c4dc 100644 --- a/Makefile +++ b/Makefile @@ -65,6 +65,8 @@ ELF_ABI_v2 ?= $(LITTLE_ENDIAN) DEAD_CODE_ELIMINATION ?= 0 # Try to build without FSP code CONFIG_FSP?=1 +# Try to build without POWER8 support +CONFIG_P8?=1 # # Where is the source directory, must be a full path (no ~) diff --git a/Makefile.main b/Makefile.main index c8a63e8b1..2a346a6c9 100644 --- a/Makefile.main +++ b/Makefile.main @@ -96,7 +96,11 @@ CPPFLAGS += -DDEBUG -DCCAN_LIST_DEBUG endif CFLAGS := -fno-strict-aliasing -pie -fpie -fno-pic -m64 -fno-asynchronous-unwind-tables +ifeq ($(CONFIG_P8),1) CFLAGS += -mcpu=power8 +else +CFLAGS += -mcpu=power9 +endif CFLAGS += -Wl,--oformat,elf64-powerpc -ggdb # r13,r14,r15 are preserved for OS to use as fixed registers. # These could be saved and restored in and out of skiboot, but it's more @@ -156,6 +160,10 @@ else CFLAGS += -fno-stack-protector endif +# Add preprocessor defines for CONFIG_ options here +ifeq ($(CONFIG_P8),1) +CFLAGS += -DCONFIG_P8=1 +endif CFLAGS += $(call try-cflag,$(CC),-Wjump-misses-init) \ $(call try-cflag,$(CC),-Wsuggest-attribute=const) \ @@ -173,7 +181,12 @@ LDFLAGS := -m64 -static -nostdlib -pie LDFLAGS += -Wl,-pie LDFLAGS += -Wl,-Ttext-segment,$(LD_TEXT) -Wl,-N -Wl,--build-id=none LDFLAGS += -Wl,--no-multi-toc -LDFLAGS += -mcpu=power8 -Wl,--oformat,elf64-powerpc +ifeq ($(CONFIG_P8),1) +LDFLAGS += -mcpu=power8 +else +LDFLAGS += -mcpu=power9 +endif +LDFLAGS += -Wl,--oformat,elf64-powerpc LDFLAGS_FINAL = -m elf64lppc --no-multi-toc -N --build-id=none --whole-archive LDFLAGS_FINAL += -static -nostdlib -pie -Ttext-segment=$(LD_TEXT) --oformat=elf64-powerpc LDFLAGS_FINAL += --orphan-handling=warn diff --git a/core/cpu.c b/core/cpu.c index 60a9ea1c3..d4d33b836 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -1051,9 +1051,16 @@ void init_boot_cpu(void) cpu_thread_count = 1; } - if (proc_gen == proc_gen_p8 && (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1)) { - prerror("CPU: POWER8 DD1 is not supported\n"); + if (proc_gen == proc_gen_p8) { +#ifdef CONFIG_P8 + if (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1) { + prerror("CPU: POWER8 DD1 is not supported\n"); + abort(); + } +#else + prerror("CPU: POWER8 detected but CONFIG_P8 not set\n"); abort(); +#endif } if (is_power9n(pvr) && (PVR_VERS_MAJ(pvr) == 1)) { diff --git a/hw/Makefile.inc b/hw/Makefile.inc index 37256d3cc..d436da222 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -3,13 +3,17 @@ SUBDIRS += hw HW_OBJS = xscom.o chiptod.o lpc.o lpc-uart.o psi.o HW_OBJS += homer.o slw.o occ.o fsi-master.o centaur.o imc.o HW_OBJS += nx.o nx-rng.o nx-crypto.o nx-compress.o nx-842.o nx-gzip.o -HW_OBJS += phb3.o sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o -HW_OBJS += dts.o lpc-rtc.o npu.o npu-hw-procedures.o xive.o phb4.o +HW_OBJS += sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o +HW_OBJS += dts.o lpc-rtc.o xive.o phb4.o HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o HW_OBJS += ocmb.o xive2.o +HW_OBJS += npu.o npu-hw-procedures.o +ifeq ($(CONFIG_P8),1) +HW_OBJS += phb3.o +endif HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:55 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:55 +1000 Subject: [Skiboot] [PATCH v2 05/10] hw/slw: Move P8 bits behind CONFIG_P8 In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-6-npiggin@gmail.com> This saves about 3kB from skiboot.lid.xz Signed-off-by: Nicholas Piggin --- core/fast-reboot.c | 2 + hw/slw.c | 176 ++++++++++++++++++++++--------------------- libpore/Makefile.inc | 8 +- 3 files changed, 100 insertions(+), 86 deletions(-) diff --git a/core/fast-reboot.c b/core/fast-reboot.c index 9f92525a9..2696348af 100644 --- a/core/fast-reboot.c +++ b/core/fast-reboot.c @@ -272,6 +272,7 @@ static void cleanup_cpu_state(void) /* XXX Update the SLW copies ! Also dbl check HIDs etc... */ init_shared_sprs(); +#ifdef CONFIG_P8 if (proc_gen == proc_gen_p8) { /* If somebody was in fast_sleep, we may have a * workaround to undo @@ -287,6 +288,7 @@ static void cleanup_cpu_state(void) */ cleanup_local_tlb(); } +#endif /* And we might have lost TB sync */ chiptod_wakeup_resync(); diff --git a/hw/slw.c b/hw/slw.c index 178ee4f85..cf633d2ad 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -32,19 +32,20 @@ enum wakeup_engine_states wakeup_engine_state = WAKEUP_ENGINE_NOT_PRESENT; bool has_deep_states = false; -DEFINE_LOG_ENTRY(OPAL_RC_SLW_INIT, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, - OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL, - OPAL_NA); - DEFINE_LOG_ENTRY(OPAL_RC_SLW_SET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -DEFINE_LOG_ENTRY(OPAL_RC_SLW_GET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, +DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, +#ifdef CONFIG_P8 +DEFINE_LOG_ENTRY(OPAL_RC_SLW_INIT, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, + OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL, + OPAL_NA); + +DEFINE_LOG_ENTRY(OPAL_RC_SLW_GET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); @@ -98,59 +99,6 @@ static bool slw_set_overrides(struct proc_chip *chip, struct cpu_thread *c) return true; } -static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - uint32_t core = pir_to_core_id(c->pir); - - /* Special wakeup bits that could hold power mgt */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_SET), - "SLW: Failed to read P10_QME_SPWU_HYP\n"); - return false; - } - if (tmp & P10_SPWU_REQ) - prlog(PR_WARNING, - "SLW: core %d P10_QME_SPWU_HYP requested 0x%016llx\n", - core, tmp); - - return true; -} - - -static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - uint32_t core = pir_to_core_id(c->pir); - - /* Special wakeup bits that could hold power mgt */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_SET), - "SLW: Failed to read EC_PPM_SPECIAL_WKUP_HYP\n"); - return false; - } - if (tmp) - prlog(PR_WARNING, - "SLW: core %d EC_PPM_SPECIAL_WKUP_HYP read 0x%016llx\n", - core, tmp); - rc = xscom_read(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_OTR), - &tmp); - if (tmp) - prlog(PR_WARNING, - "SLW: core %d EC_PPM_SPECIAL_WKUP_OTR read 0x%016llx\n", - core, tmp); - return true; -} - static bool slw_set_idle_mode(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -242,6 +190,60 @@ static bool idle_prepare_core(struct proc_chip *chip, struct cpu_thread *c) return true; } +#endif + +static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) +{ + uint64_t tmp; + int rc; + uint32_t core = pir_to_core_id(c->pir); + + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), + &tmp); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_SET), + "SLW: Failed to read P10_QME_SPWU_HYP\n"); + return false; + } + if (tmp & P10_SPWU_REQ) + prlog(PR_WARNING, + "SLW: core %d P10_QME_SPWU_HYP requested 0x%016llx\n", + core, tmp); + + return true; +} + + +static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) +{ + uint64_t tmp; + int rc; + uint32_t core = pir_to_core_id(c->pir); + + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), + &tmp); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_SET), + "SLW: Failed to read EC_PPM_SPECIAL_WKUP_HYP\n"); + return false; + } + if (tmp) + prlog(PR_WARNING, + "SLW: core %d EC_PPM_SPECIAL_WKUP_HYP read 0x%016llx\n", + core, tmp); + rc = xscom_read(chip->id, + XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_OTR), + &tmp); + if (tmp) + prlog(PR_WARNING, + "SLW: core %d EC_PPM_SPECIAL_WKUP_OTR read 0x%016llx\n", + core, tmp); + return true; +} /* Define device-tree fields */ #define MAX_NAME_LEN 16 @@ -1069,31 +1071,6 @@ void add_cpu_idle_state_properties(void) free(pm_ctrl_reg_mask_buf); } -static void slw_patch_regs(struct proc_chip *chip) -{ - struct cpu_thread *c; - void *image = (void *)chip->slw_base; - int rc; - - for_each_available_cpu(c) { - if (c->chip_id != chip->id) - continue; - - /* Clear HRMOR */ - rc = p8_pore_gen_cpureg_fixed(image, P8_SLW_MODEBUILD_SRAM, - P8_SPR_HRMOR, 0, - cpu_get_core_index(c), - cpu_get_thread_index(c)); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to set HRMOR for CPU %x\n", - c->pir); - } - - /* XXX Add HIDs etc... */ - } -} - static void slw_init_chip_p9(struct proc_chip *chip) { struct cpu_thread *c; @@ -1135,6 +1112,32 @@ static bool slw_image_check_p9(struct proc_chip *chip) } +#ifdef CONFIG_P8 +static void slw_patch_regs(struct proc_chip *chip) +{ + struct cpu_thread *c; + void *image = (void *)chip->slw_base; + int rc; + + for_each_available_cpu(c) { + if (c->chip_id != chip->id) + continue; + + /* Clear HRMOR */ + rc = p8_pore_gen_cpureg_fixed(image, P8_SLW_MODEBUILD_SRAM, + P8_SPR_HRMOR, 0, + cpu_get_core_index(c), + cpu_get_thread_index(c)); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_REG), + "SLW: Failed to set HRMOR for CPU %x\n", + c->pir); + } + + /* XXX Add HIDs etc... */ + } +} + static bool slw_image_check_p8(struct proc_chip *chip) { int64_t rc; @@ -1284,6 +1287,7 @@ static int64_t opal_config_cpu_idle_state(uint64_t state, uint64_t enter) } opal_call(OPAL_CONFIG_CPU_IDLE_STATE, opal_config_cpu_idle_state, 2); +#endif int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) { @@ -1324,6 +1328,7 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) sprn, val, cpu_pir); } +#ifdef CONFIG_P8 } else if (proc_gen == proc_gen_p8) { int spr_is_supported = 0; void *image; @@ -1347,6 +1352,7 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) sprn, val, cpu_get_core_index(c), cpu_get_thread_index(c)); +#endif } else { log_simple_error(&e_info(OPAL_RC_SLW_REG), "SLW: proc_gen not supported\n"); @@ -1378,6 +1384,7 @@ void slw_init(void) return; } if (proc_gen == proc_gen_p8) { +#ifdef CONFIG_P8 for_each_chip(chip) { slw_init_chip_p8(chip); if(slw_image_check_p8(chip)) @@ -1386,6 +1393,7 @@ void slw_init(void) slw_late_init_p8(chip); } p8_sbe_init_timer(); +#endif } else if (proc_gen == proc_gen_p9) { for_each_chip(chip) { slw_init_chip_p9(chip); diff --git a/libpore/Makefile.inc b/libpore/Makefile.inc index 06d9c8902..a60674856 100644 --- a/libpore/Makefile.inc +++ b/libpore/Makefile.inc @@ -1,5 +1,9 @@ -LIBPORE_SRCS = p8_pore_table_gen_api_fixed.C p9_stop_api.C p9_stop_util.C p10_stop_api.C p10_stop_util.C -LIBPORE_SRCS += p8_pore_table_static_data.c sbe_xip_image.c pore_inline_assembler.c +LIBPORE_SRCS = p9_stop_api.C p9_stop_util.C p10_stop_api.C p10_stop_util.C +LIBPORE_SRCS += sbe_xip_image.c pore_inline_assembler.c +ifeq ($(CONFIG_P8),1) +LIBPORE_SRCS += p8_pore_table_gen_api_fixed.C p8_pore_table_static_data.c +endif + LIBPORE_OBJS_1 = $(LIBPORE_SRCS:%.c=%.o) LIBPORE_OBJS = $(LIBPORE_OBJS_1:%.C=%.o) SUBDIRS += libpore -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:56 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:56 +1000 Subject: [Skiboot] [PATCH v2 06/10] hwprobe: convert vas_init(), nx_init() In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-7-npiggin@gmail.com> From: Stewart Smith [npiggin: remove imc_init because it moved later in boot (fbcbd4e47c)] Signed-off-by: Stewart Smith --- core/init.c | 6 ------ hw/nx.c | 2 ++ hw/vas.c | 2 ++ 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/core/init.c b/core/init.c index 5e2b18d85..0ec5d6ac3 100644 --- a/core/init.c +++ b/core/init.c @@ -1355,12 +1355,6 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Catalog decompression routine */ imc_decompress_catalog(); - /* Virtual Accelerator Switchboard */ - vas_init(); - - /* NX init */ - nx_init(); - /* Probe all HWPROBE hardware we have code linked for */ probe_hardware(); diff --git a/hw/nx.c b/hw/nx.c index fdadf53c7..b1cab5774 100644 --- a/hw/nx.c +++ b/hw/nx.c @@ -136,3 +136,5 @@ void nx_init(void) if (proc_gen >= proc_gen_p9) darn_init(); } + +DEFINE_HWPROBE_DEPS(nx, nx_init, "vas"); diff --git a/hw/vas.c b/hw/vas.c index 0dbe0bcda..96ca055cc 100644 --- a/hw/vas.c +++ b/hw/vas.c @@ -637,3 +637,5 @@ out: vas_err("Disabled (failed initialization)\n"); return; } + +DEFINE_HWPROBE(vas, vas_init); -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:57 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:57 +1000 Subject: [Skiboot] [PATCH v2 07/10] npu: move npu_set_fence_state() to phb_ops In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-8-npiggin@gmail.com> From: Stewart Smith This lets us consider not building in npu.o Signed-off-by: Stewart Smith --- core/hmi.c | 2 +- hw/npu.c | 7 +++++-- include/npu.h | 1 - include/pci.h | 3 +++ 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/core/hmi.c b/core/hmi.c index 9363cc5fb..55eaa59c6 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -924,7 +924,7 @@ static void find_npu_checkstop_reason(int flat_chip_id, npu_fir_action0, npu_fir_action1); /* Set the NPU to fenced since it can't recover. */ - npu_set_fence_state(p, true); + phb->ops->set_fence_state(phb, true); /* Set up the HMI event */ hmi_evt->severity = OpalHMI_SEV_WARNING; diff --git a/hw/npu.c b/hw/npu.c index 2b5364c33..6992e7e72 100644 --- a/hw/npu.c +++ b/hw/npu.c @@ -925,7 +925,9 @@ static int64_t npu_eeh_next_error(struct phb *phb, } /* For use in error injection and handling. */ -void npu_set_fence_state(struct npu *p, bool fence) { +static void npu_set_fence_state(struct phb *phb, bool fence) { + struct npu *p = phb_to_npu(phb); + p->fenced = fence; if (fence) @@ -968,7 +970,7 @@ static int64_t npu_err_inject(struct phb *phb, uint64_t pe_number, return OPAL_PARAMETER; } else if (type == 1) { /* Emulate fence mode. */ - npu_set_fence_state(p, true); + npu_set_fence_state(phb, true); } else { /* Cause a freeze with an invalid MMIO read. If the BAR is not * enabled, this will checkstop the machine. @@ -1012,6 +1014,7 @@ static const struct phb_ops npu_ops = { .get_diag_data2 = NULL, .set_capi_mode = NULL, .set_capp_recovery = NULL, + .set_fence_state = npu_set_fence_state, }; static void assign_mmio_bars(uint32_t gcid, uint32_t xscom, diff --git a/include/npu.h b/include/npu.h index 50cc9c9fc..45818a28f 100644 --- a/include/npu.h +++ b/include/npu.h @@ -153,7 +153,6 @@ int64_t npu_dev_procedure(void *dev, struct pci_cfg_reg_filter *pcrf, uint32_t offset, uint32_t len, uint32_t *data, bool write); -void npu_set_fence_state(struct npu *p, bool fence); void npu_dev_procedure_reset(struct npu_dev *dev); #define NPUDBG(p, fmt, a...) prlog(PR_DEBUG, "NPU%d: " fmt, \ diff --git a/include/pci.h b/include/pci.h index eb23a6d9b..05d02171b 100644 --- a/include/pci.h +++ b/include/pci.h @@ -340,6 +340,9 @@ struct phb_ops { /* Get/set PBCQ Tunnel BAR register */ void (*get_tunnel_bar)(struct phb *phb, uint64_t *addr); int64_t (*set_tunnel_bar)(struct phb *phb, uint64_t addr); + + /* Currently only used by NPU HMI code */ + void (*set_fence_state)(struct phb *phb, bool fence); }; enum phb_type { -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:58 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:58 +1000 Subject: [Skiboot] [PATCH v2 08/10] npu: Move npu.o and npu-hw-procedules.o under CONIFG_P8 In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-9-npiggin@gmail.com> From: Stewart Smith This saves an extra 6kb of skiboot.lid.xz. Signed-off-by: Stewart Smith --- hw/Makefile.inc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/Makefile.inc b/hw/Makefile.inc index d436da222..ff207b166 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -10,9 +10,9 @@ HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o HW_OBJS += ocmb.o xive2.o -HW_OBJS += npu.o npu-hw-procedures.o ifeq ($(CONFIG_P8),1) HW_OBJS += phb3.o +HW_OBJS += npu.o npu-hw-procedures.o endif HW=hw/built-in.a -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:20:59 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:20:59 +1000 Subject: [Skiboot] [PATCH v2 09/10] platforms: put P8 platforms behind CONFIG_P8 In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-10-npiggin@gmail.com> From: Stewart Smith Shaves an additional 4kb off skiboot.lid.xz. Signed-off-by: Stewart Smith --- platforms/astbmc/Makefile.inc | 12 ++++++++---- platforms/ibm-fsp/Makefile.inc | 7 ++++++- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/platforms/astbmc/Makefile.inc b/platforms/astbmc/Makefile.inc index 070813231..1cdf37f2a 100644 --- a/platforms/astbmc/Makefile.inc +++ b/platforms/astbmc/Makefile.inc @@ -1,13 +1,17 @@ SUBDIRS += $(PLATDIR)/astbmc ASTBMC_OBJS = pnor.o common.o slots.o \ - palmetto.o habanero.o firestone.o \ - p8dtu.o p8dnu.o \ - garrison.o barreleye.o \ witherspoon.o zaius.o romulus.o p9dsu.o \ - vesnin.o nicole.o mihawk.o mowgli.o \ + nicole.o mihawk.o mowgli.o \ talos.o blackbird.o \ swift.o rainier.o +ifeq ($(CONFIG_P8),1) +ASTBMC_OBJS += palmetto.o habanero.o firestone.o \ + p8dtu.o p8dnu.o \ + garrison.o barreleye.o \ + vesnin.o +endif + ASTBMC = $(PLATDIR)/astbmc/built-in.a $(ASTBMC): $(ASTBMC_OBJS:%=$(PLATDIR)/astbmc/%) diff --git a/platforms/ibm-fsp/Makefile.inc b/platforms/ibm-fsp/Makefile.inc index 8883f09c1..fd80a79a9 100644 --- a/platforms/ibm-fsp/Makefile.inc +++ b/platforms/ibm-fsp/Makefile.inc @@ -1,7 +1,12 @@ SUBDIRS += $(PLATDIR)/ibm-fsp IBM_FSP_OBJS = common.o lxvpd.o hostservices.o fsp-vpd.o \ - firenze.o firenze-pci.o zz.o + firenze-pci.o zz.o + +ifeq ($(CONFIG_P8),1) +IBM_FSP_OBJS += firenze.o +endif + IBM_FSP = $(PLATDIR)/ibm-fsp/built-in.a ifeq ($(CONFIG_FSP),1) -- 2.23.0 From npiggin at gmail.com Sat Aug 7 14:21:00 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Sat, 7 Aug 2021 14:21:00 +1000 Subject: [Skiboot] [PATCH v2 10/10] npu: Add CONFIG_NPU to optionally skip NPU code In-Reply-To: <20210807042100.399449-1-npiggin@gmail.com> References: <20210807042100.399449-1-npiggin@gmail.com> Message-ID: <20210807042100.399449-11-npiggin@gmail.com> From: Stewart Smith Saves a whopping 39kb of skiboot.lid.xz. Signed-off-by: Stewart Smith --- Makefile | 2 ++ Makefile.main | 4 ++++ core/hmi.c | 10 +++++++++- core/platform.c | 1 - hw/Makefile.inc | 12 +++++++++--- hw/npu2.c | 1 + include/npu2.h | 6 ++++++ include/pci.h | 3 +++ platforms/astbmc/Makefile.inc | 15 +++++++++++---- 9 files changed, 45 insertions(+), 9 deletions(-) diff --git a/Makefile b/Makefile index a9807c4dc..115c97fcd 100644 --- a/Makefile +++ b/Makefile @@ -67,6 +67,8 @@ DEAD_CODE_ELIMINATION ?= 0 CONFIG_FSP?=1 # Try to build without POWER8 support CONFIG_P8?=1 +# Try and build without any NPU support +CONFIG_NPU?=1 # # Where is the source directory, must be a full path (no ~) diff --git a/Makefile.main b/Makefile.main index 2a346a6c9..dce0338da 100644 --- a/Makefile.main +++ b/Makefile.main @@ -165,6 +165,10 @@ ifeq ($(CONFIG_P8),1) CFLAGS += -DCONFIG_P8=1 endif +ifeq ($(CONFIG_NPU),1) +CFLAGS += -DCONFIG_NPU=1 +endif + CFLAGS += $(call try-cflag,$(CC),-Wjump-misses-init) \ $(call try-cflag,$(CC),-Wsuggest-attribute=const) \ $(call try-cflag,$(CC),-Wsuggest-attribute=noreturn) \ diff --git a/core/hmi.c b/core/hmi.c index 55eaa59c6..279f8b8cf 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -717,6 +717,7 @@ static void find_nx_checkstop_reason(int flat_chip_id, queue_hmi_event(hmi_evt, 0, out_flags); } +#ifdef CONFIG_NPU static bool phb_is_npu2(struct dt_node *dn) { return (dt_node_is_compatible(dn, "ibm,power9-npu-pciex") || @@ -847,7 +848,7 @@ static void find_npu2_checkstop_reason(int flat_chip_id, npu2_hmi_verbose = true; if (npu2_hmi_verbose) { - npu2_dump_scoms(flat_chip_id); + phb->ops->dump_debug_data(flat_chip_id); prlog(PR_ERR, " _________________________ \n"); prlog(PR_ERR, "< It's Debug time! >\n"); prlog(PR_ERR, " ------------------------- \n"); @@ -935,6 +936,13 @@ static void find_npu_checkstop_reason(int flat_chip_id, /* The HMI is "recoverable" because it shouldn't crash the system */ queue_hmi_event(hmi_evt, 1, out_flags); } +#else +static void find_npu_checkstop_reason(int flat_chip_id __unused, + struct OpalHMIEvent *hmi_evt __unused, + uint64_t *out_flags __unused) +{ +} +#endif static void decode_malfunction(struct OpalHMIEvent *hmi_evt, uint64_t *out_flags) { diff --git a/core/platform.c b/core/platform.c index 320fdea03..3f4c8bdd5 100644 --- a/core/platform.c +++ b/core/platform.c @@ -226,7 +226,6 @@ static struct platform generic_platform = { .start_preload_resource = generic_start_preload_resource, .resource_loaded = generic_resource_loaded, .ocapi = &generic_ocapi, - .npu2_device_detect = npu2_i2c_presence_detect, /* Assumes ZZ */ }; const struct bmc_platform *bmc_platform = &generic_bmc; diff --git a/hw/Makefile.inc b/hw/Makefile.inc index ff207b166..627b1a022 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -5,15 +5,21 @@ HW_OBJS += homer.o slw.o occ.o fsi-master.o centaur.o imc.o HW_OBJS += nx.o nx-rng.o nx-crypto.o nx-compress.o nx-842.o nx-gzip.o HW_OBJS += sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o HW_OBJS += dts.o lpc-rtc.o xive.o phb4.o -HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o -HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o -HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o +HW_OBJS += fake-nvram.o lpc-mbox.o +ifeq ($(CONFIG_NPU),1) +HW_OBJS += npu2.o npu2-hw-procedures.o +HW_OBJS += npu2-common.o npu2-opencapi.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o +endif +HW_OBJS += phys-map.o sbe-p9.o capp.o +HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += ocmb.o xive2.o ifeq ($(CONFIG_P8),1) HW_OBJS += phb3.o +ifeq ($(CONFIG_NPU),1) HW_OBJS += npu.o npu-hw-procedures.o endif +endif HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/npu2.c b/hw/npu2.c index cf57eeb0c..e18a1b7b1 100644 --- a/hw/npu2.c +++ b/hw/npu2.c @@ -1316,6 +1316,7 @@ static const struct phb_ops npu_ops = { .set_capi_mode = NULL, .set_capp_recovery = NULL, .tce_kill = npu2_tce_kill, + .dump_debug_data = npu2_dump_scoms, }; static void assign_mmio_bars(uint64_t gcid, uint32_t scom, uint64_t reg[2], uint64_t mm_win[2]) diff --git a/include/npu2.h b/include/npu2.h index eb7c45587..6ab33c702 100644 --- a/include/npu2.h +++ b/include/npu2.h @@ -212,7 +212,13 @@ static inline struct phb *npu2_dev_to_phb(struct npu2_dev *ndev) } } +#ifdef CONFIG_NPU void npu2_i2c_presence_detect(struct npu2 *npu); +#else +static inline void npu2_i2c_presence_detect(struct npu2 *npu __unused) +{ +} +#endif int npu2_opencapi_init_npu(struct npu2 *npu); int npu2_nvlink_init_npu(struct npu2 *npu); void npu2_nvlink_create_phb(struct npu2 *npu, struct dt_node *dn); diff --git a/include/pci.h b/include/pci.h index 05d02171b..c70a507dc 100644 --- a/include/pci.h +++ b/include/pci.h @@ -343,6 +343,9 @@ struct phb_ops { /* Currently only used by NPU HMI code */ void (*set_fence_state)(struct phb *phb, bool fence); + + /* The most terrible of situtions, dump debug data to console. */ + void (*dump_debug_data)(int flat_chip_id); }; enum phb_type { diff --git a/platforms/astbmc/Makefile.inc b/platforms/astbmc/Makefile.inc index 1cdf37f2a..be2267d3f 100644 --- a/platforms/astbmc/Makefile.inc +++ b/platforms/astbmc/Makefile.inc @@ -1,16 +1,23 @@ SUBDIRS += $(PLATDIR)/astbmc ASTBMC_OBJS = pnor.o common.o slots.o \ - witherspoon.o zaius.o romulus.o p9dsu.o \ - nicole.o mihawk.o mowgli.o \ + witherspoon.o romulus.o p9dsu.o \ + nicole.o mowgli.o \ talos.o blackbird.o \ - swift.o rainier.o + rainier.o + +ifeq ($(CONFIG_NPU),1) +ASTBMC_OBJS += zaius.o mihawk.o swift.o +endif ifeq ($(CONFIG_P8),1) ASTBMC_OBJS += palmetto.o habanero.o firestone.o \ p8dtu.o p8dnu.o \ - garrison.o barreleye.o \ + barreleye.o \ vesnin.o +ifeq ($(CONFIG_NPU),1) +ASTBMC_OBJS += garrison.o +endif endif ASTBMC = $(PLATDIR)/astbmc/built-in.a -- 2.23.0 From clg at kaod.org Sat Aug 7 17:38:21 2021 From: clg at kaod.org (=?UTF-8?q?C=C3=A9dric=20Le=20Goater?=) Date: Sat, 7 Aug 2021 09:38:21 +0200 Subject: [Skiboot] [PATCH 3/3] interrupts: Do not advertise XICS support on P10 In-Reply-To: <20210807073821.192901-1-clg@kaod.org> References: <20210807073821.192901-1-clg@kaod.org> Message-ID: <20210807073821.192901-3-clg@kaod.org> We only support the XIVE interface. Signed-off-by: C?dric Le Goater --- include/xive.h | 1 + core/interrupts.c | 12 +++++++++++- hw/xive2.c | 5 +++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/include/xive.h b/include/xive.h index 1a8a2e02714b..0c8041efc07e 100644 --- a/include/xive.h +++ b/include/xive.h @@ -79,6 +79,7 @@ bool xive2_cap_phb_pq_disable(void); bool xive2_cap_phb_abt(void); bool xive2_cap_store_eoi(void); int64_t xive2_reset(void); +uint32_t xive2_get_phandle(void); uint32_t xive2_alloc_hw_irqs(uint32_t chip_id, uint32_t count, uint32_t align); uint32_t xive2_alloc_ipi_irqs(uint32_t chip_id, uint32_t count, uint32_t align); diff --git a/core/interrupts.c b/core/interrupts.c index 0a617d385aee..5d2d04db589e 100644 --- a/core/interrupts.c +++ b/core/interrupts.c @@ -18,6 +18,7 @@ #include #include #include +#include /* ICP registers */ #define ICP_XIRR 0x4 /* 32-bit access */ @@ -157,9 +158,14 @@ uint32_t get_psi_interrupt(uint32_t chip_id) struct dt_node *add_ics_node(void) { - struct dt_node *ics = dt_new_addr(dt_root, "interrupt-controller", 0); + struct dt_node *ics; bool has_xive; + bool has_xive_only = proc_gen >= proc_gen_p10; + if (has_xive_only) + return NULL; + + ics = dt_new_addr(dt_root, "interrupt-controller", 0); if (!ics) return NULL; @@ -181,6 +187,10 @@ struct dt_node *add_ics_node(void) uint32_t get_ics_phandle(void) { struct dt_node *i; + bool has_xive_only = proc_gen >= proc_gen_p10; + + if (has_xive_only) + return xive2_get_phandle(); for (i = dt_first(dt_root); i; i = dt_next(dt_root, i)) { if (streq(i->name, "interrupt-controller at 0")) { diff --git a/hw/xive2.c b/hw/xive2.c index d0094e9bad99..810ab91d8e0b 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1975,6 +1975,11 @@ static void xive_create_mmio_dt_node(struct xive *x) } +uint32_t xive2_get_phandle(void) +{ + return xive_dt_node->phandle; +} + static void xive_setup_forward_ports(struct xive *x, struct proc_chip *remote_chip) { struct xive *remote_xive = remote_chip->xive; -- 2.31.1 From clg at kaod.org Sat Aug 7 17:38:20 2021 From: clg at kaod.org (=?UTF-8?q?C=C3=A9dric=20Le=20Goater?=) Date: Sat, 7 Aug 2021 09:38:20 +0200 Subject: [Skiboot] [PATCH 2/3] xive/p10: Fix mismatch errors when DEBUG=1 In-Reply-To: <20210807073821.192901-1-clg@kaod.org> References: <20210807073821.192901-1-clg@kaod.org> Message-ID: <20210807073821.192901-2-clg@kaod.org> HW has some reserved fields which break the comparison when checking END cache updates. Signed-off-by: C?dric Le Goater --- include/xive2-regs.h | 3 +++ hw/xive2.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/include/xive2-regs.h b/include/xive2-regs.h index 1f7a3e721b64..367c1ea96308 100644 --- a/include/xive2-regs.h +++ b/include/xive2-regs.h @@ -479,6 +479,7 @@ struct xive_end { #define END_W0_ESCALATE_END PPC_BIT32(13) /* "N" bit */ #define END_W0_FIRMWARE1 PPC_BIT32(16) /* Owned by FW */ #define END_W0_FIRMWARE2 PPC_BIT32(17) /* Owned by FW */ +#define END_W0_RESERVED PPC_BITMASK32(24,31) beint32_t w1; #define END_W1_ES PPC_BITMASK32(0,3) #define END_W1_ESn PPC_BITMASK32(0,1) @@ -487,6 +488,7 @@ struct xive_end { #define END_W1_ESe PPC_BITMASK32(2,3) #define END_W1_ESe_P PPC_BIT32(2) #define END_W1_ESe_Q PPC_BIT32(3) +#define END_W1_RESERVED PPC_BITMASK32(6,7) #define END_W1_GEN_FLIPPED PPC_BIT32(8) #define END_W1_GENERATION PPC_BIT32(9) #define END_W1_PAGE_OFF PPC_BITMASK32(10,31) @@ -511,6 +513,7 @@ struct xive_end { beint32_t w7; #define END_W7_TOPO PPC_BITMASK32(0,3) /* Owned by HW */ #define END_W7_F0_PRIORITY PPC_BITMASK32(8,15) +#define END_W7_F0_RESERVED PPC_BITMASK32(16,31) #define END_W7_F1_LOG_SERVER_ID PPC_BITMASK32(4,31) }; #define xive_end_is_firmware1(end) \ diff --git a/hw/xive2.c b/hw/xive2.c index c09dd555f4ef..d0094e9bad99 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1166,6 +1166,9 @@ static bool xive_check_endc_update(struct xive *x, uint32_t idx, struct xive_end assert(end_p); end2 = *end_p; + end2.w0 &= ~END_W0_RESERVED; + end2.w1 &= ~END_W1_RESERVED; + end2.w7 &= ~END_W7_F0_RESERVED; if (memcmp(end, &end2, sizeof(struct xive_end)) != 0) { xive_err(x, "END update mismatch idx %d\n", idx); xive_err(x, "want: %08x %08x %08x %08x\n", -- 2.31.1 From clg at kaod.org Sat Aug 7 17:38:19 2021 From: clg at kaod.org (=?UTF-8?q?C=C3=A9dric=20Le=20Goater?=) Date: Sat, 7 Aug 2021 09:38:19 +0200 Subject: [Skiboot] [PATCH 1/3] xive/p10: Fix xive_special_cache_check when DEBUG=1 Message-ID: <20210807073821.192901-1-clg@kaod.org> The special cache check done when skiboot is compiled with DEBUG is incompatible with Automatic Context Save and Restore. Random data is written in the NVP to check that cache updates are correct but this can lead to a checkstop raised by the XIVE interrupt controller. When the NVP Valid (0) bit, the hardware controlled H (7) bit, and the Checked Out bit (45) are all ones at the same time, the HW thinks that the NVP entry is checked out by a thread and does not allow the cache write to occur. Make sure that the valid bit is not set on the NVP. Signed-off-by: C?dric Le Goater --- hw/xive2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/xive2.c b/hw/xive2.c index d5814bcbfd0c..c09dd555f4ef 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -25,7 +25,6 @@ /* Verbose debug */ #undef XIVE_VERBOSE_DEBUG -#undef DEBUG /* Extra debug options used in debug builds */ #ifdef DEBUG @@ -2938,6 +2937,7 @@ static void xive_special_cache_check(struct xive *x, uint32_t blk, uint32_t idx) struct xive_nvp *vp_m = xive_get_vp(x, idx); memset(vp_m, (~i) & 0xff, sizeof(*vp_m)); + vp_m->w0 = xive_set_field32(NVP_W0_VALID, vp_m->w0, 0); sync(); vp.w1 = (i << 16) | i; assert(!xive_nxc_cache_update(x, blk, idx, &vp, true)); -- 2.31.1 From npiggin at gmail.com Wed Aug 11 15:46:52 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:52 +1000 Subject: [Skiboot] [PATCH v3 01/10] Remove support for POWER8 DD1 In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-2-npiggin@gmail.com> This significantly simplifies the SLW code. HILE is now always supported. Reviewed-by: Stewart Smith Signed-off-by: Nicholas Piggin --- core/cpu.c | 23 ++-- hw/slw.c | 323 ---------------------------------------------- include/skiboot.h | 5 - 3 files changed, 9 insertions(+), 342 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index f58aeb27a..60a9ea1c3 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -35,7 +35,6 @@ unsigned int cpu_thread_count; unsigned int cpu_max_pir; struct cpu_thread *boot_cpu; static struct lock reinit_lock = LOCK_UNLOCKED; -static bool hile_supported; static bool radix_supported; static unsigned long hid0_hile; static unsigned long hid0_attn; @@ -999,27 +998,23 @@ void init_boot_cpu(void) case PVR_TYPE_P8E: case PVR_TYPE_P8: proc_gen = proc_gen_p8; - hile_supported = PVR_VERS_MAJ(mfspr(SPR_PVR)) >= 2; hid0_hile = SPR_HID0_POWER8_HILE; hid0_attn = SPR_HID0_POWER8_ENABLE_ATTN; break; case PVR_TYPE_P8NVL: proc_gen = proc_gen_p8; - hile_supported = true; hid0_hile = SPR_HID0_POWER8_HILE; hid0_attn = SPR_HID0_POWER8_ENABLE_ATTN; break; case PVR_TYPE_P9: case PVR_TYPE_P9P: proc_gen = proc_gen_p9; - hile_supported = true; radix_supported = true; hid0_hile = SPR_HID0_POWER9_HILE; hid0_attn = SPR_HID0_POWER9_ENABLE_ATTN; break; case PVR_TYPE_P10: proc_gen = proc_gen_p10; - hile_supported = true; radix_supported = true; hid0_hile = SPR_HID0_POWER10_HILE; hid0_attn = SPR_HID0_POWER10_ENABLE_ATTN; @@ -1056,6 +1051,11 @@ void init_boot_cpu(void) cpu_thread_count = 1; } + if (proc_gen == proc_gen_p8 && (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1)) { + prerror("CPU: POWER8 DD1 is not supported\n"); + abort(); + } + if (is_power9n(pvr) && (PVR_VERS_MAJ(pvr) == 1)) { prerror("CPU: POWER9N DD1 is not supported\n"); abort(); @@ -1597,7 +1597,7 @@ static int64_t opal_reinit_cpus(uint64_t flags) } /* * Now we need to mark ourselves "active" or we'll be skipped - * by the various "for_each_active_..." calls done by slw_reinit() + * by the various "for_each_active_..." */ this_cpu()->state = cpu_state_active; this_cpu()->in_reinit = true; @@ -1611,10 +1611,8 @@ static int64_t opal_reinit_cpus(uint64_t flags) */ cpu_cleanup_all(); - /* If HILE change via HID0 is supported ... */ - if (hile_supported && - (flags & (OPAL_REINIT_CPUS_HILE_BE | - OPAL_REINIT_CPUS_HILE_LE))) { + if (flags & (OPAL_REINIT_CPUS_HILE_BE | + OPAL_REINIT_CPUS_HILE_LE)) { bool hile = !!(flags & OPAL_REINIT_CPUS_HILE_LE); flags &= ~(OPAL_REINIT_CPUS_HILE_BE | OPAL_REINIT_CPUS_HILE_LE); @@ -1669,10 +1667,7 @@ static int64_t opal_reinit_cpus(uint64_t flags) rc = OPAL_SUCCESS; } - /* Handle P8 DD1 SLW reinit */ - if (flags != 0 && proc_gen == proc_gen_p8 && !hile_supported) - rc = slw_reinit(flags); - else if (flags != 0) + if (flags != 0) rc = OPAL_UNSUPPORTED; /* And undo the above */ diff --git a/hw/slw.c b/hw/slw.c index 56ba05b0a..178ee4f85 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -29,10 +29,6 @@ #include #include -static uint32_t slw_saved_reset[0x100]; - -static bool slw_current_le = false; - enum wakeup_engine_states wakeup_engine_state = WAKEUP_ENGINE_NOT_PRESENT; bool has_deep_states = false; @@ -52,125 +48,6 @@ DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -static void slw_do_rvwinkle(void *data) -{ - struct cpu_thread *cpu = this_cpu(); - struct cpu_thread *master = data; - uint64_t lpcr = mfspr(SPR_LPCR); - struct proc_chip *chip; - - /* Setup our ICP to receive IPIs */ - icp_prep_for_pm(); - - /* Setup LPCR to wakeup on external interrupts only */ - mtspr(SPR_LPCR, ((lpcr & ~SPR_LPCR_P8_PECE) | SPR_LPCR_P8_PECE2)); - isync(); - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x going to rvwinkle...\n", - cpu->pir); - - /* Tell that we got it */ - cpu->state = cpu_state_rvwinkle; - - enter_p8_pm_state(1); - - /* Restore SPRs */ - init_shared_sprs(); - init_replicated_sprs(); - - /* Ok, it's ours again */ - cpu->state = cpu_state_active; - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x woken up !\n", cpu->pir); - - /* Cleanup our ICP */ - reset_cpu_icp(); - - /* Resync timebase */ - chiptod_wakeup_resync(); - - /* Restore LPCR */ - mtspr(SPR_LPCR, lpcr); - isync(); - - /* If we are passed a master pointer we are the designated - * waker, let's proceed. If not, return, we are finished. - */ - if (!master) - return; - - prlog(PR_DEBUG, "SLW: CPU PIR 0x%04x waiting for master...\n", - cpu->pir); - - /* Allriiiight... now wait for master to go down */ - while(master->state != cpu_state_rvwinkle) - sync(); - - /* XXX Wait one second ! (should check xscom state ? ) */ - time_wait_ms(1000); - - for_each_chip(chip) { - struct cpu_thread *c; - uint64_t tmp; - for_each_available_core_in_chip(c, chip->id) { - xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - prlog(PR_TRACE, "SLW: core %x:%x" - " history: 0x%016llx (mid2)\n", - chip->id, pir_to_core_id(c->pir), - tmp); - } - } - - prlog(PR_DEBUG, "SLW: Waking master (PIR 0x%04x)...\n", master->pir); - - /* Now poke all the secondary threads on the master's core */ - for_each_cpu(cpu) { - if (!cpu_is_sibling(cpu, master) || (cpu == master)) - continue; - icp_kick_cpu(cpu); - - /* Wait for it to claim to be back (XXX ADD TIMEOUT) */ - while(cpu->state != cpu_state_active) - sync(); - } - - /* Now poke the master and be gone */ - icp_kick_cpu(master); -} - -static void slw_patch_reset(void) -{ - uint32_t *src, *dst, *sav; - - src = &reset_patch_start; - dst = (uint32_t *)0x100; - sav = slw_saved_reset; - while(src < &reset_patch_end) { - *(sav++) = *(dst); - *(dst++) = *(src++); - } - sync_icache(); -} - -static void slw_unpatch_reset(void) -{ - extern uint32_t reset_patch_start; - extern uint32_t reset_patch_end; - uint32_t *src, *dst, *sav; - - src = &reset_patch_start; - dst = (uint32_t *)0x100; - sav = slw_saved_reset; - while(src < &reset_patch_end) { - *(dst++) = *(sav++); - src++; - } - sync_icache(); -} - static bool slw_general_init(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -274,15 +151,6 @@ static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) return true; } -static bool slw_unset_overrides(struct proc_chip *chip, struct cpu_thread *c) -{ - uint32_t core = pir_to_core_id(c->pir); - - /* XXX FIXME: Save and restore the overrides */ - prlog(PR_DEBUG, "SLW: slw_unset_overrides %x:%x\n", chip->id, core); - return true; -} - static bool slw_set_idle_mode(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -1201,197 +1069,6 @@ void add_cpu_idle_state_properties(void) free(pm_ctrl_reg_mask_buf); } -static void slw_cleanup_core(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - - /* Display history to check transition */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_GET), - "SLW: Failed to read PM_IDLE_STATE_HISTORY\n"); - /* XXX error handling ? return false; */ - } - - prlog(PR_DEBUG, "SLW: core %x:%x history: 0x%016llx (new1)\n", - chip->id, pir_to_core_id(c->pir), tmp); - - rc = xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_GET), - "SLW: Failed to read PM_IDLE_STATE_HISTORY\n"); - /* XXX error handling ? return false; */ - } - - prlog(PR_DEBUG, "SLW: core %x:%x history: 0x%016llx (new2)\n", - chip->id, pir_to_core_id(c->pir), tmp); - - /* - * XXX FIXME: Error out if the transition didn't reach rvwinkle ? - */ - - /* - * XXX FIXME: We should restore a bunch of the EX bits we - * overwrite to sane values here - */ - slw_unset_overrides(chip, c); -} - -static void slw_cleanup_chip(struct proc_chip *chip) -{ - struct cpu_thread *c; - - for_each_available_core_in_chip(c, chip->id) - slw_cleanup_core(chip, c); -} - -static void slw_patch_scans(struct proc_chip *chip, bool le_mode) -{ - int64_t rc; - uint64_t old_val, new_val; - - rc = sbe_xip_get_scalar((void *)chip->slw_base, - "skip_ex_override_ring_scans", &old_val); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to read scan override on chip %d\n", - chip->id); - return; - } - - new_val = le_mode ? 0 : 1; - - prlog(PR_TRACE, "SLW: Chip %d, LE value was: %lld, setting to %lld\n", - chip->id, old_val, new_val); - - rc = sbe_xip_set_scalar((void *)chip->slw_base, - "skip_ex_override_ring_scans", new_val); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to set LE mode on chip %d\n", chip->id); - return; - } -} - -int64_t slw_reinit(uint64_t flags) -{ - struct proc_chip *chip; - struct cpu_thread *cpu; - bool has_waker = false; - bool target_le = slw_current_le; - - if (flags & OPAL_REINIT_CPUS_HILE_BE) - target_le = false; - if (flags & OPAL_REINIT_CPUS_HILE_LE) - target_le = true; - - prlog(PR_TRACE, "SLW Reinit from CPU PIR 0x%04x," - " HILE set to %s endian...\n", - this_cpu()->pir, - target_le ? "little" : "big"); - - /* Prepare chips/cores for rvwinkle */ - for_each_chip(chip) { - if (!chip->slw_base) { - log_simple_error(&e_info(OPAL_RC_SLW_INIT), - "SLW: Not found on chip %d\n", chip->id); - return OPAL_HARDWARE; - } - - slw_patch_scans(chip, target_le); - } - slw_current_le = target_le; - - /* XXX Save HIDs ? Or do that in head.S ... */ - - slw_patch_reset(); - - /* rvwinkle everybody and pick one to wake me once I rvwinkle myself */ - for_each_available_cpu(cpu) { - struct cpu_thread *master = NULL; - - if (cpu == this_cpu()) - continue; - - /* Pick up a waker for myself: it must not be a sibling of - * the current CPU and must be a thread 0 (so it gets to - * sync its timebase before doing time_wait_ms() - */ - if (!has_waker && !cpu_is_sibling(cpu, this_cpu()) && - cpu_is_thread0(cpu)) { - has_waker = true; - master = this_cpu(); - } - __cpu_queue_job(cpu, "slw_do_rvwinkle", - slw_do_rvwinkle, master, true); - - /* Wait for it to claim to be down */ - while(cpu->state != cpu_state_rvwinkle) - sync(); - } - - /* XXX Wait one second ! (should check xscom state ? ) */ - prlog(PR_TRACE, "SLW: Waiting one second...\n"); - time_wait_ms(1000); - prlog(PR_TRACE, "SLW: Done.\n"); - - for_each_chip(chip) { - struct cpu_thread *c; - uint64_t tmp; - for_each_available_core_in_chip(c, chip->id) { - xscom_read(chip->id, - XSCOM_ADDR_P8_EX_SLAVE(pir_to_core_id(c->pir), - EX_PM_IDLE_STATE_HISTORY_PHYP), - &tmp); - prlog(PR_DEBUG, "SLW: core %x:%x" - " history: 0x%016llx (mid)\n", - chip->id, pir_to_core_id(c->pir), tmp); - } - } - - - /* Wake everybody except on my core */ - for_each_cpu(cpu) { - if (cpu->state != cpu_state_rvwinkle || - cpu_is_sibling(cpu, this_cpu())) - continue; - icp_kick_cpu(cpu); - - /* Wait for it to claim to be back (XXX ADD TIMEOUT) */ - while(cpu->state != cpu_state_active) - sync(); - } - - /* Did we find a waker ? If we didn't, that means we had no - * other core in the system, we can't do it - */ - if (!has_waker) { - prlog(PR_TRACE, "SLW: No candidate waker, giving up !\n"); - return OPAL_HARDWARE; - } - - /* Our siblings are rvwinkling, and our waker is waiting for us - * so let's just go down now - */ - slw_do_rvwinkle(NULL); - - slw_unpatch_reset(); - - for_each_chip(chip) - slw_cleanup_chip(chip); - - prlog(PR_TRACE, "SLW Reinit complete !\n"); - - return OPAL_SUCCESS; -} - static void slw_patch_regs(struct proc_chip *chip) { struct cpu_thread *c; diff --git a/include/skiboot.h b/include/skiboot.h index f3378ec28..fa5323231 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -311,11 +311,6 @@ extern enum wakeup_engine_states wakeup_engine_state; extern bool has_deep_states; extern void nx_p9_rng_late_init(void); - - -/* SLW reinit function for switching core settings */ -extern int64_t slw_reinit(uint64_t flags); - /* Patch SPR in SLW image */ extern int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val); -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:53 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:53 +1000 Subject: [Skiboot] [PATCH v3 02/10] Introduce hwprobe facility to avoid hard-coding probe functions In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-3-npiggin@gmail.com> From: Stewart Smith hwprobe is a little system to have different hardware probing modules run in the dependency order they choose rather than hard coding that order in core/init.c. Signed-off-by: Stewart Smith --- core/Makefile.inc | 1 + core/hwprobe.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++ core/init.c | 3 ++ include/skiboot.h | 39 +++++++++++++++++++++++++- skiboot.lds.S | 6 ++++ 5 files changed, 118 insertions(+), 1 deletion(-) create mode 100644 core/hwprobe.c diff --git a/core/Makefile.inc b/core/Makefile.inc index 829800e5b..f80019b6a 100644 --- a/core/Makefile.inc +++ b/core/Makefile.inc @@ -13,6 +13,7 @@ CORE_OBJS += timer.o i2c.o rtc.o flash.o sensor.o ipmi-opal.o CORE_OBJS += flash-subpartition.o bitmap.o buddy.o pci-quirk.o powercap.o psr.o CORE_OBJS += pci-dt-slot.o direct-controls.o cpufeatures.o CORE_OBJS += flash-firmware-versions.o opal-dump.o +CORE_OBJS += hwprobe.o ifeq ($(SKIBOOT_GCOV),1) CORE_OBJS += gcov-profiling.o diff --git a/core/hwprobe.c b/core/hwprobe.c new file mode 100644 index 000000000..0a641ada5 --- /dev/null +++ b/core/hwprobe.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later +/* Copyright 2021 Stewart Smith */ + +#define pr_fmt(fmt) "HWPROBE: " fmt +#include +#include + +static bool hwprobe_deps_satisfied(const struct hwprobe *hwp) +{ + struct hwprobe *hwprobe; + const char **dep; + unsigned int i; + + dep = hwp->deps; + if (dep == NULL) + return true; + + + prlog(PR_TRACE, "Checking deps for %s\n", hwp->name); + + while (*dep != NULL) { + prlog(PR_TRACE, "Checking %s dep %s\n", hwp->name, *dep); + hwprobe = &__hwprobes_start; + for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { + if(strcmp(hwprobe[i].name, *dep) == 0 && + !hwprobe[i].probed) + return false; + } + dep++; + } + + prlog(PR_TRACE, "deps for %s are satisfied!\n", hwp->name); + return true; + +} + +void probe_hardware(void) +{ + struct hwprobe *hwprobe; + unsigned int i; + bool work_todo = true; + bool did_something = true; + + while (work_todo) { + work_todo = false; + did_something = false; + hwprobe = &__hwprobes_start; + prlog(PR_DEBUG, "Begin loop\n"); + for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { + if (hwprobe[i].probed) + continue; + if (hwprobe_deps_satisfied(&hwprobe[i])) { + prlog(PR_DEBUG, "Probing %s...\n", hwprobe[i].name); + if (hwprobe[i].probe) + hwprobe[i].probe(); + did_something = true; + hwprobe[i].probed = true; + } else { + prlog(PR_DEBUG, "Dependencies for %s not yet satisfied, skipping\n", + hwprobe[i].name); + work_todo = true; + } + } + + if (work_todo && !did_something) { + prlog(PR_ERR, "Cannot satisfy dependencies! Bailing out\n"); + break; + } + } +} diff --git a/core/init.c b/core/init.c index a8bac28a8..61934c9fe 100644 --- a/core/init.c +++ b/core/init.c @@ -1372,6 +1372,9 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) probe_npu2(); probe_npu3(); + /* Probe all HWPROBE hardware we have code linked for*/ + probe_hardware(); + /* Initialize PCI */ pci_init_slots(); diff --git a/include/skiboot.h b/include/skiboot.h index fa5323231..f83fcbdf6 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -1,5 +1,7 @@ // SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later -/* Copyright 2013-2019 IBM Corp. */ +/* Copyright 2013-2019 IBM Corp. + * Copyright 2021 Stewart Smith + */ #ifndef __SKIBOOT_H #define __SKIBOOT_H @@ -341,4 +343,39 @@ extern int fake_nvram_info(uint32_t *total_size); extern int fake_nvram_start_read(void *dst, uint32_t src, uint32_t len); extern int fake_nvram_write(uint32_t offset, void *src, uint32_t size); +/* + * A bunch of hardware needs to be probed, sometimes in a particular order. + * Very simple dependency graph, with a even simpler way to resolve it. + * But it means we can now at link time choose what hardware we support. + * This struct should not be defined directly but with the macros. + */ +struct hwprobe { + const char *name; + void (*probe)(void); + + bool probed; + + /* NULL or NULL-terminated array of strings */ + const char **deps; +}; + +#define DEFINE_HWPROBE(__name, __probe) \ +static const struct hwprobe __used __section(".hwprobes") hwprobe_##__name = { \ + .name = #__name, \ + .probe = __probe, \ + .deps = NULL, \ +} + +#define DEFINE_HWPROBE_DEPS(__name, __probe, ...) \ +static const struct hwprobe __used __section(".hwprobes") hwprobe_##__name = { \ + .name = #__name, \ + .probe = __probe, \ + .deps = (const char *[]){ __VA_ARGS__, NULL}, \ +} + +extern struct hwprobe __hwprobes_start; +extern struct hwprobe __hwprobes_end; + +extern void probe_hardware(void); + #endif /* __SKIBOOT_H */ diff --git a/skiboot.lds.S b/skiboot.lds.S index 5a7f9e316..c8e6e747c 100644 --- a/skiboot.lds.S +++ b/skiboot.lds.S @@ -164,6 +164,12 @@ SECTIONS __platforms_end = .; } + .hwprobes : { + __hwprobes_start = .; + KEEP(*(.hwprobes)) + __hwprobes_end = .; + } + /* Relocations */ . = ALIGN(0x10); .dynamic : { -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:54 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:54 +1000 Subject: [Skiboot] [PATCH v3 03/10] hwprobe: convert PHB, NPU subsystems to hwprobe In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-4-npiggin@gmail.com> From: Stewart Smith [npiggin: split out from initial hwprobe pach] Signed-off-by: Stewart Smith --- core/init.c | 13 +------------ hw/npu.c | 2 ++ hw/npu2-common.c | 2 ++ hw/npu3.c | 2 ++ hw/phb3.c | 2 +- hw/phb4.c | 2 ++ 6 files changed, 10 insertions(+), 13 deletions(-) diff --git a/core/init.c b/core/init.c index 61934c9fe..5e2b18d85 100644 --- a/core/init.c +++ b/core/init.c @@ -1361,18 +1361,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* NX init */ nx_init(); - /* Probe PHB3 on P8 */ - probe_phb3(); - - /* Probe PHB4 on P9 and PHB5 on P10 */ - probe_phb4(); - - /* Probe NPUs */ - probe_npu(); - probe_npu2(); - probe_npu3(); - - /* Probe all HWPROBE hardware we have code linked for*/ + /* Probe all HWPROBE hardware we have code linked for */ probe_hardware(); /* Initialize PCI */ diff --git a/hw/npu.c b/hw/npu.c index dba7ee50f..2b5364c33 100644 --- a/hw/npu.c +++ b/hw/npu.c @@ -1691,3 +1691,5 @@ void probe_npu(void) dt_for_each_compatible(dt_root, np, "ibm,power8-npu-pciex") npu_create_phb(np); } + +DEFINE_HWPROBE_DEPS(npu, probe_npu, "phb3"); diff --git a/hw/npu2-common.c b/hw/npu2-common.c index 3bc9bcee6..87ebf8232 100644 --- a/hw/npu2-common.c +++ b/hw/npu2-common.c @@ -679,3 +679,5 @@ void probe_npu2(void) setup_devices(npu); } } + +DEFINE_HWPROBE_DEPS(npu2, probe_npu2, "phb4"); diff --git a/hw/npu3.c b/hw/npu3.c index 03461373e..92af96b23 100644 --- a/hw/npu3.c +++ b/hw/npu3.c @@ -547,3 +547,5 @@ void probe_npu3(void) npu3_init(npu); } } + +DEFINE_HWPROBE_DEPS(npu3, probe_npu3, "phb4"); diff --git a/hw/phb3.c b/hw/phb3.c index 8af6b6164..320023e57 100644 --- a/hw/phb3.c +++ b/hw/phb3.c @@ -5049,4 +5049,4 @@ void probe_phb3(void) phb3_create(np); } - +DEFINE_HWPROBE(phb3, probe_phb3); diff --git a/hw/phb4.c b/hw/phb4.c index 79083d4a1..ec07fe2bb 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -6398,3 +6398,5 @@ void probe_phb4(void) phb4_create(np); } } + +DEFINE_HWPROBE(phb4, probe_phb4); -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:51 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:51 +1000 Subject: [Skiboot] [PATCH v3 00/10] hwprobe patches Message-ID: <20210811054701.861123-1-npiggin@gmail.com> Since v1: - Rebased on upstream Since v2: - Fixed bug in deps traversal code that could cause some modules to not be probed. - Tested and boots with little-endian patch on a P9. Nicholas Piggin (2): Remove support for POWER8 DD1 hw/slw: Move P8 bits behind CONFIG_P8 Stewart Smith (8): Introduce hwprobe facility to avoid hard-coding probe functions hwprobe: convert PHB, NPU subsystems to hwprobe Add CONFIG_P8 with PHB3 behind it hwprobe: convert vas_init(), nx_init() npu: move npu_set_fence_state() to phb_ops npu: Move npu.o and npu-hw-procedules.o under CONIFG_P8 platforms: put P8 platforms behind CONFIG_P8 npu: Add CONFIG_NPU to optionally skip NPU code Makefile | 4 + Makefile.main | 19 +- core/Makefile.inc | 1 + core/cpu.c | 30 +- core/fast-reboot.c | 2 + core/hmi.c | 12 +- core/hwprobe.c | 70 +++++ core/init.c | 18 +- core/platform.c | 1 - hw/Makefile.inc | 20 +- hw/npu.c | 9 +- hw/npu2-common.c | 2 + hw/npu2.c | 1 + hw/npu3.c | 2 + hw/nx.c | 2 + hw/phb3.c | 2 +- hw/phb4.c | 2 + hw/slw.c | 491 ++++++--------------------------- hw/vas.c | 2 + include/npu.h | 1 - include/npu2.h | 6 + include/pci.h | 6 + include/skiboot.h | 44 ++- libpore/Makefile.inc | 8 +- platforms/astbmc/Makefile.inc | 23 +- platforms/ibm-fsp/Makefile.inc | 7 +- skiboot.lds.S | 6 + 27 files changed, 330 insertions(+), 461 deletions(-) create mode 100644 core/hwprobe.c -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:55 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:55 +1000 Subject: [Skiboot] [PATCH v3 04/10] Add CONFIG_P8 with PHB3 behind it In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-5-npiggin@gmail.com> From: Stewart Smith We can use a base CPU of POWER9 if we don't have P8. We can also hide PHB3 code behind this, and shave 12kb off skiboot.lid.xz [npiggin: add cpp define, fail gracefully on P8] Signed-off-by: Stewart Smith --- Makefile | 2 ++ Makefile.main | 15 ++++++++++++++- core/cpu.c | 11 +++++++++-- hw/Makefile.inc | 8 ++++++-- 4 files changed, 31 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 6e5b91d84..a9807c4dc 100644 --- a/Makefile +++ b/Makefile @@ -65,6 +65,8 @@ ELF_ABI_v2 ?= $(LITTLE_ENDIAN) DEAD_CODE_ELIMINATION ?= 0 # Try to build without FSP code CONFIG_FSP?=1 +# Try to build without POWER8 support +CONFIG_P8?=1 # # Where is the source directory, must be a full path (no ~) diff --git a/Makefile.main b/Makefile.main index c8a63e8b1..2a346a6c9 100644 --- a/Makefile.main +++ b/Makefile.main @@ -96,7 +96,11 @@ CPPFLAGS += -DDEBUG -DCCAN_LIST_DEBUG endif CFLAGS := -fno-strict-aliasing -pie -fpie -fno-pic -m64 -fno-asynchronous-unwind-tables +ifeq ($(CONFIG_P8),1) CFLAGS += -mcpu=power8 +else +CFLAGS += -mcpu=power9 +endif CFLAGS += -Wl,--oformat,elf64-powerpc -ggdb # r13,r14,r15 are preserved for OS to use as fixed registers. # These could be saved and restored in and out of skiboot, but it's more @@ -156,6 +160,10 @@ else CFLAGS += -fno-stack-protector endif +# Add preprocessor defines for CONFIG_ options here +ifeq ($(CONFIG_P8),1) +CFLAGS += -DCONFIG_P8=1 +endif CFLAGS += $(call try-cflag,$(CC),-Wjump-misses-init) \ $(call try-cflag,$(CC),-Wsuggest-attribute=const) \ @@ -173,7 +181,12 @@ LDFLAGS := -m64 -static -nostdlib -pie LDFLAGS += -Wl,-pie LDFLAGS += -Wl,-Ttext-segment,$(LD_TEXT) -Wl,-N -Wl,--build-id=none LDFLAGS += -Wl,--no-multi-toc -LDFLAGS += -mcpu=power8 -Wl,--oformat,elf64-powerpc +ifeq ($(CONFIG_P8),1) +LDFLAGS += -mcpu=power8 +else +LDFLAGS += -mcpu=power9 +endif +LDFLAGS += -Wl,--oformat,elf64-powerpc LDFLAGS_FINAL = -m elf64lppc --no-multi-toc -N --build-id=none --whole-archive LDFLAGS_FINAL += -static -nostdlib -pie -Ttext-segment=$(LD_TEXT) --oformat=elf64-powerpc LDFLAGS_FINAL += --orphan-handling=warn diff --git a/core/cpu.c b/core/cpu.c index 60a9ea1c3..d4d33b836 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -1051,9 +1051,16 @@ void init_boot_cpu(void) cpu_thread_count = 1; } - if (proc_gen == proc_gen_p8 && (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1)) { - prerror("CPU: POWER8 DD1 is not supported\n"); + if (proc_gen == proc_gen_p8) { +#ifdef CONFIG_P8 + if (PVR_VERS_MAJ(mfspr(SPR_PVR)) == 1) { + prerror("CPU: POWER8 DD1 is not supported\n"); + abort(); + } +#else + prerror("CPU: POWER8 detected but CONFIG_P8 not set\n"); abort(); +#endif } if (is_power9n(pvr) && (PVR_VERS_MAJ(pvr) == 1)) { diff --git a/hw/Makefile.inc b/hw/Makefile.inc index 37256d3cc..d436da222 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -3,13 +3,17 @@ SUBDIRS += hw HW_OBJS = xscom.o chiptod.o lpc.o lpc-uart.o psi.o HW_OBJS += homer.o slw.o occ.o fsi-master.o centaur.o imc.o HW_OBJS += nx.o nx-rng.o nx-crypto.o nx-compress.o nx-842.o nx-gzip.o -HW_OBJS += phb3.o sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o -HW_OBJS += dts.o lpc-rtc.o npu.o npu-hw-procedures.o xive.o phb4.o +HW_OBJS += sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o +HW_OBJS += dts.o lpc-rtc.o xive.o phb4.o HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o HW_OBJS += ocmb.o xive2.o +HW_OBJS += npu.o npu-hw-procedures.o +ifeq ($(CONFIG_P8),1) +HW_OBJS += phb3.o +endif HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:56 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:56 +1000 Subject: [Skiboot] [PATCH v3 05/10] hw/slw: Move P8 bits behind CONFIG_P8 In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-6-npiggin@gmail.com> This saves about 3kB from skiboot.lid.xz Signed-off-by: Nicholas Piggin --- core/fast-reboot.c | 2 + hw/slw.c | 176 ++++++++++++++++++++++--------------------- libpore/Makefile.inc | 8 +- 3 files changed, 100 insertions(+), 86 deletions(-) diff --git a/core/fast-reboot.c b/core/fast-reboot.c index 9f92525a9..2696348af 100644 --- a/core/fast-reboot.c +++ b/core/fast-reboot.c @@ -272,6 +272,7 @@ static void cleanup_cpu_state(void) /* XXX Update the SLW copies ! Also dbl check HIDs etc... */ init_shared_sprs(); +#ifdef CONFIG_P8 if (proc_gen == proc_gen_p8) { /* If somebody was in fast_sleep, we may have a * workaround to undo @@ -287,6 +288,7 @@ static void cleanup_cpu_state(void) */ cleanup_local_tlb(); } +#endif /* And we might have lost TB sync */ chiptod_wakeup_resync(); diff --git a/hw/slw.c b/hw/slw.c index 178ee4f85..cf633d2ad 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -32,19 +32,20 @@ enum wakeup_engine_states wakeup_engine_state = WAKEUP_ENGINE_NOT_PRESENT; bool has_deep_states = false; -DEFINE_LOG_ENTRY(OPAL_RC_SLW_INIT, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, - OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL, - OPAL_NA); - DEFINE_LOG_ENTRY(OPAL_RC_SLW_SET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -DEFINE_LOG_ENTRY(OPAL_RC_SLW_GET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, +DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); -DEFINE_LOG_ENTRY(OPAL_RC_SLW_REG, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, +#ifdef CONFIG_P8 +DEFINE_LOG_ENTRY(OPAL_RC_SLW_INIT, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, + OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL, + OPAL_NA); + +DEFINE_LOG_ENTRY(OPAL_RC_SLW_GET, OPAL_PLATFORM_ERR_EVT, OPAL_SLW, OPAL_PLATFORM_FIRMWARE, OPAL_INFO, OPAL_NA); @@ -98,59 +99,6 @@ static bool slw_set_overrides(struct proc_chip *chip, struct cpu_thread *c) return true; } -static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - uint32_t core = pir_to_core_id(c->pir); - - /* Special wakeup bits that could hold power mgt */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_SET), - "SLW: Failed to read P10_QME_SPWU_HYP\n"); - return false; - } - if (tmp & P10_SPWU_REQ) - prlog(PR_WARNING, - "SLW: core %d P10_QME_SPWU_HYP requested 0x%016llx\n", - core, tmp); - - return true; -} - - -static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) -{ - uint64_t tmp; - int rc; - uint32_t core = pir_to_core_id(c->pir); - - /* Special wakeup bits that could hold power mgt */ - rc = xscom_read(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), - &tmp); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_SET), - "SLW: Failed to read EC_PPM_SPECIAL_WKUP_HYP\n"); - return false; - } - if (tmp) - prlog(PR_WARNING, - "SLW: core %d EC_PPM_SPECIAL_WKUP_HYP read 0x%016llx\n", - core, tmp); - rc = xscom_read(chip->id, - XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_OTR), - &tmp); - if (tmp) - prlog(PR_WARNING, - "SLW: core %d EC_PPM_SPECIAL_WKUP_OTR read 0x%016llx\n", - core, tmp); - return true; -} - static bool slw_set_idle_mode(struct proc_chip *chip, struct cpu_thread *c) { uint32_t core = pir_to_core_id(c->pir); @@ -242,6 +190,60 @@ static bool idle_prepare_core(struct proc_chip *chip, struct cpu_thread *c) return true; } +#endif + +static bool slw_set_overrides_p10(struct proc_chip *chip, struct cpu_thread *c) +{ + uint64_t tmp; + int rc; + uint32_t core = pir_to_core_id(c->pir); + + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P10_QME_CORE(core, P10_QME_SPWU_HYP), + &tmp); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_SET), + "SLW: Failed to read P10_QME_SPWU_HYP\n"); + return false; + } + if (tmp & P10_SPWU_REQ) + prlog(PR_WARNING, + "SLW: core %d P10_QME_SPWU_HYP requested 0x%016llx\n", + core, tmp); + + return true; +} + + +static bool slw_set_overrides_p9(struct proc_chip *chip, struct cpu_thread *c) +{ + uint64_t tmp; + int rc; + uint32_t core = pir_to_core_id(c->pir); + + /* Special wakeup bits that could hold power mgt */ + rc = xscom_read(chip->id, + XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_HYP), + &tmp); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_SET), + "SLW: Failed to read EC_PPM_SPECIAL_WKUP_HYP\n"); + return false; + } + if (tmp) + prlog(PR_WARNING, + "SLW: core %d EC_PPM_SPECIAL_WKUP_HYP read 0x%016llx\n", + core, tmp); + rc = xscom_read(chip->id, + XSCOM_ADDR_P9_EC_SLAVE(core, EC_PPM_SPECIAL_WKUP_OTR), + &tmp); + if (tmp) + prlog(PR_WARNING, + "SLW: core %d EC_PPM_SPECIAL_WKUP_OTR read 0x%016llx\n", + core, tmp); + return true; +} /* Define device-tree fields */ #define MAX_NAME_LEN 16 @@ -1069,31 +1071,6 @@ void add_cpu_idle_state_properties(void) free(pm_ctrl_reg_mask_buf); } -static void slw_patch_regs(struct proc_chip *chip) -{ - struct cpu_thread *c; - void *image = (void *)chip->slw_base; - int rc; - - for_each_available_cpu(c) { - if (c->chip_id != chip->id) - continue; - - /* Clear HRMOR */ - rc = p8_pore_gen_cpureg_fixed(image, P8_SLW_MODEBUILD_SRAM, - P8_SPR_HRMOR, 0, - cpu_get_core_index(c), - cpu_get_thread_index(c)); - if (rc) { - log_simple_error(&e_info(OPAL_RC_SLW_REG), - "SLW: Failed to set HRMOR for CPU %x\n", - c->pir); - } - - /* XXX Add HIDs etc... */ - } -} - static void slw_init_chip_p9(struct proc_chip *chip) { struct cpu_thread *c; @@ -1135,6 +1112,32 @@ static bool slw_image_check_p9(struct proc_chip *chip) } +#ifdef CONFIG_P8 +static void slw_patch_regs(struct proc_chip *chip) +{ + struct cpu_thread *c; + void *image = (void *)chip->slw_base; + int rc; + + for_each_available_cpu(c) { + if (c->chip_id != chip->id) + continue; + + /* Clear HRMOR */ + rc = p8_pore_gen_cpureg_fixed(image, P8_SLW_MODEBUILD_SRAM, + P8_SPR_HRMOR, 0, + cpu_get_core_index(c), + cpu_get_thread_index(c)); + if (rc) { + log_simple_error(&e_info(OPAL_RC_SLW_REG), + "SLW: Failed to set HRMOR for CPU %x\n", + c->pir); + } + + /* XXX Add HIDs etc... */ + } +} + static bool slw_image_check_p8(struct proc_chip *chip) { int64_t rc; @@ -1284,6 +1287,7 @@ static int64_t opal_config_cpu_idle_state(uint64_t state, uint64_t enter) } opal_call(OPAL_CONFIG_CPU_IDLE_STATE, opal_config_cpu_idle_state, 2); +#endif int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) { @@ -1324,6 +1328,7 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) sprn, val, cpu_pir); } +#ifdef CONFIG_P8 } else if (proc_gen == proc_gen_p8) { int spr_is_supported = 0; void *image; @@ -1347,6 +1352,7 @@ int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val) sprn, val, cpu_get_core_index(c), cpu_get_thread_index(c)); +#endif } else { log_simple_error(&e_info(OPAL_RC_SLW_REG), "SLW: proc_gen not supported\n"); @@ -1378,6 +1384,7 @@ void slw_init(void) return; } if (proc_gen == proc_gen_p8) { +#ifdef CONFIG_P8 for_each_chip(chip) { slw_init_chip_p8(chip); if(slw_image_check_p8(chip)) @@ -1386,6 +1393,7 @@ void slw_init(void) slw_late_init_p8(chip); } p8_sbe_init_timer(); +#endif } else if (proc_gen == proc_gen_p9) { for_each_chip(chip) { slw_init_chip_p9(chip); diff --git a/libpore/Makefile.inc b/libpore/Makefile.inc index 06d9c8902..a60674856 100644 --- a/libpore/Makefile.inc +++ b/libpore/Makefile.inc @@ -1,5 +1,9 @@ -LIBPORE_SRCS = p8_pore_table_gen_api_fixed.C p9_stop_api.C p9_stop_util.C p10_stop_api.C p10_stop_util.C -LIBPORE_SRCS += p8_pore_table_static_data.c sbe_xip_image.c pore_inline_assembler.c +LIBPORE_SRCS = p9_stop_api.C p9_stop_util.C p10_stop_api.C p10_stop_util.C +LIBPORE_SRCS += sbe_xip_image.c pore_inline_assembler.c +ifeq ($(CONFIG_P8),1) +LIBPORE_SRCS += p8_pore_table_gen_api_fixed.C p8_pore_table_static_data.c +endif + LIBPORE_OBJS_1 = $(LIBPORE_SRCS:%.c=%.o) LIBPORE_OBJS = $(LIBPORE_OBJS_1:%.C=%.o) SUBDIRS += libpore -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:57 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:57 +1000 Subject: [Skiboot] [PATCH v3 06/10] hwprobe: convert vas_init(), nx_init() In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-7-npiggin@gmail.com> From: Stewart Smith [npiggin: remove imc_init because it moved later in boot (fbcbd4e47c)] Signed-off-by: Stewart Smith --- core/init.c | 6 ------ hw/nx.c | 2 ++ hw/vas.c | 2 ++ 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/core/init.c b/core/init.c index 5e2b18d85..0ec5d6ac3 100644 --- a/core/init.c +++ b/core/init.c @@ -1355,12 +1355,6 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Catalog decompression routine */ imc_decompress_catalog(); - /* Virtual Accelerator Switchboard */ - vas_init(); - - /* NX init */ - nx_init(); - /* Probe all HWPROBE hardware we have code linked for */ probe_hardware(); diff --git a/hw/nx.c b/hw/nx.c index fdadf53c7..b1cab5774 100644 --- a/hw/nx.c +++ b/hw/nx.c @@ -136,3 +136,5 @@ void nx_init(void) if (proc_gen >= proc_gen_p9) darn_init(); } + +DEFINE_HWPROBE_DEPS(nx, nx_init, "vas"); diff --git a/hw/vas.c b/hw/vas.c index 0dbe0bcda..96ca055cc 100644 --- a/hw/vas.c +++ b/hw/vas.c @@ -637,3 +637,5 @@ out: vas_err("Disabled (failed initialization)\n"); return; } + +DEFINE_HWPROBE(vas, vas_init); -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:58 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:58 +1000 Subject: [Skiboot] [PATCH v3 07/10] npu: move npu_set_fence_state() to phb_ops In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-8-npiggin@gmail.com> From: Stewart Smith This lets us consider not building in npu.o Signed-off-by: Stewart Smith --- core/hmi.c | 2 +- hw/npu.c | 7 +++++-- include/npu.h | 1 - include/pci.h | 3 +++ 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/core/hmi.c b/core/hmi.c index 9363cc5fb..55eaa59c6 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -924,7 +924,7 @@ static void find_npu_checkstop_reason(int flat_chip_id, npu_fir_action0, npu_fir_action1); /* Set the NPU to fenced since it can't recover. */ - npu_set_fence_state(p, true); + phb->ops->set_fence_state(phb, true); /* Set up the HMI event */ hmi_evt->severity = OpalHMI_SEV_WARNING; diff --git a/hw/npu.c b/hw/npu.c index 2b5364c33..6992e7e72 100644 --- a/hw/npu.c +++ b/hw/npu.c @@ -925,7 +925,9 @@ static int64_t npu_eeh_next_error(struct phb *phb, } /* For use in error injection and handling. */ -void npu_set_fence_state(struct npu *p, bool fence) { +static void npu_set_fence_state(struct phb *phb, bool fence) { + struct npu *p = phb_to_npu(phb); + p->fenced = fence; if (fence) @@ -968,7 +970,7 @@ static int64_t npu_err_inject(struct phb *phb, uint64_t pe_number, return OPAL_PARAMETER; } else if (type == 1) { /* Emulate fence mode. */ - npu_set_fence_state(p, true); + npu_set_fence_state(phb, true); } else { /* Cause a freeze with an invalid MMIO read. If the BAR is not * enabled, this will checkstop the machine. @@ -1012,6 +1014,7 @@ static const struct phb_ops npu_ops = { .get_diag_data2 = NULL, .set_capi_mode = NULL, .set_capp_recovery = NULL, + .set_fence_state = npu_set_fence_state, }; static void assign_mmio_bars(uint32_t gcid, uint32_t xscom, diff --git a/include/npu.h b/include/npu.h index 50cc9c9fc..45818a28f 100644 --- a/include/npu.h +++ b/include/npu.h @@ -153,7 +153,6 @@ int64_t npu_dev_procedure(void *dev, struct pci_cfg_reg_filter *pcrf, uint32_t offset, uint32_t len, uint32_t *data, bool write); -void npu_set_fence_state(struct npu *p, bool fence); void npu_dev_procedure_reset(struct npu_dev *dev); #define NPUDBG(p, fmt, a...) prlog(PR_DEBUG, "NPU%d: " fmt, \ diff --git a/include/pci.h b/include/pci.h index eb23a6d9b..05d02171b 100644 --- a/include/pci.h +++ b/include/pci.h @@ -340,6 +340,9 @@ struct phb_ops { /* Get/set PBCQ Tunnel BAR register */ void (*get_tunnel_bar)(struct phb *phb, uint64_t *addr); int64_t (*set_tunnel_bar)(struct phb *phb, uint64_t addr); + + /* Currently only used by NPU HMI code */ + void (*set_fence_state)(struct phb *phb, bool fence); }; enum phb_type { -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:46:59 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:46:59 +1000 Subject: [Skiboot] [PATCH v3 08/10] npu: Move npu.o and npu-hw-procedules.o under CONIFG_P8 In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-9-npiggin@gmail.com> From: Stewart Smith This saves an extra 6kb of skiboot.lid.xz. Signed-off-by: Stewart Smith --- hw/Makefile.inc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/Makefile.inc b/hw/Makefile.inc index d436da222..ff207b166 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -10,9 +10,9 @@ HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o HW_OBJS += ocmb.o xive2.o -HW_OBJS += npu.o npu-hw-procedures.o ifeq ($(CONFIG_P8),1) HW_OBJS += phb3.o +HW_OBJS += npu.o npu-hw-procedures.o endif HW=hw/built-in.a -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:47:00 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:47:00 +1000 Subject: [Skiboot] [PATCH v3 09/10] platforms: put P8 platforms behind CONFIG_P8 In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-10-npiggin@gmail.com> From: Stewart Smith Shaves an additional 4kb off skiboot.lid.xz. Signed-off-by: Stewart Smith --- platforms/astbmc/Makefile.inc | 12 ++++++++---- platforms/ibm-fsp/Makefile.inc | 7 ++++++- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/platforms/astbmc/Makefile.inc b/platforms/astbmc/Makefile.inc index 070813231..1cdf37f2a 100644 --- a/platforms/astbmc/Makefile.inc +++ b/platforms/astbmc/Makefile.inc @@ -1,13 +1,17 @@ SUBDIRS += $(PLATDIR)/astbmc ASTBMC_OBJS = pnor.o common.o slots.o \ - palmetto.o habanero.o firestone.o \ - p8dtu.o p8dnu.o \ - garrison.o barreleye.o \ witherspoon.o zaius.o romulus.o p9dsu.o \ - vesnin.o nicole.o mihawk.o mowgli.o \ + nicole.o mihawk.o mowgli.o \ talos.o blackbird.o \ swift.o rainier.o +ifeq ($(CONFIG_P8),1) +ASTBMC_OBJS += palmetto.o habanero.o firestone.o \ + p8dtu.o p8dnu.o \ + garrison.o barreleye.o \ + vesnin.o +endif + ASTBMC = $(PLATDIR)/astbmc/built-in.a $(ASTBMC): $(ASTBMC_OBJS:%=$(PLATDIR)/astbmc/%) diff --git a/platforms/ibm-fsp/Makefile.inc b/platforms/ibm-fsp/Makefile.inc index 8883f09c1..fd80a79a9 100644 --- a/platforms/ibm-fsp/Makefile.inc +++ b/platforms/ibm-fsp/Makefile.inc @@ -1,7 +1,12 @@ SUBDIRS += $(PLATDIR)/ibm-fsp IBM_FSP_OBJS = common.o lxvpd.o hostservices.o fsp-vpd.o \ - firenze.o firenze-pci.o zz.o + firenze-pci.o zz.o + +ifeq ($(CONFIG_P8),1) +IBM_FSP_OBJS += firenze.o +endif + IBM_FSP = $(PLATDIR)/ibm-fsp/built-in.a ifeq ($(CONFIG_FSP),1) -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:47:01 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:47:01 +1000 Subject: [Skiboot] [PATCH v3 10/10] npu: Add CONFIG_NPU to optionally skip NPU code In-Reply-To: <20210811054701.861123-1-npiggin@gmail.com> References: <20210811054701.861123-1-npiggin@gmail.com> Message-ID: <20210811054701.861123-11-npiggin@gmail.com> From: Stewart Smith Saves a whopping 39kb of skiboot.lid.xz. Signed-off-by: Stewart Smith --- Makefile | 2 ++ Makefile.main | 4 ++++ core/hmi.c | 10 +++++++++- core/platform.c | 1 - hw/Makefile.inc | 12 +++++++++--- hw/npu2.c | 1 + include/npu2.h | 6 ++++++ include/pci.h | 3 +++ platforms/astbmc/Makefile.inc | 15 +++++++++++---- 9 files changed, 45 insertions(+), 9 deletions(-) diff --git a/Makefile b/Makefile index a9807c4dc..115c97fcd 100644 --- a/Makefile +++ b/Makefile @@ -67,6 +67,8 @@ DEAD_CODE_ELIMINATION ?= 0 CONFIG_FSP?=1 # Try to build without POWER8 support CONFIG_P8?=1 +# Try and build without any NPU support +CONFIG_NPU?=1 # # Where is the source directory, must be a full path (no ~) diff --git a/Makefile.main b/Makefile.main index 2a346a6c9..dce0338da 100644 --- a/Makefile.main +++ b/Makefile.main @@ -165,6 +165,10 @@ ifeq ($(CONFIG_P8),1) CFLAGS += -DCONFIG_P8=1 endif +ifeq ($(CONFIG_NPU),1) +CFLAGS += -DCONFIG_NPU=1 +endif + CFLAGS += $(call try-cflag,$(CC),-Wjump-misses-init) \ $(call try-cflag,$(CC),-Wsuggest-attribute=const) \ $(call try-cflag,$(CC),-Wsuggest-attribute=noreturn) \ diff --git a/core/hmi.c b/core/hmi.c index 55eaa59c6..279f8b8cf 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -717,6 +717,7 @@ static void find_nx_checkstop_reason(int flat_chip_id, queue_hmi_event(hmi_evt, 0, out_flags); } +#ifdef CONFIG_NPU static bool phb_is_npu2(struct dt_node *dn) { return (dt_node_is_compatible(dn, "ibm,power9-npu-pciex") || @@ -847,7 +848,7 @@ static void find_npu2_checkstop_reason(int flat_chip_id, npu2_hmi_verbose = true; if (npu2_hmi_verbose) { - npu2_dump_scoms(flat_chip_id); + phb->ops->dump_debug_data(flat_chip_id); prlog(PR_ERR, " _________________________ \n"); prlog(PR_ERR, "< It's Debug time! >\n"); prlog(PR_ERR, " ------------------------- \n"); @@ -935,6 +936,13 @@ static void find_npu_checkstop_reason(int flat_chip_id, /* The HMI is "recoverable" because it shouldn't crash the system */ queue_hmi_event(hmi_evt, 1, out_flags); } +#else +static void find_npu_checkstop_reason(int flat_chip_id __unused, + struct OpalHMIEvent *hmi_evt __unused, + uint64_t *out_flags __unused) +{ +} +#endif static void decode_malfunction(struct OpalHMIEvent *hmi_evt, uint64_t *out_flags) { diff --git a/core/platform.c b/core/platform.c index 320fdea03..3f4c8bdd5 100644 --- a/core/platform.c +++ b/core/platform.c @@ -226,7 +226,6 @@ static struct platform generic_platform = { .start_preload_resource = generic_start_preload_resource, .resource_loaded = generic_resource_loaded, .ocapi = &generic_ocapi, - .npu2_device_detect = npu2_i2c_presence_detect, /* Assumes ZZ */ }; const struct bmc_platform *bmc_platform = &generic_bmc; diff --git a/hw/Makefile.inc b/hw/Makefile.inc index ff207b166..627b1a022 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -5,15 +5,21 @@ HW_OBJS += homer.o slw.o occ.o fsi-master.o centaur.o imc.o HW_OBJS += nx.o nx-rng.o nx-crypto.o nx-compress.o nx-842.o nx-gzip.o HW_OBJS += sfc-ctrl.o fake-rtc.o bt.o p8-i2c.o prd.o HW_OBJS += dts.o lpc-rtc.o xive.o phb4.o -HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o -HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o -HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o +HW_OBJS += fake-nvram.o lpc-mbox.o +ifeq ($(CONFIG_NPU),1) +HW_OBJS += npu2.o npu2-hw-procedures.o +HW_OBJS += npu2-common.o npu2-opencapi.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o +endif +HW_OBJS += phys-map.o sbe-p9.o capp.o +HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += ocmb.o xive2.o ifeq ($(CONFIG_P8),1) HW_OBJS += phb3.o +ifeq ($(CONFIG_NPU),1) HW_OBJS += npu.o npu-hw-procedures.o endif +endif HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/npu2.c b/hw/npu2.c index cf57eeb0c..e18a1b7b1 100644 --- a/hw/npu2.c +++ b/hw/npu2.c @@ -1316,6 +1316,7 @@ static const struct phb_ops npu_ops = { .set_capi_mode = NULL, .set_capp_recovery = NULL, .tce_kill = npu2_tce_kill, + .dump_debug_data = npu2_dump_scoms, }; static void assign_mmio_bars(uint64_t gcid, uint32_t scom, uint64_t reg[2], uint64_t mm_win[2]) diff --git a/include/npu2.h b/include/npu2.h index eb7c45587..6ab33c702 100644 --- a/include/npu2.h +++ b/include/npu2.h @@ -212,7 +212,13 @@ static inline struct phb *npu2_dev_to_phb(struct npu2_dev *ndev) } } +#ifdef CONFIG_NPU void npu2_i2c_presence_detect(struct npu2 *npu); +#else +static inline void npu2_i2c_presence_detect(struct npu2 *npu __unused) +{ +} +#endif int npu2_opencapi_init_npu(struct npu2 *npu); int npu2_nvlink_init_npu(struct npu2 *npu); void npu2_nvlink_create_phb(struct npu2 *npu, struct dt_node *dn); diff --git a/include/pci.h b/include/pci.h index 05d02171b..c70a507dc 100644 --- a/include/pci.h +++ b/include/pci.h @@ -343,6 +343,9 @@ struct phb_ops { /* Currently only used by NPU HMI code */ void (*set_fence_state)(struct phb *phb, bool fence); + + /* The most terrible of situtions, dump debug data to console. */ + void (*dump_debug_data)(int flat_chip_id); }; enum phb_type { diff --git a/platforms/astbmc/Makefile.inc b/platforms/astbmc/Makefile.inc index 1cdf37f2a..be2267d3f 100644 --- a/platforms/astbmc/Makefile.inc +++ b/platforms/astbmc/Makefile.inc @@ -1,16 +1,23 @@ SUBDIRS += $(PLATDIR)/astbmc ASTBMC_OBJS = pnor.o common.o slots.o \ - witherspoon.o zaius.o romulus.o p9dsu.o \ - nicole.o mihawk.o mowgli.o \ + witherspoon.o romulus.o p9dsu.o \ + nicole.o mowgli.o \ talos.o blackbird.o \ - swift.o rainier.o + rainier.o + +ifeq ($(CONFIG_NPU),1) +ASTBMC_OBJS += zaius.o mihawk.o swift.o +endif ifeq ($(CONFIG_P8),1) ASTBMC_OBJS += palmetto.o habanero.o firestone.o \ p8dtu.o p8dnu.o \ - garrison.o barreleye.o \ + barreleye.o \ vesnin.o +ifeq ($(CONFIG_NPU),1) +ASTBMC_OBJS += garrison.o +endif endif ASTBMC = $(PLATDIR)/astbmc/built-in.a -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:45 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:45 +1000 Subject: [Skiboot] [PATCH v1 0/6] idle synchronisation improvements Message-ID: <20210811054851.861482-1-npiggin@gmail.com> This patch series makes the locking and synchronisation of the idle code simpler and more correct. It fixes the occasional "cpu_idle_p9 called while pm disabled" message. Thanks, Nick Nicholas Piggin (6): core/cpu: rewrite idle synchronisation core/cpu: remove POWER8 IPI loop core/cpu: refactor IPI sending core/cpu: move cpu_wake out of job_lock core/cpu: make cpu idle states simpler core/cpu: move sleep/wake synchronisation out from low level code core/cpu.c | 405 +++++++++++++++++++++++++------------------------- include/cpu.h | 3 +- 2 files changed, 202 insertions(+), 206 deletions(-) -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:46 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:46 +1000 Subject: [Skiboot] [PATCH v1 1/6] core/cpu: rewrite idle synchronisation In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-2-npiggin@gmail.com> Idle reconfiguration is somewhat racy and not obviously correct, with pm_enabled changing while in low level idle routines, which can result in messages like "cpu_idle_p9 called pm disabled". This changes CPU idle synchronisation to always kick all other CPUs out of idle code first whenever idle settings (pm_enabled, IPIs, sreset etc) are to be changed. Signed-off-by: Nicholas Piggin --- core/cpu.c | 243 +++++++++++++++++++++++++++++++---------------------- 1 file changed, 143 insertions(+), 100 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index d4d33b836..0dc013b5b 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -38,6 +38,7 @@ static struct lock reinit_lock = LOCK_UNLOCKED; static bool radix_supported; static unsigned long hid0_hile; static unsigned long hid0_attn; +static bool reconfigure_idle = false; static bool sreset_enabled; static bool ipi_enabled; static bool pm_enabled; @@ -391,7 +392,7 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) sync(); /* Check for jobs again */ - if (cpu_check_jobs(cpu) || !pm_enabled) + if (cpu_check_jobs(cpu) || reconfigure_idle) goto skip_sleep; /* Setup wakup cause in LPCR: EE (for IPI) */ @@ -406,7 +407,7 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) sync(); /* Check if PM got disabled */ - if (!pm_enabled) + if (reconfigure_idle) goto skip_sleep; /* EE and DEC */ @@ -447,7 +448,7 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) sync(); /* Check for jobs again */ - if (cpu_check_jobs(cpu) || !pm_enabled) + if (cpu_check_jobs(cpu) || reconfigure_idle) goto skip_sleep; /* HV DBELL for IPI */ @@ -460,7 +461,7 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) sync(); /* Check if PM got disabled */ - if (!pm_enabled) + if (reconfigure_idle) goto skip_sleep; /* HV DBELL and DEC */ @@ -534,70 +535,66 @@ static void cpu_idle_pm(enum cpu_wake_cause wake_on) } } -void cpu_idle_job(void) +static struct lock idle_lock = LOCK_UNLOCKED; +static int nr_cpus_idle = 0; + +static void enter_idle(void) { - if (pm_enabled) { - cpu_idle_pm(cpu_wake_on_job); - } else { - struct cpu_thread *cpu = this_cpu(); + for (;;) { + lock(&idle_lock); + if (!reconfigure_idle) { + nr_cpus_idle++; + break; + } + unlock(&idle_lock); + /* Another CPU is reconfiguring idle */ smt_lowest(); - /* Check for jobs again */ - while (!cpu_check_jobs(cpu)) { - if (pm_enabled) - break; + while (reconfigure_idle) cpu_relax(); - barrier(); - } smt_medium(); } + + unlock(&idle_lock); } -void cpu_idle_delay(unsigned long delay) +static void exit_idle(void) { - unsigned long now = mftb(); - unsigned long end = now + delay; - unsigned long min_pm = usecs_to_tb(10); - - if (pm_enabled && delay > min_pm) { -pm: - for (;;) { - if (delay >= 0x7fffffff) - delay = 0x7fffffff; - mtspr(SPR_DEC, delay); + lock(&idle_lock); + assert(nr_cpus_idle > 0); + nr_cpus_idle--; + unlock(&idle_lock); +} - cpu_idle_pm(cpu_wake_on_dec); +static void reconfigure_idle_start(void) +{ + struct cpu_thread *cpu; - now = mftb(); - if (tb_compare(now, end) == TB_AAFTERB) - break; - delay = end - now; - if (!(pm_enabled && delay > min_pm)) - goto no_pm; + for (;;) { + lock(&idle_lock); + if (!reconfigure_idle) { + reconfigure_idle = true; + break; } - } else { -no_pm: + unlock(&idle_lock); + + /* Someone else is reconfiguring */ smt_lowest(); - for (;;) { - now = mftb(); - if (tb_compare(now, end) == TB_AAFTERB) - break; - delay = end - now; - if (pm_enabled && delay > min_pm) { - smt_medium(); - goto pm; - } - } + while (reconfigure_idle) + cpu_relax(); smt_medium(); } -} -static void cpu_pm_disable(void) -{ - struct cpu_thread *cpu; - unsigned int timeout; + unlock(&idle_lock); - pm_enabled = false; + /* + * Now kick everyone out of idle. + */ + + /* + * Order earlier store to reconfigure_idle=true vs load from + * cpu->in_sleep and cpu->in_idle. + */ sync(); if (proc_gen == proc_gen_p8) { @@ -612,22 +609,88 @@ static void cpu_pm_disable(void) if (cpu->in_sleep || cpu->in_idle) p9_dbell_send(cpu->pir); } + } - /* This code is racy with cpus entering idle, late ones miss the dbell */ + smt_lowest(); + while (nr_cpus_idle != 0) + cpu_relax(); + smt_medium(); - smt_lowest(); - for_each_available_cpu(cpu) { - timeout = 0x08000000; - while ((cpu->in_sleep || cpu->in_idle) && --timeout) - barrier(); - if (!timeout) { - prlog(PR_DEBUG, "cpu_pm_disable TIMEOUT on cpu 0x%04x to exit idle\n", - cpu->pir); - p9_dbell_send(cpu->pir); - } + /* + * Order load of nr_cpus_idle with loads of data the idle CPUs + * might have previously stored to before coming out of idle. + */ + lwsync(); +} + +static void reconfigure_idle_end(void) +{ + assert(reconfigure_idle); + lock(&idle_lock); + reconfigure_idle = false; + unlock(&idle_lock); +} + +void cpu_idle_job(void) +{ + struct cpu_thread *cpu = this_cpu(); + + do { + enter_idle(); + + if (pm_enabled) { + cpu_idle_pm(cpu_wake_on_job); + } else { + smt_lowest(); + for (;;) { + if (cpu_check_jobs(cpu)) + break; + if (reconfigure_idle) + break; + cpu_relax(); + } + smt_medium(); } - smt_medium(); - } + + exit_idle(); + + } while (!cpu_check_jobs(cpu)); +} + +void cpu_idle_delay(unsigned long delay) +{ + unsigned long now = mftb(); + unsigned long end = now + delay; + unsigned long min_pm = usecs_to_tb(10); + + do { + enter_idle(); + + delay = end - now; + + if (pm_enabled && delay > min_pm) { + if (delay >= 0x7fffffff) + delay = 0x7fffffff; + mtspr(SPR_DEC, delay); + + cpu_idle_pm(cpu_wake_on_dec); + } else { + smt_lowest(); + for (;;) { + if (tb_compare(mftb(), end) == TB_AAFTERB) + break; + if (reconfigure_idle) + break; + cpu_relax(); + } + smt_medium(); + } + + exit_idle(); + + now = mftb(); + + } while (tb_compare(now, end) != TB_AAFTERB); } void cpu_set_sreset_enable(bool enabled) @@ -639,28 +702,16 @@ void cpu_set_sreset_enable(bool enabled) /* Public P8 Mambo has broken NAP */ if (chip_quirk(QUIRK_MAMBO_CALLOUTS)) return; + } - sreset_enabled = enabled; - sync(); + reconfigure_idle_start(); - if (!enabled) { - cpu_pm_disable(); - } else { - if (ipi_enabled) - pm_enabled = true; - } + sreset_enabled = enabled; - } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { - sreset_enabled = enabled; - sync(); - /* - * Kick everybody out of PM so they can adjust the PM - * mode they are using (EC=0/1). - */ - cpu_pm_disable(); - if (ipi_enabled) - pm_enabled = true; - } + if (proc_gen == proc_gen_p8) + pm_enabled = ipi_enabled && sreset_enabled; + + reconfigure_idle_end(); } void cpu_set_ipi_enable(bool enabled) @@ -668,24 +719,16 @@ void cpu_set_ipi_enable(bool enabled) if (ipi_enabled == enabled) return; - if (proc_gen == proc_gen_p8) { - ipi_enabled = enabled; - sync(); - if (!enabled) { - cpu_pm_disable(); - } else { - if (sreset_enabled) - pm_enabled = true; - } + reconfigure_idle_start(); - } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { - ipi_enabled = enabled; - sync(); - if (!enabled) - cpu_pm_disable(); - else - pm_enabled = true; - } + ipi_enabled = enabled; + + if (proc_gen == proc_gen_p8) + pm_enabled = ipi_enabled && sreset_enabled; + else + pm_enabled = ipi_enabled; + + reconfigure_idle_end(); } void cpu_process_local_jobs(void) -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:47 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:47 +1000 Subject: [Skiboot] [PATCH v1 2/6] core/cpu: remove POWER8 IPI loop In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-3-npiggin@gmail.com> POWER8 should not have to loop sending IPIs until the destination wakes up. One should be enough. Signed-off-by: Nicholas Piggin --- core/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index 0dc013b5b..940b02ce4 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -599,10 +599,8 @@ static void reconfigure_idle_start(void) if (proc_gen == proc_gen_p8) { for_each_available_cpu(cpu) { - while (cpu->in_sleep || cpu->in_idle) { + if (cpu->in_sleep || cpu->in_idle) icp_kick_cpu(cpu); - cpu_relax(); - } } } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { for_each_available_cpu(cpu) { -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:48 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:48 +1000 Subject: [Skiboot] [PATCH v1 3/6] core/cpu: refactor IPI sending In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-4-npiggin@gmail.com> Pull the IPI sending code into its own function where it is used in two places. cpu_wake() already checks in_idle, so its caller does not need to check pm_enabled. Signed-off-by: Nicholas Piggin --- core/cpu.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index 940b02ce4..d77ab7c93 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -90,13 +90,8 @@ void __nomcount cpu_relax(void) barrier(); } -static void cpu_wake(struct cpu_thread *cpu) +static void cpu_send_ipi(struct cpu_thread *cpu) { - /* Is it idle ? If not, no need to wake */ - sync(); - if (!cpu->in_idle) - return; - if (proc_gen == proc_gen_p8) { /* Poke IPI */ icp_kick_cpu(cpu); @@ -105,6 +100,14 @@ static void cpu_wake(struct cpu_thread *cpu) } } +static void cpu_wake(struct cpu_thread *cpu) +{ + /* Is it idle ? If not, no need to wake */ + sync(); + if (cpu->in_idle) + cpu_send_ipi(cpu); +} + /* * If chip_id is >= 0, schedule the job on that node. * Otherwise schedule the job anywhere. @@ -189,8 +192,7 @@ static void queue_job_on_cpu(struct cpu_thread *cpu, struct cpu_job *job) cpu->job_has_no_return = true; else cpu->job_count++; - if (pm_enabled) - cpu_wake(cpu); + cpu_wake(cpu); unlock(&cpu->job_lock); } @@ -597,16 +599,9 @@ static void reconfigure_idle_start(void) */ sync(); - if (proc_gen == proc_gen_p8) { - for_each_available_cpu(cpu) { - if (cpu->in_sleep || cpu->in_idle) - icp_kick_cpu(cpu); - } - } else if (proc_gen == proc_gen_p9 || proc_gen == proc_gen_p10) { - for_each_available_cpu(cpu) { - if (cpu->in_sleep || cpu->in_idle) - p9_dbell_send(cpu->pir); - } + for_each_available_cpu(cpu) { + if (cpu->in_sleep || cpu->in_idle) + cpu_send_ipi(cpu); } smt_lowest(); -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:49 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:49 +1000 Subject: [Skiboot] [PATCH v1 4/6] core/cpu: move cpu_wake out of job_lock In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-5-npiggin@gmail.com> There is no need to keep the IPI initiation under the job_lock. If the target does wake after the job is queued and before we can send the IPI, it will check for new jobs. Signed-off-by: Nicholas Piggin --- core/cpu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/core/cpu.c b/core/cpu.c index d77ab7c93..0b05d28c6 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -192,8 +192,9 @@ static void queue_job_on_cpu(struct cpu_thread *cpu, struct cpu_job *job) cpu->job_has_no_return = true; else cpu->job_count++; - cpu_wake(cpu); unlock(&cpu->job_lock); + + cpu_wake(cpu); } struct cpu_job *__cpu_queue_job(struct cpu_thread *cpu, -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:50 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:50 +1000 Subject: [Skiboot] [PATCH v1 5/6] core/cpu: make cpu idle states simpler In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-6-npiggin@gmail.com> in_idle is true for any kind of idle. This is not used anywhere except for state assertions, but it could be used to remove the idle_lock and global counter if that becomes a significant cost. in_sleep is true for sleep that requires an IPI to wake up. in_job_sleep is true for sleep that wants an IPI sent after a job is queued, and implies in_sleep. Signed-off-by: Nicholas Piggin --- core/cpu.c | 26 +++++++++++++++++++------- include/cpu.h | 3 ++- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index 0b05d28c6..2cafa4301 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -104,7 +104,7 @@ static void cpu_wake(struct cpu_thread *cpu) { /* Is it idle ? If not, no need to wake */ sync(); - if (cpu->in_idle) + if (cpu->in_job_sleep) cpu_send_ipi(cpu); } @@ -391,7 +391,8 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) /* Synchronize with wakers */ if (wake_on == cpu_wake_on_job) { /* Mark ourselves in idle so other CPUs know to send an IPI */ - cpu->in_idle = true; + cpu->in_sleep = true; + cpu->in_job_sleep = true; sync(); /* Check for jobs again */ @@ -425,8 +426,8 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) skip_sleep: /* Restore */ sync(); - cpu->in_idle = false; cpu->in_sleep = false; + cpu->in_job_sleep = false; reset_cpu_icp(); return vec; @@ -447,7 +448,8 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) /* Synchronize with wakers */ if (wake_on == cpu_wake_on_job) { /* Mark ourselves in idle so other CPUs know to send an IPI */ - cpu->in_idle = true; + cpu->in_sleep = true; + cpu->in_job_sleep = true; sync(); /* Check for jobs again */ @@ -493,8 +495,8 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) skip_sleep: /* Restore */ sync(); - cpu->in_idle = false; cpu->in_sleep = false; + cpu->in_job_sleep = false; return vec; } @@ -543,10 +545,15 @@ static int nr_cpus_idle = 0; static void enter_idle(void) { + struct cpu_thread *cpu = this_cpu(); + + assert(!cpu->in_idle); + for (;;) { lock(&idle_lock); if (!reconfigure_idle) { nr_cpus_idle++; + cpu->in_idle = true; break; } unlock(&idle_lock); @@ -563,9 +570,14 @@ static void enter_idle(void) static void exit_idle(void) { + struct cpu_thread *cpu = this_cpu(); + + assert(cpu->in_idle); + lock(&idle_lock); assert(nr_cpus_idle > 0); nr_cpus_idle--; + cpu->in_idle = false; unlock(&idle_lock); } @@ -596,12 +608,12 @@ static void reconfigure_idle_start(void) /* * Order earlier store to reconfigure_idle=true vs load from - * cpu->in_sleep and cpu->in_idle. + * cpu->in_sleep. */ sync(); for_each_available_cpu(cpu) { - if (cpu->in_sleep || cpu->in_idle) + if (cpu->in_sleep) cpu_send_ipi(cpu); } diff --git a/include/cpu.h b/include/cpu.h index b0c78ce62..1be5cb0d4 100644 --- a/include/cpu.h +++ b/include/cpu.h @@ -60,8 +60,9 @@ struct cpu_thread { bool in_poller; bool in_reinit; bool in_fast_sleep; - bool in_sleep; bool in_idle; + bool in_sleep; + bool in_job_sleep; uint32_t hbrt_spec_wakeup; /* primary only */ uint64_t save_l2_fir_action1; uint64_t current_token; -- 2.23.0 From npiggin at gmail.com Wed Aug 11 15:48:51 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 11 Aug 2021 15:48:51 +1000 Subject: [Skiboot] [PATCH v1 6/6] core/cpu: move sleep/wake synchronisation out from low level code In-Reply-To: <20210811054851.861482-1-npiggin@gmail.com> References: <20210811054851.861482-1-npiggin@gmail.com> Message-ID: <20210811054851.861482-7-npiggin@gmail.com> The sleep/wake synchronisation involes the waker setting a wake condition then testing if the target needs to be woken, vs setting a wake-required flag then testing the wake condition. The low level sleep state call comes after that. This patch moves the synchronisation out from the low level sleep functions and consolidates both copies into one place. Signed-off-by: Nicholas Piggin --- core/cpu.c | 138 ++++++++++++++++------------------------------------- 1 file changed, 42 insertions(+), 96 deletions(-) diff --git a/core/cpu.c b/core/cpu.c index 2cafa4301..98f5a7202 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -377,57 +377,21 @@ enum cpu_wake_cause { static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) { uint64_t lpcr = mfspr(SPR_LPCR) & ~SPR_LPCR_P8_PECE; - struct cpu_thread *cpu = this_cpu(); - unsigned int vec = 0; - - if (!pm_enabled) { - prlog_once(PR_DEBUG, "cpu_idle_p8 called pm disabled\n"); - return vec; - } + unsigned int vec; /* Clean up ICP, be ready for IPIs */ icp_prep_for_pm(); - /* Synchronize with wakers */ - if (wake_on == cpu_wake_on_job) { - /* Mark ourselves in idle so other CPUs know to send an IPI */ - cpu->in_sleep = true; - cpu->in_job_sleep = true; - sync(); - - /* Check for jobs again */ - if (cpu_check_jobs(cpu) || reconfigure_idle) - goto skip_sleep; - - /* Setup wakup cause in LPCR: EE (for IPI) */ - lpcr |= SPR_LPCR_P8_PECE2; - mtspr(SPR_LPCR, lpcr); - - } else { - /* Mark outselves sleeping so cpu_set_pm_enable knows to - * send an IPI - */ - cpu->in_sleep = true; - sync(); - - /* Check if PM got disabled */ - if (reconfigure_idle) - goto skip_sleep; - - /* EE and DEC */ - lpcr |= SPR_LPCR_P8_PECE2 | SPR_LPCR_P8_PECE3; - mtspr(SPR_LPCR, lpcr); - } + /* Setup wakup cause in LPCR: EE (for IPI) */ + lpcr |= SPR_LPCR_P8_PECE2; + if (wake_on == cpu_wake_on_dec) + lpcr |= SPR_LPCR_P8_PECE3; /* DEC */ + mtspr(SPR_LPCR, lpcr); isync(); /* Enter nap */ vec = enter_p8_pm_state(false); -skip_sleep: - /* Restore */ - sync(); - cpu->in_sleep = false; - cpu->in_job_sleep = false; reset_cpu_icp(); return vec; @@ -437,42 +401,11 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) { uint64_t lpcr = mfspr(SPR_LPCR) & ~SPR_LPCR_P9_PECE; uint64_t psscr; - struct cpu_thread *cpu = this_cpu(); - unsigned int vec = 0; - - if (!pm_enabled) { - prlog(PR_DEBUG, "cpu_idle_p9 called on cpu 0x%04x with pm disabled\n", cpu->pir); - return vec; - } - - /* Synchronize with wakers */ - if (wake_on == cpu_wake_on_job) { - /* Mark ourselves in idle so other CPUs know to send an IPI */ - cpu->in_sleep = true; - cpu->in_job_sleep = true; - sync(); - - /* Check for jobs again */ - if (cpu_check_jobs(cpu) || reconfigure_idle) - goto skip_sleep; - - /* HV DBELL for IPI */ - lpcr |= SPR_LPCR_P9_PECEL1; - } else { - /* Mark outselves sleeping so cpu_set_pm_enable knows to - * send an IPI - */ - cpu->in_sleep = true; - sync(); - - /* Check if PM got disabled */ - if (reconfigure_idle) - goto skip_sleep; - - /* HV DBELL and DEC */ - lpcr |= SPR_LPCR_P9_PECEL1 | SPR_LPCR_P9_PECEL3; - } + unsigned int vec; + lpcr |= SPR_LPCR_P9_PECEL1; /* HV DBELL for IPI */ + if (wake_on == cpu_wake_on_dec) + lpcr |= SPR_LPCR_P9_PECEL3; /* DEC */ mtspr(SPR_LPCR, lpcr); isync(); @@ -487,39 +420,46 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) /* PSSCR SD=0 ESL=0 EC=0 PSSL=0 TR=3 MTL=0 RL=1 */ psscr = PPC_BITMASK(54, 55) | PPC_BIT(63); enter_p9_pm_lite_state(psscr); + vec = 0; } /* Clear doorbell */ p9_dbell_receive(); - skip_sleep: - /* Restore */ - sync(); - cpu->in_sleep = false; - cpu->in_job_sleep = false; - return vec; } static void cpu_idle_pm(enum cpu_wake_cause wake_on) { + struct cpu_thread *cpu = this_cpu(); unsigned int vec; - switch(proc_gen) { - case proc_gen_p8: + if (!pm_enabled) { + prlog_once(PR_DEBUG, "cpu_idle_pm called pm disabled\n"); + return; + } + + /* + * Mark ourselves in sleep so other CPUs know to send an IPI, + * then re-check the wake conditions. This is ordered against + * queue_job_on_cpu() and reconfigure_idle_start() which first + * set the wake conditions (either queue a job or set + * reconfigure_idle = true), issue a sync(), then test if the + * target is in_sleep / in_job_sleep. + */ + cpu->in_sleep = true; + if (wake_on == cpu_wake_on_job) + cpu->in_job_sleep = true; + sync(); + if (reconfigure_idle) + goto skip_sleep; + if (wake_on == cpu_wake_on_job && cpu_check_jobs(cpu)) + goto skip_sleep; + + if (proc_gen == proc_gen_p8) vec = cpu_idle_p8(wake_on); - break; - case proc_gen_p9: - vec = cpu_idle_p9(wake_on); - break; - case proc_gen_p10: + else vec = cpu_idle_p9(wake_on); - break; - default: - vec = 0; - prlog_once(PR_DEBUG, "cpu_idle_pm called with bad processor type\n"); - break; - } if (vec == 0x100) { unsigned long srr1 = mfspr(SPR_SRR1); @@ -538,6 +478,12 @@ static void cpu_idle_pm(enum cpu_wake_cause wake_on) enable_machine_check(); mtmsrd(MSR_RI, 1); } + +skip_sleep: + sync(); + cpu->in_sleep = false; + if (wake_on == cpu_wake_on_job) + cpu->in_job_sleep = false; } static struct lock idle_lock = LOCK_UNLOCKED; -- 2.23.0 From nnac123 at gmail.com Thu Aug 12 01:02:31 2021 From: nnac123 at gmail.com (Nick Child) Date: Wed, 11 Aug 2021 11:02:31 -0400 Subject: [Skiboot] [PATCH] secvar: Free md context on hash error Message-ID: <20210811150231.31690-1-nick.child@ibm.com> There were a few instances in `get_hash_to_verify` where NULL is returned before unallocating the md context. This commit ensures that this memory is properly freed before returning. Signed-off-by: Nick Child --- libstb/secvar/backend/edk2-compat-process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libstb/secvar/backend/edk2-compat-process.c b/libstb/secvar/backend/edk2-compat-process.c index bd7a0abb..770c3706 100644 --- a/libstb/secvar/backend/edk2-compat-process.c +++ b/libstb/secvar/backend/edk2-compat-process.c @@ -643,7 +643,7 @@ static char *get_hash_to_verify(const char *key, const char *new_data, || key_equals(key, "dbx")) guid = EFI_IMAGE_SECURITY_DATABASE_GUID; else - return NULL; + goto out; /* Expand char name to wide character width */ varlen = strlen(key) * 2; @@ -672,7 +672,7 @@ static char *get_hash_to_verify(const char *key, const char *new_data, hash = zalloc(32); if (!hash) - return NULL; + goto out; rc = mbedtls_md_finish(&ctx, hash); if (rc) { free(hash); -- 2.25.1 From npiggin at gmail.com Thu Aug 12 14:04:55 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Thu, 12 Aug 2021 14:04:55 +1000 Subject: [Skiboot] [PATCH v1] Virtual Memory for OPAL boot Message-ID: <20210812040455.929979-1-npiggin@gmail.com> I'd like to re-float this idea, this patch is rebased on the hwprobe and idle changes just posted and tested in POWER9 witherspoon and P9/10 mambo. There's two reasons to do this. First is to run most of the boot sequence with virtual memory on, which can help catch bugs. Second is that by doing this we build up a list of logical virtual memory extents. With more patches that can be used by the OS to build up a virtual memory environment to run the runtime OPAL services in, which improves performance and helps protect the firmware and the OS from one another. Signed-off-by: Nicholas Piggin --- core/Makefile.inc | 2 +- core/cpu.c | 24 +- core/exceptions.c | 68 ++- core/fast-reboot.c | 14 +- core/hwprobe.c | 43 +- core/init.c | 165 ++++++-- core/mem_region.c | 145 +++++-- core/opal.c | 45 +- core/platform.c | 15 +- core/vm.c | 957 +++++++++++++++++++++++++++++++++++++++++++ hdata/spira.c | 35 +- hw/fake-nvram.c | 12 +- hw/homer.c | 15 +- hw/lpc-uart.c | 32 +- hw/lpc.c | 6 + hw/phb4.c | 9 +- hw/psi.c | 20 +- hw/xive.c | 10 + hw/xive2.c | 10 + hw/xscom.c | 4 + include/cmpxchg.h | 3 + include/cpu.h | 27 ++ include/elf-abi.h | 21 +- include/io.h | 119 ++++-- include/mem_region.h | 1 + include/platform.h | 4 +- include/processor.h | 13 +- include/skiboot.h | 31 +- libstb/container.c | 12 +- libstb/cvc.c | 3 + libstb/secureboot.c | 5 +- libstb/trustedboot.c | 6 +- skiboot.lds.S | 26 +- 33 files changed, 1720 insertions(+), 182 deletions(-) create mode 100644 core/vm.c diff --git a/core/Makefile.inc b/core/Makefile.inc index f80019b6a..efeb165c9 100644 --- a/core/Makefile.inc +++ b/core/Makefile.inc @@ -3,7 +3,7 @@ # -*-Makefile-*- SUBDIRS += core -CORE_OBJS = relocate.o console.o stack.o init.o chip.o mem_region.o +CORE_OBJS = relocate.o console.o stack.o init.o chip.o mem_region.o vm.o CORE_OBJS += malloc.o lock.o cpu.o utils.o fdt.o opal.o interrupts.o timebase.o CORE_OBJS += opal-msg.o pci.o pci-virt.o pci-slot.o pcie-slot.o CORE_OBJS += pci-opal.o fast-reboot.o device.o exceptions.o trace.o affinity.o diff --git a/core/cpu.c b/core/cpu.c index 98f5a7202..371c283da 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -376,6 +376,7 @@ enum cpu_wake_cause { static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) { + struct cpu_thread *cpu = this_cpu(); uint64_t lpcr = mfspr(SPR_LPCR) & ~SPR_LPCR_P8_PECE; unsigned int vec; @@ -389,6 +390,10 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) mtspr(SPR_LPCR, lpcr); isync(); + /* P8 must enter nap with VM disabled */ + if (cpu->vm_setup) + vm_exit(); + /* Enter nap */ vec = enter_p8_pm_state(false); @@ -399,6 +404,7 @@ static unsigned int cpu_idle_p8(enum cpu_wake_cause wake_on) static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) { + struct cpu_thread *cpu = this_cpu(); uint64_t lpcr = mfspr(SPR_LPCR) & ~SPR_LPCR_P9_PECE; uint64_t psscr; unsigned int vec; @@ -414,11 +420,19 @@ static unsigned int cpu_idle_p9(enum cpu_wake_cause wake_on) /* PSSCR SD=0 ESL=1 EC=1 PSSL=0 TR=3 MTL=0 RL=1 */ psscr = PPC_BIT(42) | PPC_BIT(43) | PPC_BITMASK(54, 55) | PPC_BIT(63); + /* + * stop with EC=1 wakes with vm off. P9 can stop with vm + * enabled, but it's simpler to disable now and so it wakes + * in the proper state. + */ + if (cpu->vm_setup) + vm_exit(); vec = enter_p9_pm_state(psscr); } else { /* stop with EC=0 (resumes) which does not require sreset. */ /* PSSCR SD=0 ESL=0 EC=0 PSSL=0 TR=3 MTL=0 RL=1 */ psscr = PPC_BITMASK(54, 55) | PPC_BIT(63); + /* Can run with VM enabled */ enter_p9_pm_lite_state(psscr); vec = 0; } @@ -433,6 +447,7 @@ static void cpu_idle_pm(enum cpu_wake_cause wake_on) { struct cpu_thread *cpu = this_cpu(); unsigned int vec; + bool was_vm_setup = this_cpu()->vm_setup; if (!pm_enabled) { prlog_once(PR_DEBUG, "cpu_idle_pm called pm disabled\n"); @@ -471,12 +486,17 @@ static void cpu_idle_pm(enum cpu_wake_cause wake_on) default: break; } - mtmsrd(MSR_RI, 1); } else if (vec == 0x200) { exception_entry_pm_mce(); enable_machine_check(); + } + + if (vec != 0) { + /* 0x100 or 0x200 */ mtmsrd(MSR_RI, 1); + if (was_vm_setup) + vm_enter(); } skip_sleep: @@ -1441,7 +1461,7 @@ static int64_t opal_return_cpu(void) printf("OPAL in_opal_call=%u\n", this_cpu()->in_opal_call); } - __secondary_cpu_entry(); + __return_cpu_entry(); return OPAL_HARDWARE; /* Should not happen */ } diff --git a/core/exceptions.c b/core/exceptions.c index 389548d16..35c14f8af 100644 --- a/core/exceptions.c +++ b/core/exceptions.c @@ -33,7 +33,7 @@ static void dump_regs(struct stack_frame *stack) #define EXCEPTION_MAX_STR 320 -static void handle_mce(struct stack_frame *stack, uint64_t nip, uint64_t msr, bool *fatal) +static void handle_mce(struct stack_frame *stack, uint64_t nip, uint64_t msr, bool *fatal, bool *vm_setup) { uint64_t mce_flags, mce_addr; const char *mce_err; @@ -44,12 +44,28 @@ static void handle_mce(struct stack_frame *stack, uint64_t nip, uint64_t msr, bo decode_mce(stack->srr0, stack->srr1, stack->dsisr, stack->dar, &mce_flags, &mce_err, &mce_addr); - /* Try to recover. */ - if (mce_flags & MCE_ERAT_ERROR) { - /* Real-mode still uses ERAT, flush transient bitflips */ + /* Try to recover */ + if ((mce_flags & (MCE_SLB_ERROR|MCE_TABLE_WALK)) && + (msr & (MSR_IR|MSR_DR)) && + !this_cpu()->vm_local_map_inuse) { + /* Try to turn off VM if non-linear map is not in use. */ + *vm_setup = false; + stack->srr1 &= ~(MSR_IR|MSR_DR); + mce_fix = "Disabling virtual memory"; + + } else if (mce_flags & MCE_ERAT_ERROR) { flush_erat(); mce_fix = "ERAT flush"; + } else if (mce_flags & MCE_TLB_ERROR) { + cleanup_global_tlb(); + mce_fix = "global TLB flush"; + + } else if (mce_flags & MCE_TLB_ERROR) { + cleanup_global_tlb(); + stack->srr0 += 4; + mce_fix = "global TLB flush and skip instruction"; + } else { *fatal = true; } @@ -83,6 +99,8 @@ static void handle_mce(struct stack_frame *stack, uint64_t nip, uint64_t msr, bo void exception_entry(struct stack_frame *stack) { + struct cpu_thread *c = this_cpu(); + bool vm_setup = c->vm_setup; bool fatal = false; bool hv; uint64_t nip; @@ -90,6 +108,8 @@ void exception_entry(struct stack_frame *stack) char buf[EXCEPTION_MAX_STR]; size_t l; + c->vm_setup = false; + switch (stack->type) { case 0x500: case 0x980: @@ -134,9 +154,44 @@ void exception_entry(struct stack_frame *stack) break; case 0x200: - handle_mce(stack, nip, msr, &fatal); + handle_mce(stack, nip, msr, &fatal, &vm_setup); goto no_symbol; + case 0x300: + if (vm_dsi(nip, stack->dar, stack->dsisr)) + goto out; + fatal = true; + l += snprintf(buf + l, EXCEPTION_MAX_STR - l, + "Fatal %s address "REG" at "REG" ", + (stack->dsisr & DSISR_ISSTORE) ? "store" : "load", + stack->dar, nip); + break; + + case 0x380: + if (vm_dslb(nip, stack->dar)) + goto out; + fatal = true; + l += snprintf(buf + l, EXCEPTION_MAX_STR - l, + "Fatal load/store address "REG" at "REG" ", + stack->dar, nip); + break; + + case 0x400: + if (vm_isi(nip)) + goto out; + fatal = true; + l += snprintf(buf + l, EXCEPTION_MAX_STR - l, + "Fatal ifetch at "REG" ", nip); + break; + + case 0x480: + if (vm_islb(nip)) + goto out; + fatal = true; + l += snprintf(buf + l, EXCEPTION_MAX_STR - l, + "Fatal ifetch at "REG" ", nip); + break; + case 0x700: { struct trap_table_entry *tte; @@ -185,11 +240,14 @@ no_symbol: for (;;) ; } +out: + assert(!fatal); if (hv) { /* Set up for SRR return */ stack->srr0 = nip; stack->srr1 = msr; } + c->vm_setup = vm_setup; } void exception_entry_pm_sreset(void) diff --git a/core/fast-reboot.c b/core/fast-reboot.c index 2696348af..42a768b53 100644 --- a/core/fast-reboot.c +++ b/core/fast-reboot.c @@ -395,6 +395,9 @@ void __noreturn fast_reboot_entry(void) cpu_set_sreset_enable(true); cpu_set_ipi_enable(true); + /* Enter virtual memory mode */ + vm_init(true); + prlog(PR_INFO, "RESET: Releasing secondaries...\n"); /* Release everybody */ @@ -415,6 +418,7 @@ void __noreturn fast_reboot_entry(void) fast_boot_release = false; if (!chip_quirk(QUIRK_MAMBO_CALLOUTS)) { + void *t; /* * mem_region_clear_unused avoids these preload regions * so it can run along side image preloading. Clear these @@ -424,8 +428,14 @@ void __noreturn fast_reboot_entry(void) * Mambo may have embedded payload here, so don't clear * it at all. */ - memset(KERNEL_LOAD_BASE, 0, KERNEL_LOAD_SIZE); - memset(INITRAMFS_LOAD_BASE, 0, INITRAMFS_LOAD_SIZE); + + t = vm_map((unsigned long)KERNEL_LOAD_BASE, KERNEL_LOAD_SIZE, true); + memset(t, 0, KERNEL_LOAD_SIZE); + vm_unmap((unsigned long)t, KERNEL_LOAD_SIZE); + + t = vm_map((unsigned long)INITRAMFS_LOAD_BASE, INITRAMFS_LOAD_SIZE, true); + memset(t, 0, INITRAMFS_LOAD_SIZE); + vm_unmap((unsigned long)t, INITRAMFS_LOAD_SIZE); } /* Start preloading kernel and ramdisk */ diff --git a/core/hwprobe.c b/core/hwprobe.c index 0a641ada5..6e62a42d8 100644 --- a/core/hwprobe.c +++ b/core/hwprobe.c @@ -7,9 +7,9 @@ static bool hwprobe_deps_satisfied(const struct hwprobe *hwp) { - struct hwprobe *hwprobe; + struct hwprobe *s = __hwprobes_start; + struct hwprobe *e = __hwprobes_end; const char **dep; - unsigned int i; dep = hwp->deps; if (dep == NULL) @@ -19,11 +19,12 @@ static bool hwprobe_deps_satisfied(const struct hwprobe *hwp) prlog(PR_TRACE, "Checking deps for %s\n", hwp->name); while (*dep != NULL) { + struct hwprobe *h; + prlog(PR_TRACE, "Checking %s dep %s\n", hwp->name, *dep); - hwprobe = &__hwprobes_start; - for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { - if(strcmp(hwprobe[i].name, *dep) == 0 && - !hwprobe[i].probed) + + for (h = s; h < e; h++) { + if (!strcmp(h->name, *dep) && !h->probed) return false; } dep++; @@ -36,28 +37,34 @@ static bool hwprobe_deps_satisfied(const struct hwprobe *hwp) void probe_hardware(void) { - struct hwprobe *hwprobe; - unsigned int i; + struct hwprobe *s = __hwprobes_start; + struct hwprobe *e = __hwprobes_end; + size_t size = (void *)e - (void *)s; bool work_todo = true; bool did_something = true; + vm_map_global("HWProbe table", (unsigned long)s, size, true, false); + while (work_todo) { + struct hwprobe *h; + work_todo = false; did_something = false; - hwprobe = &__hwprobes_start; + prlog(PR_DEBUG, "Begin loop\n"); - for (i = 0; &hwprobe[i] < &__hwprobes_end; i++) { - if (hwprobe[i].probed) + + for (h = s; h < e; h++) { + if (h->probed) continue; - if (hwprobe_deps_satisfied(&hwprobe[i])) { - prlog(PR_DEBUG, "Probing %s...\n", hwprobe[i].name); - if (hwprobe[i].probe) - hwprobe[i].probe(); + if (hwprobe_deps_satisfied(h)) { + prlog(PR_DEBUG, "Probing %s...\n", h->name); + if (h->probe) + h->probe(); did_something = true; - hwprobe[i].probed = true; + h->probed = true; } else { prlog(PR_DEBUG, "Dependencies for %s not yet satisfied, skipping\n", - hwprobe[i].name); + h->name); work_todo = true; } } @@ -67,4 +74,6 @@ void probe_hardware(void) break; } } + + vm_unmap_global((unsigned long)s, size); } diff --git a/core/init.c b/core/init.c index 0ec5d6ac3..5cc96853d 100644 --- a/core/init.c +++ b/core/init.c @@ -95,6 +95,7 @@ static bool try_load_elf64_le(struct elf_hdr *header) uint64_t load_base = (uint64_t)kh; struct elf64le_phdr *ph; unsigned int i; + bool ret = false; printf("INIT: 64-bit LE kernel discovered\n"); @@ -106,6 +107,9 @@ static bool try_load_elf64_le(struct elf_hdr *header) * but it will not work for any ELF binary. */ ph = (struct elf64le_phdr *)(load_base + le64_to_cpu(kh->e_phoff)); + vm_map_global("KERNEL ELF Program Headers", (unsigned long)ph, + le16_to_cpu(kh->e_phnum)*sizeof(struct elf64le_phdr), + false, false); for (i = 0; i < le16_to_cpu(kh->e_phnum); i++, ph++) { if (le32_to_cpu(ph->p_type) != ELF_PTYPE_LOAD) continue; @@ -122,7 +126,7 @@ static bool try_load_elf64_le(struct elf_hdr *header) if (!kernel_entry) { prerror("INIT: Failed to find kernel entry !\n"); - return false; + goto out_unmap; } kernel_entry += load_base; kernel_32bit = false; @@ -134,7 +138,12 @@ static bool try_load_elf64_le(struct elf_hdr *header) prlog(PR_DEBUG, "INIT: 64-bit kernel entry at 0x%llx, size 0x%lx\n", kernel_entry, kernel_size); - return true; + ret = true; + +out_unmap: + vm_unmap_global((unsigned long)ph, le16_to_cpu(kh->e_phnum)*sizeof(struct elf64le_phdr)); + + return ret; } static bool try_load_elf64(struct elf_hdr *header) @@ -145,12 +154,17 @@ static bool try_load_elf64(struct elf_hdr *header) struct elf64be_phdr *ph; struct elf64be_shdr *sh; unsigned int i; + bool ret = false; + + vm_map_global("KERNEL ELF64 Header", (unsigned long)header, + sizeof(struct elf64be_hdr), false, false); /* Check it's a ppc64 LE ELF */ if (khle->ei_ident == ELF_IDENT && khle->ei_data == ELF_DATA_LSB && le16_to_cpu(khle->e_machine) == ELF_MACH_PPC64) { - return try_load_elf64_le(header); + ret = try_load_elf64_le(header); + goto out_unmap1; } /* Check it's a ppc64 ELF */ @@ -158,7 +172,7 @@ static bool try_load_elf64(struct elf_hdr *header) kh->ei_data != ELF_DATA_MSB || be16_to_cpu(kh->e_machine) != ELF_MACH_PPC64) { prerror("INIT: Kernel doesn't look like an ppc64 ELF\n"); - return false; + goto out_unmap1; } /* Look for a loadable program header that has our entry in it @@ -169,6 +183,8 @@ static bool try_load_elf64(struct elf_hdr *header) * but it will not work for any ELF binary. */ ph = (struct elf64be_phdr *)(load_base + be64_to_cpu(kh->e_phoff)); + vm_map_global("KERNEL ELF Program Headers", (unsigned long)ph, + be16_to_cpu(kh->e_phnum)*sizeof(struct elf64be_phdr), false, false); for (i = 0; i < be16_to_cpu(kh->e_phnum); i++, ph++) { if (be32_to_cpu(ph->p_type) != ELF_PTYPE_LOAD) continue; @@ -185,7 +201,7 @@ static bool try_load_elf64(struct elf_hdr *header) if (!kernel_entry) { prerror("INIT: Failed to find kernel entry !\n"); - return false; + goto out_unmap2; } /* For the normal big-endian ELF ABI, the kernel entry points @@ -195,6 +211,8 @@ static bool try_load_elf64(struct elf_hdr *header) * to assuming it obeys the ABI. */ sh = (struct elf64be_shdr *)(load_base + be64_to_cpu(kh->e_shoff)); + vm_map_global("KERNEL ELF Section Headers", (unsigned long)sh, + be16_to_cpu(kh->e_shnum)*sizeof(struct elf64be_shdr), false, false); for (i = 0; i < be16_to_cpu(kh->e_shnum); i++, sh++) { if (be64_to_cpu(sh->sh_addr) <= be64_to_cpu(kh->e_entry) && (be64_to_cpu(sh->sh_addr) + be64_to_cpu(sh->sh_size)) > @@ -219,7 +237,15 @@ static bool try_load_elf64(struct elf_hdr *header) printf("INIT: 64-bit kernel entry at 0x%llx, size 0x%lx\n", kernel_entry, kernel_size); - return true; + ret = true; + + vm_unmap_global((unsigned long)sh, be16_to_cpu(kh->e_shnum)*sizeof(struct elf64be_shdr)); +out_unmap2: + vm_unmap_global((unsigned long)ph, be16_to_cpu(kh->e_phnum)*sizeof(struct elf64be_phdr)); +out_unmap1: + vm_unmap_global((unsigned long)header, sizeof(struct elf64be_hdr)); + + return ret; } static bool try_load_elf32_le(struct elf_hdr *header) @@ -335,6 +361,7 @@ bool start_preload_kernel(void) int loaded; /* Try to load an external kernel payload through the platform hooks */ + vm_map_global("KERNEL", (unsigned long)KERNEL_LOAD_BASE, KERNEL_LOAD_SIZE, true, false); kernel_size = KERNEL_LOAD_SIZE; loaded = start_preload_resource(RESOURCE_ID_KERNEL, RESOURCE_SUBID_NONE, @@ -343,9 +370,11 @@ bool start_preload_kernel(void) if (loaded != OPAL_SUCCESS) { printf("INIT: platform start load kernel failed\n"); kernel_size = 0; + vm_unmap_global((unsigned long)KERNEL_LOAD_BASE, KERNEL_LOAD_SIZE); return false; } + vm_map_global("INITRAMFS", (unsigned long)INITRAMFS_LOAD_BASE, INITRAMFS_LOAD_SIZE, true, false); initramfs_size = INITRAMFS_LOAD_SIZE; loaded = start_preload_resource(RESOURCE_ID_INITRAMFS, RESOURCE_SUBID_NONE, @@ -353,6 +382,7 @@ bool start_preload_kernel(void) if (loaded != OPAL_SUCCESS) { printf("INIT: platform start load initramfs failed\n"); initramfs_size = 0; + vm_unmap_global((unsigned long)INITRAMFS_LOAD_BASE, INITRAMFS_LOAD_SIZE); return false; } @@ -362,13 +392,16 @@ bool start_preload_kernel(void) static bool load_kernel(void) { void *stb_container = NULL; - struct elf_hdr *kh; + struct elf_hdr *kh, *t; + uint32_t ei_ident; + uint8_t ei_class; int loaded; prlog(PR_NOTICE, "INIT: Waiting for kernel...\n"); loaded = wait_for_resource_loaded(RESOURCE_ID_KERNEL, RESOURCE_SUBID_NONE); + vm_unmap_global((unsigned long)KERNEL_LOAD_BASE, KERNEL_LOAD_SIZE); if (loaded != OPAL_SUCCESS) { printf("INIT: platform wait for kernel load failed\n"); @@ -384,8 +417,10 @@ static bool load_kernel(void) ((uint64_t)__builtin_kernel_start) - SKIBOOT_BASE + boot_offset; printf("Using built-in kernel\n"); + vm_map_global("KERNEL", (unsigned long)KERNEL_LOAD_BASE, kernel_size, true, false); memmove(KERNEL_LOAD_BASE, (void*)builtin_base, kernel_size); + vm_unmap_global((unsigned long)KERNEL_LOAD_BASE, kernel_size); } } @@ -401,7 +436,7 @@ static bool load_kernel(void) if (kernel_entry < EXCEPTION_VECTORS_END) { cpu_set_sreset_enable(false); memcpy_null(NULL, old_vectors, EXCEPTION_VECTORS_END); - sync_icache(); + sync_icache(0); } else { /* Hack for STB in Mambo, assume at least 4kb in mem */ if (!kernel_size) @@ -432,15 +467,20 @@ static bool load_kernel(void) "INIT: Kernel loaded, size: %zu bytes (0 = unknown preload)\n", kernel_size); - if (kh->ei_ident != ELF_IDENT) { + t = vm_map((unsigned long)kh, sizeof(*kh), false); + ei_ident = t->ei_ident; + ei_class = t->ei_class; + vm_unmap((unsigned long)t, sizeof(*kh)); + + if (ei_ident != ELF_IDENT) { prerror("INIT: ELF header not found. Assuming raw binary.\n"); return true; } - if (kh->ei_class == ELF_CLASS_64) { + if (ei_class == ELF_CLASS_64) { if (!try_load_elf64(kh)) return false; - } else if (kh->ei_class == ELF_CLASS_32) { + } else if (ei_class == ELF_CLASS_32) { if (!try_load_elf32(kh)) return false; } else { @@ -468,7 +508,7 @@ static void load_initramfs(void) loaded = wait_for_resource_loaded(RESOURCE_ID_INITRAMFS, RESOURCE_SUBID_NONE); - + vm_unmap_global((unsigned long)INITRAMFS_LOAD_BASE, INITRAMFS_LOAD_SIZE); if (loaded != OPAL_SUCCESS || !initramfs_size) return; @@ -540,6 +580,7 @@ void __noreturn load_and_boot_kernel(bool is_reboot) const struct dt_property *memprop; const char *cmdline, *stdoutp; uint64_t mem_top; + uint32_t *t; memprop = dt_find_property(dt_root, DT_PRIVATE "maxmem"); if (memprop) @@ -614,11 +655,13 @@ void __noreturn load_and_boot_kernel(bool is_reboot) fdt_set_boot_cpuid_phys(fdt, this_cpu()->pir); + t = vm_map(kernel_entry, 4, false); /* Check there is something there before we branch to it */ - if (*(uint32_t *)kernel_entry == 0) { + if (*t == 0) { prlog(PR_EMERG, "FATAL: Kernel is zeros, can't execute!\n"); assert(0); } + vm_unmap(kernel_entry, 4); if (platform.exit) platform.exit(); @@ -630,7 +673,10 @@ void __noreturn load_and_boot_kernel(bool is_reboot) printf("INIT: Starting kernel at 0x%llx, fdt at %p %u bytes\n", kernel_entry, fdt, fdt_totalsize(fdt)); - /* Disable machine checks on all */ + /* Go back to realmode and tear down our VM before booting kernel */ + vm_destroy(); + + /* Disable machine checks, RI on all */ cpu_disable_ME_RI_all(); patch_traps(false); @@ -840,37 +886,60 @@ static void setup_branch_null_catcher(void) void copy_sreset_vector(void) { + static char patch[0x100]; uint32_t *src, *dst; + uint32_t *t; + uint32_t len = (void *)&reset_patch_end - (void *)&reset_patch_start; /* Copy the reset code over the entry point. */ src = &reset_patch_start; + t = vm_map((unsigned long)src, len, false); + memcpy(patch, t, len); + vm_unmap((unsigned long)src, len); + dst = (uint32_t *)0x100; - while(src < &reset_patch_end) - *(dst++) = *(src++); - sync_icache(); + t = vm_map((unsigned long)dst, len, true); + memcpy(t, patch, len); + sync_icache((unsigned long)t); + vm_unmap((unsigned long)dst, len); } void copy_sreset_vector_fast_reboot(void) { + static char patch[0x100]; uint32_t *src, *dst; + uint32_t *t; + uint32_t len = (void *)&reset_fast_reboot_patch_end - + (void *)&reset_fast_reboot_patch_start; /* Copy the reset code over the entry point. */ src = &reset_fast_reboot_patch_start; + t = vm_map((unsigned long)src, len, false); + memcpy(patch, t, len); + vm_unmap((unsigned long)src, len); + dst = (uint32_t *)0x100; - while(src < &reset_fast_reboot_patch_end) - *(dst++) = *(src++); - sync_icache(); + t = vm_map((unsigned long)dst, len, true); + memcpy(t, patch, len); + sync_icache((unsigned long)t); + vm_unmap((unsigned long)dst, len); } void copy_exception_vectors(void) { + void *t; + + t = vm_map(0x0, EXCEPTION_VECTORS_END, true); + /* Copy from 0x100 to EXCEPTION_VECTORS_END, avoid below 0x100 as * this is the boot flag used by CPUs still potentially entering * skiboot. */ - memcpy((void *)0x100, (void *)(SKIBOOT_BASE + 0x100), + memcpy(t + 0x100, (void *)(SKIBOOT_BASE + 0x100), EXCEPTION_VECTORS_END - 0x100); - sync_icache(); + + sync_icache((unsigned long)t); + vm_unmap(0x0, EXCEPTION_VECTORS_END); } /* @@ -884,15 +953,16 @@ void patch_traps(bool enable) for (tte = __trap_table_start; tte < __trap_table_end; tte++) { uint32_t *insn; - insn = (uint32_t *)tte->address; + insn = vm_map(tte->address, sizeof(uint32_t), true); if (enable) { *insn = PPC_INST_TRAP; } else { *insn = PPC_INST_NOP; } + sync_icache((unsigned long)insn); + vm_unmap(tte->address, sizeof(uint32_t)); } - sync_icache(); } static void per_thread_sanity_checks(void) @@ -942,19 +1012,22 @@ void pci_nvram_init(void) static uint32_t mem_csum(void *_p, void *_e) { size_t len = _e - _p; - uint32_t *p = _p; + uint32_t *t; uint32_t v1 = 0, v2 = 0; uint32_t csum; unsigned int i; + t = vm_map((unsigned long)_p, len, false); + for (i = 0; i < len; i += 4) { - uint32_t v = *p++; + uint32_t v = *t++; v1 += v; v2 += v1; } - csum = v1 ^ v2; + vm_unmap((unsigned long)_p, len); + return csum; } @@ -968,6 +1041,8 @@ static void checksum_romem(void) if (chip_quirk(QUIRK_SLOW_SIM)) return; + /* Called in real mode */ + csum = mem_csum(_start, _head_end); romem_csum ^= csum; @@ -1091,7 +1166,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) prlog(PR_DEBUG, "initial console log level: memory %d, driver %d\n", (debug_descriptor.console_log_levels >> 4), (debug_descriptor.console_log_levels & 0x0f)); - prlog(PR_TRACE, "OPAL is Powered By Linked-List Technology.\n"); + prlog(PR_TRACE, "OPAL is Powered By Linked-List Technology. Now with more indirection.\n"); #ifdef SKIBOOT_GCOV skiboot_gcov_done(); @@ -1103,6 +1178,9 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Now locks can be used */ init_locks(); + /* Enter virtual memory mode */ + vm_init(false); + /* Create the OPAL call table early on, entries can be overridden * later on (FSP console code for example) */ @@ -1128,7 +1206,20 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) if (parse_hdat(false) < 0) abort(); } else { + void *t; + uint32_t size; + + t = vm_map((unsigned long)fdt, sizeof(struct fdt_header), false); + size = fdt_totalsize(t); + vm_unmap((unsigned long)fdt, sizeof(struct fdt_header)); + + /* + * Would be nice to make this a local map, but it seems + * to need to be expanded in place. + */ + vm_map_global("fdt", (unsigned long)fdt, size, false, false); dt_expand(fdt); + vm_unmap_global((unsigned long)fdt, size); } dt_add_cpufeatures(dt_root); @@ -1179,6 +1270,8 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) */ init_cpu_max_pir(); + vm_init_stacks(); + /* * Now, we init our memory map from the device-tree, and immediately * reserve areas which we know might contain data coming from @@ -1415,7 +1508,7 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) load_and_boot_kernel(false); } -void __noreturn __secondary_cpu_entry(void) +static void __noreturn __cpu_entry(bool init_vm) { struct cpu_thread *cpu = this_cpu(); @@ -1425,12 +1518,16 @@ void __noreturn __secondary_cpu_entry(void) enable_machine_check(); mtmsrd(MSR_RI, 1); + if (init_vm) + vm_init_secondary(); + /* Some XIVE setup */ if (proc_gen == proc_gen_p9) xive_cpu_callin(cpu); else if (proc_gen == proc_gen_p10) xive2_cpu_callin(cpu); + /* Wait for work to do */ while(true) { if (cpu_check_jobs(cpu)) @@ -1440,6 +1537,16 @@ void __noreturn __secondary_cpu_entry(void) } } +void __noreturn __secondary_cpu_entry(void) +{ + __cpu_entry(true); +} + +void __noreturn __return_cpu_entry(void) +{ + __cpu_entry(false); +} + /* Called from head.S, thus no prototype. */ void __noreturn __nomcount secondary_cpu_entry(void); diff --git a/core/mem_region.c b/core/mem_region.c index 36de2d094..69f24d630 100644 --- a/core/mem_region.c +++ b/core/mem_region.c @@ -25,7 +25,7 @@ #define POISON_MEM_REGION 0 #endif #define POISON_MEM_REGION_WITH 0x99 -#define POISON_MEM_REGION_LIMIT 1*1024*1024*1024 +#define POISON_MEM_REGION_LIMIT (128*1024*1024 - PAGE_SIZE) /* Locking: The mem_region_lock protects the regions list from concurrent * updates. Additions to, or removals from, the region list must be done @@ -57,24 +57,27 @@ static struct mem_region skiboot_os_reserve = { .type = REGION_OS, }; -struct mem_region skiboot_heap = { - .name = "ibm,firmware-heap", - .start = HEAP_BASE, - .len = HEAP_SIZE, - .type = REGION_SKIBOOT_HEAP, -}; - static struct mem_region skiboot_code_and_text = { .name = "ibm,firmware-code", .start = SKIBOOT_BASE, .len = HEAP_BASE - SKIBOOT_BASE, + .vm_mapped_len = HEAP_BASE - SKIBOOT_BASE, .type = REGION_SKIBOOT_FIRMWARE, }; +struct mem_region skiboot_heap = { + .name = "ibm,firmware-heap", + .start = HEAP_BASE, + .len = HEAP_SIZE, + .vm_mapped_len = HEAP_SIZE, + .type = REGION_SKIBOOT_HEAP, +}; + static struct mem_region skiboot_after_heap = { .name = "ibm,firmware-data", .start = HEAP_BASE + HEAP_SIZE, .len = SKIBOOT_BASE + SKIBOOT_SIZE - (HEAP_BASE + HEAP_SIZE), + .vm_mapped_len = SKIBOOT_BASE + SKIBOOT_SIZE - (HEAP_BASE + HEAP_SIZE), .type = REGION_SKIBOOT_FIRMWARE, }; @@ -141,17 +144,40 @@ static struct alloc_hdr *next_hdr(const struct mem_region *region, return next; } +static unsigned long vm_map_limit(const struct mem_region *region, + const struct alloc_hdr *hdr, + unsigned long size) +{ + unsigned long end = region->start + region->len; + unsigned long limit; + + assert((unsigned long)hdr >= region->start); + + limit = (unsigned long)hdr + size; + assert(limit <= end); + + if (limit + sizeof(struct free_hdr) <= end) + limit += sizeof(struct free_hdr); + + return limit - region->start; +} + #if POISON_MEM_REGION == 1 static void mem_poison(struct free_hdr *f) { - size_t poison_size = (void*)tailer(f) - (void*)(f+1); + unsigned long start = (unsigned long)(f + 1); + unsigned long *t = tailer(f); + size_t poison_size = (unsigned long)t - start; + void *mem; /* We only poison up to a limit, as otherwise boot is * kinda slow */ if (poison_size > POISON_MEM_REGION_LIMIT) poison_size = POISON_MEM_REGION_LIMIT; - memset(f+1, POISON_MEM_REGION_WITH, poison_size); + mem = vm_map(start, poison_size, true); + memset(mem, POISON_MEM_REGION_WITH, poison_size); + vm_unmap(start, poison_size); } #endif @@ -159,14 +185,36 @@ static void mem_poison(struct free_hdr *f) static void init_allocatable_region(struct mem_region *region) { struct free_hdr *f = region_start(region); + unsigned long num_longs; + unsigned long *t; + assert(region->type == REGION_SKIBOOT_HEAP || region->type == REGION_MEMORY); - f->hdr.num_longs = region->len / sizeof(long); + + num_longs = region->len / sizeof(long); + + assert(PAGE_SIZE >= sizeof(*f)); + assert(region->len >= PAGE_SIZE*2); + + list_head_init(®ion->free_list); + + if (!region->vm_mapped_len) { + /* SKIBOOT_BASE-SIZE regions already come mapped */ + vm_map_global(region->name, region->start, sizeof(struct free_hdr), true, false); + region->vm_mapped_len = sizeof(struct free_hdr); + } else { + assert(region == &skiboot_heap); + } + + f->hdr.num_longs = num_longs; f->hdr.free = true; f->hdr.prev_free = false; - *tailer(f) = f->hdr.num_longs; - list_head_init(®ion->free_list); list_add(®ion->free_list, &f->list); + + t = vm_map((unsigned long)tailer(f), sizeof(long), true); + *t = num_longs; + vm_unmap((unsigned long)tailer(f), sizeof(long)); + #if POISON_MEM_REGION == 1 mem_poison(f); #endif @@ -176,6 +224,9 @@ static void make_free(struct mem_region *region, struct free_hdr *f, const char *location, bool skip_poison) { struct alloc_hdr *next; + unsigned long *t; + unsigned long new_end; + unsigned long new_sz; #if POISON_MEM_REGION == 1 if (!skip_poison) @@ -202,20 +253,33 @@ static void make_free(struct mem_region *region, struct free_hdr *f, list_add(®ion->free_list, &f->list); } - /* Fix up tailer. */ - *tailer(f) = f->hdr.num_longs; - - /* If next is free, coalesce it */ + /* If next is free coalesce it, else mark us as free. */ next = next_hdr(region, &f->hdr); if (next) { - next->prev_free = true; if (next->free) { struct free_hdr *next_free = (void *)next; list_del_from(®ion->free_list, &next_free->list); - /* Maximum of one level of recursion */ - make_free(region, next_free, location, true); + f->hdr.num_longs += next_free->hdr.num_longs; + } else { + assert(!next->prev_free); + next->prev_free = true; + goto no_unmap; } } + + /* Freed to the end, may have to trim mapping */ + new_end = (unsigned long)f + sizeof(struct free_hdr); + new_sz = new_end - region->start; + if (region != &skiboot_heap && new_sz < region->vm_mapped_len) { + vm_unmap_global(new_end, region->vm_mapped_len - new_sz); + region->vm_mapped_len = new_sz; + } + +no_unmap: + /* Fix up tailer. */ + t = vm_map((unsigned long)tailer(f), sizeof(long), true); + *t = f->hdr.num_longs; + vm_unmap((unsigned long)tailer(f), sizeof(long)); } /* Can we fit this many longs with this alignment in this free block? */ @@ -253,11 +317,12 @@ static void discard_excess(struct mem_region *region, post->hdr.num_longs = hdr->num_longs - alloc_longs; post->hdr.prev_free = false; + /* No coalescing required. */ + make_free(region, post, location, skip_poison); + /* Trim our block. */ hdr->num_longs = alloc_longs; - /* This coalesces as required. */ - make_free(region, post, location, skip_poison); } } @@ -445,6 +510,18 @@ found: if (next) { assert(next->prev_free); next->prev_free = false; + } else { + unsigned long new_sz; + + /* Took from the end, may have to expand mapping */ + new_sz = vm_map_limit(region, &f->hdr, (alloc_longs + offset) * sizeof(long)); + if (new_sz > region->vm_mapped_len) { + assert(region != &skiboot_heap); + vm_map_global(region->name, + region->start + region->vm_mapped_len, + new_sz - region->vm_mapped_len, true, false); + region->vm_mapped_len = new_sz; + } } if (offset != 0) { @@ -536,6 +613,7 @@ bool mem_resize(struct mem_region *region, void *mem, size_t len, { struct alloc_hdr *hdr, *next; struct free_hdr *f; + unsigned long new_sz; /* This should be a constant. */ assert(is_rodata(location)); @@ -566,6 +644,15 @@ bool mem_resize(struct mem_region *region, void *mem, size_t len, if (!next || !next->free || hdr->num_longs + next->num_longs < len) return false; + new_sz = vm_map_limit(region, hdr, len * sizeof(long)); + if (new_sz > region->vm_mapped_len) { + assert(region != &skiboot_heap); + vm_map_global(region->name, + region->start + region->vm_mapped_len, + new_sz - region->vm_mapped_len, true, false); + region->vm_mapped_len = new_sz; + } + /* OK, it's free and big enough, absorb it. */ f = (struct free_hdr *)next; list_del_from(®ion->free_list, &f->list); @@ -691,6 +778,7 @@ static struct mem_region *new_region(const char *name, region->name = name; region->start = start; region->len = len; + region->vm_mapped_len = 0; region->node = node; region->type = type; region->free_list.n.next = NULL; @@ -1199,6 +1287,7 @@ void mem_region_release_unused(void) continue; used_len = allocated_length(r); + assert(used_len <= r->vm_mapped_len); prlog(PR_INFO, " %s: %llu/%llu used\n", r->name, (long long)used_len, (long long)r->len); @@ -1227,6 +1316,10 @@ void mem_region_release_unused(void) } list_add(®ions, &for_linux->list); } + if (r->vm_mapped_len > used_len) { + vm_unmap_global(r->start + used_len, r->vm_mapped_len - used_len); + r->vm_mapped_len = used_len; + } } unlock(&mem_region_lock); } @@ -1271,9 +1364,13 @@ static void mem_clear_range(uint64_t s, uint64_t e) return; } - prlog(PR_DEBUG, "Clearing region %llx-%llx\n", - (long long)s, (long long)e); + /* + * Large clear thrashes the small hash table, with parallel clearing + * this can livelock. Clear in real mode. + */ + vm_exit(); memset((void *)s, 0, e - s); + vm_enter(); } struct mem_region_clear_job_args { diff --git a/core/opal.c b/core/opal.c index 2898a45ce..c976fcf33 100644 --- a/core/opal.c +++ b/core/opal.c @@ -44,19 +44,39 @@ static uint64_t opal_dynamic_events; extern uint32_t attn_trigger; extern uint32_t hir_trigger; +void __opal_register(uint64_t token, void *func, unsigned int nargs) +{ + uint64_t f; + uint64_t *t; + u8 *a; + + assert(token <= OPAL_LAST); + + f = function_entry_address(func); + + t = vm_map((unsigned long)&opal_branch_table[token], sizeof(*t), true); + *t = f; + vm_unmap((unsigned long)&opal_branch_table[token], sizeof(*t)); + + a = vm_map((unsigned long)&opal_num_args[token], sizeof(*a), true); + *a = nargs; + vm_unmap((unsigned long)&opal_num_args[token], sizeof(*a)); +} void opal_table_init(void) { struct opal_table_entry *s = __opal_table_start; struct opal_table_entry *e = __opal_table_end; + struct opal_table_entry *te; + size_t size = (void *)e - (void *)s; prlog(PR_DEBUG, "OPAL table: %p .. %p, branch table: %p\n", s, e, opal_branch_table); - while(s < e) { - ((uint64_t *)opal_branch_table)[s->token] = function_entry_address(s->func); - ((u8 *)opal_num_args)[s->token] = s->nargs; - s++; - } + + vm_map_global("OPAL table", (unsigned long)s, size, false, false); + for (te = s; te < e; te++) + __opal_register(te->token, te->func, te->nargs); + vm_unmap_global((unsigned long)s, size); } /* Called from head.S, thus no prototype */ @@ -317,14 +337,6 @@ int64_t opal_quiesce(uint32_t quiesce_type, int32_t cpu_target) } opal_call(OPAL_QUIESCE, opal_quiesce, 2); -void __opal_register(uint64_t token, void *func, unsigned int nargs) -{ - assert(token <= OPAL_LAST); - - ((uint64_t *)opal_branch_table)[token] = function_entry_address(func); - ((u8 *)opal_num_args)[token] = nargs; -} - /* * add_opal_firmware_exports_node: adds properties to the device-tree which * the OS will then change into sysfs nodes. @@ -617,12 +629,17 @@ void opal_run_pollers(void) static int64_t opal_poll_events(__be64 *outstanding_event_mask) { + uint32_t *t, a; if (!opal_addr_valid(outstanding_event_mask)) return OPAL_PARAMETER; /* Check if we need to trigger an attn for test use */ - if (attn_trigger == 0xdeadbeef) { + t = vm_map((unsigned long)&attn_trigger, sizeof(attn_trigger), false); + a = *t; + vm_unmap((unsigned long)&attn_trigger, sizeof(attn_trigger)); + + if (a == 0xdeadbeef) { prlog(PR_EMERG, "Triggering attn\n"); assert(false); } diff --git a/core/platform.c b/core/platform.c index 3f4c8bdd5..89a8bc944 100644 --- a/core/platform.c +++ b/core/platform.c @@ -242,8 +242,10 @@ void set_bmc_platform(const struct bmc_platform *bmc) void probe_platform(void) { - struct platform *platforms = &__platforms_start; - unsigned int i; + struct platform *s = __platforms_start; + struct platform *e = __platforms_end; + struct platform *p; + size_t size = (void *)e - (void *)s; /* Detect Manufacturing mode */ if (dt_find_property(dt_root, "ibm,manufacturing-mode")) { @@ -257,12 +259,15 @@ void probe_platform(void) manufacturing_mode = true; } - for (i = 0; &platforms[i] < &__platforms_end; i++) { - if (platforms[i].probe && platforms[i].probe()) { - platform = platforms[i]; + vm_map_global("Platform table", (unsigned long)s, size, false, false); + for (p = s; p < e; p++) { + if (p->probe && p->probe()) { + platform = *p; break; } } + vm_unmap_global((unsigned long)s, size); + if (!platform.name) { platform = generic_platform; if (platform.probe) diff --git a/core/vm.c b/core/vm.c new file mode 100644 index 000000000..e98cba78a --- /dev/null +++ b/core/vm.c @@ -0,0 +1,957 @@ +// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later +/* + * Copyright 2018-2021 IBM Corp. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static bool using_vm = false; +static bool boot_vm_setup = false; +static bool vm_globals_allocated = false; + +#define SLB_SZ (256UL*1024*1024) +#define SLB_NR 32 +#define LOCAL_SLB_NR 2 +#define GLOBAL_SLB_NR (SLB_NR - LOCAL_SLB_NR) +#define LOCAL_SLB_BASE GLOBAL_SLB_NR + +#define LOCAL_EA_PERCPU (SLB_SZ) +#define LOCAL_EA_BEGIN 0x0008000000000000ULL +#define LOCAL_EA_END 0x0009000000000000ULL + +static void __nomcount slb_install(unsigned long esid, unsigned long vsid, unsigned int index) +{ + unsigned long rs; + unsigned long rb; + + rs = vsid << (63-51); /* 256MB VSID */ + rs |= 1UL << (63-53); /* Kp = 1 */ + if (PAGE_SIZE == 0x10000) { + rs |= 1UL << (63-55); /* L = 1 */ + rs |= 1UL << (63-59); /* LP = 01 */ + } + + rb = esid << (63-35); /* 256MB ESID */ + rb |= 1UL << (63-36); /* V = 1 */ + rb |= index; + + asm volatile("slbmte %0,%1" : : "r"(rs), "r"(rb) : "memory"); +} + +#if 0 +static void slb_remove(unsigned long esid) +{ + asm volatile("isync ; slbie %0 ; isync" : : "r"(esid << 28) : "memory"); +} +#endif + +static void slb_remove_all(void) +{ + asm volatile("isync ; slbmte %0,%0 ; slbia ; isync" : : "r"(0) : "memory"); +} + +static void __nomcount slb_add(unsigned long ea) +{ + struct cpu_thread *cpu = this_cpu(); + uint64_t esid = ea >> 28; + uint64_t vsid = ea >> 28; + + slb_install(esid, vsid, cpu->vm_slb_rr); + + cpu->vm_slb_rr++; + if (cpu->vm_slb_rr == GLOBAL_SLB_NR) + cpu->vm_slb_rr = 0; +} + +struct hpte { + beint64_t dword[2]; +}; + +struct hpteg { + struct hpte hpte[8]; +}; + +static struct hpteg *htab; +static unsigned long htab_shift; +static unsigned long htab_pteg_mask; + +static struct lock htab_lock; + +static void __nomcount htab_install(unsigned long va, unsigned long pa, int rw, int ex, int ci, bool local) +{ + unsigned long hash; + struct hpteg *hpteg; + struct hpte *hpte; + unsigned long ava = va >> 23; + unsigned long arpn = pa >> 12; + unsigned long dw0, dw1; + unsigned long _dw0; + unsigned long _ava; + unsigned int hstart, hend; + unsigned int i; + + if (PAGE_SIZE == 0x10000) + arpn >>= 4; + + dw0 = ava << (63-56); /* AVA = ava */ + dw0 |= 0x1; /* V = 1 */ + if (PAGE_SIZE == 0x10000) + dw0 |= 0x4; /* L = 1 */ + if (local) + dw0 |= 0x8; /* SW[0] = 1 */ + + if (PAGE_SIZE == 0x10000) { + dw1 = (arpn << (63-43 - 4)); /* ARPN||LP-4 = arpn */ + dw1 |= (0x1 << (63-43 - 8)); /* LP = 0001 */ + } else + dw1 = (arpn << (63-43 - 8)); /* ARPN||LP = arpn */ + if (!rw) + dw1 |= (1UL << (63 - 0)) | (1UL << (63 - 63 + 1)); /* pp = 110 */ + if (!ex) + dw1 |= (1UL << (63 - 61)); /* N = 1 */ + dw1 |= (1UL << (63 - 60 + 1)); /* WIMG = 0010 */ + if (ci) + dw1 |= (1UL << (63 - 60)) | (1UL << (63 - 60 + 2)); /* WIMG = 0111 */ + dw1 |= (1UL << (63 - 55)) | (1UL << (63 - 56)); /* R=C=1 */ + + if (PAGE_SIZE == 0x10000) + hash = ((va >> 16) & 0xfff) ^ ((va >> 28) & 0x7fffffffffUL); + else + hash = ((va >> 12) & 0xffff) ^ ((va >> 28) & 0x7fffffffffUL); + hpteg = &htab[hash & htab_pteg_mask]; + + lock(&htab_lock); + + hstart = 0; + hend = 7; + + for (i = hstart; i <= hend; i++) { + hpte = &hpteg->hpte[i]; + + _dw0 = be64_to_cpu(hpte->dword[0]); + if (_dw0 & 1) { + _ava = _dw0 >> (63 - 56); + if (_ava == ava) { + assert(!local); + /* This could happen with racing global fault */ + assert(dw0 == _dw0); + assert(dw1 == be64_to_cpu(hpte->dword[1])); + goto out; + } + + continue; + } + + assert(!_dw0); + goto install; + } + + i = mftb(); + i = (i ^ (i >> 4)) & 0x7; + hpte = &hpteg->hpte[i]; + +install: + hpte->dword[1] = cpu_to_be64(dw1); + eieio(); + hpte->dword[0] = cpu_to_be64(dw0); + asm volatile("ptesync" ::: "memory"); +out: + unlock(&htab_lock); +} + +static void htab_remove(unsigned long va, int local) +{ + struct cpu_thread *c = this_cpu(); + bool vm_setup = c->vm_setup; + unsigned long hash; + struct hpteg *hpteg; + unsigned long ava = va >> 23; + unsigned long dw0; + unsigned long rb; + unsigned int hstart, hend; + unsigned int i; + + dw0 = ava << (63-56); + dw0 |= 0x1; + if (PAGE_SIZE == 0x10000) + dw0 |= 0x4; + if (local) + dw0 |= 0x8; + + if (PAGE_SIZE == 0x10000) + hash = ((va >> 16) & 0xfff) ^ ((va >> 28) & 0x7fffffffffUL); + else + hash = ((va >> 12) & 0xffff) ^ ((va >> 28) & 0x7fffffffffUL); + hpteg = &htab[hash & htab_pteg_mask]; + + if (vm_setup) + vm_exit(); + lock(&htab_lock); + hstart = 0; + hend = 7; + + for (i = hstart; i <= hend; i++) { + struct hpte *hpte = &hpteg->hpte[i]; + beint64_t _raw_dw0; + uint64_t _dw0; + + _raw_dw0 = hpte->dword[0]; + _dw0 = be64_to_cpu(_raw_dw0); + + if (!(_dw0 & 1)) { + assert(!_raw_dw0); + continue; + } + + if (_dw0 != dw0) + continue; + + hpte->dword[0] = 0; + eieio(); + hpte->dword[1] = 0; + + break; + } + + if (PAGE_SIZE == 0x10000) { + rb = (va >> 16) << (63 - 47); /* AVA||LP-4 */ + rb |= 0x1 << (63 - 51); /* LP=0001 */ + rb |= 0x1; /* L=1 */ + } else { + rb = va & ~0xfffUL; + } + + unlock(&htab_lock); + + if (vm_setup) + vm_enter(); + + if (local) { + asm volatile("ptesync" ::: "memory"); + asm volatile("tlbiel %0" : : "r"(rb)); + asm volatile("ptesync" ::: "memory"); + } else { + asm volatile("ptesync" ::: "memory"); + asm volatile("tlbie %0,%1" : : "r"(rb), "r"(0)); + asm volatile("eieio ; tlbsync ; ptesync" ::: "memory"); + + } +} + +/* + * Try to fix problems in callers if !strict. + */ +static bool vm_strict = false; + +static struct list_head vm_maps = LIST_HEAD_INIT(vm_maps); +static struct lock vm_maps_lock; +static unsigned long nr_vm_maps; + +static void __vm_map_global(const char *name, unsigned long addr, unsigned long len, unsigned long pa, bool r, bool w, bool x, bool ci) +{ + struct cpu_thread *c = this_cpu(); + bool vm_setup = c->vm_setup; + struct vm_map *new; + struct vm_map *vmm; + + assert(!cpu_in_os()); + + new = zalloc(sizeof(*new)); + assert(new); + + new->name = name; + new->address = addr; + new->length = len; + new->pa = pa; + new->readable = r; + new->writeable = w; + new->executable = x; + new->ci = ci; + + /* Can not take a d-side fault while holding this lock */ + if (vm_setup) + vm_exit(); + lock(&vm_maps_lock); + + list_for_each(&vm_maps, vmm, list) { + unsigned long ps = addr & ~(PAGE_SIZE - 1); + unsigned long pe = (addr + len + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1); + unsigned long vmm_ps = vmm->address & ~(PAGE_SIZE - 1); + unsigned long vmm_pe = (vmm->address + vmm->length + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1); + bool mergeable = false; + bool samepage = false; + + /* Ensure no overlap */ + assert(addr + len <= vmm->address || addr >= vmm->address + vmm->length); + + if (ps > vmm_pe) + continue; /* Sort */ + if (pe < vmm_ps) { + /* Not same or adjacent page is easy */ + list_add_before(&vm_maps, &new->list, &vmm->list); + goto found; + } + if (pe > vmm_ps || ps < vmm_pe) + samepage = true; + + mergeable = /* XXX: check pa */ 1 && + (vmm->ci == ci) && + (vmm->readable == r) && + (vmm->writeable == w) && + (vmm->executable == x); + samepage = false; + + if (samepage && !mergeable) { + printf("VMM: %s (%lx-%lx) mismatched permissions with same page mapping %s (%llx-%llx)\n", name, addr, addr + len, vmm->name, vmm->address, vmm->address + vmm->length); + assert(vmm->pa == pa); + assert(vmm->ci == ci); + assert(vmm->readable == r); + assert(vmm->writeable == w); + assert(vmm->executable == x); + } + + if (!strcmp(name, vmm->name) && mergeable) { + if (addr == vmm->address + vmm->length) { + free(new); + vmm->length += len; + goto done; + } + + if (addr + len == vmm->address) { + free(new); + vmm->address = addr; + vmm->pa = pa; + vmm->length += len; + goto done; + } + } + + if (addr >= vmm->address + vmm->length) + continue; + if (addr + len <= vmm->address) { + list_add_before(&vm_maps, &new->list, &vmm->list); + goto found; + } + + assert(0); + } + list_add_tail(&vm_maps, &new->list); +found: + nr_vm_maps++; +done: + unlock(&vm_maps_lock); + if (vm_setup) + vm_enter(); +} + +static void __vm_unmap_global(unsigned long addr, unsigned long len) +{ + struct cpu_thread *c = this_cpu(); + bool vm_setup = c->vm_setup; + unsigned long end = addr + len; + struct vm_map *vmm, *to_free = NULL; + + assert(!cpu_in_os()); + + /* Can not take a d-side fault while holding this lock */ + if (vm_setup) + vm_exit(); + lock(&vm_maps_lock); + list_for_each(&vm_maps, vmm, list) { + struct vm_map *new; + + if (addr + len <= vmm->address) + continue; + if (addr >= vmm->address + vmm->length) + continue; + if (addr == vmm->address && len == vmm->length) { + to_free = vmm; + goto found; + } + + if (addr == vmm->address) { + vmm->address += len; + vmm->pa += len; + vmm->length -= len; + goto done; + } + + if (addr + len == vmm->address + vmm->length) { + vmm->length -= len; + goto done; + } + + /* Unmaps will never span multiple because they always apply to a previous map, so this is a split */ + new = zalloc(sizeof(*new)); + assert(new); + memcpy(new, vmm, sizeof(*new)); + list_add_before(&vm_maps, &new->list, &vmm->list); + nr_vm_maps++; + + new->length = addr - new->address; + vmm->address += new->length + len; + vmm->pa += new->length + len; + vmm->length -= new->length + len; + goto done; + } + vmm = NULL; + unlock(&vm_maps_lock); + assert(!vm_strict); + prerror("unmap didn't find anything\n"); + backtrace(); + goto out; + +found: + list_del(&vmm->list); + nr_vm_maps--; +done: + if (boot_vm_setup) { + while (addr < end) { + htab_remove(addr, false); + addr += PAGE_SIZE; + } + } + + unlock(&vm_maps_lock); +out: + if (vm_setup) + vm_enter(); + + if (to_free) + free(to_free); +} + + +void vm_map_global(const char *name, unsigned long addr, unsigned long len, bool rw, bool ci) +{ + __vm_map_global(name, addr, len, addr, true, rw, false, ci); +} + +void vm_map_global_text(const char *name, unsigned long addr, unsigned long len) +{ + __vm_map_global(name, addr, len, addr, true, false, true, false); +} + +void vm_unmap_global(unsigned long addr, unsigned long len) +{ + __vm_unmap_global(addr, len); +} + +void *vm_map(unsigned long addr, unsigned long len, bool rw) +{ + struct cpu_thread *c = this_cpu(); + unsigned long newaddr; + unsigned long end; + unsigned long offset = addr & (PAGE_SIZE - 1); + + end = (addr + len + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1); + addr &= ~(PAGE_SIZE - 1); + len = end - addr; + + assert(len <= LOCAL_EA_PERCPU); + + /* Can't do nested mappings */ + assert(!c->vm_local_map_inuse); + c->vm_local_map_inuse = true; + + if (c->vm_setup) { + struct vm_map *new = &c->vm_local_map; + + newaddr = LOCAL_EA_BEGIN + LOCAL_EA_PERCPU * c->pir; + + new->name = "local"; + new->address = newaddr; + new->length = len; + new->pa = addr; + new->readable = true; + new->writeable = rw; + new->executable = false; + new->ci = false; + + } else { + newaddr = addr; + } + + return (void *)newaddr + offset; +} + +void vm_unmap(unsigned long addr, unsigned long len) +{ + struct cpu_thread *c = this_cpu(); + unsigned long newaddr; + unsigned long end; + + end = (addr + len + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1); + addr &= ~(PAGE_SIZE - 1); + len = end - addr; + + assert(len <= LOCAL_EA_PERCPU); + + assert(c->vm_local_map_inuse); + c->vm_local_map_inuse = false; + + if (c->vm_setup) { + struct vm_map *vmm; + unsigned long ea; + + newaddr = LOCAL_EA_BEGIN + LOCAL_EA_PERCPU * c->pir; + + vmm = &c->vm_local_map; + assert(newaddr == vmm->address); + assert(len == vmm->length); + memset(vmm, 0, sizeof(struct vm_map)); + + ea = newaddr; + while (ea < newaddr + len) { + htab_remove(ea, true); + ea += PAGE_SIZE; + } + } +} + +struct prte { + beint64_t dword[2]; +}; + +static struct prte *prtab; +static unsigned long old_lpcr; +static unsigned long new_lpcr; + +static void vm_init_cpu(void) +{ + struct cpu_thread *c = this_cpu(); + unsigned long ea = LOCAL_EA_BEGIN + LOCAL_EA_PERCPU * c->pir; + unsigned long esid = ea >> 28; + unsigned long vsid = ea >> 28; + + mtspr(SPR_LPCR, new_lpcr); + + mtspr(SPR_LPID, 0); + mtspr(SPR_PID, 0); + mtspr(SPR_HRMOR, 0); + mtspr(SPR_PTCR, (unsigned long)prtab); + mtspr(SPR_AMR, 0); + mtspr(SPR_IAMR, 0); + mtspr(SPR_AMOR, 0); + mtspr(SPR_UAMOR, 0); + + slb_remove_all(); + slb_install(esid, vsid, LOCAL_SLB_BASE); +} + +void vm_init_secondary(void) +{ + vm_init_cpu(); + vm_enter(); +} + +bool vm_realmode(void) +{ + struct cpu_thread *c = this_cpu(); + + return !c->vm_setup; +} + +void vm_enter(void) +{ + struct cpu_thread *c = this_cpu(); + + if (!using_vm) + return; + + assert(!cpu_in_os()); + assert(boot_vm_setup); + if (c->vm_setup) { + prerror("CPU:%d vm_enter already entered\n", c->pir); + backtrace(); + } + if (c->vm_local_map_inuse) { + prerror("CPU:%d vm_enter local map inuse\n", c->pir); + backtrace(); + } + + c->vm_setup = true; + mtmsr(mfmsr() | (MSR_IR|MSR_DR)); +} + +void vm_exit(void) +{ + struct cpu_thread *c = this_cpu(); + + if (!using_vm) + return; + + assert(!cpu_in_os()); + assert(boot_vm_setup); + if (!c->vm_setup) { + prerror("CPU:%d vm_exit already exited\n", c->pir); + backtrace(); + } + if (c->vm_local_map_inuse) { + prerror("CPU:%d vm_enter local map inuse\n", c->pir); + backtrace(); + } + c->vm_setup = false; + mtmsr(mfmsr() & ~(MSR_IR|MSR_DR)); +} + +bool __nomcount vm_dslb(uint64_t nia, uint64_t dar) +{ + /* + * Per-cpu map ranges are bolted to per-cpu SLBs. + */ + assert((dar < LOCAL_EA_BEGIN) || + (dar >= LOCAL_EA_END)); + + (void)nia; + slb_add(dar); + + return true; +} + +bool __nomcount vm_islb(uint64_t nia) +{ + slb_add(nia); + + return true; +} + +bool __nomcount vm_dsi(uint64_t nia, uint64_t dar, uint32_t dsisr) +{ + struct cpu_thread *c = this_cpu(); + struct vm_map *vmm; + uint64_t pa; + bool store = !!(dsisr & DSISR_ISSTORE); + bool ret = true; + bool local; + + if (dsisr & 0xbdffffffU) { + printf("Page fault bad dsisr at 0x%016llx dar=0x%016llx dsisr=0x%08x\n", nia, dar, dsisr); + return false; + } + + if ((dar >= LOCAL_EA_BEGIN) && (dar < LOCAL_EA_END)) { + local = true; + vmm = &c->vm_local_map; + if (dar >= vmm->address && dar < vmm->address + vmm->length) + goto found; + /* !vm_strict can't fix up this case because it's a non-linear mapping */ + goto not_found; + } + + local = false; + + lock(&vm_maps_lock); + list_for_each(&vm_maps, vmm, list) { + assert(vmm->pa == vmm->address); + if (dar >= vmm->address && dar < vmm->address + vmm->length) + goto found; + } + if (!vm_strict) { + if (dar >= 0x0006000000000000 && dar < 0x0007000000000000) + /* MMIO */ + htab_install(dar, dar, 1, 0, 1, false); + else if (dar < LOCAL_EA_BEGIN) + htab_install(dar, dar, 1, 0, 0, false); + else + ret = false; + unlock(&vm_maps_lock); + prerror("Page fault with no VMM at NIA:0x%016llx DAR:0x%016llx, store:%d\n", nia, dar, store); + backtrace(); + list_for_each(&vm_maps, vmm, list) + prlog(PR_DEBUG, "%28s 0x%08llx-0x%08llx\n", vmm->name, + vmm->address, vmm->address + vmm->length); + goto out; + } + unlock(&vm_maps_lock); +not_found: + prerror(" vmm not found\n"); + ret = false; + assert(0); + goto out; + +found: + pa = vmm->pa + (dar & ~(PAGE_SIZE - 1)) - vmm->address; + if (!vmm->readable) { + if (!vm_strict) { + htab_install(dar, pa, store, 0, vmm->ci, local); + if (!local) + unlock(&vm_maps_lock); + prerror("Page fault to unreadable VMM:%s at NIA:0x%016llx DAR:0x%016llx\n", vmm->name, nia, dar); + backtrace(); + goto out; + } + prerror(" vmm not readable\n"); + ret = false; + assert(0); + goto out; + } + if (store && !vmm->writeable) { + if (!vm_strict) { + htab_install(dar, pa, store, 0, vmm->ci, local); + if (!local) + unlock(&vm_maps_lock); + prerror("Page fault store to RO VMM:%s at NIA:0x%016llx DAR:0x%016llx\n", vmm->name, nia, dar); + backtrace(); + goto out; + } + if (!local) + unlock(&vm_maps_lock); + prerror(" vmm not writeable\n"); + ret = false; + assert(0); + goto out; + } + + htab_install(dar, pa, vmm->writeable, vmm->executable, vmm->ci, local); + if (!local) + unlock(&vm_maps_lock); + +out: + return ret; +} + +bool __nomcount vm_isi(uint64_t nia) +{ + struct vm_map *vmm; + + lock(&vm_maps_lock); + list_for_each(&vm_maps, vmm, list) { + assert(vmm->pa == vmm->address); + if (nia >= vmm->address && nia < vmm->address + vmm->length) { + if (!vmm->executable) { + assert(!vm_strict); + prerror("Page fault at NIA:0x%016llx NX mapping!\n", nia); + backtrace(); + } + + goto found; + } + } + + assert(!vm_strict); + prerror("Page fault, no mapping for NIA:0x%016llx !\n", nia); + backtrace(); + +found: + unlock(&vm_maps_lock); + htab_install(nia, nia, 0, 1, 0, false); + + return true; +} + +static void cpu_stop_vm(void *arg __unused) +{ + vm_exit(); +} + +static void cpu_cleanup_vm(void *arg __unused) +{ + slb_remove_all(); + mtspr(SPR_PTCR, 0); + mtspr(SPR_LPCR, old_lpcr); +} + +static void cpu_all_destroy_vm(void) +{ + struct cpu_thread *cpu; + struct cpu_job **jobs; + + jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); + assert(jobs); + + /* Stop all CPUs */ + for_each_available_cpu(cpu) { + if (cpu == this_cpu()) + continue; + jobs[cpu->pir] = cpu_queue_job(cpu, "cpu_stop_vm", + cpu_stop_vm, NULL); + } + + /* this cpu */ + cpu_stop_vm(NULL); + + /* Cleaup after all stop */ + for_each_available_cpu(cpu) { + if (jobs[cpu->pir]) + cpu_wait_job(jobs[cpu->pir], true); + } + + for_each_available_cpu(cpu) { + if (cpu == this_cpu()) + continue; + jobs[cpu->pir] = cpu_queue_job(cpu, "cpu_cleanup_vm", + cpu_cleanup_vm, NULL); + } + + /* this cpu */ + cpu_cleanup_vm(NULL); + + for_each_available_cpu(cpu) { + if (jobs[cpu->pir]) + cpu_wait_job(jobs[cpu->pir], true); + } + + free(jobs); + + cleanup_global_tlb(); +} + +static void print_maps(void) +{ + struct vm_map *vmm; + + prlog(PR_DEBUG, " %lu Global mappings\n", nr_vm_maps); + list_for_each(&vm_maps, vmm, list) { + prlog(PR_DEBUG, "%28s 0x%08llx-0x%08llx\n", vmm->name, + vmm->address, vmm->address + vmm->length); + } +} + +void vm_init(bool fast_reboot) +{ + unsigned long stack_start = SKIBOOT_BASE + SKIBOOT_SIZE; + unsigned long stack_end = stack_start + (cpu_max_pir + 1)*STACK_SIZE; + unsigned long sym_start = (unsigned long)__sym_map_start; + unsigned long sym_size = (unsigned long)__sym_map_end - sym_start; + unsigned long htab_nr_bytes; + unsigned long htab_nr_ptegs; + + /* Only POWER9 has had significant testing so far */ + if (proc_gen != proc_gen_p9) + return; + + using_vm = true; + + assert(!boot_vm_setup); + + old_lpcr = mfspr(SPR_LPCR); + new_lpcr = (old_lpcr & ~(PPC_BITMASK(0,3) | PPC_BIT(41) | PPC_BIT(43))) + | PPC_BIT(54); + + prtab = memalign(64*1024, 64*1024); + assert(prtab); + memset(prtab, 0, 64*1024); + + htab_shift = 18; /* 256kB table */ + htab_nr_bytes = 1UL << htab_shift; + htab_nr_ptegs = htab_nr_bytes / sizeof(struct hpteg); + htab_pteg_mask = htab_nr_ptegs - 1; + htab = memalign(1UL << htab_shift, htab_nr_bytes); + assert(htab); + memset(htab, 0, htab_nr_bytes); + + prtab[0].dword[0] = cpu_to_be64((unsigned long)htab | (htab_shift - 18)); + prtab[0].dword[1] = 0; + + eieio(); + + vm_init_cpu(); + + cleanup_global_tlb(); + + if (vm_globals_allocated) { + assert(fast_reboot); + goto done; + } + + assert(!fast_reboot); + vm_globals_allocated = true; + + vm_map_global_text("OPAL text", (unsigned long)_stext, + (unsigned long)_etext - (unsigned long)_stext); + vm_map_global("OPAL rodata", (unsigned long)__rodata_start, + (unsigned long)__vm_mapped_romem_end - (unsigned long)__rodata_start, + false, false); + vm_map_global("OPAL data", (unsigned long)_sdata, + (unsigned long)_edata - (unsigned long)_sdata, + true, false); + vm_map_global("OPAL symbols", sym_start, sym_size, false, false); + vm_map_global("OPAL bss", (unsigned long)_sbss, + (unsigned long)_ebss - (unsigned long)_sbss, + true, false); + vm_map_global("OPAL heap", HEAP_BASE, HEAP_SIZE, true, false); + vm_map_global("Memory console", INMEM_CON_START, INMEM_CON_LEN, true, false); + vm_map_global("Hostboot console", HBRT_CON_START, HBRT_CON_LEN, false, false); + vm_map_global("SPIRA heap", SPIRA_HEAP_BASE, SPIRA_HEAP_SIZE, false, false); + vm_map_global("PSI TCE table", PSI_TCE_TABLE_BASE, PSI_TCE_TABLE_SIZE, false, false); + vm_map_global("OPAL boot stacks", stack_start, stack_end - stack_start, true, false); + +done: + prlog(PR_DEBUG, "VMM: SETUP\n"); + prlog(PR_DEBUG, " PRTAB:%p\n", prtab); + prlog(PR_DEBUG, " HTAB: %p\n", htab); + print_maps(); + + boot_vm_setup = true; + + vm_enter(); +} + +void vm_init_stacks(void) +{ + unsigned long stack_start = SKIBOOT_BASE + SKIBOOT_SIZE; + unsigned long stack_end = stack_start + (cpu_max_pir + 1)*STACK_SIZE; + struct cpu_thread *c = this_cpu(); + struct vm_map *vmm; + + if (!using_vm) + return; + + assert(boot_vm_setup); + + /* Can not take a d-side fault while holdig this lock */ + if (c->vm_setup) + mtmsr(mfmsr() & ~MSR_DR); + lock(&vm_maps_lock); + list_for_each(&vm_maps, vmm, list) { + if (vmm->address >= stack_end) + continue; + if (vmm->address + vmm->length <= stack_start) + continue; + goto found; + } + unlock(&vm_maps_lock); + assert(0); + +found: + vmm->name = "OPAL stacks"; + vmm->address = stack_start; + vmm->length = stack_end - stack_start; + unlock(&vm_maps_lock); + if (c->vm_setup) + mtmsr(mfmsr() | MSR_DR); +} + +void vm_destroy(void) +{ + if (!using_vm) + return; + + assert(boot_vm_setup); + + prlog(PR_DEBUG, "VMM: TEARDOWN\n"); + print_maps(); + + cpu_all_destroy_vm(); + + boot_vm_setup = false; + + /* + * Global vm_maps stay around for OS virtual memory and fast reboot. + */ + + free(htab); + htab = NULL; + free(prtab); + prtab = NULL; +} diff --git a/hdata/spira.c b/hdata/spira.c index baa23751d..af76faf13 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -1810,11 +1810,20 @@ static void fixup_spira(void) static void update_spirah_addr(void) { #if !defined(TEST) + beint64_t *spirah_offset; + beint64_t *spira_offset; + if (proc_gen < proc_gen_p9) return; - naca.spirah_addr = CPU_TO_BE64(SPIRAH_OFF); - naca.spira_addr = CPU_TO_BE64(SPIRA_OFF); + spirah_offset = vm_map((u64)&naca, sizeof(u64), true); + *spirah_offset = CPU_TO_BE64(SPIRAH_OFF); + vm_unmap((unsigned long)spirah_offset, sizeof(u64)); + + spira_offset = vm_map((u64)&naca + 0x30, sizeof(u64), true); + *spira_offset = CPU_TO_BE64(SPIRA_OFF); + vm_unmap((unsigned long)spira_offset, sizeof(u64)); + spirah.ntuples.hs_data_area.addr = CPU_TO_BE64(SPIRA_HEAP_BASE - SKIBOOT_BASE); spirah.ntuples.mdump_res.addr = CPU_TO_BE64(MDRT_TABLE_BASE - SKIBOOT_BASE); #endif @@ -1822,13 +1831,24 @@ static void update_spirah_addr(void) int parse_hdat(bool is_opal) { + int ret = 0; + cpu_type = PVR_TYPE(mfspr(SPR_PVR)); prlog(PR_DEBUG, "Parsing HDAT...\n"); + vm_map_global("SPIRA", SKIBOOT_BASE + SPIRA_OFF, sizeof(spira), true, false); fixup_spira(); + vm_unmap_global(SKIBOOT_BASE + SPIRA_OFF, sizeof(spira)); + vm_map_global("SPIRA-H", SKIBOOT_BASE + SPIRAH_OFF, sizeof(spirah), true, false); update_spirah_addr(); + vm_unmap_global(SKIBOOT_BASE + SPIRAH_OFF, sizeof(spirah)); + + /* Downgrade to read-only */ + + vm_map_global("SPIRA", SKIBOOT_BASE + SPIRA_OFF, sizeof(spira), false, false); + vm_map_global("SPIRA-H", SKIBOOT_BASE + SPIRAH_OFF, sizeof(spirah), false, false); /* * Basic DT root stuff @@ -1849,8 +1869,10 @@ int parse_hdat(bool is_opal) dt_init_led_node(); /* Parse PCIA */ - if (!pcia_parse()) - return -1; + if (!pcia_parse()) { + ret = -1; + goto out; + } /* IPL params */ add_iplparams(); @@ -1896,6 +1918,9 @@ int parse_hdat(bool is_opal) node_stb_parse(); prlog(PR_DEBUG, "Parsing HDAT...done\n"); +out: + vm_unmap_global(SKIBOOT_BASE + SPIRA_OFF, sizeof(spira)); + vm_unmap_global(SKIBOOT_BASE + SPIRAH_OFF, sizeof(spirah)); - return 0; + return ret; } diff --git a/hw/fake-nvram.c b/hw/fake-nvram.c index 44adde4a3..d1ed62e9e 100644 --- a/hw/fake-nvram.c +++ b/hw/fake-nvram.c @@ -23,12 +23,16 @@ int fake_nvram_info(uint32_t *total_size) int fake_nvram_start_read(void *dst, uint32_t src, uint32_t len) { + void *t; + if (!nvram_region) return -ENODEV; + t = vm_map(nvram_region->start + src, len, false); lock(&fake_nvram_lock); - memcpy(dst, (void *) (nvram_region->start + src), len); + memcpy(dst, t, len); unlock(&fake_nvram_lock); + vm_unmap(nvram_region->start + src, len); nvram_read_complete(true); @@ -37,12 +41,16 @@ int fake_nvram_start_read(void *dst, uint32_t src, uint32_t len) int fake_nvram_write(uint32_t offset, void *src, uint32_t size) { + void *t; + if (!nvram_region) return OPAL_HARDWARE; + t = vm_map(nvram_region->start + offset, size, true); lock(&fake_nvram_lock); - memcpy((void *) (nvram_region->start + offset), src, size); + memcpy(t, src, size); unlock(&fake_nvram_lock); + vm_unmap(nvram_region->start + offset, size); return 0; } diff --git a/hw/homer.c b/hw/homer.c index 3ff6ed1ae..832c636a2 100644 --- a/hw/homer.c +++ b/hw/homer.c @@ -118,6 +118,9 @@ static void homer_init_chip(struct proc_chip *chip) chip->homer_base = hbase; chip->homer_size = hsize; + /* slw late init and xive late init want to write to HOMER */ + /* XXX: make it read only until then? */ + vm_map_global("HOMER Image", hbase, hsize, true, false); } /* @@ -144,13 +147,21 @@ static void homer_init_chip(struct proc_chip *chip) chip->slw_base = sbase; chip->slw_bar_size = ssize; chip->slw_image_size = ssize; /* will be adjusted later */ + /* XXX */ } if (read_pba_bar(chip, bar_occ_common, &obase, &osize)) { - prlog(PR_DEBUG, " OCC Common Area at 0x%llx size %lldMB\n", - obase, osize / 0x100000); + static uint64_t homer_obase = 0; + chip->occ_common_base = obase; chip->occ_common_size = osize; + + prlog(PR_DEBUG, " OCC Common Area at 0x%llx size %lldMB\n", + obase, osize / 0x100000); + if (obase != homer_obase) { + vm_map_global("OCC Common Area", obase, osize, false, false); + homer_obase = obase; + } } } diff --git a/hw/lpc-uart.c b/hw/lpc-uart.c index 834011b37..a52f0b8aa 100644 --- a/hw/lpc-uart.c +++ b/hw/lpc-uart.c @@ -60,7 +60,7 @@ static uint64_t uart_tx_full_time; static bool has_irq = false, irq_ok, rx_full, tx_full; static uint8_t tx_room; static uint8_t cached_ier; -static void *mmio_uart_base; +void *mmio_uart_base; static int uart_console_policy = UART_CONSOLE_OPAL; static int lpc_irq = -1; @@ -642,6 +642,8 @@ void early_uart_init(void) if (!mmio_uart_base) return; + vm_map_global("UART MMIO", (unsigned long)mmio_uart_base, 8, true, true); + clk = dt_prop_get_u32(uart_node, "clock-frequency"); baud = dt_prop_get_u32(uart_node, "current-speed"); @@ -650,6 +652,7 @@ void early_uart_init(void) prlog(PR_DEBUG, "UART: Using UART at %p\n", mmio_uart_base); } else { prerror("UART: Early init failed!"); + vm_unmap_global((unsigned long)mmio_uart_base, 8); mmio_uart_base = NULL; } } @@ -661,9 +664,6 @@ void uart_init(void) char *path __unused; const be32 *irqp; - /* Clean up after early_uart_init() */ - mmio_uart_base = NULL; - /* UART lock is in the console path and thus must block * printf re-entrancy */ @@ -681,13 +681,28 @@ void uart_init(void) * directly mapped UARTs in simulation environments */ if (n->parent == dt_root) { + void *base; + printf("UART: Found at root !\n"); - mmio_uart_base = (void *)dt_translate_address(n, 0, NULL); - if (!mmio_uart_base) { + + base = (void *)dt_translate_address(n, 0, NULL); + if (!base) { printf("UART: Failed to translate address !\n"); return; } + if (mmio_uart_base != base) { + void *old; + + vm_map_global("UART MMIO", (unsigned long)base, 8, true, true); + old = mmio_uart_base; + mmio_uart_base = base; + + /* Clean up after early_uart_init() */ + if (old) + vm_unmap_global((unsigned long)old, 8); + } + /* If it has an interrupt properly, we consider this to be * a direct XICS/XIVE interrupt */ @@ -716,6 +731,11 @@ void uart_init(void) lpc_irq = be32_to_cpu(*irqp); prlog(PR_DEBUG, "UART: Using LPC IRQ %d\n", lpc_irq); } + + if (mmio_uart_base) { +// vm_unmap_global((unsigned long)mmio_uart_base, 8); + mmio_uart_base = NULL; + } } diff --git a/hw/lpc.c b/hw/lpc.c index bf3ab1fae..60e247246 100644 --- a/hw/lpc.c +++ b/hw/lpc.c @@ -1241,6 +1241,7 @@ static void lpc_init_chip_p8(struct dt_node *xn) chip->lpc = lpc; } +extern void *mmio_uart_base; static void lpc_init_chip_p9(struct dt_node *opb_node) { uint32_t gcid = dt_get_chip_id(opb_node); @@ -1263,6 +1264,11 @@ static void lpc_init_chip_p9(struct dt_node *opb_node) if (!lpc_node) return; + + if (mmio_uart_base) + vm_unmap_global((unsigned long)mmio_uart_base, 8); + vm_map_global("LPC MMIO", addr, 0x100000000UL /* XXX: size? */, true, true); + lpc = zalloc(sizeof(struct lpcm)); assert(lpc); lpc->chip_id = gcid; diff --git a/hw/phb4.c b/hw/phb4.c index ec07fe2bb..52e19d319 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -6174,6 +6174,7 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, uint64_t val, phb_bar = 0, irq_bar = 0, bar_en; uint64_t mmio0_bar = 0, mmio0_bmask, mmio0_sz; uint64_t mmio1_bar = 0, mmio1_bmask, mmio1_sz; + uint64_t bar_sz; void *foo; __be64 mmio_win[4]; unsigned int mmio_win_sz; @@ -6217,7 +6218,8 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, bar_en = 0; /* Initialize PHB register BAR */ - phys_map_get(gcid, phys_reg_spc, phb_num, &phb_bar, NULL); + phys_map_get(gcid, phys_reg_spc, phb_num, &phb_bar, &bar_sz); + vm_map_global("PHB REGS", phb_bar, bar_sz, true, true); rc = xscom_write(gcid, nest_stack + XPEC_NEST_STK_PHB_REG_BAR, phb_bar << 8); @@ -6231,18 +6233,21 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, bar_en |= XPEC_NEST_STK_BAR_EN_PHB; /* Same with INT BAR (ESB) */ - phys_map_get(gcid, phys_xive_esb, phb_num, &irq_bar, NULL); + phys_map_get(gcid, phys_xive_esb, phb_num, &irq_bar, &bar_sz); + vm_map_global("PHB IRQ", irq_bar, bar_sz, true, true); xscom_write(gcid, nest_stack + XPEC_NEST_STK_IRQ_BAR, irq_bar << 8); bar_en |= XPEC_NEST_STK_BAR_EN_INT; /* Same with MMIO windows */ phys_map_get(gcid, phys_mmio64, phb_num, &mmio0_bar, &mmio0_sz); + vm_map_global("PHB MMIO0", mmio0_bar, mmio0_sz, true, true); mmio0_bmask = (~(mmio0_sz - 1)) & 0x00FFFFFFFFFFFFFFULL; xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR0, mmio0_bar << 8); xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR0_MASK, mmio0_bmask << 8); phys_map_get(gcid, phys_mmio32, phb_num, &mmio1_bar, &mmio1_sz); + vm_map_global("PHB MMIO1", mmio1_bar, mmio1_sz, true, true); mmio1_bmask = (~(mmio1_sz - 1)) & 0x00FFFFFFFFFFFFFFULL; xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR1, mmio1_bar << 8); xscom_write(gcid, nest_stack + XPEC_NEST_STK_MMIO_BAR1_MASK, mmio1_bmask << 8); diff --git a/hw/psi.c b/hw/psi.c index de074ce4a..12b5e5a73 100644 --- a/hw/psi.c +++ b/hw/psi.c @@ -707,7 +707,7 @@ static void psi_init_p8_interrupts(struct psi *psi) static void psi_init_p9_interrupts(struct psi *psi) { struct proc_chip *chip; - u64 val; + u64 val, size; /* Grab chip */ chip = get_chip(psi->chip_id); @@ -715,7 +715,8 @@ static void psi_init_p9_interrupts(struct psi *psi) return; /* Configure the CI BAR */ - phys_map_get(chip->id, PSIHB_ESB, 0, &val, NULL); + phys_map_get(chip->id, PSIHB_ESB, 0, &val, &size); + vm_map_global("PSIHB ESB", val, size, true, true); val |= PSIHB_ESB_CI_VALID; out_be64(psi->regs + PSIHB_ESB_CI_BASE, val); @@ -763,7 +764,7 @@ static const struct irq_source_ops psi_p10_irq_ops = { static void psi_init_p10_interrupts(struct psi *psi) { struct proc_chip *chip; - u64 val; + u64 val, size; uint32_t esb_shift = 16; uint32_t flags = XIVE_SRC_LSI; struct irq_source *is; @@ -775,7 +776,8 @@ static void psi_init_p10_interrupts(struct psi *psi) return; /* Configure the CI BAR */ - phys_map_get(chip->id, PSIHB_ESB, 0, &val, NULL); + phys_map_get(chip->id, PSIHB_ESB, 0, &val, &size); + vm_map_global("PSIHB ESB", val, size, true, true); val |= PSIHB_ESB_CI_VALID; if (esb_shift == 16) val |= PSIHB10_ESB_CI_64K; @@ -972,9 +974,10 @@ static struct psi *psi_probe_p8(struct proc_chip *chip, u64 base) static struct psi *psi_probe_p9(struct proc_chip *chip, u64 base) { struct psi *psi = NULL; - uint64_t addr; + uint64_t addr, size; - phys_map_get(chip->id, PSIHB_REG, 0, &addr, NULL); + phys_map_get(chip->id, PSIHB_REG, 0, &addr, &size); + vm_map_global("PSIHB REG", addr, size, true, true); xscom_write(chip->id, base + PSIHB_XSCOM_P9_BASE, addr | PSIHB_XSCOM_P9_HBBAR_EN); @@ -989,9 +992,10 @@ static struct psi *psi_probe_p9(struct proc_chip *chip, u64 base) static struct psi *psi_probe_p10(struct proc_chip *chip, u64 base) { struct psi *psi = NULL; - uint64_t addr; + uint64_t addr, size; - phys_map_get(chip->id, PSIHB_REG, 0, &addr, NULL); + phys_map_get(chip->id, PSIHB_REG, 0, &addr, &size); + vm_map_global("PSIHB REG", addr, size, true, true); xscom_write(chip->id, base + PSIHB_XSCOM_P9_BASE, addr | PSIHB_XSCOM_P9_HBBAR_EN); diff --git a/hw/xive.c b/hw/xive.c index 51b03549a..3cb266e90 100644 --- a/hw/xive.c +++ b/hw/xive.c @@ -1430,9 +1430,12 @@ static bool xive_configure_bars(struct xive *x) { uint64_t chip_id = x->chip_id; uint64_t val; + bool tm_mapped = false; /* IC BAR */ phys_map_get(chip_id, XIVE_IC, 0, (uint64_t *)&x->ic_base, &x->ic_size); + vm_map_global("XIVE IC", (unsigned long)x->ic_base, x->ic_size, true, true); + val = (uint64_t)x->ic_base | CQ_IC_BAR_VALID | CQ_IC_BAR_64K; x->ic_shift = 16; @@ -1445,6 +1448,11 @@ static bool xive_configure_bars(struct xive *x) * all phys_map_get(XIVE_TM) calls. */ phys_map_get(0, XIVE_TM, 0, (uint64_t *)&x->tm_base, &x->tm_size); + if (!tm_mapped) { + vm_map_global("XIVE TM", (unsigned long)x->tm_base, x->tm_size, true, true); + tm_mapped = true; + } + val = (uint64_t)x->tm_base | CQ_TM_BAR_VALID | CQ_TM_BAR_64K; x->tm_shift = 16; @@ -1457,6 +1465,7 @@ static bool xive_configure_bars(struct xive *x) /* PC BAR. Clear first, write mask, then write value */ phys_map_get(chip_id, XIVE_PC, 0, (uint64_t *)&x->pc_base, &x->pc_size); + vm_map_global("XIVE PC", (unsigned long)x->pc_base, x->pc_size, true, true); xive_regwx(x, CQ_PC_BAR, 0); if (x->last_reg_error) return false; @@ -1471,6 +1480,7 @@ static bool xive_configure_bars(struct xive *x) /* VC BAR. Clear first, write mask, then write value */ phys_map_get(chip_id, XIVE_VC, 0, (uint64_t *)&x->vc_base, &x->vc_size); + vm_map_global("XIVE VC", (unsigned long)x->vc_base, x->vc_size, true, true); xive_regwx(x, CQ_VC_BAR, 0); if (x->last_reg_error) return false; diff --git a/hw/xive2.c b/hw/xive2.c index d5814bcbf..d01e64053 100644 --- a/hw/xive2.c +++ b/hw/xive2.c @@ -1392,12 +1392,14 @@ static bool xive_configure_ic_bars(struct xive *x) { uint64_t chip_id = x->chip_id; uint64_t val; + static bool tm_mapped = false; /* Reset all bars to zero */ xive_regwx(x, CQ_RST_CTL, CQ_RST_PB_BAR_RESET); /* IC BAR */ phys_map_get(chip_id, XIVE_IC, 0, (uint64_t *)&x->ic_base, &x->ic_size); + vm_map_global("XIVE IC", (unsigned long)x->ic_base, x->ic_size, true, true); val = (uint64_t)x->ic_base | CQ_IC_BAR_VALID | CQ_IC_BAR_64K; x->ic_shift = 16; @@ -1410,6 +1412,11 @@ static bool xive_configure_ic_bars(struct xive *x) * chip 0 and use that for all phys_map_get(XIVE_TM) calls. */ phys_map_get(0, XIVE_TM, 0, (uint64_t *)&x->tm_base, &x->tm_size); + if (!tm_mapped) { + vm_map_global("XIVE TM", (unsigned long)x->tm_base, x->tm_size, true, true); + tm_mapped = true; + } + val = (uint64_t)x->tm_base | CQ_TM_BAR_VALID | CQ_TM_BAR_64K; x->tm_shift = 16; @@ -1452,6 +1459,7 @@ static bool xive_configure_bars(struct xive *x) "0x%012llx > 0x%012llx\n", x->nvp_size, nvp_size); return false; } + vm_map_global("XIVE NVPG", (unsigned long)x->nvp_base, nvp_size, true, true); val = (uint64_t)x->nvp_base | CQ_BAR_VALID | CQ_BAR_64K | SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->nvp_size) - 24); @@ -1466,6 +1474,7 @@ static bool xive_configure_bars(struct xive *x) "0x%012llx > 0x%012llx\n", x->esb_size, esb_size); return false; } + vm_map_global("XIVE ESB", (unsigned long)x->esb_base, esb_size, true, true); val = (uint64_t)x->esb_base | CQ_BAR_VALID | CQ_BAR_64K | SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->esb_size) - 24); @@ -1480,6 +1489,7 @@ static bool xive_configure_bars(struct xive *x) "0x%012llx > 0x%012llx\n", x->end_size, end_size); return false; } + vm_map_global("XIVE END", (unsigned long)x->end_base, end_size, true, true); val = (uint64_t)x->end_base | CQ_BAR_VALID | CQ_BAR_64K | SETFIELD(CQ_BAR_RANGE, 0ull, ilog2(x->end_size) - 24); diff --git a/hw/xscom.c b/hw/xscom.c index 347457242..1ecd88e8f 100644 --- a/hw/xscom.c +++ b/hw/xscom.c @@ -958,6 +958,7 @@ void xscom_init(void) const struct dt_property *reg; struct proc_chip *chip; const char *chip_name; + u64 size; static const char *chip_names[] = { "UNKNOWN", "P8E", "P8", "P8NVL", "P9N", "P9C", "P9P", "P10", @@ -973,6 +974,9 @@ void xscom_init(void) assert(reg); chip->xscom_base = dt_translate_address(xn, 0, NULL); + size = dt_property_get_u64(reg, 1); + + vm_map_global("XSCOM MMIO", chip->xscom_base, size, true, true); /* Grab processor type and EC level */ xscom_init_chip_info(chip); diff --git a/include/cmpxchg.h b/include/cmpxchg.h index 0304e9134..835743cf5 100644 --- a/include/cmpxchg.h +++ b/include/cmpxchg.h @@ -5,6 +5,9 @@ #define __CMPXCHG_H #ifndef __TEST__ +#include +#include + /* * Bare cmpxchg, no barriers. */ diff --git a/include/cpu.h b/include/cpu.h index 1be5cb0d4..8e598795d 100644 --- a/include/cpu.h +++ b/include/cpu.h @@ -12,6 +12,19 @@ #include #include +struct vm_map { + struct list_node list; + + const char *name; + uint64_t address; + uint64_t pa; + uint64_t length; + bool readable; + bool writeable; + bool executable; + bool ci; +}; + /* * cpu_thread is our internal structure representing each * thread in the system @@ -74,10 +87,19 @@ struct cpu_thread { struct bt_entry stack_bot_bt[CPU_BACKTRACE_SIZE]; struct bt_metadata stack_bot_bt_metadata; #endif + /* + * Per-thread VM parameters + */ + struct vm_map vm_local_map; /* per-cpu map */ + bool vm_local_map_inuse; + uint8_t vm_slb_rr; /* RR allocator */ + bool vm_setup; /* virtual memory is up */ + struct lock job_lock; struct list_head job_queue; uint32_t job_count; bool job_has_no_return; + /* * Per-core mask tracking for threads in HMI handler and * a cleanup done bit. @@ -316,6 +338,11 @@ static inline void cpu_give_self_os(void) __this_cpu->state = cpu_state_os; } +static inline bool cpu_in_os(void) +{ + return __this_cpu->state == cpu_state_os; +} + extern unsigned long __attrconst cpu_stack_bottom(unsigned int pir); extern unsigned long __attrconst cpu_stack_top(unsigned int pir); extern unsigned long __attrconst cpu_emergency_stack_top(unsigned int pir); diff --git a/include/elf-abi.h b/include/elf-abi.h index 29c757642..34b95d337 100644 --- a/include/elf-abi.h +++ b/include/elf-abi.h @@ -21,7 +21,16 @@ static inline uint64_t function_entry_address(void *func) { #ifdef ELF_ABI_v2 - u32 *insn = func; + u32 *ret = func; + u32 *i; + u32 insn; + u32 insn2; + + i = vm_map((unsigned long)func, sizeof(insn)*2, false); + insn = *i; + insn2 = *(i+1); + vm_unmap((unsigned long)func, sizeof(insn)*2); + /* * A PPC64 ABIv2 function may have a local and a global entry * point. We use the local entry point for branch tables called @@ -38,12 +47,12 @@ static inline uint64_t function_entry_address(void *func) * lis r2,XXXX * addi r2,r2,XXXX */ - if ((((*insn & OP_RT_RA_MASK) == ADDIS_R2_R12) || - ((*insn & OP_RT_RA_MASK) == LIS_R2)) && - ((*(insn+1) & OP_RT_RA_MASK) == ADDI_R2_R2)) - return (uint64_t)(insn + 2); + if ((((insn & OP_RT_RA_MASK) == ADDIS_R2_R12) || + ((insn & OP_RT_RA_MASK) == LIS_R2)) && + ((insn2 & OP_RT_RA_MASK) == ADDI_R2_R2)) + return (uint64_t)(ret + 2); else - return (uint64_t)func; + return (uint64_t)ret; #else return *(uint64_t *)func; #endif diff --git a/include/io.h b/include/io.h index f00021dcd..5c1bd41b4 100644 --- a/include/io.h +++ b/include/io.h @@ -7,6 +7,7 @@ #ifndef __ASSEMBLY__ #include +#include #include #include #include @@ -23,8 +24,13 @@ static inline uint8_t __in_8(const volatile uint8_t *addr) { uint8_t val; - asm volatile("lbzcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("lbzcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return val; } @@ -37,8 +43,13 @@ static inline uint8_t in_8(const volatile uint8_t *addr) static inline uint16_t __in_be16(const volatile beint16_t *addr) { __be16 val; - asm volatile("lhzcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("lhzcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return be16_to_cpu(val); } @@ -51,8 +62,13 @@ static inline uint16_t in_be16(const volatile beint16_t *addr) static inline uint16_t __in_le16(const volatile leint16_t *addr) { __le16 val; - asm volatile("lhzcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("lhzcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return le16_to_cpu(val); } @@ -65,8 +81,13 @@ static inline uint16_t in_le16(const volatile leint16_t *addr) static inline uint32_t __in_be32(const volatile beint32_t *addr) { __be32 val; - asm volatile("lwzcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("lwzcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return be32_to_cpu(val); } @@ -79,8 +100,13 @@ static inline uint32_t in_be32(const volatile beint32_t *addr) static inline uint32_t __in_le32(const volatile leint32_t *addr) { __le32 val; - asm volatile("lwzcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("lwzcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return le32_to_cpu(val); } @@ -93,8 +119,13 @@ static inline uint32_t in_le32(const volatile leint32_t *addr) static inline uint64_t __in_be64(const volatile beint64_t *addr) { __be64 val; - asm volatile("ldcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("ldcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return be64_to_cpu(val); } @@ -107,8 +138,13 @@ static inline uint64_t in_be64(const volatile beint64_t *addr) static inline uint64_t __in_le64(const volatile leint64_t *addr) { __le64 val; - asm volatile("ldcix %0,0,%1" : - "=r"(val) : "r"(addr), "m"(*addr) : "memory"); + + if (vm_realmode()) + asm volatile("ldcix %0,0,%1" : + "=r"(val) : "r"(addr), "m"(*addr)); + else + val = *addr; + return le64_to_cpu(val); } @@ -120,8 +156,11 @@ static inline uint64_t in_le64(const volatile leint64_t *addr) static inline void __out_8(volatile uint8_t *addr, uint8_t val) { - asm volatile("stbcix %0,0,%1" - : : "r"(val), "r"(addr), "m"(*addr) : "memory"); + if (vm_realmode()) + asm volatile("stbcix %0,0,%1" + : : "r"(val), "r"(addr), "m"(*addr)); + else + *addr = val; } static inline void out_8(volatile uint8_t *addr, uint8_t val) @@ -132,8 +171,12 @@ static inline void out_8(volatile uint8_t *addr, uint8_t val) static inline void __out_be16(volatile beint16_t *addr, uint16_t val) { - asm volatile("sthcix %0,0,%1" - : : "r"(cpu_to_be16(val)), "r"(addr), "m"(*addr) : "memory"); + __be16 __val = cpu_to_be16(val); + if (vm_realmode()) + asm volatile("sthcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_be16(volatile beint16_t *addr, uint16_t val) @@ -144,8 +187,12 @@ static inline void out_be16(volatile beint16_t *addr, uint16_t val) static inline void __out_le16(volatile leint16_t *addr, uint16_t val) { - asm volatile("sthcix %0,0,%1" - : : "r"(cpu_to_le16(val)), "r"(addr), "m"(*addr) : "memory"); + __le16 __val = cpu_to_le16(val); + if (vm_realmode()) + asm volatile("sthcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_le16(volatile leint16_t *addr, uint16_t val) @@ -156,8 +203,12 @@ static inline void out_le16(volatile leint16_t *addr, uint16_t val) static inline void __out_be32(volatile beint32_t *addr, uint32_t val) { - asm volatile("stwcix %0,0,%1" - : : "r"(cpu_to_be32(val)), "r"(addr), "m"(*addr) : "memory"); + __be32 __val = cpu_to_be32(val); + if (vm_realmode()) + asm volatile("stwcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_be32(volatile beint32_t *addr, uint32_t val) @@ -168,8 +219,12 @@ static inline void out_be32(volatile beint32_t *addr, uint32_t val) static inline void __out_le32(volatile leint32_t *addr, uint32_t val) { - asm volatile("stwcix %0,0,%1" - : : "r"(cpu_to_le32(val)), "r"(addr), "m"(*addr) : "memory"); + __le32 __val = cpu_to_le32(val); + if (vm_realmode()) + asm volatile("stwcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_le32(volatile leint32_t *addr, uint32_t val) @@ -180,8 +235,12 @@ static inline void out_le32(volatile leint32_t *addr, uint32_t val) static inline void __out_be64(volatile beint64_t *addr, uint64_t val) { - asm volatile("stdcix %0,0,%1" - : : "r"(cpu_to_be64(val)), "r"(addr), "m"(*addr) : "memory"); + __be64 __val = cpu_to_be64(val); + if (vm_realmode()) + asm volatile("stdcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_be64(volatile beint64_t *addr, uint64_t val) @@ -192,8 +251,12 @@ static inline void out_be64(volatile beint64_t *addr, uint64_t val) static inline void __out_le64(volatile leint64_t *addr, uint64_t val) { - asm volatile("stdcix %0,0,%1" - : : "r"(cpu_to_le64(val)), "r"(addr), "m"(*addr) : "memory"); + __le64 __val = cpu_to_le64(val); + if (vm_realmode()) + asm volatile("stdcix %0,0,%1" + : : "r"(__val), "r"(addr), "m"(*addr)); + else + *addr = __val; } static inline void out_le64(volatile leint64_t *addr, uint64_t val) diff --git a/include/mem_region.h b/include/mem_region.h index 3e3818a66..47c3bd70c 100644 --- a/include/mem_region.h +++ b/include/mem_region.h @@ -33,6 +33,7 @@ struct mem_region { struct list_node list; const char *name; uint64_t start, len; + uint64_t vm_mapped_len; struct dt_node *node; enum mem_region_type type; struct list_head free_list; diff --git a/include/platform.h b/include/platform.h index d113e6eb9..34e8e296f 100644 --- a/include/platform.h +++ b/include/platform.h @@ -305,8 +305,8 @@ struct platform { void (*vpd_iohub_load)(struct dt_node *hub_node); }; -extern struct platform __platforms_start; -extern struct platform __platforms_end; +extern struct platform __platforms_start[]; +extern struct platform __platforms_end[]; extern struct platform platform; extern const struct bmc_platform *bmc_platform; diff --git a/include/processor.h b/include/processor.h index 973d7e77b..858fa9935 100644 --- a/include/processor.h +++ b/include/processor.h @@ -41,7 +41,9 @@ #define SPR_SRR1 0x01b /* RW: Exception save/restore reg 1 */ #define SPR_CFAR 0x01c /* RW: Come From Address Register */ #define SPR_AMR 0x01d /* RW: Authority Mask Register */ +#define SPR_PID 0x030 /* RW: PID register */ #define SPR_IAMR 0x03d /* RW: Instruction Authority Mask Register */ +#define SPR_UAMOR 0x09d #define SPR_RPR 0x0ba /* RW: Relative Priority Register */ #define SPR_TBRL 0x10c /* RO: Timebase low */ #define SPR_TBRU 0x10d /* RO: Timebase high */ @@ -63,10 +65,12 @@ #define SPR_HSRR1 0x13b /* RW: HV Exception save/restore reg 1 */ #define SPR_TFMR 0x13d #define SPR_LPCR 0x13e +#define SPR_LPID 0x13f /* RW: LPID register */ #define SPR_HMER 0x150 /* Hypervisor Maintenance Exception */ #define SPR_HMEER 0x151 /* HMER interrupt enable mask */ #define SPR_PCR 0x152 #define SPR_AMOR 0x15d +#define SPR_PTCR 0x1d0 /* RW: Partition table control register */ #define SPR_USRR0 0x1fa /* RW: Ultravisor Save/Restore Register 0 */ #define SPR_USRR1 0x1fb /* RW: Ultravisor Save/Restore Register 1 */ #define SPR_SMFCTRL 0x1ff /* RW: Secure Memory Facility Control */ @@ -85,6 +89,11 @@ #define SPR_SRR1_PM_WAKE_SRESET 0x100000 #define SPR_SRR1_PM_WAKE_MCE 0x3c0000 /* Use reserved value for MCE */ +/* Bits in DSISR */ + +#define DSISR_ISSTORE 0x02000000 + + /* Bits in LPCR */ /* Powersave Exit Cause Enable is different on each generation */ @@ -375,9 +384,9 @@ static inline void isync(void) /* * Cache sync */ -static inline void sync_icache(void) +static inline void sync_icache(unsigned long ptr) { - asm volatile("sync; icbi 0,%0; sync; isync" : : "r" (0) : "memory"); + asm volatile("sync; icbi 0,%0; sync; isync" : : "r" (ptr) : "memory"); } /* diff --git a/include/skiboot.h b/include/skiboot.h index f83fcbdf6..f92cb92e9 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -44,10 +44,16 @@ extern char _stext[]; extern char _etext[]; extern char __sym_map_end[]; extern char _romem_end[]; +extern char __vm_mapped_romem_end[]; #ifndef __TESTING__ +extern char _stext[], _etext[]; /* Readonly section start and end. */ extern char __rodata_start[], __rodata_end[]; +extern char _sdata[], _edata[]; +extern char __sym_map_start[], __sym_map_end[]; +extern char _sbss[], _ebss[]; +extern char _end[]; static inline bool is_rodata(const void *p) { @@ -193,6 +199,7 @@ extern void disable_fast_reboot(const char *reason); extern void add_fast_reboot_dt_entries(void); extern void fast_reboot(void); extern void __noreturn __secondary_cpu_entry(void); +extern void __noreturn __return_cpu_entry(void); extern void __noreturn load_and_boot_kernel(bool is_reboot); extern void cleanup_local_tlb(void); extern void cleanup_global_tlb(void); @@ -338,6 +345,26 @@ extern uint32_t reset_patch_end; extern uint32_t reset_fast_reboot_patch_start; extern uint32_t reset_fast_reboot_patch_end; +/* core/vm.c */ +bool vm_realmode(void); +void vm_map_global(const char *name, unsigned long addr, unsigned long len, bool rw, bool ci); +void vm_map_global_text(const char *name, unsigned long addr, unsigned long len); +void vm_unmap_global(unsigned long addr, unsigned long len); +void *vm_map(unsigned long addr, unsigned long len, bool rw); +void vm_unmap(unsigned long addr, unsigned long len); +void vm_init(bool fast_reboot); +void vm_init_stacks(void); +void vm_destroy(void); +void vm_init_secondary(void); +void vm_enter(void); +void vm_exit(void); +void vm_exit_cleanup(void); +void vm_map_stacks(void); +bool vm_dslb(uint64_t nia, uint64_t dar); +bool vm_islb(uint64_t nia); +bool vm_dsi(uint64_t nia, uint64_t dar, uint32_t dsisr); +bool vm_isi(uint64_t nia); + /* Fallback fake NVRAM */ extern int fake_nvram_info(uint32_t *total_size); extern int fake_nvram_start_read(void *dst, uint32_t src, uint32_t len); @@ -373,8 +400,8 @@ static const struct hwprobe __used __section(".hwprobes") hwprobe_##__name = { \ .deps = (const char *[]){ __VA_ARGS__, NULL}, \ } -extern struct hwprobe __hwprobes_start; -extern struct hwprobe __hwprobes_end; +extern struct hwprobe __hwprobes_start[]; +extern struct hwprobe __hwprobes_end[]; extern void probe_hardware(void); diff --git a/libstb/container.c b/libstb/container.c index eca54cf63..2b8f22f70 100644 --- a/libstb/container.c +++ b/libstb/container.c @@ -6,14 +6,20 @@ bool stb_is_container(const void *buf, size_t size) { + beint32_t *t; ROM_container_raw *c; + bool ret = true;; c = (ROM_container_raw*) buf; if (!buf || size < SECURE_BOOT_HEADERS_SIZE) return false; - if (be32_to_cpu(c->magic_number) != ROM_MAGIC_NUMBER ) - return false; - return true; + + t = vm_map((unsigned long)&c->magic_number, sizeof(*t), false); + if (be32_to_cpu(*t) != ROM_MAGIC_NUMBER) + ret = false; + vm_unmap((unsigned long)&c->magic_number, sizeof(*t)); + + return ret; } uint32_t stb_payload_magic(const void *buf, size_t size) diff --git a/libstb/cvc.c b/libstb/cvc.c index 663e53953..08b2eea60 100644 --- a/libstb/cvc.c +++ b/libstb/cvc.c @@ -155,6 +155,9 @@ static int cvc_reserved_mem_init(struct dt_node *parent) { return -1; } addr = dt_get_address(cvc_resv_mem, 0, &size); + if (size == 0) // MAMBO HACK + size = 64*1024; + vm_map_global_text("STB-CVC", addr, size); cvc_register(addr, addr + size-1); exports = dt_find_by_path(dt_root, "/ibm,opal/firmware/exports"); diff --git a/libstb/secureboot.c b/libstb/secureboot.c index f8cce2851..3bd00b067 100644 --- a/libstb/secureboot.c +++ b/libstb/secureboot.c @@ -169,6 +169,7 @@ int secureboot_verify(enum resource_id id, void *buf, size_t len) { const char *name; __be64 log; + void *vbuf; int rc = -1; name = flash_map_resource_name(id); @@ -186,7 +187,9 @@ int secureboot_verify(enum resource_id id, void *buf, size_t len) return -1; } - rc = call_cvc_verify(buf, len, hw_key_hash, hw_key_hash_size, &log); + vbuf = vm_map((unsigned long)buf, len, false); + rc = call_cvc_verify(vbuf, len, hw_key_hash, hw_key_hash_size, &log); + vm_unmap((unsigned long)buf, len); if (rc == OPAL_SUCCESS) { prlog(PR_NOTICE, "%s verified\n", name); diff --git a/libstb/trustedboot.c b/libstb/trustedboot.c index 1be2f07e6..5b0f6f602 100644 --- a/libstb/trustedboot.c +++ b/libstb/trustedboot.c @@ -166,7 +166,7 @@ out_free: int trustedboot_measure(enum resource_id id, void *buf, size_t len) { uint8_t digest[SHA512_DIGEST_LENGTH]; - void *buf_aux; + void *buf_aux, *vbuf; size_t len_aux; const char *name; TPMI_DH_PCR pcr; @@ -224,7 +224,9 @@ int trustedboot_measure(enum resource_id id, void *buf, size_t len) len_aux = len; } - rc = call_cvc_sha512(buf_aux, len_aux, digest, SHA512_DIGEST_LENGTH); + vbuf = vm_map((unsigned long)buf_aux, len_aux, false); + rc = call_cvc_sha512(vbuf, len_aux, digest, SHA512_DIGEST_LENGTH); + vm_unmap((unsigned long)buf_aux, len_aux); if (rc == OPAL_SUCCESS) { prlog(PR_NOTICE, "%s hash calculated\n", name); diff --git a/skiboot.lds.S b/skiboot.lds.S index c8e6e747c..f3715706f 100644 --- a/skiboot.lds.S +++ b/skiboot.lds.S @@ -123,12 +123,26 @@ SECTIONS __rodata_end = .; } + . = ALIGN(0x100); + .got : { + __toc_start = . + 0x8000; + *(.got) + *(.toc) + } + + . = ALIGN(0x10); + .opd : { + *(.opd) + } + . = ALIGN(0x10); .trap_table : { __trap_table_start = .; KEEP(*(.trap_table)) __trap_table_end = .; } + __vm_mapped_romem_end = .; + . = ALIGN(PAGE_SIZE); . = ALIGN(0x10); .init : { @@ -139,18 +153,6 @@ SECTIONS __ctors_end = .; } - . = ALIGN(0x10); - .opd : { - *(.opd) - } - - . = ALIGN(0x100); - .got : { - __toc_start = . + 0x8000; - *(.got) - *(.toc) - } - . = ALIGN(0x10); .opal_table : { __opal_table_start = .; -- 2.23.0 From hegdevasant at linux.vnet.ibm.com Thu Aug 19 21:40:50 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 19 Aug 2021 17:10:50 +0530 Subject: [Skiboot] [PATCH 1/3] xive/p10: Fix xive_special_cache_check when DEBUG=1 In-Reply-To: <20210807073821.192901-1-clg@kaod.org> References: <20210807073821.192901-1-clg@kaod.org> Message-ID: <2ab0f7c7-ffcf-7ef6-578d-14d15362bf99@linux.vnet.ibm.com> On 8/7/21 1:08 PM, C?dric Le Goater wrote: > The special cache check done when skiboot is compiled with DEBUG is > incompatible with Automatic Context Save and Restore. > > Random data is written in the NVP to check that cache updates are > correct but this can lead to a checkstop raised by the XIVE interrupt > controller. When the NVP Valid (0) bit, the hardware controlled H (7) > bit, and the Checked Out bit (45) are all ones at the same time, the > HW thinks that the NVP entry is checked out by a thread and does not > allow the cache write to occur. > > Make sure that the valid bit is not set on the NVP. > > Signed-off-by: C?dric Le Goater Thanks! Merged series to master as of cd12ea6d. -Vasant From hegdevasant at linux.vnet.ibm.com Fri Aug 20 01:40:39 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 19 Aug 2021 21:10:39 +0530 Subject: [Skiboot] [PATCH v2] hello_world: Add p10 mambo tests Message-ID: <20210819154039.52851-1-hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde --- Changes in v2: I had missed to add run_mambo_p10_hello_world.sh in v1. -Vasant test/hello_world/Makefile.check | 16 +++++ test/hello_world/run_mambo_p10_hello_world.sh | 64 +++++++++++++++++++ 2 files changed, 80 insertions(+) create mode 100755 test/hello_world/run_mambo_p10_hello_world.sh diff --git a/test/hello_world/Makefile.check b/test/hello_world/Makefile.check index 0390cf662..8cf15cb2f 100644 --- a/test/hello_world/Makefile.check +++ b/test/hello_world/Makefile.check @@ -4,14 +4,18 @@ HELLO_WORLD_STB_TEST := test/hello_world/hello_kernel/hello_kernel.stb .PHONY: hello_world-tests hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-p9-mambo) +hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-smt-p10-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-p9-mambo) +hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-p10-mambo) hello_world-tests: $(HELLO_WORLD_TEST:%=%-check-qemu) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p9-mambo) +hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p10-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-mambo) hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-p9-mambo) +hello_world-tests: $(HELLO_WORLD_STB_TEST:%=%-check-stb-p10-mambo) boot-tests: hello_world-tests check: hello_world-tests @@ -29,12 +33,18 @@ $(HELLO_WORLD_TEST:%=%-check-smt-mambo): %-check-smt-mambo: % skiboot.lid $(HELLO_WORLD_TEST:%=%-check-smt-p9-mambo): %-check-smt-p9-mambo: % skiboot.lid $(call Q , BOOT TEST , THREADS=2 ./test/hello_world/run_mambo_p9_hello_world.sh , $@) +$(HELLO_WORLD_TEST:%=%-check-smt-p10-mambo): %-check-smt-p10-mambo: % skiboot.lid + $(call Q , BOOT TEST , THREADS=2 ./test/hello_world/run_mambo_p10_hello_world.sh , $@) + $(HELLO_WORLD_TEST:%=%-check-mambo): %-check-mambo: % skiboot.lid $(call Q , BOOT TEST , ./test/hello_world/run_mambo_hello_world.sh, $@) $(HELLO_WORLD_TEST:%=%-check-p9-mambo): %-check-p9-mambo: % skiboot.lid $(call Q , BOOT TEST , ./test/hello_world/run_mambo_p9_hello_world.sh, $@) +$(HELLO_WORLD_TEST:%=%-check-p10-mambo): %-check-p10-mambo: % skiboot.lid + $(call Q , BOOT TEST , ./test/hello_world/run_mambo_p10_hello_world.sh, $@) + # and now, with secure and trusted boot: $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo): %-check-stb-smt-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_hello_world.sh , $@) @@ -42,12 +52,18 @@ $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-mambo): %-check-stb-smt-mambo: % skiboo $(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p9-mambo): %-check-stb-smt-p9-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_p9_hello_world.sh , $@) +$(HELLO_WORLD_STB_TEST:%=%-check-stb-smt-p10-mambo): %-check-stb-smt-p10-mambo: % skiboot.lid.stb + $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 THREADS=2 ./test/hello_world/run_mambo_p10_hello_world.sh , $@) + $(HELLO_WORLD_STB_TEST:%=%-check-stb-mambo): %-check-stb-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_hello_world.sh, $@) $(HELLO_WORLD_STB_TEST:%=%-check-stb-p9-mambo): %-check-stb-p9-mambo: % skiboot.lid.stb $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_p9_hello_world.sh, $@) +$(HELLO_WORLD_STB_TEST:%=%-check-stb-p10-mambo): %-check-stb-p10-mambo: % skiboot.lid.stb + $(call Q , BOOT TEST , SKIBOOT_ENABLE_MAMBO_STB=1 ./test/hello_world/run_mambo_p10_hello_world.sh, $@) + # qemu $(HELLO_WORLD_TEST:%=%-check-qemu): %-check-qemu: % skiboot.lid diff --git a/test/hello_world/run_mambo_p10_hello_world.sh b/test/hello_world/run_mambo_p10_hello_world.sh new file mode 100755 index 000000000..4ce54494d --- /dev/null +++ b/test/hello_world/run_mambo_p10_hello_world.sh @@ -0,0 +1,64 @@ +#!/bin/bash + +if [ -z "$P10MAMBO_PATH" ]; then + P10MAMBO_PATH=/opt/ibm/systemsim-p10-1.1-0 +fi + +if [ -z "$P10MAMBO_BINARY" ]; then + P10MAMBO_BINARY="run/p10/power10" +fi + +if [ ! -x "$P10MAMBO_PATH/$P10MAMBO_BINARY" ]; then + echo "Could not find executable P10MAMBO_BINARY ($P10MAMBO_PATH/$P10MAMBO_BINARY). Skipping hello_world test"; + exit 0; +fi + +if [ -n "$KERNEL" ]; then + echo 'Please rebuild skiboot without KERNEL set. Skipping hello_world test'; + exit 0; +fi + +if [ ! $(command -v expect) ]; then + echo 'Could not find expect binary. Skipping hello_world test'; + exit 0; +fi + +if [ -n "$SKIBOOT_ENABLE_MAMBO_STB" ]; then + export SKIBOOT_ZIMAGE=$(pwd)/test/hello_world/hello_kernel/hello_kernel.stb + export SKIBOOT_CVC_CODE=$(pwd)/external/mambo/cvc.bin +else + export SKIBOOT_ZIMAGE=$(pwd)/test/hello_world/hello_kernel/hello_kernel +fi + +# Currently getting some core dumps from mambo, so disable them! +ulimit -c 0 + +t=$(mktemp) || exit 1 + +trap "rm -f -- '$t'" EXIT + +( cd external/mambo; +cat <&1 > $t + +r=$? +if [ $r != 0 ]; then + cat $t + exit $r +fi + +if [ -n "$V" ] ; then cat "$t" ; fi +rm -f -- "$t" +trap - EXIT +exit 0; -- 2.31.1 From hegdevasant at linux.vnet.ibm.com Fri Aug 20 01:41:03 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Thu, 19 Aug 2021 21:11:03 +0530 Subject: [Skiboot] [PATCH] ci: Bump qemu version Message-ID: <20210819154103.53068-1-hegdevasant@linux.vnet.ibm.com> Move to qemu version powernv-6.1. Signed-off-by: Vasant Hegde --- opal-ci/build-qemu-powernv.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/opal-ci/build-qemu-powernv.sh b/opal-ci/build-qemu-powernv.sh index cf96048a8..725130309 100755 --- a/opal-ci/build-qemu-powernv.sh +++ b/opal-ci/build-qemu-powernv.sh @@ -2,7 +2,7 @@ set -e set -vx -git clone --depth=1 -b powernv-6.0 git://github.com/open-power/qemu.git +git clone --depth=1 -b powernv-6.1 git://github.com/open-power/qemu.git cd qemu git submodule update --init dtc export CC="ccache gcc" -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:41 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:41 +0200 Subject: [Skiboot] [PATCH 00/16] OpenCAPI 5.0 Support for P10 Message-ID: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> The Open Coherently Attached Processor Interface (OCAPI) is used to allow an Attached Functional Unit (AFU) to connect to the Processor Chip's system bus in a high speed and cache coherent manner. This series implements OpenCAPI support for P10. The series is divided as follows: - Patches 1-3: general refactoring Add various structs and fields we'll need later. - Patch 4: Detect devices on Rainier platform - Patches 5-9: setting up the PAU - Patch 10: create OpenCAPI PHBs - Patch 11: dumping hmi scoms - Patch 12-13: Link training and Phy configuration - Patch 14: OPAL API calls We define three new API calls for handling the Shared Process Area and setting OpenCAPI TL template capabilities. - Patch 15: mmio invalidates. Use MMIO registers to perform TLB operations. - Patch 16: lpc memory support. This series has been tested on a Rainier system with a HMS/Bono and TresHombres cards Christophe Lombard (16): opencapi5: move opal api opencapi5: update npu3 opencapi5: introduce support opencapi5: rainier detect device opencapi5: assing bars opencapi5: create phb opencapi5: enabling opencapi opencapi5: translation layer configuration opencapi5: enable interrupt on error opencapi5: complete phb ops opencapi5: hmi scom dump opencapi5: phy init opencapi5: link training opencapi5: add opal functions opencapi5: mmio invalidates opencapi5: Add support for OpenCAPI Persistent Memory devices. core/hmi.c | 263 +++-- core/init.c | 3 + core/pci-opal.c | 9 +- core/pci.c | 4 +- hdata/spira.c | 140 ++- hdata/spira.h | 2 +- hw/Makefile.inc | 2 +- hw/npu-opal.c | 113 ++ hw/npu2-common.c | 30 +- hw/npu2-opencapi.c | 53 +- hw/npu3.c | 4 +- hw/pau-hw-procedures.c | 310 ++++++ hw/pau.c | 2073 ++++++++++++++++++++++++++++++++++++ hw/phys-map.c | 49 +- include/npu2-regs.h | 5 + include/npu2.h | 12 +- include/pau-regs.h | 223 ++++ include/pau.h | 222 ++++ include/pci.h | 1 + include/phys-map.h | 4 + include/platform.h | 9 + include/skiboot.h | 1 + include/xscom-p10-regs.h | 49 + platforms/astbmc/rainier.c | 181 ++++ 24 files changed, 3530 insertions(+), 232 deletions(-) create mode 100644 hw/pau-hw-procedures.c create mode 100644 hw/pau.c create mode 100644 include/pau-regs.h create mode 100644 include/pau.h -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:43 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:43 +0200 Subject: [Skiboot] [PATCH 02/16] [PATCH 02/16] opencapi5: update npu3 In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-3-clombard@linux.vnet.ibm.com> Create npu3 entry in the device tree only for P9P chip to avoid confusion with the following chips. NPU3 is only available for Axone. Signed-off-by: Christophe Lombard --- hw/npu3.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/hw/npu3.c b/hw/npu3.c index 03461373..0a9bbce8 100644 --- a/hw/npu3.c +++ b/hw/npu3.c @@ -113,9 +113,7 @@ static bool npu3_dt_create(void) struct dt_node *xscom; /* npu3 chips only */ - if (proc_gen < proc_gen_p9 || - chip->type == PROC_CHIP_P9_NIMBUS || - chip->type == PROC_CHIP_P9_CUMULUS) + if (chip->type != PROC_CHIP_P9P) return false; dt_for_each_compatible(dt_root, xscom, "ibm,xscom") -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:42 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:42 +0200 Subject: [Skiboot] [PATCH 01/16] [PATCH 01/16] opencapi5: move opal api In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-2-clombard@linux.vnet.ibm.com> Move the OPAL entry points for npu2 opencapi to the common opal NPU file. This prepares us to add same entries for PAU opencapi in this common file. No functional change. Signed-off-by: Christophe Lombard --- hw/npu-opal.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++ hw/npu2-opencapi.c | 35 ++++-------------------- include/npu2.h | 7 +++++ 3 files changed, 79 insertions(+), 30 deletions(-) diff --git a/hw/npu-opal.c b/hw/npu-opal.c index 412ea460..64e36852 100644 --- a/hw/npu-opal.c +++ b/hw/npu-opal.c @@ -174,3 +174,70 @@ static int64_t opal_npu_get_relaxed_order(uint64_t phb_id, return phb4->ro_state; } opal_call(OPAL_NPU_GET_RELAXED_ORDER, opal_npu_get_relaxed_order, 2); + +#define MAX_PE_HANDLE ((1 << 15) - 1) + +static int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t bdfn, + uint64_t addr, uint64_t PE_mask) +{ + struct phb *phb = pci_get_phb(phb_id); + int64_t rc = OPAL_SUCCESS; + + if (!phb) + return OPAL_PARAMETER; + + /* 4k aligned */ + if (addr & 0xFFF) + return OPAL_PARAMETER; + + if (PE_mask > 15) + return OPAL_PARAMETER; + + if (phb->phb_type == phb_type_npu_v2_opencapi) + rc = npu2_opencapi_spa_setup(phb, bdfn, addr, PE_mask); + else + return OPAL_PARAMETER; + + return rc; +} +opal_call(OPAL_NPU_SPA_SETUP, opal_npu_spa_setup, 4); + +static int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t bdfn, + uint64_t PE_handle) +{ + struct phb *phb = pci_get_phb(phb_id); + int64_t rc = OPAL_SUCCESS; + + if (!phb) + return OPAL_PARAMETER; + + if (PE_handle > MAX_PE_HANDLE) + return OPAL_PARAMETER; + + if (phb->phb_type == phb_type_npu_v2_opencapi) + rc = npu2_opencapi_spa_clear_cache(phb, bdfn, PE_handle); + else + return OPAL_PARAMETER; + + return rc; +} +opal_call(OPAL_NPU_SPA_CLEAR_CACHE, opal_npu_spa_clear_cache, 3); + +static int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, + long capabilities, uint64_t rate_phys, int rate_sz) +{ + struct phb *phb = pci_get_phb(phb_id); + int64_t rc = OPAL_SUCCESS; + + if (!phb) + return OPAL_PARAMETER; + + if (phb->phb_type == phb_type_npu_v2_opencapi) + rc = npu2_opencapi_tl_set(phb, bdfn, capabilities, + rate_phys, rate_sz); + else + return OPAL_PARAMETER; + + return rc; +} +opal_call(OPAL_NPU_TL_SET, opal_npu_tl_set, 5); diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c index 035c6cdc..686f2e22 100644 --- a/hw/npu2-opencapi.c +++ b/hw/npu2-opencapi.c @@ -1957,24 +1957,13 @@ void npu2_opencapi_set_broken(struct npu2 *npu, int brick) } } -static int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t __unused bdfn, +int64_t npu2_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, uint64_t addr, uint64_t PE_mask) { uint64_t stack, block, offset, reg; - struct phb *phb = pci_get_phb(phb_id); struct npu2_dev *dev; int rc; - if (!phb || phb->phb_type != phb_type_npu_v2_opencapi) - return OPAL_PARAMETER; - - /* 4k aligned */ - if (addr & 0xFFF) - return OPAL_PARAMETER; - - if (PE_mask > 15) - return OPAL_PARAMETER; - dev = phb_to_npu2_dev_ocapi(phb); if (!dev) return OPAL_PARAMETER; @@ -1986,7 +1975,6 @@ static int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t __unused bdfn, else offset = NPU2_XSL_PSL_SPAP_A0; - lock(&dev->npu->lock); /* * set the SPAP used by the device @@ -2024,22 +2012,14 @@ out: unlock(&dev->npu->lock); return rc; } -opal_call(OPAL_NPU_SPA_SETUP, opal_npu_spa_setup, 4); -static int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t __unused bdfn, - uint64_t PE_handle) +int64_t npu2_opencapi_spa_clear_cache(struct phb *phb, uint32_t __unused bdfn, + uint64_t PE_handle) { uint64_t cc_inv, stack, block, reg, rc; uint32_t retries = 5; - struct phb *phb = pci_get_phb(phb_id); struct npu2_dev *dev; - if (!phb || phb->phb_type != phb_type_npu_v2_opencapi) - return OPAL_PARAMETER; - - if (PE_handle > MAX_PE_HANDLE) - return OPAL_PARAMETER; - dev = phb_to_npu2_dev_ocapi(phb); if (!dev) return OPAL_PARAMETER; @@ -2077,7 +2057,6 @@ out: unlock(&dev->npu->lock); return rc; } -opal_call(OPAL_NPU_SPA_CLEAR_CACHE, opal_npu_spa_clear_cache, 3); static int get_template_rate(unsigned int templ, char *rate_buf) { @@ -2101,17 +2080,14 @@ static bool is_template_supported(unsigned int templ, long capabilities) return !!(capabilities & (1ull << templ)); } -static int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t __unused bdfn, - long capabilities, uint64_t rate_phys, int rate_sz) +int64_t npu2_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, + long capabilities, uint64_t rate_phys, int rate_sz) { - struct phb *phb = pci_get_phb(phb_id); struct npu2_dev *dev; uint64_t stack, block, reg, templ_rate; int i, rate_pos; char *rate = (char *) rate_phys; - if (!phb || phb->phb_type != phb_type_npu_v2_opencapi) - return OPAL_PARAMETER; if (!opal_addr_valid(rate) || rate_sz != TL_RATE_BUF_SIZE) return OPAL_PARAMETER; @@ -2157,7 +2133,6 @@ static int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t __unused bdfn, OCAPIDBG(dev, "OTL configuration 1 register set to %llx\n", reg); return OPAL_SUCCESS; } -opal_call(OPAL_NPU_TL_SET, opal_npu_tl_set, 5); static void set_mem_bar(struct npu2_dev *dev, uint64_t base, uint64_t size) { diff --git a/include/npu2.h b/include/npu2.h index eb7c4558..f48a68b6 100644 --- a/include/npu2.h +++ b/include/npu2.h @@ -271,4 +271,11 @@ static inline int npu2_get_phb_index(unsigned int brick_index) return NPU2_PHB_INDEX_BASE + brick_index; } +int64_t npu2_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, + uint64_t addr, uint64_t PE_mask); +int64_t npu2_opencapi_spa_clear_cache(struct phb *phb, uint32_t __unused bdfn, + uint64_t PE_handle); +int64_t npu2_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, + long capabilities, uint64_t rate_phys, int rate_sz); + #endif /* __NPU2_H */ -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:45 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:45 +0200 Subject: [Skiboot] [PATCH 04/16] [PATCH 04/16] opencapi5: rainier detect device In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-5-clombard@linux.vnet.ibm.com> Update the platform_ocapi structure to store Rainier platform-specific values for detecting and resetting OpenCAPI devices via the module I2C (PCA9553) The unique number I2C bus ID associated to each OpenCapi device is get from the I2C port and engine. (De)Assert a reset and detect an OpenCapi device is available through the I2C bus id and address. Signed-off-by: Christophe Lombard --- include/pau.h | 3 + include/platform.h | 5 + platforms/astbmc/rainier.c | 181 +++++++++++++++++++++++++++++++++++++ 3 files changed, 189 insertions(+) diff --git a/include/pau.h b/include/pau.h index 75e7b8d5..d91ffa6d 100644 --- a/include/pau.h +++ b/include/pau.h @@ -24,6 +24,9 @@ struct pau_dev { uint32_t index; struct dt_node *dn; + /* Associated I2C information */ + uint8_t i2c_bus_id; + /* Associated PHY information */ uint32_t pau_unit; /* 0,3,4,5,6,7 */ uint32_t odl_index; diff --git a/include/platform.h b/include/platform.h index d25e9ef4..26fb852a 100644 --- a/include/platform.h +++ b/include/platform.h @@ -70,6 +70,11 @@ struct platform_ocapi { uint8_t i2c_presence_brick5; /* I2C pin to read for presence on brick 5 */ bool odl_phy_swap; /* Swap ODL1 to use brick 2 rather than * brick 1 lanes */ + uint8_t i2c_dev_addr; /* I2C device address */ + uint8_t i2c_intreset_pin; /* I2C pin to write to reset */ + uint8_t i2c_predetect_pin; /* I2C pin to read for presence */ + int64_t (*i2c_assert_reset)(uint8_t i2c_bus_id); + int64_t (*i2c_deassert_reset)(uint8_t i2c_bus_id); const char *(*ocapi_slot_label)(uint32_t chip_id, uint32_t brick_index); const struct ocapi_phy_setup *phy_setup; }; diff --git a/platforms/astbmc/rainier.c b/platforms/astbmc/rainier.c index 17d9fe2b..b9fd53b8 100644 --- a/platforms/astbmc/rainier.c +++ b/platforms/astbmc/rainier.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -99,8 +100,178 @@ static void rainier_init_slot_power(void) } } +static int64_t rainier_i2c_assert_reset(uint8_t i2c_bus_id) +{ + uint8_t data; + int64_t rc = OPAL_SUCCESS; + + /* + * Set the i2c reset pin in output mode (9553 device) + * To write a register: + * puti2c pu 0 0|1 C4 1, + * with data being a 2-nibble hex value and offset being the + * register offset from the datasheet + * + * puti2c (-p1) 0 0|1 C4 51 5 1 0 : i2c engine + * 0|1 : i2c_port + * C4 (C4 > 1 = 62) : Address + * 51 : data + * 5 : register (offset) + * 1 : offset byte + * + * 7.3.6 LS0 - LED selector register: default value 0x55 + * bit 1:0 01* LED0 selected (OpenCapi card) + * + * offset 0x05, register name: LS0, Fct: LED selector + * see Table 4. Control register definition (PCA9553) + */ + data = 0x51; + rc = i2c_request_send(i2c_bus_id, + platform.ocapi->i2c_dev_addr, + SMBUS_WRITE, 0x5, 1, + &data, sizeof(data), 120); + + return rc; +} + +static int64_t rainier_i2c_deassert_reset(uint8_t i2c_bus_id) +{ + uint8_t data; + int64_t rc = OPAL_SUCCESS; + + /* puti2c (-p1) 0 0|1 C4 55 1 + * + * offset 0x05, register name: LS0, Fct: LED selector + * see Table 4. Control register definition (PCA9553) + */ + data = 0x55; + rc = i2c_request_send(i2c_bus_id, + platform.ocapi->i2c_dev_addr, + SMBUS_WRITE, 0x5, 1, + &data, sizeof(data), 120); + + return rc; +} + +static int get_i2c_info(struct pau_dev *dev, int *engine, int *port) +{ + uint32_t chip_id = dev->pau->chip_id; + uint32_t pau_index = dev->pau->index; + uint32_t link = dev->index; + + switch (chip_id) { + case 0: + case 4: + /* + * OP3: links 0 and 1 on chip 0 + * link 0 only on chip 4 + */ + if (pau_index == 1) { + if (link == 1 && chip_id == 4) + return -1; + *engine = 1; + *port = link; + return 0; + } + break; + case 2: + case 6: + /* + * OP0: links 0 and 1 on chip 2 + * link 1 only on chip 6 + */ + if (pau_index == 0) { + if (link == 0 && chip_id == 6) + return -1; + *engine = 1; + *port = link; + return 0; + } + break; + } + return -1; +} + +static void rainier_i2c_presence_init(struct pau_dev *dev) +{ + char port_name[17]; + struct dt_node *np; + int engine, port; + + /* Find I2C port */ + if (dev->i2c_bus_id) + return; + + if (get_i2c_info(dev, &engine, &port)) + return; + + snprintf(port_name, sizeof(port_name), "p8_%08x_e%dp%d", + dev->pau->chip_id, engine, port); + + dt_for_each_compatible(dt_root, np, "ibm,power10-i2c-port") { + if (streq(port_name, dt_prop_get(np, "ibm,port-name"))) { + dev->i2c_bus_id = dt_prop_get_u32(np, "ibm,opal-id"); + break; + } + } +} + +static int64_t rainier_i2c_dev_detect(struct pau_dev *dev, + uint8_t *detect) +{ + int64_t rc = OPAL_SUCCESS; + + /* Read the presence value + * geti2c (-p1) pu 0 0|1 C4 1 1 + * + * offset 0x00, register name: INPUT, Fct: input register + * see Table 4. Control register definition (PCA9553) + */ + *detect = 0x00; + rc = i2c_request_send(dev->i2c_bus_id, + platform.ocapi->i2c_dev_addr, + SMBUS_READ, 0x00, 1, + detect, 1, 120); + + return rc; +} + +static void rainier_pau_device_detect(struct pau *pau) +{ + struct pau_dev *dev; + uint8_t detect; + int64_t rc; + + /* OpenCapi devices are possibly connected on Optical link pair: + * OP0 or OP3 + * pau_index Interface Link - OPxA/B + * 0 OPT0 -- PAU0 + * OPT1 -- no PAU, SMP only + * OPT2 -- no PAU, SMP only + * 1 OPT3 -- PAU3 + * 2 OPT4 -- PAU4 by default, but can be muxed to use PAU5 - N/A on Rainier + * 3 OPT5 -- PAU5 by default, but can be muxed to use PAU4 - N/A on Rainier + * 4 OPT6 -- PAU6 by default, but can be muxed to use PAU7 - N/A on Rainier + * 5 OPT7 -- PAU7 by default, but can be muxed to use PAU6 - N/A on Rainier + */ + pau_for_each_dev(dev, pau) { + dev->type = PAU_DEV_TYPE_UNKNOWN; + + rainier_i2c_presence_init(dev); + if (dev->i2c_bus_id) { + rc = rainier_i2c_dev_detect(dev, &detect); + /* LED0 (bit 0): a high level no card is plugged */ + if (!rc && !(detect & platform.ocapi->i2c_predetect_pin)) + dev->type = PAU_DEV_TYPE_OPENCAPI; + } + + dt_add_property_u64(dev->dn, "ibm,link-speed", 25000000000ull); + } +} + static void rainier_init(void) { + astbmc_init(); rainier_init_slot_power(); } @@ -121,6 +292,14 @@ static bool rainier_probe(void) return true; } +static struct platform_ocapi rainier_ocapi = { + .i2c_dev_addr = 0x62, /* C4 >> 1 */ + .i2c_intreset_pin = 0x02, /* PIN 2 - LED1 - INT/RESET */ + .i2c_predetect_pin = 0x01, /* PIN 1 - LED0 - PRE-DETECT */ + .i2c_assert_reset = rainier_i2c_assert_reset, + .i2c_deassert_reset = rainier_i2c_deassert_reset, +}; + DECLARE_PLATFORM(rainier) = { .name = "Rainier", .probe = rainier_probe, @@ -131,6 +310,8 @@ DECLARE_PLATFORM(rainier) = { .cec_power_down = astbmc_ipmi_power_down, .cec_reboot = astbmc_ipmi_reboot, .elog_commit = ipmi_elog_commit, + .pau_device_detect = rainier_pau_device_detect, + .ocapi = &rainier_ocapi, .exit = astbmc_exit, .terminate = ipmi_terminate, }; -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:44 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:44 +0200 Subject: [Skiboot] [PATCH 03/16] [PATCH 03/16] opencapi5: introduce support In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-4-clombard@linux.vnet.ibm.com> OpenCapi5 is included in the P10 chip. This requires OCAPI capable PHYs, Datalink Layer Logic and Transaction Layer Logic to be included. The PHYs are the physical connection to the OCAPI interconnect. The Datalink Layer provides link training. The Transaction Layer executes the cache coherent and data movement commands on the P10 chip. The PAU provides the Transaction Layer functionality for the OCAPI link(s) on the P10 chip. The P10 PAU supports two OCAPI links. Six accelerator units PAUs are instantiated on the P10 chip for a total of twelve OCAPI links. This patch adds PAU opencapi structure for supporting OpenCapi5. hw/pau.c file contains main of PAU management functions. Signed-off-by: Christophe Lombard --- core/init.c | 3 + hdata/spira.c | 140 ++++++++++++++++++++++++++-- hdata/spira.h | 2 +- hw/Makefile.inc | 2 +- hw/pau.c | 225 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 30 ++++++ include/pau.h | 87 ++++++++++++++++++ include/platform.h | 4 + include/skiboot.h | 1 + 9 files changed, 485 insertions(+), 9 deletions(-) create mode 100644 hw/pau.c create mode 100644 include/pau-regs.h create mode 100644 include/pau.h diff --git a/core/init.c b/core/init.c index a8bac28a..c72d36d2 100644 --- a/core/init.c +++ b/core/init.c @@ -1372,6 +1372,9 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) probe_npu2(); probe_npu3(); + /* Probe PAUs */ + probe_pau(); + /* Initialize PCI */ pci_init_slots(); diff --git a/hdata/spira.c b/hdata/spira.c index baa23751..81806652 100644 --- a/hdata/spira.c +++ b/hdata/spira.c @@ -966,21 +966,40 @@ static void add_nx(void) static void add_nmmu(void) { struct dt_node *xscom, *nmmu; - u32 scom; + u32 scom1, scom2; + u32 chip_id; /* Nest MMU only exists on POWER9 or later */ if (proc_gen < proc_gen_p9) return; - if (proc_gen == proc_gen_p9) - scom = 0x5012c40; - else - scom = 0x2010c40; + if (proc_gen == proc_gen_p9) { + scom1 = 0x5012c40; + } else if (proc_gen == proc_gen_p10) { + scom1 = 0x2010c40; + scom2 = 0x3010c40; + } else + scom1 = 0x2010c40; dt_for_each_compatible(dt_root, xscom, "ibm,xscom") { - nmmu = dt_new_addr(xscom, "nmmu", scom); + nmmu = dt_new_addr(xscom, "nmmu", scom1); + dt_add_property_strings(nmmu, "compatible", "ibm,power9-nest-mmu"); + dt_add_property_cells(nmmu, "reg", scom1, 0x20); + + /* + * P10 has a second nMMU, a.k.a "south" nMMU. + * It exists only on P1 and P3 + */ + if (proc_gen < proc_gen_p10) + return; + + chip_id = __dt_get_chip_id(xscom); + if (chip_id != 2 && chip_id != 6) + continue; + + nmmu = dt_new_addr(xscom, "nmmu", scom2); dt_add_property_strings(nmmu, "compatible", "ibm,power9-nest-mmu"); - dt_add_property_cells(nmmu, "reg", scom, 0x20); + dt_add_property_cells(nmmu, "reg", scom2, 0x20); } } @@ -1757,6 +1776,110 @@ static void add_npus(void) } } +static void add_pau(struct dt_node *xscom, + const struct HDIF_array_hdr *links) +{ + /* P10 contains 6 distinct accelerator units: PAU0, PAU3, + * PAU4, PAU5, PAU6, PAU7 + */ + const uint32_t pau_base[] = { 0x10010800, 0x11010800, + 0x12010800, 0x12011000, + 0x13010800, 0x13011000}; + + const struct sppcrd_smp_link *link; + struct dt_node *pau, *node; + uint32_t chip_id, link_id; + uint16_t pau_index, pau_unit, brick_id, opt_unit, lane_mask; + int i; + + chip_id = dt_get_chip_id(xscom); + + HDIF_iarray_for_each(links, i, link) { + /* Link Nr ? ID of the SMP Links. For every processor + * the SMP Link IDs range from 0..13 + * (PAU numbers X 2 + halflink/brick) + */ + link_id = be32_to_cpu(link->link_id); + + /* ID of brick from which the Link is originating. + * Split in half + * 0x0008 = PAU Unit Number = 0..7 + * 0x000A = Brick/halflink per PAU = 0..1 + */ + pau_unit = be16_to_cpu(link->brick_id); + brick_id = be16_to_cpu(link->brick_id >> 16); + if (pau_unit == 0) + pau_index = pau_unit; + else + pau_index = pau_unit - 2; + + /* Lanes used in this Brick & A 1 at a bit + * position indicates that the lane is used in + * this brick. + * 0xFF00 = brick0 + * 0x00FF = brick1 + */ + lane_mask = be16_to_cpu(link->lane_mask >> brick_id); + + /* IOHS/OPT link mapped to the PAU for this + * ocapi link = 0..7 + */ + opt_unit = be16_to_cpu(link->opt_id); + + if (be32_to_cpu(link->usage) != SMP_LINK_USE_OPENCAPI) + continue; + + prlog(PR_DEBUG, "PAU: chip: %04x, SMP link: %d, PAU: %d, " + "brick: %d, opt_unit: %d, lane mask %x\n", + chip_id, link_id, pau_unit, brick_id, + opt_unit, lane_mask); + + pau = dt_find_by_name_addr(xscom, "pau", pau_base[pau_index]); + if (!pau) { + pau = dt_new_addr(xscom, "pau", pau_base[pau_index]); + dt_add_property_cells(pau, "#size-cells", 0); + dt_add_property_cells(pau, "#address-cells", 1); + dt_add_property_cells(pau, "reg", pau_base[pau_index], 0x2c); + dt_add_property_string(pau, "compatible", "ibm,power10-pau"); + dt_add_property_cells(pau, "ibm,pau-index", pau_index); + dt_add_property_cells(pau, "ibm,pau-unit", pau_unit); + dt_add_property_cells(pau, "ibm,pau-chiplet", pau_base[pau_index] >> 24); + dt_add_property_cells(pau, "ibm,phb-index", 7 + pau_index); + } + + if (!(dt_find_by_name_addr(pau, "link", brick_id))) { + node = dt_new_addr(pau, "link", brick_id); + dt_add_property_string(node, "compatible", "ibm,pau-link"); + dt_add_property_cells(node, "reg", brick_id); + dt_add_property_cells(node, "ibm,pau-link-index", brick_id); + dt_add_property_cells(node, "ibm,odl-index", brick_id); + dt_add_property_cells(node, "ibm,op-unit", opt_unit); + dt_add_property_cells(node, "ibm,pau-lane-mask", lane_mask); + } + } +} + +static void add_paus(void) +{ + struct dt_node *xscom; + + /* P10 chips only */ + if (cpu_type != PVR_TYPE_P10) + return; + + dt_for_each_compatible(dt_root, xscom, "ibm,xscom") { + const struct HDIF_array_hdr *links; + + links = xscom_to_pcrd(xscom, SPPCRD_IDATA_SMP_LINK); + if (!links) { + prerror("PAU: Unable to find matching SPPCRD for %s\n", + xscom->name); + continue; + } + add_pau(xscom, links); + } +} + /* * Legacy SPIRA is being deprecated and we have new SPIRA-H/S structures. * But on older system (p7?) we will continue to get legacy SPIRA. @@ -1880,6 +2003,9 @@ int parse_hdat(bool is_opal) /* Add NPU nodes */ add_npus(); + /* Add PAU nodes */ + add_paus(); + /* Parse VPD */ vpd_parse(); diff --git a/hdata/spira.h b/hdata/spira.h index afdc9228..8def23bd 100644 --- a/hdata/spira.h +++ b/hdata/spira.h @@ -1152,7 +1152,7 @@ struct sppcrd_smp_link { __be16 pci_sideband_slot_idx; __be16 slca_idx; /* SLCA index of the *external* port */ - __be16 reserved; + __be16 opt_id; /* nvlink/ocapi detection devices */ __be32 i2c_link_cable; diff --git a/hw/Makefile.inc b/hw/Makefile.inc index 37256d3c..6e96318a 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -9,7 +9,7 @@ HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o -HW_OBJS += ocmb.o xive2.o +HW_OBJS += ocmb.o xive2.o pau.o HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/pau.c b/hw/pau.c new file mode 100644 index 00000000..b02b0034 --- /dev/null +++ b/hw/pau.c @@ -0,0 +1,225 @@ +// SPDX-License-Identifier: Apache-2.0 +/* + * Copyright 2020 IBM Corp. + */ + +#include +#include +#include + +struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, + enum pau_dev_type type) +{ + uint32_t i = 0; + + if (dev) + i = dev->index + 1; + + for (; i < pau->links; i++) { + dev = &pau->devices[i]; + + if (dev->type == type || type == PAU_DEV_TYPE_ANY) + return dev; + } + + return NULL; +} + +static void pau_dt_create_link(struct dt_node *pau, uint32_t pau_index, + uint32_t dev_index) +{ + struct dt_node *link; + uint32_t phy_lane_mask = 0, pau_unit = 0; + uint32_t op_unit = 0, odl_index = 0; + + link = dt_new_addr(pau, "link", dev_index); + + dt_add_property_string(link, "compatible", "ibm,pau-link"); + dt_add_property_cells(link, "reg", dev_index); + dt_add_property_cells(link, "ibm,pau-link-index", dev_index); + + /* pau_index Interface Link - OPxA/B + * 0 OPT0 -- PAU0 + * OPT1 -- no PAU, SMP only + * OPT2 -- no PAU, SMP only + * 1 OPT3 -- PAU3 + * 2 OPT4 -- PAU4 by default, but can be muxed to use PAU5 + * 3 OPT5 -- PAU5 by default, but can be muxed to use PAU4 + * 4 OPT6 -- PAU6 by default, but can be muxed to use PAU7 + * 5 OPT7 -- PAU7 by default, but can be muxed to use PAU6 + */ + switch (pau_index) { + case 0: + /* OP0A - OP0B */ + pau_unit = 0; + op_unit = 0; + break; + case 1: + /* OP3A - OP3B */ + pau_unit = 3; + op_unit = 3; + break; + case 2: + /* OP4A - OP4B or OP5A - OP5B (TO DO) */ + pau_unit = 4; + op_unit = 4; + break; + case 3: + /* OP5A - OP5B or OP4A - OP4B (TO DO) */ + pau_unit = 5; + op_unit = 5; + break; + case 4: + /* OP6A - OP6B or OP7A - OP7B (TO DO) */ + pau_unit = 6; + op_unit = 6; + break; + case 5: + /* OP7A - OP7B or OP6A - OP6B (TO DO) */ + pau_unit = 7; + op_unit = 7; + break; + default: + return; + } + + /* ODL0 is hooked up to OTL0 */ + if (dev_index == 0) { + odl_index = 0; + phy_lane_mask = PPC_BITMASK32(0, 3); + phy_lane_mask |= PPC_BITMASK32(5, 8); + } else if (dev_index == 1) { + odl_index = 1; + phy_lane_mask = PPC_BITMASK32(9, 12); + phy_lane_mask |= PPC_BITMASK32(14, 17); + } + + dt_add_property_cells(link, "ibm,odl-index", odl_index); + dt_add_property_cells(link, "ibm,pau-unit", pau_unit); + dt_add_property_cells(link, "ibm,op-unit", op_unit); + dt_add_property_cells(link, "ibm,pau-lane-mask", phy_lane_mask); +} + +static void pau_dt_create_pau(struct dt_node *xscom, uint32_t pau_index) +{ + const uint32_t pau_base[] = { 0x10010800, 0x11010800, + 0x12010800, 0x12011000, + 0x13010800, 0x13011000}; + struct dt_node *pau; + uint32_t links; + + assert(pau_index < PAU_NBR); + pau = dt_new_addr(xscom, "pau", pau_base[pau_index]); + + dt_add_property_cells(pau, "#size-cells", 0); + dt_add_property_cells(pau, "#address-cells", 1); + dt_add_property_cells(pau, "reg", pau_base[pau_index], 0x2c); + dt_add_property_string(pau, "compatible", "ibm,power10-pau"); + dt_add_property_cells(pau, "ibm,pau-index", pau_index); + dt_add_property_cells(pau, "ibm,phb-index", 7 + pau_index); + + links = PAU_LINKS_OPENCAPI_PER_PAU; + for (uint32_t i = 0; i < links; i++) + pau_dt_create_link(pau, pau_index, i); +} + +static bool pau_dt_create(void) +{ + struct dt_node *xscom; + + /* P10 chips only */ + if (proc_gen < proc_gen_p10) + return false; + + dt_for_each_compatible(dt_root, xscom, "ibm,xscom") + for (uint32_t i = 0; i < PAU_NBR; i++) + pau_dt_create_pau(xscom, i); + + return true; +} + +static struct pau *pau_create(struct dt_node *dn) +{ + struct pau *pau; + struct dt_node *link; + struct pau_dev *dev; + char *path; + uint32_t i; + + pau = zalloc(sizeof(*pau)); + assert(pau); + + init_lock(&pau->lock); + + pau->dt_node = dn; + pau->index = dt_prop_get_u32(dn, "ibm,pau-index"); + pau->xscom_base = dt_get_address(dn, 0, NULL); + + pau->chip_id = dt_get_chip_id(dn); + assert(get_chip(pau->chip_id)); + + pau->links = PAU_LINKS_OPENCAPI_PER_PAU; + dt_for_each_compatible(dn, link, "ibm,pau-link") { + i = dt_prop_get_u32(link, "ibm,pau-link-index"); + assert(i < PAU_LINKS_OPENCAPI_PER_PAU); + + dev = &pau->devices[i]; + dev->index = i; + dev->pau = pau; + dev->dn = link; + dev->odl_index = dt_prop_get_u32(link, "ibm,odl-index"); + dev->op_unit = dt_prop_get_u32(link, "ibm,op-unit"); + dev->phy_lane_mask = dt_prop_get_u32(link, "ibm,pau-lane-mask"); + }; + + path = dt_get_path(dn); + PAUINF(pau, "Found %s\n", path); + PAUINF(pau, "SCOM base: 0x%llx\n", pau->xscom_base); + free(path); + + return pau; +} + +static void pau_device_detect_fixup(struct pau_dev *dev) +{ + struct dt_node *dn = dev->dn; + + if (dev->type == PAU_DEV_TYPE_OPENCAPI) { + PAUDEVDBG(dev, "Link type opencapi\n"); + dt_add_property_strings(dn, "ibm,pau-link-type", "opencapi"); + return; + } + + PAUDEVDBG(dev, "Link type unknown\n"); + dt_add_property_strings(dn, "ibm,pau-link-type", "unknown"); +} + +static void pau_init(struct pau *pau) +{ + struct pau_dev *dev; + + platform.pau_device_detect(pau); + pau_for_each_dev(dev, pau) + pau_device_detect_fixup(dev); + +} + +void probe_pau(void) +{ + struct dt_node *dn; + struct pau *pau; + + /* This can be removed when/if we decide to use HDAT instead */ + if (!pau_dt_create()) + return; + + if (!platform.pau_device_detect) { + prlog(PR_INFO, "PAU: Platform does not support PAU\n"); + return; + } + + dt_for_each_compatible(dt_root, dn, "ibm,power10-pau") { + pau = pau_create(dn); + pau_init(pau); + } +} diff --git a/include/pau-regs.h b/include/pau-regs.h new file mode 100644 index 00000000..a35668f1 --- /dev/null +++ b/include/pau-regs.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later + * Copyright 2021 IBM Corp. + */ + +#ifndef __PAU_REGS_H +#define __PAU_REGS_H + +/* PAU FIR registers */ +#define PAU_FIR(n) (0x400 + (n) * 0x40) +#define PAU_FIR_MASK(n) (0x403 + (n) * 0x40) +#define PAU_FIR_ACTION0(n) (0x406 + (n) * 0x40) +#define PAU_FIR_ACTION1(n) (0x407 + (n) * 0x40) +#define PAU_FIR_MAX 3 + +/* PAU RING: Indirect address/data port */ +#define PAU_MISC_SCOM_IND_SCOM_ADDR 0x33e +#define PAU_MISC_DA_ADDR PPC_BITMASK(0, 23) +#define PAU_MISC_DA_LEN PPC_BITMASK(24, 25) +#define PAU_MISC_DA_LEN_4B 2 +#define PAU_MISC_DA_LEN_8B 3 +#define PAU_MISC_SCOM_IND_SCOM_DATA 0x33f + +/* PAU RING: Indirect register blocks */ +#define PAU_BLOCK(nib0, nib1) ((nib0) << 20 | (nib1) << 16) +#define PAU_REG_BLOCK(reg) ((reg) & 0xff0000) +#define PAU_REG_OFFSET(reg) ((reg) & 0xffff) + +#define PAU_BLOCK_CQ_SM(n) PAU_BLOCK(4, (n)) + +#endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h new file mode 100644 index 00000000..75e7b8d5 --- /dev/null +++ b/include/pau.h @@ -0,0 +1,87 @@ +/* SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later + * Copyright 2021 IBM Corp. + */ + +#ifndef __PAU_H +#define __PAU_H + +#include +#include +#include +#include + +#define PAU_NBR 6 +#define PAU_LINKS_OPENCAPI_PER_PAU 2 + +enum pau_dev_type { + PAU_DEV_TYPE_UNKNOWN = 0, + PAU_DEV_TYPE_OPENCAPI, + PAU_DEV_TYPE_ANY = INT_MAX +}; + +struct pau_dev { + enum pau_dev_type type; + uint32_t index; + struct dt_node *dn; + + /* Associated PHY information */ + uint32_t pau_unit; /* 0,3,4,5,6,7 */ + uint32_t odl_index; + uint32_t op_unit; /* 0 -> 7 */ + uint32_t phy_lane_mask; + + struct pau *pau; +}; + +struct pau { + uint32_t index; + struct dt_node *dt_node; + uint32_t chip_id; + uint64_t xscom_base; + + /* Global MMIO window (all PAU regs) */ + uint64_t regs[2]; + + struct lock lock; + + uint32_t links; + struct pau_dev devices[PAU_LINKS_OPENCAPI_PER_PAU]; +}; + +#define PAUDBG(pau, fmt, a...) PAULOG(PR_DEBUG, pau, fmt, ##a) +#define PAUINF(pau, fmt, a...) PAULOG(PR_INFO, pau, fmt, ##a) +#define PAUERR(pau, fmt, a...) PAULOG(PR_ERR, pau, fmt, ##a) + +#define PAUDEVDBG(dev, fmt, a...) PAUDEVLOG(PR_DEBUG, dev, fmt, ##a) +#define PAUDEVINF(dev, fmt, a...) PAUDEVLOG(PR_INFO, dev, fmt, ##a) +#define PAUDEVERR(dev, fmt, a...) PAUDEVLOG(PR_ERR, dev, fmt, ##a) + +#define PAULOG(l, pau, fmt, a...) \ + prlog(l, "PAU[%d:%d]: " fmt, (pau)->chip_id, (pau)->index, ##a) + +#define PAUDEVLOG(l, dev, fmt, a...) \ + prlog(l, "PAU[%d:%d:%d]: " fmt, \ + (dev)->pau->chip_id, \ + (dev)->pau->index, \ + (dev)->index, ##a) + + +/* pau-scope index of the link */ +static inline uint32_t pau_dev_index(struct pau_dev *dev, int links) +{ + return dev->pau->index * links + dev->index; +} + +struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, + enum pau_dev_type type); + +#define pau_for_each_dev_type(dev, pau, type) \ + for (dev = NULL; (dev = pau_next_dev(pau, dev, type));) + +#define pau_for_each_opencapi_dev(dev, pau) \ + pau_for_each_dev_type(dev, pau, PAU_DEV_TYPE_OPENCAPI) + +#define pau_for_each_dev(dev, pau) \ + pau_for_each_dev_type(dev, pau, PAU_DEV_TYPE_ANY) + +#endif /* __PAU_H */ diff --git a/include/platform.h b/include/platform.h index d113e6eb..d25e9ef4 100644 --- a/include/platform.h +++ b/include/platform.h @@ -11,6 +11,7 @@ struct pci_slot; struct errorlog; struct npu2; struct npu3; +struct pau; enum resource_id { RESOURCE_ID_KERNEL, @@ -128,6 +129,9 @@ struct platform { void (*npu2_device_detect)(struct npu2 *npu); void (*npu3_device_detect)(struct npu3 *npu); + /* PAU device detection */ + void (*pau_device_detect)(struct pau *pau); + /* * Probe platform, return true on a match, called before * any allocation has been performed outside of the heap diff --git a/include/skiboot.h b/include/skiboot.h index f3378ec2..4afcb318 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -210,6 +210,7 @@ extern void preload_io_vpd(void); extern void probe_npu(void); extern void probe_npu2(void); extern void probe_npu3(void); +extern void probe_pau(void); extern void uart_init(void); extern void mbox_init(void); extern void early_uart_init(void); -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:46 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:46 +0200 Subject: [Skiboot] [PATCH 05/16] [PATCH 05/16] opencapi5: assing bars In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-6-clombard@linux.vnet.ibm.com> Configure early PAU Global MMIO BAR registers to allow PAU MMIO register accesses. This is done for each PAU. Enable the Powerbus interface is mandatory for MMIO accesses. For each OpenCAPI device, configure the bar registers to access to the AFU MMIO and to the AFU Config Addr/Data registers. AFU Config/Data registers = GENID_ADDR (from phy_map file) + 320K (= 0x50000) Signed-off-by: Christophe Lombard --- hw/pau.c | 75 +++++++++++++++++++++++++++++++++++++++++++ hw/phys-map.c | 46 ++++++++++++++++++++++++--- include/pau-regs.h | 26 +++++++++++++++ include/pau.h | 79 ++++++++++++++++++++++++++++++++++++++++++++++ include/phys-map.h | 4 +++ 5 files changed, 225 insertions(+), 5 deletions(-) diff --git a/hw/pau.c b/hw/pau.c index b02b0034..fb6a175e 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -194,6 +194,80 @@ static void pau_device_detect_fixup(struct pau_dev *dev) dt_add_property_strings(dn, "ibm,pau-link-type", "unknown"); } +static void pau_opencapi_assign_bars(struct pau *pau) +{ + struct pau_dev *dev; + uint64_t addr, size, val; + + /* Global MMIO bar (per pau) + * 16M aligned address -> 0x1000000 (bit 24) + */ + phys_map_get(pau->chip_id, PAU_REGS, pau->index, &addr, &size); + val = SETFIELD(PAU_MMIO_BAR_ADDR, 0ull, addr >> 24); + val |= PAU_MMIO_BAR_ENABLE; + pau_write(pau, PAU_MMIO_BAR, val); + + PAUINF(pau, "MMIO base: 0x%016llx (%lldMB)\n", addr, size >> 20); + pau->regs[0] = addr; + pau->regs[1] = size; + + /* NTL bar (per device) + * 64K aligned address -> 0x10000 (bit 16) + */ + pau_for_each_dev(dev, pau) { + if (dev->type == PAU_DEV_TYPE_UNKNOWN) + continue; + + phys_map_get(pau->chip_id, PAU_OCAPI_MMIO, + pau_dev_index(dev, PAU_LINKS_OPENCAPI_PER_PAU), + &addr, &size); + + val = SETFIELD(PAU_NTL_BAR_ADDR, 0ull, addr >> 16); + val = SETFIELD(PAU_NTL_BAR_SIZE, val, ilog2(size >> 16)); + pau_write(pau, PAU_NTL_BAR(dev->index), val); + + val = SETFIELD(PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR, 0ull, addr >> 16); + val = SETFIELD(PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE, val, ilog2(size >> 16)); + pau_write(pau, PAU_CTL_MISC_MMIOPA_CONFIG(dev->index), val); + + dev->ntl_bar.addr = addr; + dev->ntl_bar.size = size; + } + + /* GENID bar (logically divided per device) + * 512K aligned address -> 0x80000 (bit 19) + */ + phys_map_get(pau->chip_id, PAU_GENID, pau->index, &addr, &size); + val = SETFIELD(PAU_GENID_BAR_ADDR, 0ull, addr >> 19); + pau_write(pau, PAU_GENID_BAR, val); + + pau_for_each_dev(dev, pau) { + if (dev->type == PAU_DEV_TYPE_UNKNOWN) + continue; + + dev->genid_bar.size = size; + /* +320K = Bricks 0-4 Config Addr/Data registers */ + dev->genid_bar.cfg = addr + 0x50000; + } +} + +static void pau_opencapi_init_hw(struct pau *pau) +{ + pau_opencapi_assign_bars(pau); +} + +static void pau_opencapi_init(struct pau *pau) +{ + if (!pau_next_dev(pau, NULL, PAU_DEV_TYPE_OPENCAPI)) + return; + + assert(platform.ocapi); + + pau_opencapi_init_hw(pau); + + disable_fast_reboot("OpenCAPI device enabled"); +} + static void pau_init(struct pau *pau) { struct pau_dev *dev; @@ -202,6 +276,7 @@ static void pau_init(struct pau *pau) pau_for_each_dev(dev, pau) pau_device_detect_fixup(dev); + pau_opencapi_init(pau); } void probe_pau(void) diff --git a/hw/phys-map.c b/hw/phys-map.c index d6ff99fd..7b44fc61 100644 --- a/hw/phys-map.c +++ b/hw/phys-map.c @@ -82,8 +82,44 @@ static const struct phys_map_entry phys_map_table_p10[] = { { VAS_HYP_WIN , 0, 0x00060302fe000000ull, 0x0000000002000000ull }, { VAS_USER_WIN , 0, 0x0006030300000000ull, 0x0000000100000000ull }, - /* TODO: MC, OCMB, PAU */ + /* TODO: MC, OCMB */ { RESV , 8, 0x0006030400000000ull, 0x000000f800000000ull }, + { PAU_OCAPI_MMIO, 0, 0x0006038800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 1, 0x0006039000000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 2, 0x0006039800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 3, 0x000603a000000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 4, 0x000603a800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 5, 0x000603b000000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 6, 0x000603b800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 7, 0x000603c000000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 8, 0x000603c800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 9, 0x000603d000000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 10, 0x000603d800000000ull, 0x0000000800000000ull }, + { PAU_OCAPI_MMIO, 11, 0x000603e000000000ull, 0x0000000800000000ull }, + { PAU_REGS, 0, 0x000603e800000000ull, 0x0000000001000000ull }, + { PAU_REGS, 1, 0x000603e801000000ull, 0x0000000001000000ull }, + { PAU_REGS, 2, 0x000603e802000000ull, 0x0000000001000000ull }, + { PAU_REGS, 3, 0x000603e803000000ull, 0x0000000001000000ull }, + { PAU_REGS, 4, 0x000603e804000000ull, 0x0000000001000000ull }, + { PAU_REGS, 5, 0x000603e805000000ull, 0x0000000001000000ull }, + { PAU_GENID, 0, 0x000603e806080000ull, 0x0000000000080000ull }, + { PAU_GENID, 1, 0x000603e806180000ull, 0x0000000000080000ull }, + { PAU_GENID, 2, 0x000603e806280000ull, 0x0000000000080000ull }, + { PAU_GENID, 3, 0x000603e806380000ull, 0x0000000000080000ull }, + { PAU_GENID, 4, 0x000603e806480000ull, 0x0000000000080000ull }, + { PAU_GENID, 5, 0x000603e806580000ull, 0x0000000000080000ull }, + { PAU_NTL, 0, 0x000603e806040000ull, 0x0000000000020000ull }, + { PAU_NTL, 1, 0x000603e806060000ull, 0x0000000000020000ull }, + { PAU_NTL, 2, 0x000603e806140000ull, 0x0000000000020000ull }, + { PAU_NTL, 3, 0x000603e806160000ull, 0x0000000000020000ull }, + { PAU_NTL, 4, 0x000603e806240000ull, 0x0000000000020000ull }, + { PAU_NTL, 5, 0x000603e806260000ull, 0x0000000000020000ull }, + { PAU_NTL, 6, 0x000603e806340000ull, 0x0000000000020000ull }, + { PAU_NTL, 7, 0x000603e806360000ull, 0x0000000000020000ull }, + { PAU_NTL, 8, 0x000603e806440000ull, 0x0000000000020000ull }, + { PAU_NTL, 9, 0x000603e806460000ull, 0x0000000000020000ull }, + { PAU_NTL, 10, 0x000603e806540000ull, 0x0000000000020000ull }, + { PAU_NTL, 11, 0x000603e806560000ull, 0x0000000000020000ull }, { XSCOM , 0, 0x000603fc00000000ull, 0x0000000400000000ull }, /* 4 TB offset */ @@ -130,10 +166,10 @@ static const struct phys_map_entry phys_map_table_nimbus[] = { * * We don't currently support >4TB ranges. */ - { OCAPI_MEM, 0, 0x0002000000000000ull, 0x0000040000000000ull }, - { OCAPI_MEM, 1, 0x0002800000000000ull, 0x0000040000000000ull }, - { OCAPI_MEM, 2, 0x0003000000000000ull, 0x0000040000000000ull }, - { OCAPI_MEM, 3, 0x0003800000000000ull, 0x0000040000000000ull }, + { OCAPI_MEM, 0, 0x0002000000000000ull, 0x0000040000000000ull }, + { OCAPI_MEM, 1, 0x0002800000000000ull, 0x0000040000000000ull }, + { OCAPI_MEM, 2, 0x0003000000000000ull, 0x0000040000000000ull }, + { OCAPI_MEM, 3, 0x0003800000000000ull, 0x0000040000000000ull }, /* 0 TB offset @ MMIO 0x0006000000000000ull */ { PHB4_64BIT_MMIO, 0, 0x0006000000000000ull, 0x0000004000000000ull }, diff --git a/include/pau-regs.h b/include/pau-regs.h index a35668f1..afe6f958 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -26,5 +26,31 @@ #define PAU_REG_OFFSET(reg) ((reg) & 0xffff) #define PAU_BLOCK_CQ_SM(n) PAU_BLOCK(4, (n)) +#define PAU_BLOCK_CQ_CTL PAU_BLOCK(4, 4) + +/* + * CQ_SM block registers + * + * Definitions here use PAU_BLOCK_CQ_SM(0), but when pau_write() is given + * one of these, it will do corresponding writes to every CQ_SM block. + */ +#define PAU_MCP_MISC_CFG0 (PAU_BLOCK_CQ_SM(0) + 0x000) +#define PAU_MCP_MISC_CFG0_MA_MCRESP_OPT_WRP PPC_BIT(9) +#define PAU_MCP_MISC_CFG0_ENABLE_PBUS PPC_BIT(26) +#define PAU_SNP_MISC_CFG0 (PAU_BLOCK_CQ_SM(0) + 0x180) +#define PAU_SNP_MISC_CFG0_ENABLE_PBUS PPC_BIT(2) +#define PAU_NTL_BAR(brk) (PAU_BLOCK_CQ_SM(0) + 0x1b8 + (brk) * 8) +#define PAU_NTL_BAR_ADDR PPC_BITMASK(3, 35) +#define PAU_NTL_BAR_SIZE PPC_BITMASK(39, 43) +#define PAU_MMIO_BAR (PAU_BLOCK_CQ_SM(0) + 0x1e0) +#define PAU_MMIO_BAR_ENABLE PPC_BIT(0) +#define PAU_MMIO_BAR_ADDR PPC_BITMASK(3, 27) +#define PAU_GENID_BAR (PAU_BLOCK_CQ_SM(0) + 0x1e8) +#define PAU_GENID_BAR_ADDR PPC_BITMASK(3, 32) + +/* CQ_CTL block registers */ +#define PAU_CTL_MISC_MMIOPA_CONFIG(brk) (PAU_BLOCK_CQ_CTL + 0x098 + (brk) * 8) +#define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR PPC_BITMASK(1, 35) +#define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE PPC_BITMASK(39, 43) #endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h index d91ffa6d..05b27196 100644 --- a/include/pau.h +++ b/include/pau.h @@ -19,11 +19,21 @@ enum pau_dev_type { PAU_DEV_TYPE_ANY = INT_MAX }; +/* Used to expose a hardware BAR (or logical slice of it) outside skiboot */ +struct pau_bar { + uint64_t addr; + uint64_t size; + uint64_t cfg; +}; + struct pau_dev { enum pau_dev_type type; uint32_t index; struct dt_node *dn; + struct pau_bar ntl_bar; + struct pau_bar genid_bar; + /* Associated I2C information */ uint8_t i2c_bus_id; @@ -44,6 +54,7 @@ struct pau { /* Global MMIO window (all PAU regs) */ uint64_t regs[2]; + bool mmio_access; struct lock lock; @@ -87,4 +98,72 @@ struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, #define pau_for_each_dev(dev, pau) \ pau_for_each_dev_type(dev, pau, PAU_DEV_TYPE_ANY) +/* + * We use the indirect method because it uses the same addresses as + * the MMIO offsets (PAU RING) + */ +static inline void pau_scom_sel(struct pau *pau, uint64_t reg, + uint64_t size) +{ + uint64_t val; + + val = SETFIELD(PAU_MISC_DA_ADDR, 0ull, reg); + val = SETFIELD(PAU_MISC_DA_LEN, val, size); + xscom_write(pau->chip_id, + pau->xscom_base + PAU_MISC_SCOM_IND_SCOM_ADDR, + val); +} + +static inline void pau_scom_write(struct pau *pau, uint64_t reg, + uint64_t size, + uint64_t val) +{ + pau_scom_sel(pau, reg, size); + xscom_write(pau->chip_id, + pau->xscom_base + PAU_MISC_SCOM_IND_SCOM_DATA, + val); +} + +static inline uint64_t pau_scom_read(struct pau *pau, uint64_t reg, + uint64_t size) +{ + uint64_t val; + + pau_scom_sel(pau, reg, size); + xscom_read(pau->chip_id, + pau->xscom_base + PAU_MISC_SCOM_IND_SCOM_DATA, + &val); + + return val; +} + +static inline void pau_write(struct pau *pau, uint64_t reg, + uint64_t val) +{ + void *mmio = (void *)pau->regs[0]; + + if (pau->mmio_access) + out_be64(mmio + reg, val); + else + pau_scom_write(pau, reg, PAU_MISC_DA_LEN_8B, val); + + /* CQ_SM writes should be mirrored in all four blocks */ + if (PAU_REG_BLOCK(reg) != PAU_BLOCK_CQ_SM(0)) + return; + + for (uint32_t i = 1; i < 4; i++) + pau_write(pau, PAU_BLOCK_CQ_SM(i) + PAU_REG_OFFSET(reg), + val); +} + +static inline uint64_t pau_read(struct pau *pau, uint64_t reg) +{ + void *mmio = (void *)pau->regs[0]; + + if (pau->mmio_access) + return in_be64(mmio + reg); + + return pau_scom_read(pau, reg, PAU_MISC_DA_LEN_8B); +} + #endif /* __PAU_H */ diff --git a/include/phys-map.h b/include/phys-map.h index 1dd337a5..a53bcd04 100644 --- a/include/phys-map.h +++ b/include/phys-map.h @@ -51,6 +51,10 @@ enum phys_map_type { XIVE_NVPG, XIVE_ESB, XIVE_END, + PAU_OCAPI_MMIO, + PAU_REGS, + PAU_GENID, + PAU_NTL, }; extern void phys_map_get(uint64_t gcid, enum phys_map_type type, -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:47 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:47 +0200 Subject: [Skiboot] [PATCH 06/16] [PATCH 06/16] opencapi5: create phb In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-7-clombard@linux.vnet.ibm.com> Implement the necessary operations for the OpenCAPI PHB type and inform the device-tree properties associated. The OpenCapi PCI config Addr/Data registers are reachable through the Generation-ID Registers MMIO BARS. The Config Address and Data registers are located at the following offsets from the AFU Config BAR plus 320 KB. ? Config Address for Brick 0 ? Offset 0 ? Config Data for Brick 0 ? Offsets: ? 128 ? 4-byte config register ? Config Address for Brick 1 ? Offset 256 ? Config Data for Brick 1 ? Offsets: ? 384 ? 4-byte config register Signed-off-by: Christophe Lombard --- core/pci-opal.c | 9 +- core/pci.c | 4 +- hw/pau.c | 234 ++++++++++++++++++++++++++++++++++++++++++++- include/pau-regs.h | 8 ++ include/pau.h | 20 ++++ include/pci.h | 1 + 6 files changed, 271 insertions(+), 5 deletions(-) diff --git a/core/pci-opal.c b/core/pci-opal.c index aa375c6a..acbcd2a5 100644 --- a/core/pci-opal.c +++ b/core/pci-opal.c @@ -748,7 +748,8 @@ static void rescan_slot_devices(struct pci_slot *slot) * prepare_link_change() is called (if needed) by the state * machine during the slot reset or link polling */ - if (phb->phb_type != phb_type_npu_v2_opencapi) { + if ((phb->phb_type != phb_type_npu_v2_opencapi) && + (phb->phb_type != phb_type_pau_opencapi)) { pci_scan_bus(phb, pd->secondary_bus, pd->subordinate_bus, &pd->children, pd, true); pci_add_device_nodes(phb, &pd->children, pd->dn, @@ -766,7 +767,8 @@ static void remove_slot_devices(struct pci_slot *slot) struct phb *phb = slot->phb; struct pci_device *pd = slot->pd; - if (phb->phb_type != phb_type_npu_v2_opencapi) + if ((phb->phb_type != phb_type_npu_v2_opencapi) && + (phb->phb_type != phb_type_pau_opencapi)) pci_remove_bus(phb, &pd->children); else pci_remove_bus(phb, &phb->devices); @@ -817,7 +819,8 @@ static bool training_needed(struct pci_slot *slot) struct pci_device *pd = slot->pd; /* only for opencapi slots for now */ - if (!pd && phb->phb_type == phb_type_npu_v2_opencapi) + if (!pd && ((phb->phb_type == phb_type_npu_v2_opencapi) || + (phb->phb_type == phb_type_pau_opencapi))) return true; return false; } diff --git a/core/pci.c b/core/pci.c index e195ecbf..0a146c83 100644 --- a/core/pci.c +++ b/core/pci.c @@ -1517,7 +1517,9 @@ static void __noinline pci_add_one_device_node(struct phb *phb, * device has a 4KB config space. It's got nothing to do with the * standard Type 0/1 config spaces defined by PCI. */ - if (is_pcie || phb->phb_type == phb_type_npu_v2_opencapi) { + if (is_pcie || + (phb->phb_type == phb_type_npu_v2_opencapi) || + (phb->phb_type == phb_type_pau_opencapi)) { snprintf(compat, MAX_NAME, "pciex%x,%x", PCI_VENDOR_ID(pd->vdid), PCI_DEVICE_ID(pd->vdid)); dt_add_property_cells(np, "ibm,pci-config-space-type", 1); diff --git a/hw/pau.c b/hw/pau.c index fb6a175e..5caafe6b 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -3,12 +3,18 @@ * Copyright 2020 IBM Corp. */ +#include +#include #include #include #include +/* Number of PEs supported */ +#define PAU_MAX_PE_NUM 16 +#define PAU_RESERVED_PE_NUM 15 + struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, - enum pau_dev_type type) + enum pau_dev_type type) { uint32_t i = 0; @@ -251,9 +257,235 @@ static void pau_opencapi_assign_bars(struct pau *pau) } } +static void pau_opencapi_create_phb_slot(struct pau_dev *dev) +{ + struct pci_slot *slot; + + slot = pci_slot_alloc(&dev->phb, NULL); + if (!slot) { + /** + * @fwts-label OCAPICannotCreatePHBSlot + * @fwts-advice Firmware probably ran out of memory creating + * PAU slot. OpenCAPI functionality could be broken. + */ + PAUDEVERR(dev, "Cannot create PHB slot\n"); + } +} + +static int64_t pau_opencapi_pcicfg_check(struct pau_dev *dev, + uint32_t offset, + uint32_t size) +{ + if (!dev || offset > 0xfff || (offset & (size - 1))) + return OPAL_PARAMETER; + + return OPAL_SUCCESS; +} + +static int64_t pau_opencapi_pcicfg_read(struct phb *phb, uint32_t bdfn, + uint32_t offset, uint32_t size, + void *data) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + uint64_t cfg_addr, genid_base; + int64_t rc; + + rc = pau_opencapi_pcicfg_check(dev, offset, size); + if (rc) + return rc; + + /* Config Address for Brick 0 ? Offset 0 + * Config Address for Brick 1 ? Offset 256 + */ + genid_base = dev->genid_bar.cfg + (dev->index << 8); + + cfg_addr = PAU_CTL_MISC_CFG_ADDR_ENABLE; + cfg_addr = SETFIELD(PAU_CTL_MISC_CFG_ADDR_BUS_NBR | + PAU_CTL_MISC_CFG_ADDR_DEVICE_NBR | + PAU_CTL_MISC_CFG_ADDR_FUNCTION_NBR, + cfg_addr, bdfn); + cfg_addr = SETFIELD(PAU_CTL_MISC_CFG_ADDR_REGISTER_NBR, + cfg_addr, offset & ~3u); + + out_be64((uint64_t *)genid_base, cfg_addr); + sync(); + + switch (size) { + case 1: + *((uint8_t *)data) = + in_8((uint8_t *)(genid_base + 128 + (offset & 3))); + break; + case 2: + *((uint16_t *)data) = + in_le16((uint16_t *)(genid_base + 128 + (offset & 2))); + break; + case 4: + *((uint32_t *)data) = in_le32((uint32_t *)(genid_base + 128)); + break; + default: + return OPAL_PARAMETER; + } + + return OPAL_SUCCESS; +} + +#define PAU_OPENCAPI_PCI_CFG_READ(size, type) \ +static int64_t pau_opencapi_pcicfg_read##size(struct phb *phb, uint32_t bdfn, \ + uint32_t offset, type * data) \ +{ \ + /* Initialize data in case of error */ \ + *data = (type)0xffffffff; \ + return pau_opencapi_pcicfg_read(phb, bdfn, offset, sizeof(type), data); \ +} + +static int64_t pau_opencapi_pcicfg_write(struct phb *phb, uint32_t bdfn, + uint32_t offset, uint32_t size, + uint32_t data) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + uint64_t genid_base, cfg_addr; + int64_t rc; + + rc = pau_opencapi_pcicfg_check(dev, offset, size); + if (rc) + return rc; + + /* Config Address for Brick 0 ? Offset 0 + * Config Address for Brick 1 ? Offset 256 + */ + genid_base = dev->genid_bar.cfg + (dev->index << 8); + + cfg_addr = PAU_CTL_MISC_CFG_ADDR_ENABLE; + cfg_addr = SETFIELD(PAU_CTL_MISC_CFG_ADDR_BUS_NBR | + PAU_CTL_MISC_CFG_ADDR_DEVICE_NBR | + PAU_CTL_MISC_CFG_ADDR_FUNCTION_NBR, + cfg_addr, bdfn); + cfg_addr = SETFIELD(PAU_CTL_MISC_CFG_ADDR_REGISTER_NBR, + cfg_addr, offset & ~3u); + + out_be64((uint64_t *)genid_base, cfg_addr); + sync(); + + switch (size) { + case 1: + out_8((uint8_t *)(genid_base + 128 + (offset & 3)), data); + break; + case 2: + out_le16((uint16_t *)(genid_base + 128 + (offset & 2)), data); + break; + case 4: + out_le32((uint32_t *)(genid_base + 128), data); + break; + default: + return OPAL_PARAMETER; + } + + return OPAL_SUCCESS; +} + +#define PAU_OPENCAPI_PCI_CFG_WRITE(size, type) \ +static int64_t pau_opencapi_pcicfg_write##size(struct phb *phb, uint32_t bdfn, \ + uint32_t offset, type data) \ +{ \ + return pau_opencapi_pcicfg_write(phb, bdfn, offset, sizeof(type), data);\ +} + +PAU_OPENCAPI_PCI_CFG_READ(8, u8) +PAU_OPENCAPI_PCI_CFG_READ(16, u16) +PAU_OPENCAPI_PCI_CFG_READ(32, u32) +PAU_OPENCAPI_PCI_CFG_WRITE(8, u8) +PAU_OPENCAPI_PCI_CFG_WRITE(16, u16) +PAU_OPENCAPI_PCI_CFG_WRITE(32, u32) + +static const struct phb_ops pau_opencapi_ops = { + .cfg_read8 = pau_opencapi_pcicfg_read8, + .cfg_read16 = pau_opencapi_pcicfg_read16, + .cfg_read32 = pau_opencapi_pcicfg_read32, + .cfg_write8 = pau_opencapi_pcicfg_write8, + .cfg_write16 = pau_opencapi_pcicfg_write16, + .cfg_write32 = pau_opencapi_pcicfg_write32, +}; + +static void pau_opencapi_create_phb(struct pau_dev *dev) +{ + struct phb *phb = &dev->phb; + uint64_t mm_win[2]; + + mm_win[0] = dev->ntl_bar.addr; + mm_win[1] = dev->ntl_bar.size; + + phb->phb_type = phb_type_pau_opencapi; + phb->scan_map = 0; + + phb->ops = &pau_opencapi_ops; + phb->dt_node = dt_new_addr(dt_root, "pciex", mm_win[0]); + assert(phb->dt_node); + + pci_register_phb(phb, pau_get_opal_id(dev->pau->chip_id, + pau_get_phb_index(dev->pau->index, dev->index))); + pau_opencapi_create_phb_slot(dev); +} + +static void pau_opencapi_dt_add_mmio_window(struct pau_dev *dev) +{ + struct dt_node *dn = dev->phb.dt_node; + uint64_t mm_win[2]; + + mm_win[0] = dev->ntl_bar.addr; + mm_win[1] = dev->ntl_bar.size; + PAUDEVDBG(dev, "Setting AFU MMIO window to %016llx %016llx\n", + mm_win[0], mm_win[1]); + + dt_add_property(dn, "reg", mm_win, sizeof(mm_win)); + dt_add_property(dn, "ibm,mmio-window", mm_win, sizeof(mm_win)); + dt_add_property_cells(dn, "ranges", 0x02000000, + hi32(mm_win[0]), lo32(mm_win[0]), + hi32(mm_win[0]), lo32(mm_win[0]), + hi32(mm_win[1]), lo32(mm_win[1])); +} + +static void pau_opencapi_dt_add_props(struct pau_dev *dev) +{ + struct dt_node *dn = dev->phb.dt_node; + struct pau *pau = dev->pau; + + dt_add_property_strings(dn, + "compatible", + "ibm,power10-pau-opencapi-pciex", + "ibm,ioda3-pau-opencapi-phb", + "ibm,ioda2-npu2-opencapi-phb"); + + dt_add_property_cells(dn, "#address-cells", 3); + dt_add_property_cells(dn, "#size-cells", 2); + dt_add_property_cells(dn, "#interrupt-cells", 1); + dt_add_property_cells(dn, "bus-range", 0, 0xff); + dt_add_property_cells(dn, "clock-frequency", 0x200, 0); + dt_add_property_cells(dn, "interrupt-parent", get_ics_phandle()); + + dt_add_property_strings(dn, "device_type", "pciex"); + dt_add_property_cells(dn, "ibm,pau-index", pau->index); + dt_add_property_cells(dn, "ibm,chip-id", pau->chip_id); + dt_add_property_cells(dn, "ibm,xscom-base", pau->xscom_base); + dt_add_property_cells(dn, "ibm,npcq", pau->dt_node->phandle); + dt_add_property_cells(dn, "ibm,links", 1); + dt_add_property_cells(dn, "ibm,phb-diag-data-size", 0); + dt_add_property_cells(dn, "ibm,opal-num-pes", PAU_MAX_PE_NUM); + dt_add_property_cells(dn, "ibm,opal-reserved-pe", PAU_RESERVED_PE_NUM); + + pau_opencapi_dt_add_mmio_window(dev); +} + static void pau_opencapi_init_hw(struct pau *pau) { + struct pau_dev *dev = NULL; + pau_opencapi_assign_bars(pau); + + /* Create phb */ + pau_for_each_opencapi_dev(dev, pau) { + pau_opencapi_create_phb(dev); + pau_opencapi_dt_add_props(dev); + } } static void pau_opencapi_init(struct pau *pau) diff --git a/include/pau-regs.h b/include/pau-regs.h index afe6f958..57796920 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -52,5 +52,13 @@ #define PAU_CTL_MISC_MMIOPA_CONFIG(brk) (PAU_BLOCK_CQ_CTL + 0x098 + (brk) * 8) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR PPC_BITMASK(1, 35) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE PPC_BITMASK(39, 43) +#define PAU_CTL_MISC_CFG_ADDR(brk) (PAU_BLOCK_CQ_CTL + 0x250 + (brk) * 8) +#define PAU_CTL_MISC_CFG_ADDR_ENABLE PPC_BIT(0) +#define PAU_CTL_MISC_CFG_ADDR_STATUS PPC_BITMASK(1, 3) +#define PAU_CTL_MISC_CFG_ADDR_BUS_NBR PPC_BITMASK(4, 11) +#define PAU_CTL_MISC_CFG_ADDR_DEVICE_NBR PPC_BITMASK(12, 16) +#define PAU_CTL_MISC_CFG_ADDR_FUNCTION_NBR PPC_BITMASK(17, 19) +#define PAU_CTL_MISC_CFG_ADDR_REGISTER_NBR PPC_BITMASK(20, 31) +#define PAU_CTL_MISC_CFG_ADDR_TYPE PPC_BIT(32) #endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h index 05b27196..6f2eef6e 100644 --- a/include/pau.h +++ b/include/pau.h @@ -8,6 +8,7 @@ #include #include #include +#include #include #define PAU_NBR 6 @@ -30,6 +31,7 @@ struct pau_dev { enum pau_dev_type type; uint32_t index; struct dt_node *dn; + struct phb phb; struct pau_bar ntl_bar; struct pau_bar genid_bar; @@ -86,6 +88,12 @@ static inline uint32_t pau_dev_index(struct pau_dev *dev, int links) return dev->pau->index * links + dev->index; } +static inline struct pau_dev *pau_phb_to_opencapi_dev(struct phb *phb) +{ + assert(phb->phb_type == phb_type_pau_opencapi); + return container_of(phb, struct pau_dev, phb); +} + struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, enum pau_dev_type type); @@ -98,6 +106,18 @@ struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, #define pau_for_each_dev(dev, pau) \ pau_for_each_dev_type(dev, pau, PAU_DEV_TYPE_ANY) +#define PAU_PHB_INDEX_BASE 6 /* immediately after real PHBs */ +static inline int pau_get_phb_index(unsigned int pau_index, + unsigned int link_index) +{ + return PAU_PHB_INDEX_BASE + pau_index * 2 + link_index; +} + +static inline int pau_get_opal_id(unsigned int chip_id, unsigned int index) +{ + return phb4_get_opal_id(chip_id, index); +} + /* * We use the indirect method because it uses the same addresses as * the MMIO offsets (PAU RING) diff --git a/include/pci.h b/include/pci.h index eb23a6d9..cb8e7741 100644 --- a/include/pci.h +++ b/include/pci.h @@ -353,6 +353,7 @@ enum phb_type { phb_type_npu_v2, phb_type_npu_v2_opencapi, phb_type_npu_v3, + phb_type_pau_opencapi, }; /* Generic PCI NVRAM flags */ -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:48 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:48 +0200 Subject: [Skiboot] [PATCH 07/16] [PATCH 07/16] opencapi5: enabling opencapi In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-8-clombard@linux.vnet.ibm.com> Enable OpenCAPI mode for each brick which are connected to be used in this mode. This is be done through 7 steps as described in the P10 OCAPI 5.0 Processing Unit Workbook document, section: 17.1.3.1 Enabling OpenCAPI. The following sequences must be performed: 1. Set Transport MUX controls to select OpenCAPI 2. Enable Clocks in XSL 3. Enable Clocks in MISC 4. Set NPCQ configuration 5. Enable XSL-XTS Interfaces 6. Enable State-machine allocation Enabling the NTL/GENID BARS allows to access to the MMIO registers. Signed-off-by: Christophe Lombard --- hw/pau.c | 222 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 43 +++++++++ include/pau.h | 1 + 3 files changed, 266 insertions(+) diff --git a/hw/pau.c b/hw/pau.c index 5caafe6b..d7b51ee5 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -200,6 +200,42 @@ static void pau_device_detect_fixup(struct pau_dev *dev) dt_add_property_strings(dn, "ibm,pau-link-type", "unknown"); } +#define CQ_CTL_STATUS_TIMEOUT 10 /* milliseconds */ + +static int pau_opencapi_set_fence_control(struct pau_dev *dev, + uint8_t state_requested) +{ + uint64_t timeout = mftb() + msecs_to_tb(CQ_CTL_STATUS_TIMEOUT); + uint8_t status; + struct pau *pau = dev->pau; + uint64_t reg, val; + + reg = PAU_CTL_MISC_FENCE_CTRL(dev->index); + val = pau_read(pau, reg); + val = SETFIELD(PAU_CTL_MISC_FENCE_REQUEST, val, state_requested); + pau_write(pau, reg, val); + + /* Wait for fence status to update */ + do { + reg = PAU_CTL_MISC_STATUS(dev->index); + val = pau_read(pau, reg); + status = GETFIELD(PAU_CTL_MISC_STATUS_AM_FENCED(dev->index), val); + if (status == state_requested) + return OPAL_SUCCESS; + time_wait_ms(1); + } while (tb_compare(mftb(), timeout) == TB_ABEFOREB); + + /* + * @fwts-label OCAPIFenceStatusTimeout + * @fwts-advice The PAU fence status did not update as expected. This + * could be the result of a firmware or hardware bug. OpenCAPI + * functionality could be broken. + */ + PAUDEVERR(dev, "Bad fence status: expected 0x%x, got 0x%x\n", + state_requested, status); + return OPAL_HARDWARE; +} + static void pau_opencapi_assign_bars(struct pau *pau) { struct pau_dev *dev; @@ -257,6 +293,37 @@ static void pau_opencapi_assign_bars(struct pau *pau) } } +static void pau_opencapi_enable_bars(struct pau_dev *dev, bool enable) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + if (dev->ntl_bar.enable == enable) /* No state change */ + return; + + dev->ntl_bar.enable = enable; + dev->genid_bar.enable = enable; + + reg = PAU_NTL_BAR(dev->index); + val = pau_read(pau, reg); + val = SETFIELD(PAU_NTL_BAR_ENABLE, val, enable); + pau_write(pau, reg, val); + + /* + * Generation IDs are a single space in the hardware but we split them + * per device. Only disable in hardware if every device has disabled. + */ + if (!enable) + pau_for_each_dev(dev, pau) + if (dev->genid_bar.enable) + return; + + reg = PAU_GENID_BAR; + val = pau_read(pau, reg); + val = SETFIELD(PAU_GENID_BAR_ENABLE, val, enable); + pau_write(pau, reg, val); +} + static void pau_opencapi_create_phb_slot(struct pau_dev *dev) { struct pci_slot *slot; @@ -475,6 +542,135 @@ static void pau_opencapi_dt_add_props(struct pau_dev *dev) pau_opencapi_dt_add_mmio_window(dev); } +static void pau_opencapi_set_transport_mux_controls(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint32_t typemap = 0; + uint64_t reg, val = 0; + + PAUDEVDBG(dev, "Setting transport mux controls\n"); + typemap = 0x2 >> dev->index; + + reg = PAU_MISC_OPTICAL_IO_CONFIG; + val = pau_read(pau, reg); + typemap |= GETFIELD(PAU_MISC_OPTICAL_IO_CONFIG_OTL, val); + val = SETFIELD(PAU_MISC_OPTICAL_IO_CONFIG_OTL, val, typemap); + pau_write(pau, reg, val); +} + +static void pau_opencapi_enable_xsl_clocks(struct pau *pau) +{ + uint64_t reg, val; + + PAUDBG(pau, "Enable clocks in XSL\n"); + + reg = PAU_XSL_WRAP_CFG; + val = pau_read(pau, reg); + val |= PAU_XSL_WRAP_CFG_CLOCK_ENABLE; + pau_write(pau, reg, val); +} + +static void pau_opencapi_enable_misc_clocks(struct pau *pau) +{ + uint64_t reg, val; + + PAUDBG(pau, "Enable clocks in MISC\n"); + + /* clear any spurious NDL stall or no_stall_c_err_rpts */ + reg = PAU_MISC_HOLD; + val = pau_read(pau, reg); + val = SETFIELD(PAU_MISC_HOLD_NDL_STALL, val, 0b0000); + pau_write(pau, reg, val); + + reg = PAU_MISC_CONFIG; + val = pau_read(pau, reg); + val |= PAU_MISC_CONFIG_OC_MODE; + pau_write(pau, reg, val); +} + +static void pau_opencapi_set_npcq_config(struct pau *pau) +{ + struct pau_dev *dev; + uint8_t oc_typemap = 0; + uint64_t reg, val; + + /* MCP_MISC_CFG0 + * SNP_MISC_CFG0 done in pau_opencapi_enable_pb + */ + pau_for_each_opencapi_dev(dev, pau) + oc_typemap |= 0x10 >> dev->index; + + PAUDBG(pau, "Set NPCQ Config\n"); + reg = PAU_CTL_MISC_CFG2; + val = pau_read(pau, reg); + val = SETFIELD(PAU_CTL_MISC_CFG2_OCAPI_MODE, val, oc_typemap); + val = SETFIELD(PAU_CTL_MISC_CFG2_OCAPI_4, val, oc_typemap); + val = SETFIELD(PAU_CTL_MISC_CFG2_OCAPI_C2, val, oc_typemap); + val = SETFIELD(PAU_CTL_MISC_CFG2_OCAPI_AMO, val, oc_typemap); + val = SETFIELD(PAU_CTL_MISC_CFG2_OCAPI_MEM_OS_BIT, val, oc_typemap); + pau_write(pau, reg, val); + + reg = PAU_DAT_MISC_CFG1; + val = pau_read(pau, reg); + val = SETFIELD(PAU_DAT_MISC_CFG1_OCAPI_MODE, val, oc_typemap); + pau_write(pau, reg, val); +} + +static void pau_opencapi_enable_xsl_xts_interfaces(struct pau *pau) +{ + uint64_t reg, val; + + PAUDBG(pau, "Enable XSL-XTS Interfaces\n"); + reg = PAU_XTS_CFG; + val = pau_read(pau, reg); + val |= PAU_XTS_CFG_OPENCAPI; + pau_write(pau, reg, val); + + reg = PAU_XTS_CFG2; + val = pau_read(pau, reg); + val |= PAU_XTS_CFG2_XSL2_ENA; + pau_write(pau, reg, val); +} + +static void pau_opencapi_enable_sm_allocation(struct pau *pau) +{ + uint64_t reg, val; + + PAUDBG(pau, "Enable State Machine Allocation\n"); + + reg = PAU_MISC_MACHINE_ALLOC; + val = pau_read(pau, reg); + val |= PAU_MISC_MACHINE_ALLOC_ENABLE; + pau_write(pau, reg, val); +} + +static void pau_opencapi_enable_powerbus(struct pau *pau) +{ + struct pau_dev *dev; + uint8_t oc_typemap = 0; + uint64_t reg, val; + + PAUDBG(pau, "Enable PowerBus\n"); + + pau_for_each_opencapi_dev(dev, pau) + oc_typemap |= 0x10 >> dev->index; + + /* PowerBus interfaces must be enabled prior to MMIO */ + reg = PAU_MCP_MISC_CFG0; + val = pau_read(pau, reg); + val |= PAU_MCP_MISC_CFG0_ENABLE_PBUS; + val |= PAU_MCP_MISC_CFG0_MA_MCRESP_OPT_WRP; + val = SETFIELD(PAU_MCP_MISC_CFG0_OCAPI_MODE, val, oc_typemap); + pau_write(pau, reg, val); + + reg = PAU_SNP_MISC_CFG0; + val = pau_read(pau, reg); + val |= PAU_SNP_MISC_CFG0_ENABLE_PBUS; + val = SETFIELD(PAU_SNP_MISC_CFG0_OCAPI_MODE, val, oc_typemap); + val = SETFIELD(PAU_SNP_MISC_CFG0_OCAPI_C2, val, oc_typemap); + pau_write(pau, reg, val); +} + static void pau_opencapi_init_hw(struct pau *pau) { struct pau_dev *dev = NULL; @@ -483,9 +679,35 @@ static void pau_opencapi_init_hw(struct pau *pau) /* Create phb */ pau_for_each_opencapi_dev(dev, pau) { + PAUDEVINF(dev, "Create phb\n"); pau_opencapi_create_phb(dev); + pau_opencapi_enable_bars(dev, true); pau_opencapi_dt_add_props(dev); } + + /* Procedure 17.1.3.1 - Enabling OpenCAPI */ + pau_for_each_opencapi_dev(dev, pau) { + PAUDEVINF(dev, "Configuring link ...\n"); + pau_opencapi_set_transport_mux_controls(dev); /* step 1 */ + } + pau_opencapi_enable_xsl_clocks(pau); /* step 2 */ + pau_opencapi_enable_misc_clocks(pau); /* step 3 */ + + /* OTL disabled */ + pau_for_each_opencapi_dev(dev, pau) + pau_opencapi_set_fence_control(dev, 0b01); + + pau_opencapi_set_npcq_config(pau); /* step 4 */ + pau_opencapi_enable_xsl_xts_interfaces(pau); /* step 5 */ + pau_opencapi_enable_sm_allocation(pau); /* step 6 */ + pau_opencapi_enable_powerbus(pau); /* step 7 */ + + /* + * access to the PAU registers through mmio requires setting + * up the PAU mmio BAR (in pau_opencapi_assign_bars() above) + * and machine state allocation + */ + pau->mmio_access = true; } static void pau_opencapi_init(struct pau *pau) diff --git a/include/pau-regs.h b/include/pau-regs.h index 57796920..6aeb7589 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -27,6 +27,10 @@ #define PAU_BLOCK_CQ_SM(n) PAU_BLOCK(4, (n)) #define PAU_BLOCK_CQ_CTL PAU_BLOCK(4, 4) +#define PAU_BLOCK_CQ_DAT PAU_BLOCK(4, 5) +#define PAU_BLOCK_XSL PAU_BLOCK(4, 0xE) +#define PAU_BLOCK_PAU_XTS PAU_BLOCK(7, 1) +#define PAU_BLOCK_PAU_MISC PAU_BLOCK(7, 2) /* * CQ_SM block registers @@ -37,21 +41,38 @@ #define PAU_MCP_MISC_CFG0 (PAU_BLOCK_CQ_SM(0) + 0x000) #define PAU_MCP_MISC_CFG0_MA_MCRESP_OPT_WRP PPC_BIT(9) #define PAU_MCP_MISC_CFG0_ENABLE_PBUS PPC_BIT(26) +#define PAU_MCP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(44, 48) #define PAU_SNP_MISC_CFG0 (PAU_BLOCK_CQ_SM(0) + 0x180) #define PAU_SNP_MISC_CFG0_ENABLE_PBUS PPC_BIT(2) +#define PAU_SNP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(32, 36) +#define PAU_SNP_MISC_CFG0_OCAPI_C2 PPC_BITMASK(45, 49) #define PAU_NTL_BAR(brk) (PAU_BLOCK_CQ_SM(0) + 0x1b8 + (brk) * 8) +#define PAU_NTL_BAR_ENABLE PPC_BIT(0) #define PAU_NTL_BAR_ADDR PPC_BITMASK(3, 35) #define PAU_NTL_BAR_SIZE PPC_BITMASK(39, 43) #define PAU_MMIO_BAR (PAU_BLOCK_CQ_SM(0) + 0x1e0) #define PAU_MMIO_BAR_ENABLE PPC_BIT(0) #define PAU_MMIO_BAR_ADDR PPC_BITMASK(3, 27) #define PAU_GENID_BAR (PAU_BLOCK_CQ_SM(0) + 0x1e8) +#define PAU_GENID_BAR_ENABLE PPC_BIT(0) #define PAU_GENID_BAR_ADDR PPC_BITMASK(3, 32) +#define PAU_MISC_MACHINE_ALLOC (PAU_BLOCK_CQ_SM(0) + 0x268) +#define PAU_MISC_MACHINE_ALLOC_ENABLE PPC_BIT(0) /* CQ_CTL block registers */ +#define PAU_CTL_MISC_CFG2 (PAU_BLOCK_CQ_CTL + 0x010) +#define PAU_CTL_MISC_CFG2_OCAPI_MODE PPC_BITMASK(0, 4) +#define PAU_CTL_MISC_CFG2_OCAPI_4 PPC_BITMASK(10, 14) +#define PAU_CTL_MISC_CFG2_OCAPI_C2 PPC_BITMASK(15, 19) +#define PAU_CTL_MISC_CFG2_OCAPI_AMO PPC_BITMASK(20, 24) +#define PAU_CTL_MISC_CFG2_OCAPI_MEM_OS_BIT PPC_BITMASK(25, 29) +#define PAU_CTL_MISC_STATUS(brk) (PAU_BLOCK_CQ_CTL + 0x060 + (brk) * 8) +#define PAU_CTL_MISC_STATUS_AM_FENCED(brk) (PPC_BITMASK(41, 42) << ((brk)*32)) #define PAU_CTL_MISC_MMIOPA_CONFIG(brk) (PAU_BLOCK_CQ_CTL + 0x098 + (brk) * 8) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR PPC_BITMASK(1, 35) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE PPC_BITMASK(39, 43) +#define PAU_CTL_MISC_FENCE_CTRL(brk) (PAU_BLOCK_CQ_CTL + 0x108 + (brk) * 8) +#define PAU_CTL_MISC_FENCE_REQUEST PPC_BITMASK(0, 1) #define PAU_CTL_MISC_CFG_ADDR(brk) (PAU_BLOCK_CQ_CTL + 0x250 + (brk) * 8) #define PAU_CTL_MISC_CFG_ADDR_ENABLE PPC_BIT(0) #define PAU_CTL_MISC_CFG_ADDR_STATUS PPC_BITMASK(1, 3) @@ -61,4 +82,26 @@ #define PAU_CTL_MISC_CFG_ADDR_REGISTER_NBR PPC_BITMASK(20, 31) #define PAU_CTL_MISC_CFG_ADDR_TYPE PPC_BIT(32) +/* CQ_DAT block registers */ +#define PAU_DAT_MISC_CFG1 (PAU_BLOCK_CQ_DAT + 0x008) +#define PAU_DAT_MISC_CFG1_OCAPI_MODE PPC_BITMASK(40, 44) + +/* XSL block registers */ +#define PAU_XSL_WRAP_CFG (PAU_BLOCK_XSL + 0x100) +#define PAU_XSL_WRAP_CFG_CLOCK_ENABLE PPC_BIT(0) + +/* XTS block registers */ +#define PAU_XTS_CFG (PAU_BLOCK_PAU_XTS + 0x020) +#define PAU_XTS_CFG_OPENCAPI PPC_BIT(15) +#define PAU_XTS_CFG2 (PAU_BLOCK_PAU_XTS + 0x028) +#define PAU_XTS_CFG2_XSL2_ENA PPC_BIT(55) + +/* MISC block registers */ +#define PAU_MISC_OPTICAL_IO_CONFIG (PAU_BLOCK_PAU_MISC + 0x018) +#define PAU_MISC_OPTICAL_IO_CONFIG_OTL PPC_BITMASK(2, 3) +#define PAU_MISC_HOLD (PAU_BLOCK_PAU_MISC + 0x020) +#define PAU_MISC_HOLD_NDL_STALL PPC_BITMASK(0, 3) +#define PAU_MISC_CONFIG (PAU_BLOCK_PAU_MISC + 0x030) +#define PAU_MISC_CONFIG_OC_MODE PPC_BIT(16) + #endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h index 6f2eef6e..4d78cbb6 100644 --- a/include/pau.h +++ b/include/pau.h @@ -22,6 +22,7 @@ enum pau_dev_type { /* Used to expose a hardware BAR (or logical slice of it) outside skiboot */ struct pau_bar { + bool enable; uint64_t addr; uint64_t size; uint64_t cfg; -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:50 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:50 +0200 Subject: [Skiboot] [PATCH 09/16] [PATCH 09/16] opencapi5: enable interrupt on error In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-10-clombard@linux.vnet.ibm.com> The default action for the errors (unexpected errors on the opencapi link) reported in the PAU FIR2 registe is mostly set to system checkstop. This patch changes the default action of those errors so that the PAU will raise an interrupt instead. Interrupt information are logged so that the error can be debugged and linux can catch the event. Signed-off-by: Christophe Lombard --- hw/pau.c | 157 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 12 ++++ include/pau.h | 2 + 3 files changed, 171 insertions(+) diff --git a/hw/pau.c b/hw/pau.c index d45b5023..98abe704 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -236,6 +237,15 @@ static int pau_opencapi_set_fence_control(struct pau_dev *dev, return OPAL_HARDWARE; } +#define PAU_DEV_STATUS_BROKEN 0x1 + +static void pau_opencapi_set_broken(struct pau_dev *dev) +{ + PAUDEVDBG(dev, "Update status to broken\n"); + + dev->status = PAU_DEV_STATUS_BROKEN; +} + static void pau_opencapi_mask_firs(struct pau *pau) { uint64_t reg, val; @@ -309,6 +319,113 @@ static void pau_opencapi_assign_bars(struct pau *pau) } } +static uint64_t pau_opencapi_ipi_attributes(struct irq_source *is, + uint32_t isn) +{ + struct pau *pau = is->data; + uint32_t level = isn - pau->irq_base; + + if (level >= 37 && level <= 40) { + /* level 37-40: OTL/XSL interrupt */ + return IRQ_ATTR_TARGET_OPAL | + IRQ_ATTR_TARGET_RARE | + IRQ_ATTR_TYPE_MSI; + } + + return IRQ_ATTR_TARGET_LINUX; +} + +static void pau_opencapi_ipi_interrupt(struct irq_source *is, + uint32_t isn) +{ + struct pau *pau = is->data; + uint32_t level = isn - pau->irq_base; + struct pau_dev *dev; + + switch (level) { + case 37 ... 40: + pau_for_each_opencapi_dev(dev, pau) + pau_opencapi_set_broken(dev); + + opal_update_pending_evt(OPAL_EVENT_PCI_ERROR, + OPAL_EVENT_PCI_ERROR); + break; + default: + PAUERR(pau, "Received unknown interrupt %d\n", level); + return; + } +} + +#define PAU_IRQ_LEVELS 60 + +static char *pau_opencapi_ipi_name(struct irq_source *is, uint32_t isn) +{ + struct pau *pau = is->data; + uint32_t level = isn - pau->irq_base; + + switch (level) { + case 0 ... 19: + return strdup("Reserved"); + case 20: + return strdup("An error event related to PAU CQ functions"); + case 21: + return strdup("An error event related to PAU MISC functions"); + case 22 ... 34: + return strdup("Reserved"); + case 35: + return strdup("Translation failure for OCAPI link 0"); + case 36: + return strdup("Translation failure for OCAPI link 1"); + case 37: + return strdup("An error event related to OTL for link 0"); + case 38: + return strdup("An error event related to OTL for link 1"); + case 39: + return strdup("An error event related to XSL for link 0"); + case 40: + return strdup("An error event related to XSL for link 1"); + case 41 ... 59: + return strdup("Reserved"); + } + + return strdup("Unknown"); +} + +static const struct irq_source_ops pau_opencapi_ipi_ops = { + .attributes = pau_opencapi_ipi_attributes, + .interrupt = pau_opencapi_ipi_interrupt, + .name = pau_opencapi_ipi_name, +}; + +static void pau_opencapi_setup_irqs(struct pau *pau) +{ + uint64_t reg, val; + uint32_t base; + + base = xive2_alloc_ipi_irqs(pau->chip_id, PAU_IRQ_LEVELS, 64); + if (base == XIVE_IRQ_ERROR) { + PAUERR(pau, "Failed to allocate interrupt sources\n"); + return; + } + + xive2_register_ipi_source(base, PAU_IRQ_LEVELS, pau, &pau_opencapi_ipi_ops); + + /* Set IPI configuration */ + reg = PAU_MISC_CONFIG; + val = pau_read(pau, reg); + val = SETFIELD(PAU_MISC_CONFIG_IPI_PS, val, PAU_MISC_CONFIG_IPI_PS_64K); + val = SETFIELD(PAU_MISC_CONFIG_IPI_OS, val, PAU_MISC_CONFIG_IPI_OS_AIX); + pau_write(pau, reg, val); + + /* Set IRQ base */ + reg = PAU_MISC_INT_BAR; + val = SETFIELD(PAU_MISC_INT_BAR_ADDR, 0ull, + (uint64_t)xive2_get_trigger_port(base) >> 12); + pau_write(pau, reg, val); + + pau->irq_base = base; +} + static void pau_opencapi_enable_bars(struct pau_dev *dev, bool enable) { struct pau *pau = dev->pau; @@ -768,12 +885,48 @@ static void pau_opencapi_address_translation_config(struct pau_dev *dev) /* XSL_GP - use defaults */ } +static void pau_opencapi_enable_interrupt_on_error(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + PAUDEVDBG(dev, "Enable Interrupt-on-error\n"); + + /* translation fault */ + reg = PAU_MISC_INT_2_CONFIG; + val = pau_read(pau, reg); + val |= PAU_MISC_INT_2_CONFIG_XFAULT_2_5(dev->index); + pau_write(pau, reg, val); + + /* freeze disable */ + reg = PAU_MISC_FREEZE_1_CONFIG; + val = pau_read(pau, reg); + val &= ~PAU_FIR1_NDL_BRICKS_0_5; + val &= ~PAU_FIR1_NDL_BRICKS_6_11; + pau_write(pau, reg, val); + + /* fence disable */ + reg = PAU_MISC_FENCE_1_CONFIG; + val = pau_read(pau, reg); + val &= ~PAU_FIR1_NDL_BRICKS_0_5; + val &= ~PAU_FIR1_NDL_BRICKS_6_11; + pau_write(pau, reg, val); + + /* irq disable */ + reg = PAU_MISC_INT_1_CONFIG; + val = pau_read(pau, reg); + val &= ~PAU_FIR1_NDL_BRICKS_0_5; + val &= ~PAU_FIR1_NDL_BRICKS_6_11; + pau_write(pau, reg, val); +} + static void pau_opencapi_init_hw(struct pau *pau) { struct pau_dev *dev = NULL; pau_opencapi_mask_firs(pau); pau_opencapi_assign_bars(pau); + pau_opencapi_setup_irqs(pau); /* Create phb */ pau_for_each_opencapi_dev(dev, pau) { @@ -850,6 +1003,10 @@ static void pau_opencapi_init_hw(struct pau *pau) * AFU's memory */ + /* Procedure 17.1.3.11 - Interrupt Configuration */ + /* done in pau_opencapi_setup_irqs() */ + pau_opencapi_enable_interrupt_on_error(dev); + /* Reset disabled. Place OTLs into Run State */ pau_opencapi_set_fence_control(dev, 0b00); } diff --git a/include/pau-regs.h b/include/pau-regs.h index e4ff7cc0..d98f435b 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -139,6 +139,18 @@ #define PAU_MISC_HOLD (PAU_BLOCK_PAU_MISC + 0x020) #define PAU_MISC_HOLD_NDL_STALL PPC_BITMASK(0, 3) #define PAU_MISC_CONFIG (PAU_BLOCK_PAU_MISC + 0x030) +#define PAU_MISC_CONFIG_IPI_PS PPC_BIT(11) +#define PAU_MISC_CONFIG_IPI_PS_64K 1 +#define PAU_MISC_CONFIG_IPI_OS PPC_BIT(12) +#define PAU_MISC_CONFIG_IPI_OS_AIX 0 #define PAU_MISC_CONFIG_OC_MODE PPC_BIT(16) +#define PAU_MISC_FREEZE_1_CONFIG (PAU_BLOCK_PAU_MISC + 0x048) +#define PAU_MISC_FENCE_1_CONFIG (PAU_BLOCK_PAU_MISC + 0x058) +#define PAU_MISC_INT_1_CONFIG (PAU_BLOCK_PAU_MISC + 0x068) +#define PAU_MISC_INT_BAR (PAU_BLOCK_PAU_MISC + 0x098) +#define PAU_MISC_INT_BAR_ADDR PPC_BITMASK(0, 39) +#define PAU_MISC_INT_2_CONFIG (PAU_BLOCK_PAU_MISC + 0x408) +#define PAU_MISC_INT_2_CONFIG_XFAULT_2_5(n) PPC_BIT(0 + (n)) +#define PAU_MISC_INT_2_CONFIG_XFAULT_0_1(n) PPC_BIT(54 + (n)) #endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h index 4d78cbb6..d6a08809 100644 --- a/include/pau.h +++ b/include/pau.h @@ -33,6 +33,7 @@ struct pau_dev { uint32_t index; struct dt_node *dn; struct phb phb; + uint32_t status; struct pau_bar ntl_bar; struct pau_bar genid_bar; @@ -59,6 +60,7 @@ struct pau { uint64_t regs[2]; bool mmio_access; + uint32_t irq_base; struct lock lock; uint32_t links; -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:49 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:49 +0200 Subject: [Skiboot] [PATCH 08/16] [PATCH 08/16] opencapi5: translation layer configuration In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-9-clombard@linux.vnet.ibm.com> Next main part of the hypervisor PAU initialization. The P10 PAU supports two OpenCAPI links. The PAU provides various configuration selections for both of the OCAPI Link Transaction Layer functions (OTLs). These include a link enable, behavior controls, debug modes, and virtual channel credits to send to the AFU. The OTL Configuration 0, OTL Configuration 1, OTL Configuration 2, and TLX Credit Configuration registers are used to control these functions. This patch completes the PAU configuration following the sections 17.1.3.4 to 17.1.3.10.2 of the workbook document. Signed-off-by: Christophe Lombard --- hw/pau.c | 145 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 37 ++++++++++++ 2 files changed, 182 insertions(+) diff --git a/hw/pau.c b/hw/pau.c index d7b51ee5..d45b5023 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -236,6 +236,22 @@ static int pau_opencapi_set_fence_control(struct pau_dev *dev, return OPAL_HARDWARE; } +static void pau_opencapi_mask_firs(struct pau *pau) +{ + uint64_t reg, val; + + reg = pau->xscom_base + PAU_FIR_MASK(1); + xscom_read(pau->chip_id, reg, &val); + val |= PAU_FIR1_NDL_BRICKS_0_5; + val |= PAU_FIR1_NDL_BRICKS_6_11; + xscom_write(pau->chip_id, reg, val); + + reg = pau->xscom_base + PAU_FIR_MASK(2); + xscom_read(pau->chip_id, reg, &val); + val |= PAU_FIR2_OTL_PERR; + xscom_write(pau->chip_id, reg, val); +} + static void pau_opencapi_assign_bars(struct pau *pau) { struct pau_dev *dev; @@ -671,10 +687,92 @@ static void pau_opencapi_enable_powerbus(struct pau *pau) pau_write(pau, reg, val); } +static void pau_opencapi_tl_config(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t val; + + PAUDEVDBG(dev, "TL Configuration\n"); + + /* OTL Config 0 */ + val = 0; + val |= PAU_OTL_MISC_CFG0_EN; + val |= PAU_OTL_MISC_CFG0_BLOCK_PE_HANDLE; + val = SETFIELD(PAU_OTL_MISC_CFG0_BRICKID, val, dev->index); + val |= PAU_OTL_MISC_CFG0_ENABLE_4_0; + val |= PAU_OTL_MISC_CFG0_XLATE_RELEASE; + val |= PAU_OTL_MISC_CFG0_ENABLE_5_0; + pau_write(pau, PAU_OTL_MISC_CFG0(dev->index), val); + + /* OTL Config 1 */ + val = 0; + val = SETFIELD(PAU_OTL_MISC_CFG_TX_DRDY_WAIT, val, 0b010); + val = SETFIELD(PAU_OTL_MISC_CFG_TX_TEMP0_RATE, val, 0b0000); + val = SETFIELD(PAU_OTL_MISC_CFG_TX_TEMP1_RATE, val, 0b0011); + val = SETFIELD(PAU_OTL_MISC_CFG_TX_TEMP2_RATE, val, 0b0111); + val = SETFIELD(PAU_OTL_MISC_CFG_TX_TEMP3_RATE, val, 0b0010); + val = SETFIELD(PAU_OTL_MISC_CFG_TX_CRET_FREQ, val, 0b001); + pau_write(pau, PAU_OTL_MISC_CFG_TX(dev->index), val); + + /* OTL Config 2 - Done after link training, in otl_tx_send_enable() */ + + /* TLX Credit Configuration */ + val = 0; + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_VC0, val, 0x40); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_VC1, val, 0x40); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_VC2, val, 0x40); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_VC3, val, 0x40); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_DCP0, val, 0x80); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_SPARE, val, 0x80); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_DCP2, val, 0x80); + val = SETFIELD(PAU_OTL_MISC_CFG_TLX_CREDITS_DCP3, val, 0x80); + pau_write(pau, PAU_OTL_MISC_CFG_TLX_CREDITS(dev->index), val); +} + +static void pau_opencapi_enable_otlcq_interface(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint8_t typemap = 0; + uint64_t reg, val; + + PAUDEVDBG(dev, "Enabling OTL-CQ Interface\n"); + + typemap |= 0x10 >> dev->index; + reg = PAU_CTL_MISC_CFG0; + val = pau_read(pau, reg); + typemap |= GETFIELD(PAU_CTL_MISC_CFG0_OTL_ENABLE, val); + val = SETFIELD(PAU_CTL_MISC_CFG0_OTL_ENABLE, val, typemap); + pau_write(pau, reg, val); +} + +static void pau_opencapi_address_translation_config(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + PAUDEVDBG(dev, "Address Translation Configuration\n"); + + /* OpenCAPI 4.0 Mode */ + reg = PAU_XSL_OSL_XLATE_CFG(dev->index); + val = pau_read(pau, reg); + val |= PAU_XSL_OSL_XLATE_CFG_AFU_DIAL; + val &= ~PAU_XSL_OSL_XLATE_CFG_OPENCAPI3; + pau_write(pau, reg, val); + + /* MMIO shootdowns (OpenCAPI 5.0) */ + reg = PAU_XTS_CFG3; + val = pau_read(pau, reg); + val |= PAU_XTS_CFG3_MMIOSD_OCAPI; + pau_write(pau, reg, val); + + /* XSL_GP - use defaults */ +} + static void pau_opencapi_init_hw(struct pau *pau) { struct pau_dev *dev = NULL; + pau_opencapi_mask_firs(pau); pau_opencapi_assign_bars(pau); /* Create phb */ @@ -708,6 +806,53 @@ static void pau_opencapi_init_hw(struct pau *pau) * and machine state allocation */ pau->mmio_access = true; + + pau_for_each_opencapi_dev(dev, pau) { + /* Procedure 17.1.3.4 - Transaction Layer Configuration + * OCAPI Link Transaction Layer functions + */ + pau_opencapi_tl_config(dev); + + /* Procedure 17.1.3.4.1 - Enabling OTL-CQ Interface */ + pau_opencapi_enable_otlcq_interface(dev); + + /* Procedure 17.1.3.4.2 - Place OTL into Reset State + * Reset (Fence) both OTL and the PowerBus for this + * Brick + */ + pau_opencapi_set_fence_control(dev, 0b11); + + /* Take PAU out of OTL Reset State + * Reset (Fence) only the PowerBus for this Brick, OTL + * will be operational + */ + pau_opencapi_set_fence_control(dev, 0b10); + + /* Procedure 17.1.3.5 - Address Translation Configuration */ + pau_opencapi_address_translation_config(dev); + + /* Procedure 17.1.3.6 - AFU Memory Range BARs */ + /* Will be done out of this process */ + + /* Procedure 17.1.3.8 - AFU MMIO Range BARs */ + /* done in pau_opencapi_assign_bars() */ + + /* Procedure 17.1.3.9 - AFU Config BARs */ + /* done in pau_opencapi_assign_bars() */ + + /* Precedure 17.1.3.10 - Relaxed Ordering Configuration */ + /* Procedure 17.1.3.10.1 - Generation-Id Registers MMIO Bars */ + /* done in pau_opencapi_assign_bars() */ + + /* Procedure 17.1.3.10.2 - Relaxed Ordering Source Configuration */ + /* For an OpenCAPI AFU that uses M2 Memory Mode, + * Relaxed Ordering can be used for accesses to the + * AFU's memory + */ + + /* Reset disabled. Place OTLs into Run State */ + pau_opencapi_set_fence_control(dev, 0b00); + } } static void pau_opencapi_init(struct pau *pau) diff --git a/include/pau-regs.h b/include/pau-regs.h index 6aeb7589..e4ff7cc0 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -12,6 +12,10 @@ #define PAU_FIR_ACTION1(n) (0x407 + (n) * 0x40) #define PAU_FIR_MAX 3 +#define PAU_FIR1_NDL_BRICKS_0_5 PPC_BITMASK(0, 11) +#define PAU_FIR1_NDL_BRICKS_6_11 PPC_BITMASK(47, 58) +#define PAU_FIR2_OTL_PERR PPC_BIT(18) + /* PAU RING: Indirect address/data port */ #define PAU_MISC_SCOM_IND_SCOM_ADDR 0x33e #define PAU_MISC_DA_ADDR PPC_BITMASK(0, 23) @@ -28,6 +32,7 @@ #define PAU_BLOCK_CQ_SM(n) PAU_BLOCK(4, (n)) #define PAU_BLOCK_CQ_CTL PAU_BLOCK(4, 4) #define PAU_BLOCK_CQ_DAT PAU_BLOCK(4, 5) +#define PAU_BLOCK_OTL(brk) PAU_BLOCK(4, 0xC + (brk)) #define PAU_BLOCK_XSL PAU_BLOCK(4, 0xE) #define PAU_BLOCK_PAU_XTS PAU_BLOCK(7, 1) #define PAU_BLOCK_PAU_MISC PAU_BLOCK(7, 2) @@ -60,6 +65,8 @@ #define PAU_MISC_MACHINE_ALLOC_ENABLE PPC_BIT(0) /* CQ_CTL block registers */ +#define PAU_CTL_MISC_CFG0 (PAU_BLOCK_CQ_CTL + 0x000) +#define PAU_CTL_MISC_CFG0_OTL_ENABLE PPC_BITMASK(52, 56) #define PAU_CTL_MISC_CFG2 (PAU_BLOCK_CQ_CTL + 0x010) #define PAU_CTL_MISC_CFG2_OCAPI_MODE PPC_BITMASK(0, 4) #define PAU_CTL_MISC_CFG2_OCAPI_4 PPC_BITMASK(10, 14) @@ -86,15 +93,45 @@ #define PAU_DAT_MISC_CFG1 (PAU_BLOCK_CQ_DAT + 0x008) #define PAU_DAT_MISC_CFG1_OCAPI_MODE PPC_BITMASK(40, 44) +/* OTL block registers */ +#define PAU_OTL_MISC_CFG0(brk) (PAU_BLOCK_OTL(brk) + 0x000) +#define PAU_OTL_MISC_CFG0_EN PPC_BIT(0) +#define PAU_OTL_MISC_CFG0_BLOCK_PE_HANDLE PPC_BIT(1) +#define PAU_OTL_MISC_CFG0_BRICKID PPC_BITMASK(2, 3) +#define PAU_OTL_MISC_CFG0_ENABLE_4_0 PPC_BIT(51) +#define PAU_OTL_MISC_CFG0_XLATE_RELEASE PPC_BIT(62) +#define PAU_OTL_MISC_CFG0_ENABLE_5_0 PPC_BIT(63) +#define PAU_OTL_MISC_CFG_TLX_CREDITS(brk) (PAU_BLOCK_OTL(brk) + 0x050) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_VC0 PPC_BITMASK(0, 7) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_VC1 PPC_BITMASK(8, 15) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_VC2 PPC_BITMASK(16, 23) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_VC3 PPC_BITMASK(24, 31) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_DCP0 PPC_BITMASK(32, 39) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_SPARE PPC_BITMASK(40, 47) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_DCP2 PPC_BITMASK(48, 55) +#define PAU_OTL_MISC_CFG_TLX_CREDITS_DCP3 PPC_BITMASK(56, 63) +#define PAU_OTL_MISC_CFG_TX(brk) (PAU_BLOCK_OTL(brk) + 0x058) +#define PAU_OTL_MISC_CFG_TX_DRDY_WAIT PPC_BITMASK(5, 7) +#define PAU_OTL_MISC_CFG_TX_TEMP0_RATE PPC_BITMASK(8, 11) +#define PAU_OTL_MISC_CFG_TX_TEMP1_RATE PPC_BITMASK(12, 15) +#define PAU_OTL_MISC_CFG_TX_TEMP2_RATE PPC_BITMASK(16, 19) +#define PAU_OTL_MISC_CFG_TX_TEMP3_RATE PPC_BITMASK(20, 23) +#define PAU_OTL_MISC_CFG_TX_CRET_FREQ PPC_BITMASK(32, 34) + /* XSL block registers */ #define PAU_XSL_WRAP_CFG (PAU_BLOCK_XSL + 0x100) #define PAU_XSL_WRAP_CFG_CLOCK_ENABLE PPC_BIT(0) +#define PAU_XSL_OSL_XLATE_CFG(brk) (PAU_BLOCK_XSL + 0x040 + (brk) * 8) +#define PAU_XSL_OSL_XLATE_CFG_AFU_DIAL PPC_BIT(0) +#define PAU_XSL_OSL_XLATE_CFG_OPENCAPI3 PPC_BIT(32) /* XTS block registers */ #define PAU_XTS_CFG (PAU_BLOCK_PAU_XTS + 0x020) #define PAU_XTS_CFG_OPENCAPI PPC_BIT(15) #define PAU_XTS_CFG2 (PAU_BLOCK_PAU_XTS + 0x028) #define PAU_XTS_CFG2_XSL2_ENA PPC_BIT(55) +#define PAU_XTS_CFG3 (PAU_BLOCK_PAU_XTS + 0x068) +#define PAU_XTS_CFG3_MMIOSD_OCAPI PPC_BIT(5) /* MISC block registers */ #define PAU_MISC_OPTICAL_IO_CONFIG (PAU_BLOCK_PAU_MISC + 0x018) -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:52 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:52 +0200 Subject: [Skiboot] [PATCH 11/16] [PATCH 11/16] opencapi5: hmi scom dump In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-12-clombard@linux.vnet.ibm.com> This patch add a new function to dump PAU registers when a HMI has been raised and an OpenCAPI link has been hit by an error. For each register, the scom address and the register value are printed. The hmi.c has been redesigned in order to support the new PHB/PCIEX type (PAU OpenCapi). Now, the *npu* functions support NPU and PAU units of P8, P9 and P10 chips. Signed-off-by: Christophe Lombard --- core/hmi.c | 263 ++++++++++++++++++--------------------- hw/npu2-common.c | 30 ++--- hw/pau.c | 50 ++++++++ include/npu2-regs.h | 5 + include/npu2.h | 2 +- include/pau-regs.h | 24 ++++ include/pau.h | 2 + include/xscom-p10-regs.h | 3 + 8 files changed, 213 insertions(+), 166 deletions(-) diff --git a/core/hmi.c b/core/hmi.c index 9363cc5f..8d287cb3 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -19,8 +19,10 @@ #include #include #include +#include #include #include +#include #include #include #include @@ -717,13 +719,7 @@ static void find_nx_checkstop_reason(int flat_chip_id, queue_hmi_event(hmi_evt, 0, out_flags); } -static bool phb_is_npu2(struct dt_node *dn) -{ - return (dt_node_is_compatible(dn, "ibm,power9-npu-pciex") || - dt_node_is_compatible(dn, "ibm,power9-npu-opencapi-pciex")); -} - -static void add_npu2_xstop_reason(uint32_t *xstop_reason, uint8_t reason) +static void add_npu_xstop_reason(uint32_t *xstop_reason, uint8_t reason) { int i, reason_count; uint8_t *ptr; @@ -739,8 +735,8 @@ static void add_npu2_xstop_reason(uint32_t *xstop_reason, uint8_t reason) } } -static void encode_npu2_xstop_reason(uint32_t *xstop_reason, - uint64_t fir, int fir_number) +static void encode_npu_xstop_reason(uint32_t *xstop_reason, + uint64_t fir, int fir_number) { int bit; uint8_t reason; @@ -758,114 +754,112 @@ static void encode_npu2_xstop_reason(uint32_t *xstop_reason, bit = ilog2(fir); reason = fir_number << 6; reason |= (63 - bit); // IBM numbering - add_npu2_xstop_reason(xstop_reason, reason); + add_npu_xstop_reason(xstop_reason, reason); fir ^= 1ULL << bit; } } -static void find_npu2_checkstop_reason(int flat_chip_id, - struct OpalHMIEvent *hmi_evt, - uint64_t *out_flags) +static bool npu_fir_errors(struct phb *phb, int flat_chip_id, + uint32_t *xstop_reason) { - struct phb *phb; - int i; - bool npu2_hmi_verbose = false, found = false; - uint64_t npu2_fir; - uint64_t npu2_fir_mask; - uint64_t npu2_fir_action0; - uint64_t npu2_fir_action1; - uint64_t npu2_fir_addr; - uint64_t npu2_fir_mask_addr; - uint64_t npu2_fir_action0_addr; - uint64_t npu2_fir_action1_addr; + uint64_t fir, fir_mask; + uint64_t fir_action0, fir_action1; + uint64_t fir_reg, fir_mask_reg; + uint64_t fir_action0_reg, fir_action1_reg; uint64_t fatal_errors; - uint32_t xstop_reason = 0; - int total_errors = 0; + uint64_t xscom_base; + bool fir_errors = false; + int fir_regs; const char *loc; - - /* NPU2 only */ - if (PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P9) - return; - - /* Find the NPU on the chip associated with the HMI. */ - for_each_phb(phb) { - /* NOTE: if a chip ever has >1 NPU this will need adjusting */ - if (phb_is_npu2(phb->dt_node) && - (dt_get_chip_id(phb->dt_node) == flat_chip_id)) { - found = true; - break; + struct npu *npu; + + fir_regs = (phb->phb_type == phb_type_pcie_v3) ? 1 : 3; + + for (uint32_t i = 0; i < fir_regs; i++) { + switch (phb->phb_type) { + case phb_type_pcie_v3: + fir_reg = NX_FIR; + fir_mask_reg = NX_FIR_MASK; + fir_action0_reg = NX_FIR_ACTION0; + fir_action1_reg = NX_FIR_ACTION1; + + npu = phb_to_npu(phb); + if (!(npu == NULL)) + xscom_base = npu->at_xscom; + else + continue; + break; + case phb_type_npu_v2: + case phb_type_npu_v2_opencapi: + fir_reg = NPU2_FIR(i); + fir_mask_reg = NPU2_FIR_MASK(i); + fir_action0_reg = NPU2_FIR_ACTION0(i); + fir_action1_reg = NPU2_FIR_ACTION1(i); + xscom_base = dt_prop_get_u32(phb->dt_node, "ibm,xscom-base"); + break; + case phb_type_pau_opencapi: + fir_reg = PAU_FIR(i); + fir_mask_reg = PAU_FIR_MASK(i); + fir_action0_reg = PAU_FIR_ACTION0(i); + fir_action1_reg = PAU_FIR_ACTION1(i); + xscom_base = dt_prop_get_u32(phb->dt_node, "ibm,xscom-base"); + break; + default: + continue; } - } - - /* If we didn't find a NPU on the chip, it's not our checkstop. */ - if (!found) - return; - npu2_fir_addr = NPU2_FIR_REGISTER_0; - npu2_fir_mask_addr = NPU2_FIR_REGISTER_0 + NPU2_FIR_MASK_OFFSET; - npu2_fir_action0_addr = NPU2_FIR_REGISTER_0 + NPU2_FIR_ACTION0_OFFSET; - npu2_fir_action1_addr = NPU2_FIR_REGISTER_0 + NPU2_FIR_ACTION1_OFFSET; - - for (i = 0; i < NPU2_TOTAL_FIR_REGISTERS; i++) { - /* Read all the registers necessary to find a checkstop condition. */ - if (xscom_read(flat_chip_id, npu2_fir_addr, &npu2_fir) || - xscom_read(flat_chip_id, npu2_fir_mask_addr, &npu2_fir_mask) || - xscom_read(flat_chip_id, npu2_fir_action0_addr, &npu2_fir_action0) || - xscom_read(flat_chip_id, npu2_fir_action1_addr, &npu2_fir_action1)) { - prerror("HMI: Couldn't read NPU FIR register%d with XSCOM\n", i); + if (xscom_read(flat_chip_id, xscom_base + fir_reg, &fir) || + xscom_read(flat_chip_id, xscom_base + fir_mask_reg, &fir_mask) || + xscom_read(flat_chip_id, xscom_base + fir_action0_reg, &fir_action0) || + xscom_read(flat_chip_id, xscom_base + fir_action1_reg, &fir_action1)) { + prerror("HMI: Couldn't read NPU/PAU FIR register%d with XSCOM\n", i); continue; } - fatal_errors = npu2_fir & ~npu2_fir_mask & npu2_fir_action0 & npu2_fir_action1; + fatal_errors = fir & ~fir_mask & fir_action0 & fir_action1; if (fatal_errors) { loc = chip_loc_code(flat_chip_id); if (!loc) loc = "Not Available"; - prlog(PR_ERR, "NPU: [Loc: %s] P:%d FIR#%d FIR 0x%016llx mask 0x%016llx\n", - loc, flat_chip_id, i, npu2_fir, npu2_fir_mask); - prlog(PR_ERR, "NPU: [Loc: %s] P:%d ACTION0 0x%016llx, ACTION1 0x%016llx\n", - loc, flat_chip_id, npu2_fir_action0, npu2_fir_action1); - total_errors++; - - encode_npu2_xstop_reason(&xstop_reason, fatal_errors, i); + prlog(PR_ERR, "NPU/PAU: [Loc: %s] P:%d FIR#%d " + "FIR 0x%016llx mask 0x%016llx\n", + loc, flat_chip_id, i, fir, fir_mask); + prlog(PR_ERR, "NPU/PAU: [Loc: %s] P:%d ACTION0 " + "0x%016llx, ACTION1 0x%016llx\n", + loc, flat_chip_id, fir_action0, fir_action1); + if (phb->phb_type != phb_type_pcie_v3) + encode_npu_xstop_reason(xstop_reason, + fatal_errors, + i); + fir_errors = true; } - - /* Can't do a fence yet, we are just logging fir information for now */ - npu2_fir_addr += NPU2_FIR_OFFSET; - npu2_fir_mask_addr += NPU2_FIR_OFFSET; - npu2_fir_action0_addr += NPU2_FIR_OFFSET; - npu2_fir_action1_addr += NPU2_FIR_OFFSET; - } - if (!total_errors) - return; - - npu2_hmi_verbose = nvram_query_eq_safe("npu2-hmi-verbose", "true"); - /* Force this for now until we sort out something better */ - npu2_hmi_verbose = true; + /* dump registers */ + if (fir_errors) { + switch (phb->phb_type) { + case phb_type_npu_v2: + case phb_type_npu_v2_opencapi: + npu2_dump_scoms(phb, flat_chip_id); + break; + case phb_type_pau_opencapi: + pau_opencapi_dump_scoms(phb); + break; + default: + break; + } - if (npu2_hmi_verbose) { - npu2_dump_scoms(flat_chip_id); prlog(PR_ERR, " _________________________ \n"); - prlog(PR_ERR, "< It's Debug time! >\n"); + prlog(PR_ERR, "< It's Debug time! >\n"); prlog(PR_ERR, " ------------------------- \n"); - prlog(PR_ERR, " \\ ,__, \n"); - prlog(PR_ERR, " \\ (oo)____ \n"); - prlog(PR_ERR, " (__) )\\ \n"); + prlog(PR_ERR, " \\ ,__, \n"); + prlog(PR_ERR, " \\ (oo)____ \n"); + prlog(PR_ERR, " (__) )\\ \n"); prlog(PR_ERR, " ||--|| * \n"); } - /* Set up the HMI event */ - hmi_evt->severity = OpalHMI_SEV_WARNING; - hmi_evt->type = OpalHMI_ERROR_MALFUNC_ALERT; - hmi_evt->u.xstop_error.xstop_type = CHECKSTOP_TYPE_NPU; - hmi_evt->u.xstop_error.xstop_reason = cpu_to_be32(xstop_reason); - hmi_evt->u.xstop_error.u.chip_id = cpu_to_be32(flat_chip_id); - - /* Marking the event as recoverable so that we don't crash */ - queue_hmi_event(hmi_evt, 1, out_flags); + return fir_errors; } static void find_npu_checkstop_reason(int flat_chip_id, @@ -873,67 +867,47 @@ static void find_npu_checkstop_reason(int flat_chip_id, uint64_t *out_flags) { struct phb *phb; - struct npu *p = NULL; - - uint64_t npu_fir; - uint64_t npu_fir_mask; - uint64_t npu_fir_action0; - uint64_t npu_fir_action1; - uint64_t fatal_errors; - - /* Only check for NPU errors if the chip has a NPU */ - if (PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P8NVL) - return find_npu2_checkstop_reason(flat_chip_id, hmi_evt, out_flags); - - /* Find the NPU on the chip associated with the HMI. */ - for_each_phb(phb) { - /* NOTE: if a chip ever has >1 NPU this will need adjusting */ - if (dt_node_is_compatible(phb->dt_node, "ibm,power8-npu-pciex") && - (dt_get_chip_id(phb->dt_node) == flat_chip_id)) { - p = phb_to_npu(phb); - break; - } - } + struct dt_node *dn; + uint32_t xstop_reason = 0; - /* If we didn't find a NPU on the chip, it's not our checkstop. */ - if (p == NULL) + /* Only check for NPU errors if the chip has a NPU/PAU */ + if ((PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P8NVL) && + (PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P9) && + (PVR_TYPE(mfspr(SPR_PVR)) != PVR_TYPE_P10)) return; - /* Read all the registers necessary to find a checkstop condition. */ - if (xscom_read(flat_chip_id, - p->at_xscom + NX_FIR, &npu_fir) || - xscom_read(flat_chip_id, - p->at_xscom + NX_FIR_MASK, &npu_fir_mask) || - xscom_read(flat_chip_id, - p->at_xscom + NX_FIR_ACTION0, &npu_fir_action0) || - xscom_read(flat_chip_id, - p->at_xscom + NX_FIR_ACTION1, &npu_fir_action1)) { - prerror("Couldn't read NPU registers with XSCOM\n"); - return; - } + /* Find the NPU/PAU on the chip associated with the HMI. */ + for_each_phb(phb) { + dn = phb->dt_node; - fatal_errors = npu_fir & ~npu_fir_mask & npu_fir_action0 & npu_fir_action1; + if (!(dt_node_is_compatible(dn, "ibm,power8-npu-pciex") || + dt_node_is_compatible(dn, "ibm,power9-npu-pciex") || + dt_node_is_compatible(dn, "ibm,power9-npu-opencapi-pciex") || + dt_node_is_compatible(dn, "ibm,power10-pau-opencapi-pciex"))) + continue; - /* If there's no errors, we don't need to do anything. */ - if (!fatal_errors) - return; + if (dt_get_chip_id(dn) != flat_chip_id) + continue; - prlog(PR_DEBUG, "NPU: FIR 0x%016llx mask 0x%016llx\n", - npu_fir, npu_fir_mask); - prlog(PR_DEBUG, "NPU: ACTION0 0x%016llx, ACTION1 0x%016llx\n", - npu_fir_action0, npu_fir_action1); + /* Read all the registers necessary to find a checkstop condition. */ + if (!npu_fir_errors(phb, flat_chip_id, &xstop_reason)) + continue; - /* Set the NPU to fenced since it can't recover. */ - npu_set_fence_state(p, true); + if (phb->phb_type == phb_type_pcie_v3) { + /* Set the NPU to fenced since it can't recover. */ + npu_set_fence_state(phb_to_npu(phb), true); + } - /* Set up the HMI event */ - hmi_evt->severity = OpalHMI_SEV_WARNING; - hmi_evt->type = OpalHMI_ERROR_MALFUNC_ALERT; - hmi_evt->u.xstop_error.xstop_type = CHECKSTOP_TYPE_NPU; - hmi_evt->u.xstop_error.u.chip_id = cpu_to_be32(flat_chip_id); + /* Set up the HMI event */ + hmi_evt->severity = OpalHMI_SEV_WARNING; + hmi_evt->type = OpalHMI_ERROR_MALFUNC_ALERT; + hmi_evt->u.xstop_error.xstop_type = CHECKSTOP_TYPE_NPU; + hmi_evt->u.xstop_error.xstop_reason = xstop_reason; + hmi_evt->u.xstop_error.u.chip_id = cpu_to_be32(flat_chip_id); - /* The HMI is "recoverable" because it shouldn't crash the system */ - queue_hmi_event(hmi_evt, 1, out_flags); + /* Marking the event as recoverable so that we don't crash */ + queue_hmi_event(hmi_evt, 1, out_flags); + } } static void decode_malfunction(struct OpalHMIEvent *hmi_evt, uint64_t *out_flags) @@ -962,7 +936,8 @@ static void decode_malfunction(struct OpalHMIEvent *hmi_evt, uint64_t *out_flags xscom_write(this_cpu()->chip_id, malf_alert_scom, ~PPC_BIT(i)); find_capp_checkstop_reason(i, hmi_evt, &flags); - find_nx_checkstop_reason(i, hmi_evt, &flags); + if (proc_gen != proc_gen_p10) + find_nx_checkstop_reason(i, hmi_evt, &flags); find_npu_checkstop_reason(i, hmi_evt, &flags); } } diff --git a/hw/npu2-common.c b/hw/npu2-common.c index 3bc9bcee..7e88beac 100644 --- a/hw/npu2-common.c +++ b/hw/npu2-common.c @@ -296,31 +296,19 @@ static void show_all_regs(struct npu2 *npu, int brick_index) } } -void npu2_dump_scoms(int chip_id) +void npu2_dump_scoms(struct phb *phb, int chip_id) { - struct npu2 *npu; - struct phb *phb; + struct npu2 *npu = NULL; struct npu2_dev *dev; - /* - * Look for the npu2 structure for that chip ID. We can access it - * through the array of phbs, looking for a nvlink or opencapi - * phb. We can have several entries, but they all point - * to the same npu2 structure - */ - for_each_phb(phb) { - npu = NULL; - if (phb->phb_type == phb_type_npu_v2) { - npu = phb_to_npu2_nvlink(phb); - } else if (phb->phb_type == phb_type_npu_v2_opencapi) { - dev = phb_to_npu2_dev_ocapi(phb); - npu = dev->npu; - } - if (npu && npu->chip_id == chip_id) { - show_all_regs(npu, -1 /* all bricks */); - break; - } + if (phb->phb_type == phb_type_npu_v2) { + npu = phb_to_npu2_nvlink(phb); + } else if (phb->phb_type == phb_type_npu_v2_opencapi) { + dev = phb_to_npu2_dev_ocapi(phb); + npu = dev->npu; } + if (npu && npu->chip_id == chip_id) + show_all_regs(npu, -1 /* all bricks */); } static uint64_t npu2_ipi_attributes(struct irq_source *is __unused, uint32_t isn __unused) diff --git a/hw/pau.c b/hw/pau.c index 68195e48..132ef565 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -32,6 +32,56 @@ struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, return NULL; } +static void pau_opencapi_dump_scom_reg(struct pau *pau, uint64_t reg) +{ + PAUDBG(pau, "0x%llx = 0x%016llx\n", reg, pau_read(pau, reg)); +} + +void pau_opencapi_dump_scoms(struct phb *phb) +{ + struct pau *pau; + struct pau_dev *dev; + uint64_t cq_sm; + + if (phb->phb_type == phb_type_pau_opencapi) + pau = ((struct pau_dev *)(pau_phb_to_opencapi_dev(phb)))->pau; + else + return; + + for (uint32_t i = 1; i < 4; i++) { + cq_sm = PAU_BLOCK_CQ_SM(i); + + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE0)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE1)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE2)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE3)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE4)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE5)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE6)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_MESSAGE7)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_FIRST0)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_FIRST1)); + pau_opencapi_dump_scom_reg(pau, cq_sm + PAU_REG_OFFSET(PAU_MCP_MISC_CERR_FIRST2)); + } + + pau_opencapi_dump_scom_reg(pau, PAU_CTL_MISC_CERR_MESSAGE0); + pau_opencapi_dump_scom_reg(pau, PAU_CTL_MISC_CERR_MESSAGE1); + pau_opencapi_dump_scom_reg(pau, PAU_CTL_MISC_CERR_MESSAGE2); + pau_opencapi_dump_scom_reg(pau, PAU_CTL_MISC_CERR_FIRST0); + pau_opencapi_dump_scom_reg(pau, PAU_CTL_MISC_CERR_FIRST1); + pau_opencapi_dump_scom_reg(pau, PAU_DAT_MISC_CERR_ECC_HOLD); + pau_opencapi_dump_scom_reg(pau, PAU_DAT_MISC_CERR_ECC_MASK); + pau_opencapi_dump_scom_reg(pau, PAU_DAT_MISC_CERR_ECC_FIRST); + + pau_for_each_opencapi_dev(dev, pau) { + pau_opencapi_dump_scom_reg(pau, PAU_OTL_MISC_ERR_RPT_HOLD0(dev->index)); + pau_opencapi_dump_scom_reg(pau, PAU_OTL_MISC_OTL_REM0(dev->index)); + pau_opencapi_dump_scom_reg(pau, PAU_OTL_MISC_ERROR_SIG_RXI(dev->index)); + pau_opencapi_dump_scom_reg(pau, PAU_OTL_MISC_ERROR_SIG_RXO(dev->index)); + pau_opencapi_dump_scom_reg(pau, PAU_OTL_MISC_ERR_RPT_HOLD1(dev->index)); + } +} + static void pau_dt_create_link(struct dt_node *pau, uint32_t pau_index, uint32_t dev_index) { diff --git a/include/npu2-regs.h b/include/npu2-regs.h index 22f58a6a..cb1d3956 100644 --- a/include/npu2-regs.h +++ b/include/npu2-regs.h @@ -610,6 +610,11 @@ void npu2_scom_write(uint64_t gcid, uint64_t scom_base, #define NPU2_TOTAL_FIR_REGISTERS 3 +#define NPU2_FIR(n) (0x2c00 + (n) * 0x40) +#define NPU2_FIR_MASK(n) (0x2c03 + (n) * 0x40) +#define NPU2_FIR_ACTION0(n) (0x2c06 + (n) * 0x40) +#define NPU2_FIR_ACTION1(n) (0x2c07 + (n) * 0x40) + /* * Can't use enums for 64 bit values, use #defines */ diff --git a/include/npu2.h b/include/npu2.h index f48a68b6..abe88747 100644 --- a/include/npu2.h +++ b/include/npu2.h @@ -241,7 +241,7 @@ int64_t npu2_freeze_status(struct phb *phb __unused, uint8_t *freeze_state, uint16_t *pci_error_type __unused, uint16_t *severity __unused); -void npu2_dump_scoms(int chip_id); +void npu2_dump_scoms(struct phb *phb, int chip_id); int64_t npu2_init_context(struct phb *phb, uint64_t msr, uint64_t bdf); int64_t npu2_destroy_context(struct phb *phb, uint64_t bdf); diff --git a/include/pau-regs.h b/include/pau-regs.h index 19b0b7cd..b852a5b5 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -48,6 +48,17 @@ #define PAU_MCP_MISC_CFG0_MA_MCRESP_OPT_WRP PPC_BIT(9) #define PAU_MCP_MISC_CFG0_ENABLE_PBUS PPC_BIT(26) #define PAU_MCP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(44, 48) +#define PAU_MCP_MISC_CERR_MESSAGE0 (PAU_BLOCK_CQ_SM(0) + 0x030) +#define PAU_MCP_MISC_CERR_MESSAGE1 (PAU_BLOCK_CQ_SM(0) + 0x038) +#define PAU_MCP_MISC_CERR_MESSAGE2 (PAU_BLOCK_CQ_SM(0) + 0x040) +#define PAU_MCP_MISC_CERR_MESSAGE3 (PAU_BLOCK_CQ_SM(0) + 0x048) +#define PAU_MCP_MISC_CERR_MESSAGE4 (PAU_BLOCK_CQ_SM(0) + 0x050) +#define PAU_MCP_MISC_CERR_MESSAGE5 (PAU_BLOCK_CQ_SM(0) + 0x058) +#define PAU_MCP_MISC_CERR_MESSAGE6 (PAU_BLOCK_CQ_SM(0) + 0x060) +#define PAU_MCP_MISC_CERR_MESSAGE7 (PAU_BLOCK_CQ_SM(0) + 0x068) +#define PAU_MCP_MISC_CERR_FIRST0 (PAU_BLOCK_CQ_SM(0) + 0x078) +#define PAU_MCP_MISC_CERR_FIRST1 (PAU_BLOCK_CQ_SM(0) + 0x080) +#define PAU_MCP_MISC_CERR_FIRST2 (PAU_BLOCK_CQ_SM(0) + 0x088) #define PAU_SNP_MISC_CFG0 (PAU_BLOCK_CQ_SM(0) + 0x180) #define PAU_SNP_MISC_CFG0_ENABLE_PBUS PPC_BIT(2) #define PAU_SNP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(32, 36) @@ -79,6 +90,11 @@ #define PAU_CTL_MISC_MMIOPA_CONFIG(brk) (PAU_BLOCK_CQ_CTL + 0x098 + (brk) * 8) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR PPC_BITMASK(1, 35) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE PPC_BITMASK(39, 43) +#define PAU_CTL_MISC_CERR_MESSAGE0 (PAU_BLOCK_CQ_CTL + 0x0C0) +#define PAU_CTL_MISC_CERR_MESSAGE1 (PAU_BLOCK_CQ_CTL + 0x0C8) +#define PAU_CTL_MISC_CERR_MESSAGE2 (PAU_BLOCK_CQ_CTL + 0x0D0) +#define PAU_CTL_MISC_CERR_FIRST0 (PAU_BLOCK_CQ_CTL + 0x0D8) +#define PAU_CTL_MISC_CERR_FIRST1 (PAU_BLOCK_CQ_CTL + 0x0E0) #define PAU_CTL_MISC_FENCE_CTRL(brk) (PAU_BLOCK_CQ_CTL + 0x108 + (brk) * 8) #define PAU_CTL_MISC_FENCE_REQUEST PPC_BITMASK(0, 1) #define PAU_CTL_MISC_CFG_ADDR(brk) (PAU_BLOCK_CQ_CTL + 0x250 + (brk) * 8) @@ -93,6 +109,9 @@ /* CQ_DAT block registers */ #define PAU_DAT_MISC_CFG1 (PAU_BLOCK_CQ_DAT + 0x008) #define PAU_DAT_MISC_CFG1_OCAPI_MODE PPC_BITMASK(40, 44) +#define PAU_DAT_MISC_CERR_ECC_HOLD (PAU_BLOCK_CQ_DAT + 0x020) +#define PAU_DAT_MISC_CERR_ECC_MASK (PAU_BLOCK_CQ_DAT + 0x028) +#define PAU_DAT_MISC_CERR_ECC_FIRST (PAU_BLOCK_CQ_DAT + 0x030) /* OTL block registers */ #define PAU_OTL_MISC_CFG0(brk) (PAU_BLOCK_OTL(brk) + 0x000) @@ -102,6 +121,7 @@ #define PAU_OTL_MISC_CFG0_ENABLE_4_0 PPC_BIT(51) #define PAU_OTL_MISC_CFG0_XLATE_RELEASE PPC_BIT(62) #define PAU_OTL_MISC_CFG0_ENABLE_5_0 PPC_BIT(63) +#define PAU_OTL_MISC_ERR_RPT_HOLD0(brk) (PAU_BLOCK_OTL(brk) + 0x030) #define PAU_OTL_MISC_CFG_TLX_CREDITS(brk) (PAU_BLOCK_OTL(brk) + 0x050) #define PAU_OTL_MISC_CFG_TLX_CREDITS_VC0 PPC_BITMASK(0, 7) #define PAU_OTL_MISC_CFG_TLX_CREDITS_VC1 PPC_BITMASK(8, 15) @@ -118,6 +138,10 @@ #define PAU_OTL_MISC_CFG_TX_TEMP2_RATE PPC_BITMASK(16, 19) #define PAU_OTL_MISC_CFG_TX_TEMP3_RATE PPC_BITMASK(20, 23) #define PAU_OTL_MISC_CFG_TX_CRET_FREQ PPC_BITMASK(32, 34) +#define PAU_OTL_MISC_OTL_REM0(brk) (PAU_BLOCK_OTL(brk) + 0x068) +#define PAU_OTL_MISC_ERROR_SIG_RXI(brk) (PAU_BLOCK_OTL(brk) + 0x070) +#define PAU_OTL_MISC_ERROR_SIG_RXO(brk) (PAU_BLOCK_OTL(brk) + 0x078) +#define PAU_OTL_MISC_ERR_RPT_HOLD1(brk) (PAU_BLOCK_OTL(brk) + 0x0B0) #define PAU_OTL_MISC_PSL_DSISR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x000) #define PAU_OTL_MISC_PSL_DAR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x008) #define PAU_OTL_MISC_PSL_TFC_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x010) diff --git a/include/pau.h b/include/pau.h index d6a08809..ea15f0b8 100644 --- a/include/pau.h +++ b/include/pau.h @@ -189,4 +189,6 @@ static inline uint64_t pau_read(struct pau *pau, uint64_t reg) return pau_scom_read(pau, reg, PAU_MISC_DA_LEN_8B); } +void pau_opencapi_dump_scoms(struct phb *phb); + #endif /* __PAU_H */ diff --git a/include/xscom-p10-regs.h b/include/xscom-p10-regs.h index 6045152d..36c348bc 100644 --- a/include/xscom-p10-regs.h +++ b/include/xscom-p10-regs.h @@ -15,6 +15,9 @@ #define P10_NX_DMA_ENGINE_FIR 0x02011100 /* DMA & Engine FIR Data Register */ #define P10_NX_PBI_FIR 0x02011080 /* PowerBus Interface FIR Register */ +/* pMisc Receive Malfunction Alert Register */ +#define P10_MALFUNC_ALERT 0x00090022 + #define P10_EC_CORE_THREAD_STATE 0x412 /* XXX P10 is this right? */ #define P10_THREAD_STOPPED(t) PPC_BIT(56 + (t)) -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:55 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:55 +0200 Subject: [Skiboot] [PATCH 14/16] [PATCH 14/16] opencapi5: add opal functions In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-15-clombard@linux.vnet.ibm.com> Add three OPAL API calls that are required by the ocxl driver. - OPAL_PAU_SPA_SETUP The Shared Process Area (SPA) is a table containing one entry (a "Process Element") per memory context which can be accessed by the OpenCAPI device. - OPAL_PAU_SPA_CLEAR_CACHE The PAU keeps a cache of recently accessed memory contexts. When a Process Element is removed from the SPA, the cache for the link must be cleared. - OPAL_PAU_TL_SET The Transaction Layer specification defines several templates for messages to be exchanged on the link. During link setup, the host and device must negotiate what templates are supported on both sides and at what rates those messages can be sent. Signed-off-by: Christophe Lombard --- hw/npu-opal.c | 8 +++ hw/pau.c | 159 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 13 ++++ include/pau.h | 9 +++ 4 files changed, 189 insertions(+) diff --git a/hw/npu-opal.c b/hw/npu-opal.c index 64e36852..4fc4c662 100644 --- a/hw/npu-opal.c +++ b/hw/npu-opal.c @@ -8,6 +8,7 @@ #include #include #include +#include static int64_t opal_npu_init_context(uint64_t phb_id, int pid __unused, uint64_t msr, uint64_t bdf) @@ -195,6 +196,8 @@ static int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t bdfn, if (phb->phb_type == phb_type_npu_v2_opencapi) rc = npu2_opencapi_spa_setup(phb, bdfn, addr, PE_mask); + else if (phb->phb_type == phb_type_pau_opencapi) + rc = pau_opencapi_spa_setup(phb, bdfn, addr, PE_mask); else return OPAL_PARAMETER; @@ -216,6 +219,8 @@ static int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t bdfn, if (phb->phb_type == phb_type_npu_v2_opencapi) rc = npu2_opencapi_spa_clear_cache(phb, bdfn, PE_handle); + else if (phb->phb_type == phb_type_pau_opencapi) + rc = pau_opencapi_spa_clear_cache(phb, bdfn, PE_handle); else return OPAL_PARAMETER; @@ -235,6 +240,9 @@ static int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, if (phb->phb_type == phb_type_npu_v2_opencapi) rc = npu2_opencapi_tl_set(phb, bdfn, capabilities, rate_phys, rate_sz); + else if (phb->phb_type == phb_type_pau_opencapi) + rc = pau_opencapi_tl_set(phb, bdfn, capabilities, + rate_phys, rate_sz); else return OPAL_PARAMETER; diff --git a/hw/pau.c b/hw/pau.c index 63655118..33d33c65 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -15,6 +15,9 @@ #define PAU_MAX_PE_NUM 16 #define PAU_RESERVED_PE_NUM 15 +#define PAU_TL_MAX_TEMPLATE 63 +#define PAU_TL_RATE_BUF_SIZE 32 + #define PAU_SLOT_NORMAL PCI_SLOT_STATE_NORMAL #define PAU_SLOT_LINK PCI_SLOT_STATE_LINK #define PAU_SLOT_LINK_START (PAU_SLOT_LINK + 1) @@ -271,6 +274,162 @@ static void pau_device_detect_fixup(struct pau_dev *dev) dt_add_property_strings(dn, "ibm,pau-link-type", "unknown"); } +int64_t pau_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, + uint64_t addr, uint64_t PE_mask) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint64_t reg, val; + int64_t rc; + + lock(&pau->lock); + + reg = PAU_XSL_OSL_SPAP_AN(dev->index); + val = pau_read(pau, reg); + if ((addr && (val & PAU_XSL_OSL_SPAP_AN_EN)) || + (!addr && !(val & PAU_XSL_OSL_SPAP_AN_EN))) { + rc = OPAL_BUSY; + goto out; + } + + /* SPA is disabled by passing a NULL address */ + val = addr; + if (addr) + val = addr | PAU_XSL_OSL_SPAP_AN_EN; + pau_write(pau, reg, val); + + /* + * set the PE mask that the OS uses for PASID -> PE handle + * conversion + */ + reg = PAU_OTL_MISC_CFG0(dev->index); + val = pau_read(pau, reg); + val = SETFIELD(PAU_OTL_MISC_CFG0_PE_MASK, val, PE_mask); + pau_write(pau, reg, val); + rc = OPAL_SUCCESS; +out: + unlock(&pau->lock); + return rc; +} + +int64_t pau_opencapi_spa_clear_cache(struct phb *phb, + uint32_t __unused bdfn, + uint64_t PE_handle) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint64_t reg, val; + int64_t rc, retries = 5; + + lock(&pau->lock); + + reg = PAU_XSL_OSL_CCINV; + val = pau_read(pau, reg); + if (val & PAU_XSL_OSL_CCINV_PENDING) { + rc = OPAL_BUSY; + goto out; + } + + val = PAU_XSL_OSL_CCINV_REMOVE; + val |= SETFIELD(PAU_XSL_OSL_CCINV_PE_HANDLE, val, PE_handle); + if (dev->index) + val |= PAU_XSL_OSL_CCINV_BRICK; + pau_write(pau, reg, val); + + rc = OPAL_HARDWARE; + while (retries--) { + val = pau_read(pau, reg); + if (!(val & PAU_XSL_OSL_CCINV_PENDING)) { + rc = OPAL_SUCCESS; + break; + } + /* the bit expected to flip in less than 200us */ + time_wait_us(200); + } +out: + unlock(&pau->lock); + return rc; +} + +static int pau_opencapi_get_templ_rate(unsigned int templ, + char *rate_buf) +{ + int shift, idx, val; + + /* + * Each rate is encoded over 4 bits (0->15), with 15 being the + * slowest. The buffer is a succession of rates for all the + * templates. The first 4 bits are for template 63, followed + * by 4 bits for template 62, ... etc. So the rate for + * template 0 is at the very end of the buffer. + */ + idx = (PAU_TL_MAX_TEMPLATE - templ) / 2; + shift = 4 * (1 - ((PAU_TL_MAX_TEMPLATE - templ) % 2)); + val = rate_buf[idx] >> shift; + return val; +} + +static bool pau_opencapi_is_templ_supported(unsigned int templ, + long capabilities) +{ + return !!(capabilities & (1ull << templ)); +} + +int64_t pau_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, + long capabilities, uint64_t rate_phys, + int rate_sz) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau; + char *rate = (char *) rate_phys; + uint64_t reg, val, templ_rate; + int i, rate_pos; + + if (!dev) + return OPAL_PARAMETER; + pau = dev->pau; + + if (!opal_addr_valid(rate) || rate_sz != PAU_TL_RATE_BUF_SIZE) + return OPAL_PARAMETER; + + /* The 'capabilities' argument defines what TL template the + * device can receive. OpenCAPI 5.0 defines 64 templates, so + * that's one bit per template. + * + * For each template, the device processing time may vary, so + * the device advertises at what rate a message of a given + * template can be sent. That's encoded in the 'rate' buffer. + * + * On P10, PAU only knows about TL templates 0 -> 3. + * Per the spec, template 0 must be supported. + */ + if (!pau_opencapi_is_templ_supported(0, capabilities)) + return OPAL_PARAMETER; + + reg = PAU_OTL_MISC_CFG_TX(dev->index); + val = pau_read(pau, reg); + val &= ~PAU_OTL_MISC_CFG_TX_TEMP1_EN; + val &= ~PAU_OTL_MISC_CFG_TX_TEMP2_EN; + val &= ~PAU_OTL_MISC_CFG_TX_TEMP3_EN; + + for (i = 0; i < 4; i++) { + /* Skip template 0 as it is implicitly enabled. + * Enable other template If supported by AFU + */ + if (i && pau_opencapi_is_templ_supported(i, capabilities)) + val |= PAU_OTL_MISC_CFG_TX_TEMP_EN(i); + /* The tx rate should still be set for template 0 */ + templ_rate = pau_opencapi_get_templ_rate(i, rate); + rate_pos = 8 + i * 4; + val = SETFIELD(PAU_OTL_MISC_CFG_TX_TEMP_RATE(rate_pos, rate_pos + 3), + val, templ_rate); + } + pau_write(pau, reg, val); + PAUDEVDBG(dev, "OTL configuration register set to %llx\n", val); + + return OPAL_SUCCESS; +} + #define CQ_CTL_STATUS_TIMEOUT 10 /* milliseconds */ static int pau_opencapi_set_fence_control(struct pau_dev *dev, diff --git a/include/pau-regs.h b/include/pau-regs.h index 7a5aaa5f..57c2d723 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -118,6 +118,7 @@ #define PAU_OTL_MISC_CFG0_EN PPC_BIT(0) #define PAU_OTL_MISC_CFG0_BLOCK_PE_HANDLE PPC_BIT(1) #define PAU_OTL_MISC_CFG0_BRICKID PPC_BITMASK(2, 3) +#define PAU_OTL_MISC_CFG0_PE_MASK PPC_BITMASK(4, 7) #define PAU_OTL_MISC_CFG0_ENABLE_4_0 PPC_BIT(51) #define PAU_OTL_MISC_CFG0_XLATE_RELEASE PPC_BIT(62) #define PAU_OTL_MISC_CFG0_ENABLE_5_0 PPC_BIT(63) @@ -132,11 +133,16 @@ #define PAU_OTL_MISC_CFG_TLX_CREDITS_DCP2 PPC_BITMASK(48, 55) #define PAU_OTL_MISC_CFG_TLX_CREDITS_DCP3 PPC_BITMASK(56, 63) #define PAU_OTL_MISC_CFG_TX(brk) (PAU_BLOCK_OTL(brk) + 0x058) +#define PAU_OTL_MISC_CFG_TX_TEMP1_EN PPC_BIT(1) +#define PAU_OTL_MISC_CFG_TX_TEMP2_EN PPC_BIT(2) +#define PAU_OTL_MISC_CFG_TX_TEMP3_EN PPC_BIT(3) +#define PAU_OTL_MISC_CFG_TX_TEMP_EN(n) PPC_BIT(n) #define PAU_OTL_MISC_CFG_TX_DRDY_WAIT PPC_BITMASK(5, 7) #define PAU_OTL_MISC_CFG_TX_TEMP0_RATE PPC_BITMASK(8, 11) #define PAU_OTL_MISC_CFG_TX_TEMP1_RATE PPC_BITMASK(12, 15) #define PAU_OTL_MISC_CFG_TX_TEMP2_RATE PPC_BITMASK(16, 19) #define PAU_OTL_MISC_CFG_TX_TEMP3_RATE PPC_BITMASK(20, 23) +#define PAU_OTL_MISC_CFG_TX_TEMP_RATE(nib0, nib1) PPC_BITMASK(nib0, nib1) #define PAU_OTL_MISC_CFG_TX_CRET_FREQ PPC_BITMASK(32, 34) #define PAU_OTL_MISC_OTL_REM0(brk) (PAU_BLOCK_OTL(brk) + 0x068) #define PAU_OTL_MISC_ERROR_SIG_RXI(brk) (PAU_BLOCK_OTL(brk) + 0x070) @@ -150,11 +156,18 @@ #define PAU_OTL_MISC_PSL_PEHANDLE_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x018) /* XSL block registers */ +#define PAU_XSL_OSL_SPAP_AN(brk) (PAU_BLOCK_XSL + 0x000 + (brk) * 8) +#define PAU_XSL_OSL_SPAP_AN_EN PPC_BIT(63) #define PAU_XSL_WRAP_CFG (PAU_BLOCK_XSL + 0x100) #define PAU_XSL_WRAP_CFG_CLOCK_ENABLE PPC_BIT(0) #define PAU_XSL_OSL_XLATE_CFG(brk) (PAU_BLOCK_XSL + 0x040 + (brk) * 8) #define PAU_XSL_OSL_XLATE_CFG_AFU_DIAL PPC_BIT(0) #define PAU_XSL_OSL_XLATE_CFG_OPENCAPI3 PPC_BIT(32) +#define PAU_XSL_OSL_CCINV (PAU_BLOCK_XSL + 0x070) +#define PAU_XSL_OSL_CCINV_REMOVE PPC_BIT(15) +#define PAU_XSL_OSL_CCINV_PENDING PPC_BIT(16) +#define PAU_XSL_OSL_CCINV_BRICK PPC_BIT(47) +#define PAU_XSL_OSL_CCINV_PE_HANDLE PPC_BITMASK(48, 62) /* XTS block registers */ #define PAU_XTS_CFG (PAU_BLOCK_PAU_XTS + 0x020) diff --git a/include/pau.h b/include/pau.h index 8b978bd6..61b17925 100644 --- a/include/pau.h +++ b/include/pau.h @@ -200,6 +200,15 @@ static inline uint64_t pau_read(struct pau *pau, uint64_t reg) } void pau_opencapi_dump_scoms(struct phb *phb); +int64_t pau_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, + uint64_t addr, uint64_t PE_mask); +int64_t pau_opencapi_spa_clear_cache(struct phb *phb, + uint32_t __unused bdfn, + uint64_t PE_handle); +int64_t pau_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, + long capabilities, uint64_t rate_phys, + int rate_sz); + /* PHY */ int pau_dev_phy_reset(struct pau_dev *dev); -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:54 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:54 +0200 Subject: [Skiboot] [PATCH 13/16] [PATCH 13/16] opencapi5: link training In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-14-clombard@linux.vnet.ibm.com> Add elementary functions to handle a phb complete, fundamental and hot resets. For the time being, specific creset and hreset are not supported. A complete fundamental reset is based on the following steps, in this order: - Place all bricks into Fence state - Disable BARs - Reset ODL to Power-on Values - Set the i2c reset pin in output mode - Initialize PHY Lanes - Deassert ODL reset - Clear the the i2c reset pin - Unfence bricks - Enable BARs - Enable ODL training mode Link training is also set up. Signed-off-by: Christophe Lombard --- hw/pau.c | 536 +++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 5 + include/pau.h | 2 + include/xscom-p10-regs.h | 46 ++++ 4 files changed, 589 insertions(+) diff --git a/hw/pau.c b/hw/pau.c index 132ef565..63655118 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -9,11 +9,28 @@ #include #include #include +#include /* Number of PEs supported */ #define PAU_MAX_PE_NUM 16 #define PAU_RESERVED_PE_NUM 15 +#define PAU_SLOT_NORMAL PCI_SLOT_STATE_NORMAL +#define PAU_SLOT_LINK PCI_SLOT_STATE_LINK +#define PAU_SLOT_LINK_START (PAU_SLOT_LINK + 1) +#define PAU_SLOT_LINK_WAIT (PAU_SLOT_LINK + 2) +#define PAU_SLOT_LINK_TRAINED (PAU_SLOT_LINK + 3) +#define PAU_SLOT_FRESET PCI_SLOT_STATE_FRESET +#define PAU_SLOT_FRESET_START (PAU_SLOT_FRESET + 1) +#define PAU_SLOT_FRESET_INIT (PAU_SLOT_FRESET + 2) +#define PAU_SLOT_FRESET_ASSERT_DELAY (PAU_SLOT_FRESET + 3) +#define PAU_SLOT_FRESET_DEASSERT_DELAY (PAU_SLOT_FRESET + 4) +#define PAU_SLOT_FRESET_INIT_DELAY (PAU_SLOT_FRESET + 5) + +#define PAU_LINK_TRAINING_RETRIES 2 +#define PAU_LINK_TRAINING_TIMEOUT 15000 /* ms */ +#define PAU_LINK_STATE_TRAINED 0x7 + struct pau_dev *pau_next_dev(struct pau *pau, struct pau_dev *dev, enum pau_dev_type type) { @@ -173,6 +190,7 @@ static void pau_dt_create_pau(struct dt_node *xscom, uint32_t pau_index) dt_add_property_cells(pau, "reg", pau_base[pau_index], 0x2c); dt_add_property_string(pau, "compatible", "ibm,power10-pau"); dt_add_property_cells(pau, "ibm,pau-index", pau_index); + dt_add_property_cells(pau, "ibm,pau-chiplet", pau_base[pau_index] >> 24); dt_add_property_cells(pau, "ibm,phb-index", 7 + pau_index); links = PAU_LINKS_OPENCAPI_PER_PAU; @@ -207,12 +225,14 @@ static struct pau *pau_create(struct dt_node *dn) assert(pau); init_lock(&pau->lock); + init_lock(&pau->procedure_state.lock); pau->dt_node = dn; pau->index = dt_prop_get_u32(dn, "ibm,pau-index"); pau->xscom_base = dt_get_address(dn, 0, NULL); pau->chip_id = dt_get_chip_id(dn); + pau->op_chiplet = dt_prop_get_u32(dn, "ibm,pau-chiplet"); assert(get_chip(pau->chip_id)); pau->links = PAU_LINKS_OPENCAPI_PER_PAU; @@ -507,6 +527,458 @@ static void pau_opencapi_enable_bars(struct pau_dev *dev, bool enable) pau_write(pau, reg, val); } +static int64_t pau_opencapi_creset(struct pci_slot *slot) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + + PAUDEVERR(dev, "creset not supported\n"); + return OPAL_UNSUPPORTED; +} + +static int64_t pau_opencapi_hreset(struct pci_slot *slot) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + + PAUDEVERR(dev, "hreset not supported\n"); + return OPAL_UNSUPPORTED; +} + +static void pau_opencapi_assert_odl_reset(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + reg = P10_OB_ODL_CONFIG(dev->op_unit, dev->odl_index); + val = P10_OB_ODL_CONFIG_RESET; + val = SETFIELD(P10_OB_ODL_CONFIG_VERSION, val, 0b000100); // OCAPI 4 + val = SETFIELD(P10_OB_ODL_CONFIG_TRAIN_MODE, val, 0b0101); // ts2 + val = SETFIELD(P10_OB_ODL_CONFIG_SUPPORTED_MODES, val, 0b0010); + val |= P10_OB_ODL_CONFIG_X4_BACKOFF_ENABLE; + val = SETFIELD(P10_OB_ODL_CONFIG_PHY_CNTR_LIMIT, val, 0b1111); + val |= P10_OB_ODL_CONFIG_DEBUG_ENABLE; + val = SETFIELD(P10_OB_ODL_CONFIG_FWD_PROGRESS_TIMER, val, 0b0110); + xscom_write(pau->chip_id, reg, val); +} + +static void pau_opencapi_deassert_odl_reset(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + reg = P10_OB_ODL_CONFIG(dev->op_unit, dev->odl_index); + xscom_read(pau->chip_id, reg, &val); + val &= ~P10_OB_ODL_CONFIG_RESET; + xscom_write(pau->chip_id, reg, val); +} + +static void pau_opencapi_training_mode(struct pau_dev *dev, + uint8_t pattern) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + reg = P10_OB_ODL_CONFIG(dev->op_unit, dev->odl_index); + xscom_read(pau->chip_id, reg, &val); + val = SETFIELD(P10_OB_ODL_CONFIG_TRAIN_MODE, val, pattern); + xscom_write(pau->chip_id, reg, val); +} + +static int64_t pau_opencapi_assert_adapter_reset(struct pau_dev *dev) +{ + int64_t rc = OPAL_SUCCESS; + + if (platform.ocapi->i2c_assert_reset) + rc = platform.ocapi->i2c_assert_reset(dev->i2c_bus_id); + else + rc = OPAL_PARAMETER; + + if (rc) + PAUDEVERR(dev, "Error writing I2C reset signal: %lld\n", rc); + return rc; +} + +static int64_t pau_opencapi_deassert_adapter_reset(struct pau_dev *dev) +{ + int64_t rc = OPAL_SUCCESS; + + if (platform.ocapi->i2c_deassert_reset) + rc = platform.ocapi->i2c_deassert_reset(dev->i2c_bus_id); + else + rc = OPAL_PARAMETER; + + if (rc) + PAUDEVERR(dev, "Error writing I2C reset signal: %lld\n", rc); + return rc; +} + +static void pau_opencapi_fence_brick(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + + PAUDEVDBG(dev, "Fencing brick\n"); + pau_opencapi_set_fence_control(dev, 0b11); + + /* Place all bricks into Fence state */ + pau_write(pau, PAU_MISC_FENCE_STATE, + PAU_MISC_FENCE_STATE_SET(pau_dev_index(dev, PAU_LINKS_OPENCAPI_PER_PAU))); +} + +static void pau_opencapi_unfence_brick(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + + PAUDEVDBG(dev, "Unfencing brick\n"); + pau_write(pau, PAU_MISC_FENCE_STATE, + PAU_MISC_FENCE_STATE_CLEAR(pau_dev_index(dev, PAU_LINKS_OPENCAPI_PER_PAU))); + + pau_opencapi_set_fence_control(dev, 0b10); + pau_opencapi_set_fence_control(dev, 0b00); +} + +static int64_t pau_opencapi_freset(struct pci_slot *slot) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + uint8_t presence = 1; + int64_t rc = OPAL_SUCCESS; + + switch (slot->state) { + case PAU_SLOT_NORMAL: + case PAU_SLOT_FRESET_START: + PAUDEVDBG(dev, "FRESET: Starts\n"); + + if (slot->ops.get_presence_state) + slot->ops.get_presence_state(slot, &presence); + if (!presence) { + /* + * FIXME: if there's no card on the link, we + * should consider powering off the unused + * lanes to save energy + */ + PAUDEVINF(dev, "no card detected\n"); + return OPAL_SUCCESS; + } + slot->link_retries = PAU_LINK_TRAINING_RETRIES; + /* fall-through */ + case PAU_SLOT_FRESET_INIT: + pau_opencapi_fence_brick(dev); + pau_opencapi_enable_bars(dev, false); + pau_opencapi_assert_odl_reset(dev); + pau_opencapi_assert_adapter_reset(dev); + pci_slot_set_state(slot, PAU_SLOT_FRESET_ASSERT_DELAY); + /* assert for 5ms */ + return pci_slot_set_sm_timeout(slot, msecs_to_tb(5)); + + case PAU_SLOT_FRESET_ASSERT_DELAY: + rc = pau_dev_phy_reset(dev); + if (rc) { + PAUDEVERR(dev, "FRESET: PHY reset error\n"); + return OPAL_HARDWARE; + } + pau_opencapi_deassert_odl_reset(dev); + pau_opencapi_deassert_adapter_reset(dev); + pci_slot_set_state(slot, PAU_SLOT_FRESET_DEASSERT_DELAY); + /* give 250ms to device to be ready */ + return pci_slot_set_sm_timeout(slot, msecs_to_tb(250)); + + case PAU_SLOT_FRESET_DEASSERT_DELAY: + pau_opencapi_unfence_brick(dev); + pau_opencapi_enable_bars(dev, true); + pau_opencapi_training_mode(dev, 0b0001); /* send pattern A */ + pci_slot_set_state(slot, PAU_SLOT_FRESET_INIT_DELAY); + return pci_slot_set_sm_timeout(slot, msecs_to_tb(5)); + + case PAU_SLOT_FRESET_INIT_DELAY: + pau_opencapi_training_mode(dev, 0b1000); /* enable training */ + dev->train_start = mftb(); + dev->train_timeout = dev->train_start + + msecs_to_tb(PAU_LINK_TRAINING_TIMEOUT); + pci_slot_set_state(slot, PAU_SLOT_LINK_START); + return slot->ops.poll_link(slot); + + default: + PAUDEVERR(dev, "FRESET: unexpected slot state %08x\n", + slot->state); + } + pci_slot_set_state(slot, PAU_SLOT_NORMAL); + return OPAL_HARDWARE; +} + +static uint64_t pau_opencapi_get_odl_endpoint_info(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t val; + + xscom_read(pau->chip_id, + P10_OB_ODL_DLX_INFO(dev->op_unit, dev->odl_index), + &val); + return val; +} + +static uint64_t pau_opencapi_get_odl_training_status(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t val; + + xscom_read(pau->chip_id, + P10_OB_ODL_TRAIN_STAT(dev->op_unit, dev->odl_index), + &val); + return val; +} + +static uint64_t pau_opencapi_get_odl_status(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t val; + + xscom_read(pau->chip_id, + P10_OB_ODL_STATUS(dev->op_unit, dev->odl_index), + &val); + return val; +} + +static uint64_t pau_opencapi_get_odl_link_speed_status(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t val; + + xscom_read(pau->chip_id, + P10_OB_ODL_LINK_SPEED_STATUS(dev->op_unit, dev->odl_index), + &val); + return val; +} + +static enum OpalShpcLinkState pau_opencapi_get_link_width(uint64_t status) +{ + uint64_t tx_lanes, rx_lanes, state; + + state = GETFIELD(P10_OB_ODL_STATUS_TRAINING_STATE, status); + if (state != PAU_LINK_STATE_TRAINED) + return OPAL_SHPC_LINK_DOWN; + + rx_lanes = GETFIELD(P10_OB_ODL_STATUS_RX_TRAINED_LANES, status); + tx_lanes = GETFIELD(P10_OB_ODL_STATUS_TX_TRAINED_LANES, status); + if ((rx_lanes != 0xFF) || (tx_lanes != 0xFF)) + return OPAL_SHPC_LINK_UP_x4; + else + return OPAL_SHPC_LINK_UP_x8; + + /* OpenCapi link widths x16 ? */ +} + +static int64_t pau_opencapi_get_link_state(struct pci_slot *slot, + uint8_t *val) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + uint64_t status; + + status = pau_opencapi_get_odl_status(dev); + *val = pau_opencapi_get_link_width(status); + + return OPAL_SUCCESS; + +} + +static int64_t pau_opencapi_get_power_state(struct pci_slot *slot, + uint8_t *val) +{ + *val = slot->power_state; + return OPAL_SUCCESS; +} + +static int64_t pau_opencapi_get_presence_state(struct pci_slot __unused * slot, + uint8_t *val) +{ + /* + * Presence detection for OpenCAPI is currently done at the start of + * PAU initialisation, and we only create slots if a device is present. + * As such we will never be asked to get the presence of a slot that's + * empty. + * + * This may change if we ever support hotplug down the track. + */ + *val = OPAL_PCI_SLOT_PRESENT; + return OPAL_SUCCESS; +} + +static void pau_opencapi_check_trained_link(struct pau_dev *dev, + uint64_t status) +{ + if (pau_opencapi_get_link_width(status) != OPAL_SHPC_LINK_UP_x8) { + PAUDEVERR(dev, "Link trained in degraded mode (%016llx)\n", + status); + PAUDEVDBG(dev, "Link endpoint info: %016llx\n", + pau_opencapi_get_odl_endpoint_info(dev)); + } +} + +static int64_t pau_opencapi_retry_state(struct pci_slot *slot, + uint64_t status) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + + if (!slot->link_retries--) { + /** + * @fwts-label OCAPILinkTrainingFailed + * @fwts-advice The OpenCAPI link training procedure failed. + * This indicates a hardware or firmware bug. OpenCAPI + * functionality will not be available on this link. + */ + PAUDEVERR(dev, + "Link failed to train, final link status: %016llx\n", + status); + PAUDEVDBG(dev, "Final link training status: %016llx (Link Speed Status: %016llx)\n", + pau_opencapi_get_odl_training_status(dev), + pau_opencapi_get_odl_link_speed_status(dev)); + return OPAL_HARDWARE; + } + + PAUDEVERR(dev, "Link failed to train, retrying\n"); + PAUDEVERR(dev, "Link status: %016llx, training status: %016llx " + "(Link Speed Status: %016llx)\n", + status, + pau_opencapi_get_odl_training_status(dev), + pau_opencapi_get_odl_link_speed_status(dev)); + + pci_slot_set_state(slot, PAU_SLOT_FRESET_INIT); + return pci_slot_set_sm_timeout(slot, msecs_to_tb(1)); +} + +static void pau_opencapi_otl_tx_send_enable(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + /* Allows OTL TX to send out packets to AFU */ + PAUDEVDBG(dev, "OTL TX Send Enable\n"); + + reg = PAU_OTL_MISC_CFG_TX2(dev->index); + val = pau_read(pau, reg); + val |= PAU_OTL_MISC_CFG_TX2_SEND_EN; + pau_write(pau, reg, val); +} + +static void pau_opencapi_setup_perf_counters(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + PAUDEVDBG(dev, "Setup perf counter\n"); + + reg = P10_OB_ODL_PERF_MON_CONFIG(dev->op_unit); + xscom_read(pau->chip_id, reg, &val); + val = SETFIELD(P10_OB_ODL_PERF_MON_CONFIG_ENABLE, val, + P10_OB_ODL_PERF_MON_CONFIG_LINK0 >> dev->index); + val = SETFIELD(P10_OB_ODL_PERF_MON_CONFIG_SIZE, val, + P10_OB_ODL_PERF_MON_CONFIG_SIZE16); + xscom_write(pau->chip_id, reg, val); + PAUDEVDBG(dev, "perf counter config %llx = %llx\n", reg, val); + + reg = P10_OB_ODL_PERF_MON_SELECT(dev->op_unit); + xscom_read(pau->chip_id, reg, &val); + val = SETFIELD(P10_OB_ODL_PERF_MON_SELECT_COUNTER >> (dev->index * 16), + val, P10_OB_ODL_PERF_MON_SELECT_CRC_ODL); + val = SETFIELD(P10_OB_ODL_PERF_MON_SELECT_COUNTER >> ((dev->index * 16) + 8), + val, P10_OB_ODL_PERF_MON_SELECT_CRC_DLX); + xscom_write(pau->chip_id, reg, val); + PAUDEVDBG(dev, "perf counter select %llx = %llx\n", reg, val); +} + +static void pau_opencapi_check_perf_counters(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint64_t reg, val; + + reg = P10_OB_PERF_COUNTER0(dev->op_unit); + xscom_read(pau->chip_id, reg, &val); + + if (val) + PAUDEVERR(dev, "CRC error count perf_counter0..3=0%#llx\n", + val); +} + +static int64_t pau_opencapi_poll_link(struct pci_slot *slot) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + uint64_t status; + + switch (slot->state) { + case PAU_SLOT_NORMAL: + case PAU_SLOT_LINK_START: + PAUDEVDBG(dev, "Start polling\n"); + pci_slot_set_state(slot, PAU_SLOT_LINK_WAIT); + /* fall-through */ + case PAU_SLOT_LINK_WAIT: + status = pau_opencapi_get_odl_status(dev); + if (GETFIELD(P10_OB_ODL_STATUS_TRAINING_STATE, status) == + PAU_LINK_STATE_TRAINED) { + PAUDEVINF(dev, "link trained in %ld ms (Link Speed Status: %016llx)\n", + tb_to_msecs(mftb() - dev->train_start), + pau_opencapi_get_odl_link_speed_status(dev)); + pau_opencapi_check_trained_link(dev, status); + + pci_slot_set_state(slot, PAU_SLOT_LINK_TRAINED); + return pci_slot_set_sm_timeout(slot, msecs_to_tb(1)); + } + if (tb_compare(mftb(), dev->train_timeout) == TB_AAFTERB) + return pau_opencapi_retry_state(slot, status); + + return pci_slot_set_sm_timeout(slot, msecs_to_tb(1)); + + case PAU_SLOT_LINK_TRAINED: + pau_opencapi_otl_tx_send_enable(dev); + pci_slot_set_state(slot, PAU_SLOT_NORMAL); + if (dev->status & PAU_DEV_STATUS_BROKEN) { + PAUDEVERR(dev, "Resetting a device which hit a " + "previous error. Device recovery " + "is not supported, so future behavior is undefined\n"); + dev->status &= ~PAU_DEV_STATUS_BROKEN; + } + pau_opencapi_check_perf_counters(dev); + dev->phb.scan_map = 1; + return OPAL_SUCCESS; + + default: + PAUDEVERR(dev, "unexpected slot state %08x\n", slot->state); + + } + pci_slot_set_state(slot, PAU_SLOT_NORMAL); + return OPAL_HARDWARE; +} + +static void pau_opencapi_prepare_link_change(struct pci_slot *slot __unused, + bool up __unused) +{ + /* + * PCI hotplug wants it defined, but we don't need to do anything + */ +} + +static int64_t pau_opencapi_set_power_state(struct pci_slot *slot, + uint8_t val) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(slot->phb); + + switch (val) { + case PCI_SLOT_POWER_OFF: + PAUDEVDBG(dev, "Fake power off\n"); + pau_opencapi_fence_brick(dev); + pau_opencapi_assert_adapter_reset(dev); + slot->power_state = PCI_SLOT_POWER_OFF; + return OPAL_SUCCESS; + + case PCI_SLOT_POWER_ON: + if (slot->power_state != PCI_SLOT_POWER_OFF) + return OPAL_SUCCESS; + PAUDEVDBG(dev, "Fake power on\n"); + slot->power_state = PCI_SLOT_POWER_ON; + slot->state = PAU_SLOT_NORMAL; + return OPAL_SUCCESS; + + default: + return OPAL_UNSUPPORTED; + } +} + static void pau_opencapi_create_phb_slot(struct pau_dev *dev) { struct pci_slot *slot; @@ -520,6 +992,21 @@ static void pau_opencapi_create_phb_slot(struct pau_dev *dev) */ PAUDEVERR(dev, "Cannot create PHB slot\n"); } + + /* Elementary functions */ + slot->ops.creset = pau_opencapi_creset; + slot->ops.hreset = pau_opencapi_hreset; + slot->ops.freset = pau_opencapi_freset; + slot->ops.get_link_state = pau_opencapi_get_link_state; + slot->ops.get_power_state = pau_opencapi_get_power_state; + slot->ops.get_presence_state = pau_opencapi_get_presence_state; + slot->ops.poll_link = pau_opencapi_poll_link; + slot->ops.prepare_link_change = pau_opencapi_prepare_link_change; + slot->ops.set_power_state = pau_opencapi_set_power_state; + + /* hotplug capability */ + slot->pluggable = 1; + } static int64_t pau_opencapi_pcicfg_check(struct pau_dev *dev, @@ -837,6 +1324,26 @@ static void pau_opencapi_dt_add_mmio_window(struct pau_dev *dev) hi32(mm_win[1]), lo32(mm_win[1])); } +static void pau_opencapi_dt_add_hotpluggable(struct pau_dev *dev) +{ + struct pci_slot *slot = dev->phb.slot; + struct dt_node *dn = dev->phb.dt_node; + char label[40]; + + /* + * Add a few definitions to the DT so that the linux PCI + * hotplug framework can find the slot and identify it as + * hot-pluggable. + * + * The "ibm,slot-label" property is used by linux as the slot name + */ + pci_slot_add_dt_properties(slot, dn); + + snprintf(label, sizeof(label), "OPENCAPI-%04x", + (int)PCI_SLOT_PHB_INDEX(slot->id)); + dt_add_property_string(dn, "ibm,slot-label", label); +} + static void pau_opencapi_dt_add_props(struct pau_dev *dev) { struct dt_node *dn = dev->phb.dt_node; @@ -866,6 +1373,7 @@ static void pau_opencapi_dt_add_props(struct pau_dev *dev) dt_add_property_cells(dn, "ibm,opal-reserved-pe", PAU_RESERVED_PE_NUM); pau_opencapi_dt_add_mmio_window(dev); + pau_opencapi_dt_add_hotpluggable(dev); } static void pau_opencapi_set_transport_mux_controls(struct pau_dev *dev) @@ -884,6 +1392,30 @@ static void pau_opencapi_set_transport_mux_controls(struct pau_dev *dev) pau_write(pau, reg, val); } +static void pau_opencapi_odl_config_phy(struct pau_dev *dev) +{ + struct pau *pau = dev->pau; + uint8_t typemap = 0; + uint64_t reg, val; + + PAUDEVDBG(dev, "Configure ODL\n"); + + /* ODL must be in reset when enabling. + * It stays in reset until the link is trained + */ + pau_opencapi_assert_odl_reset(dev); + + /* DLO (Open CAPI links) */ + typemap = 0x2 >> dev->odl_index; + + reg = P10_OB_ODL_PHY_CONFIG(dev->op_unit); + xscom_read(pau->chip_id, reg, &val); + typemap |= GETFIELD(P10_OB_ODL_PHY_CONFIG_LINK_SELECT, val); + val = SETFIELD(P10_OB_ODL_PHY_CONFIG_LINK_SELECT, val, typemap); + val = SETFIELD(P10_OB_ODL_PHY_CONFIG_DL_SELECT, val, 0b10); + xscom_write(pau->chip_id, reg, val); +} + static void pau_opencapi_enable_xsl_clocks(struct pau *pau) { uint64_t reg, val; @@ -1133,6 +1665,7 @@ static void pau_opencapi_init_hw(struct pau *pau) pau_for_each_opencapi_dev(dev, pau) { PAUDEVINF(dev, "Configuring link ...\n"); pau_opencapi_set_transport_mux_controls(dev); /* step 1 */ + pau_opencapi_odl_config_phy(dev); } pau_opencapi_enable_xsl_clocks(pau); /* step 2 */ pau_opencapi_enable_misc_clocks(pau); /* step 3 */ @@ -1200,6 +1733,9 @@ static void pau_opencapi_init_hw(struct pau *pau) /* done in pau_opencapi_setup_irqs() */ pau_opencapi_enable_interrupt_on_error(dev); + /* enable performance monitor */ + pau_opencapi_setup_perf_counters(dev); + /* Reset disabled. Place OTLs into Run State */ pau_opencapi_set_fence_control(dev, 0b00); } diff --git a/include/pau-regs.h b/include/pau-regs.h index b852a5b5..7a5aaa5f 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -142,6 +142,8 @@ #define PAU_OTL_MISC_ERROR_SIG_RXI(brk) (PAU_BLOCK_OTL(brk) + 0x070) #define PAU_OTL_MISC_ERROR_SIG_RXO(brk) (PAU_BLOCK_OTL(brk) + 0x078) #define PAU_OTL_MISC_ERR_RPT_HOLD1(brk) (PAU_BLOCK_OTL(brk) + 0x0B0) +#define PAU_OTL_MISC_CFG_TX2(brk) (PAU_BLOCK_OTL(brk) + 0x0C0) +#define PAU_OTL_MISC_CFG_TX2_SEND_EN PPC_BIT(0) #define PAU_OTL_MISC_PSL_DSISR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x000) #define PAU_OTL_MISC_PSL_DAR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x008) #define PAU_OTL_MISC_PSL_TFC_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x010) @@ -178,6 +180,9 @@ #define PAU_MISC_INT_1_CONFIG (PAU_BLOCK_PAU_MISC + 0x068) #define PAU_MISC_INT_BAR (PAU_BLOCK_PAU_MISC + 0x098) #define PAU_MISC_INT_BAR_ADDR PPC_BITMASK(0, 39) +#define PAU_MISC_FENCE_STATE (PAU_BLOCK_PAU_MISC + 0x0B0) +#define PAU_MISC_FENCE_STATE_CLEAR(brk) PPC_BIT(0 + (brk)) +#define PAU_MISC_FENCE_STATE_SET(brk) PPC_BIT(12 + (brk)) #define PAU_MISC_BDF2PE_CFG(n) (PAU_BLOCK_PAU_MISC + 0x100 + (n) * 8) #define PAU_MISC_BDF2PE_CFG_ENABLE PPC_BIT(0) #define PAU_MISC_BDF2PE_CFG_PE PPC_BITMASK(4, 7) diff --git a/include/pau.h b/include/pau.h index 660931c0..8b978bd6 100644 --- a/include/pau.h +++ b/include/pau.h @@ -40,6 +40,8 @@ struct pau_dev { struct dt_node *dn; struct phb phb; uint32_t status; + unsigned long train_start; + unsigned long train_timeout; struct pau_bar ntl_bar; struct pau_bar genid_bar; diff --git a/include/xscom-p10-regs.h b/include/xscom-p10-regs.h index 36c348bc..9d991916 100644 --- a/include/xscom-p10-regs.h +++ b/include/xscom-p10-regs.h @@ -56,4 +56,50 @@ #define P10_NCU_DARN_BAR_EN PPC_BIT(0) #define P10_NCU_DARN_BAR_ADDRMSK 0x000ffffffffff000ull /* 4k aligned */ +/* PB DLL Configuration Registers */ +#define P10_OB_ODL(ob) (0x18011000 + (ob) * 0x1000000) + +#define P10_OB_ODL_PHY_CONFIG(ob) (P10_OB_ODL(ob) + 0x0C) +#define P10_OB_ODL_PHY_CONFIG_LINK_SELECT PPC_BITMASK(56, 57) +#define P10_OB_ODL_PHY_CONFIG_DL_SELECT PPC_BITMASK(62, 63) + +#define P10_OB_ODL_PERF_MON_CONFIG(ob) (P10_OB_ODL(ob) + 0x1C) +#define P10_OB_ODL_PERF_MON_CONFIG_ENABLE PPC_BITMASK(0, 1) +#define P10_OB_ODL_PERF_MON_CONFIG_LINK0 0b10 +#define P10_OB_ODL_PERF_MON_CONFIG_LINK1 0b01 +#define P10_OB_ODL_PERF_MON_CONFIG_SIZE PPC_BITMASK(16, 23) +#define P10_OB_ODL_PERF_MON_CONFIG_SIZE16 0xFF + +#define P10_OB_ODL_PERF_MON_SELECT(ob) (P10_OB_ODL(ob) + 0x1D) +#define P10_OB_ODL_PERF_MON_SELECT_COUNTER PPC_BITMASK(0, 7) +#define P10_OB_ODL_PERF_MON_SELECT_CRC_ODL 0x44 +#define P10_OB_ODL_PERF_MON_SELECT_CRC_DLX 0x45 + +#define P10_OB_PERF_COUNTER0(ob) (P10_OB_ODL(ob) + 0x1E) +#define P10_OB_PERF_COUNTER0_LOW PPC_BITMASK(0, 31) +#define P10_OB_PERF_COUNTER0_HIGH PPC_BITMASK(32, 63) + +#define P10_OB_ODL_CONFIG(ob, brk) (P10_OB_ODL(ob) + 0x2A + brk) +#define P10_OB_ODL_CONFIG_RESET PPC_BIT(0) +#define P10_OB_ODL_CONFIG_VERSION PPC_BITMASK(2, 7) +#define P10_OB_ODL_CONFIG_TRAIN_MODE PPC_BITMASK(8, 11) +#define P10_OB_ODL_CONFIG_SUPPORTED_MODES PPC_BITMASK(12, 15) +#define P10_OB_ODL_CONFIG_X4_BACKOFF_ENABLE PPC_BIT(16) +#define P10_OB_ODL_CONFIG_PHY_CNTR_LIMIT PPC_BITMASK(20, 23) +#define P10_OB_ODL_CONFIG_DEBUG_ENABLE PPC_BIT(33) +#define P10_OB_ODL_CONFIG_FWD_PROGRESS_TIMER PPC_BITMASK(40, 43) + +#define P10_OB_ODL_STATUS(ob, brk) (P10_OB_ODL(ob) + 0x2C + brk) +#define P10_OB_ODL_STATUS_TRAINED_MODE PPC_BITMASK(0, 3) +#define P10_OB_ODL_STATUS_RX_TRAINED_LANES PPC_BITMASK(16, 23) +#define P10_OB_ODL_STATUS_TX_TRAINED_LANES PPC_BITMASK(24, 31) +#define P10_OB_ODL_STATUS_TRAINING_STATE PPC_BITMASK(49, 51) + +#define P10_OB_ODL_TRAIN_STAT(ob, brk) (P10_OB_ODL(ob) + 0x2E + brk) +#define P10_OB_ODL_TRAIN_STAT_PATTERN_B PPC_BITMASK(8, 15) + +#define P10_OB_ODL_DLX_INFO(ob, brk) (P10_OB_ODL(ob) + 0x32 + brk) + +#define P10_OB_ODL_LINK_SPEED_STATUS(ob, brk) (P10_OB_ODL(ob) + 0x34 + brk) + #endif /* __XSCOM_P10_REGS_H__ */ -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:53 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:53 +0200 Subject: [Skiboot] [PATCH 12/16] [PATCH 12/16] opencapi5: phy init In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-13-clombard@linux.vnet.ibm.com> Follow the Procedure IO_INIT_RESET_PON as described in the P10 OPHY workbook document to reset and initialize the PHY lanes. The memory mapped SRAM (64 bit aligned) has to be used to configure the PHY, which is reachable the linked registers: address and data. The different links can be configured at the same time, that implies using a global lock to avoid conflicts. Signed-off-by: Christophe Lombard --- hw/Makefile.inc | 2 +- hw/pau-hw-procedures.c | 310 +++++++++++++++++++++++++++++++++++++++++ include/pau.h | 11 ++ 3 files changed, 322 insertions(+), 1 deletion(-) create mode 100644 hw/pau-hw-procedures.c diff --git a/hw/Makefile.inc b/hw/Makefile.inc index 6e96318a..5ede16ee 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -9,7 +9,7 @@ HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o -HW_OBJS += ocmb.o xive2.o pau.o +HW_OBJS += ocmb.o xive2.o pau.o pau-hw-procedures.o HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/pau-hw-procedures.c b/hw/pau-hw-procedures.c new file mode 100644 index 00000000..bd299b3f --- /dev/null +++ b/hw/pau-hw-procedures.c @@ -0,0 +1,310 @@ +// SPDX-License-Identifier: Apache-2.0 +/* + * Copyright 2020 IBM Corp. + */ +#include +#include + +#define PAU_PHY_INIT_TIMEOUT 8000 /* ms */ + +#define PAU_PHY_ADDR_REG 0x10012C0D +#define PAU_PHY_ADDR_CHIPLET PPC_BITMASK(32, 39) +#define PAU_PHY_ADDR_SRAM_ADDR PPC_BITMASK(15, 31) +#define PAU_PHY_DATA_REG 0x10012C0E +#define PAU_PHY_DATA_CHIPLET PPC_BITMASK(32, 39) + +#define PAU_MAX_PHY_LANE 18 + +/* + * We configure the PHY using the memory mapped SRAM, which is + * accessible through a pair of (addr, data) registers. The caveat is + * that accesses to the SRAM must be 64-bit aligned, yet the PHY + * registers are 16-bit, so special care is needed. + * + * A PAU chiplet may control up to 2 OP units = 4 links and each link + * has its own virtual PHB in skiboot. They can be initialized or + * reset concurrently so we need a lock when accessing the SRAM. + + * See section "5.2.5 PPE SRAM" of the workbook for the layout of the + * SRAM registers. Here is the subset of the table which is meaningful + * for us, since we're only touching a few registers: + * + * Address Bytes Linker Symbol Description + * FFFF_11B0 16 _fw_regs0_start fw_regs for thread 0 + * FFFF_11C0 16 _fw_regs1_start fw_regs for thread 1 + * + * FFFF_2800 1024 _mem_regs0_start mem_regs for thread 0 + * FFFF_2C00 1024 _mem_regs1_start mem_regs for thread 1 + * + * In each PAU, per-group registers are replicated for every OP (each + * OP units is being called a 'thread' in the workbook). + * Per-lane registers have an offset < 0x10 and are replicated for + * each lane. Their offset in their section is: + * 0byyyyyxxxx (y = 5-bit lane number, x = 4-bit per-lane register offset) + */ + +struct PPE_sram_section { + uint32_t offset; + uint32_t size; +}; + +static struct PPE_sram_section PPE_FIRMWARE = { 0x111B0, 0x10 }; +static struct PPE_sram_section PPE_MEMORY = { 0x12800, 0x400 }; + +struct PPE_sram_reg { + struct PPE_sram_section *section; + uint32_t offset; +}; + +/* PPE firmware */ +static struct PPE_sram_reg PAU_PHY_EXT_CMD_LANES_00_15 = { &PPE_FIRMWARE, 0x000 }; +static struct PPE_sram_reg PAU_PHY_EXT_CMD_LANES_16_31 = { &PPE_FIRMWARE, 0x001 }; +static struct PPE_sram_reg PAU_PHY_EXT_CMD_REQ = { &PPE_FIRMWARE, 0x002 }; +#define PAU_PHY_EXT_CMD_REQ_IO_RESET PPC_BIT16(1) +#define PAU_PHY_EXT_CMD_REQ_DCCAL PPC_BIT16(3) +#define PAU_PHY_EXT_CMD_REQ_TX_ZCAL PPC_BIT16(4) +#define PAU_PHY_EXT_CMD_REQ_TX_FFE PPC_BIT16(5) +#define PAU_PHY_EXT_CMD_REQ_POWER_ON PPC_BIT16(7) +static struct PPE_sram_reg PAU_PHY_EXT_CMD_DONE = { &PPE_FIRMWARE, 0x005 }; + +/* PPE memory */ +static struct PPE_sram_reg PAU_PHY_RX_PPE_CNTL1 = { &PPE_MEMORY, 0x000 }; +#define PAU_PHY_RX_ENABLE_AUTO_RECAL PPC_BIT16(1) + +enum pau_phy_status { + PAU_PROC_INPROGRESS, + PAU_PROC_COMPLETE, + PAU_PROC_NEXT, + PAU_PROC_FAILED +}; + +struct procedure { + const char *name; + uint32_t (*steps[])(struct pau_dev *); +}; + +#define DEFINE_PROCEDURE(NAME, STEPS...) \ + static struct procedure procedure_##NAME = { \ + .name = #NAME, \ + .steps = { STEPS } \ + } + +/* + * We could/should have one phy_sram_lock per PAU chiplet. Each PAU + * chiplet drives 2 OPT units. Since we don't have a PAU chiplet + * structure to host the lock and don't anticipate much contention, we + * go with a global lock for now + */ +static struct lock phy_sram_lock = LOCK_UNLOCKED; + +static int get_thread_id(uint32_t op_unit) +{ + int ppe_thread[8] = { 0, 1, 1, 0, 1, 0, 1, 0 }; + + /* static mapping between OP unit and PPE thread ID */ + if (op_unit >= sizeof(ppe_thread)) + return -1; + return ppe_thread[op_unit]; +} + +/* + * Compute the address in the memory mapped SRAM of a 16-bit PHY register + */ +static uint32_t pau_phy_sram_addr(struct pau_dev *dev, + struct PPE_sram_reg *reg, + int lane) +{ + uint32_t base, addr; + + base = reg->section->offset + + reg->section->size * get_thread_id(dev->op_unit); + addr = reg->offset; + if (lane >= 0) { + assert(reg->offset < 0x10); + addr += lane << 4; + } + addr <<= 1; // each register is 16-bit + return base + addr; +} + +static void pau_phy_set_access(struct pau_dev *dev, + struct PPE_sram_reg *reg, int lane, + uint64_t *data_addr, uint64_t *mask) +{ + struct pau *pau = dev->pau; + uint64_t scom_addr, sram_addr, addr, bit_start; + + scom_addr = SETFIELD(PAU_PHY_ADDR_CHIPLET, PAU_PHY_ADDR_REG, + pau->op_chiplet); + sram_addr = pau_phy_sram_addr(dev, reg, lane); + bit_start = 8 * (sram_addr & 7); + + addr = SETFIELD(PAU_PHY_ADDR_SRAM_ADDR, 0ull, sram_addr & 0xFFFFFFF8); + xscom_write(pau->chip_id, scom_addr, addr); + + *data_addr = SETFIELD(PAU_PHY_DATA_CHIPLET, PAU_PHY_DATA_REG, + pau->op_chiplet); + *mask = PPC_BITMASK(bit_start, bit_start + 15); +} + +static void pau_phy_write_lane(struct pau_dev *dev, + struct PPE_sram_reg *reg, int lane, + uint16_t val) +{ + struct pau *pau = dev->pau; + uint64_t data_addr, scom_val, mask; + + lock(&phy_sram_lock); + pau_phy_set_access(dev, reg, lane, &data_addr, &mask); + xscom_read(pau->chip_id, data_addr, &scom_val); + scom_val = SETFIELD(mask, scom_val, val); + xscom_write(pau->chip_id, data_addr, scom_val); + unlock(&phy_sram_lock); +} + +static uint16_t pau_phy_read_lane(struct pau_dev *dev, + struct PPE_sram_reg *reg, int lane) +{ + struct pau *pau = dev->pau; + uint64_t data_addr, scom_val, mask; + uint16_t res; + + lock(&phy_sram_lock); + pau_phy_set_access(dev, reg, lane, &data_addr, &mask); + xscom_read(pau->chip_id, data_addr, &scom_val); + res = GETFIELD(mask, scom_val); + unlock(&phy_sram_lock); + return res; +} + +static void pau_phy_write(struct pau_dev *dev, struct PPE_sram_reg *reg, + uint16_t val) +{ + pau_phy_write_lane(dev, reg, -1, val); +} + +static uint16_t pau_phy_read(struct pau_dev *dev, struct PPE_sram_reg *reg) +{ + return pau_phy_read_lane(dev, reg, -1); +} + +static uint16_t get_reset_request_val(void) +{ + return PAU_PHY_EXT_CMD_REQ_IO_RESET | + PAU_PHY_EXT_CMD_REQ_DCCAL | + PAU_PHY_EXT_CMD_REQ_TX_ZCAL | + PAU_PHY_EXT_CMD_REQ_TX_FFE | + PAU_PHY_EXT_CMD_REQ_POWER_ON; +} + +static uint32_t reset_start(struct pau_dev *dev) +{ + uint16_t val16; + + // Procedure IO_INIT_RESET_PON + + // Clear external command request / done registers + val16 = 0; + pau_phy_write(dev, &PAU_PHY_EXT_CMD_REQ, val16); + pau_phy_write(dev, &PAU_PHY_EXT_CMD_DONE, val16); + + // Write the external command lanes to target + val16 = dev->phy_lane_mask >> 16; + pau_phy_write(dev, &PAU_PHY_EXT_CMD_LANES_00_15, val16); + val16 = dev->phy_lane_mask & 0xFFFF; + pau_phy_write(dev, &PAU_PHY_EXT_CMD_LANES_16_31, val16); + + // Initialize PHY Lanes + val16 = get_reset_request_val(); + pau_phy_write(dev, &PAU_PHY_EXT_CMD_REQ, val16); + return PAU_PROC_NEXT; +} + +static uint32_t reset_check(struct pau_dev *dev) +{ + uint16_t val16, done; + + val16 = get_reset_request_val(); + done = pau_phy_read(dev, &PAU_PHY_EXT_CMD_DONE); + + if (val16 == done) + return PAU_PROC_NEXT; + else + return PAU_PROC_INPROGRESS; +} + +static uint32_t enable_recal(struct pau_dev *dev) +{ + uint32_t lane; + + // Enable auto-recalibration + for (lane = 0; lane <= PAU_MAX_PHY_LANE; lane++) + if (!(dev->phy_lane_mask & (1 << (31 - lane)))) + continue; + else + pau_phy_write_lane(dev, &PAU_PHY_RX_PPE_CNTL1, + lane, PAU_PHY_RX_ENABLE_AUTO_RECAL); + + return PAU_PROC_COMPLETE; +} + +DEFINE_PROCEDURE(phy_reset, reset_start, reset_check, enable_recal); + +static enum pau_phy_status run_steps(struct pau_dev *dev) +{ + struct procedure *p = &procedure_phy_reset; + struct phy_proc_state *procedure_state = &dev->pau->procedure_state; + enum pau_phy_status rc; + + do { + rc = p->steps[procedure_state->step](dev); + if (rc == PAU_PROC_NEXT) { + procedure_state->step++; + PAUDEVDBG(dev, "Running procedure %s step %d\n", + p->name, procedure_state->step); + } + } while (rc == PAU_PROC_NEXT); + return rc; +} + +static enum pau_phy_status run_procedure(struct pau_dev *dev) +{ + struct procedure *p = &procedure_phy_reset; + struct phy_proc_state *procedure_state = &dev->pau->procedure_state; + enum pau_phy_status rc; + + do { + rc = run_steps(dev); + if (rc == PAU_PROC_INPROGRESS) { + if (tb_compare(mftb(), procedure_state->timeout) == TB_AAFTERB) { + PAUDEVERR(dev, "Procedure %s timed out\n", p->name); + rc = PAU_PROC_FAILED; + } else { + time_wait_ms(1); + } + } + } while (rc == PAU_PROC_INPROGRESS); + return rc; +} + +int pau_dev_phy_reset(struct pau_dev *dev) +{ + struct procedure *p = &procedure_phy_reset; + struct phy_proc_state *procedure_state = &dev->pau->procedure_state; + enum pau_phy_status rc; + + lock(&procedure_state->lock); + procedure_state->step = 0; + procedure_state->timeout = mftb() + msecs_to_tb(PAU_PHY_INIT_TIMEOUT); + PAUDEVDBG(dev, "Running procedure %s step %d\n", + p->name, procedure_state->step); + rc = run_procedure(dev); + unlock(&procedure_state->lock); + + if (rc == PAU_PROC_COMPLETE) { + PAUDEVDBG(dev, "Procedure %s complete\n", p->name); + return OPAL_SUCCESS; + } + PAUDEVDBG(dev, "Procedure %s failed\n", p->name); + return OPAL_HARDWARE; +} diff --git a/include/pau.h b/include/pau.h index ea15f0b8..660931c0 100644 --- a/include/pau.h +++ b/include/pau.h @@ -28,6 +28,12 @@ struct pau_bar { uint64_t cfg; }; +struct phy_proc_state { + struct lock lock; /* protect any change to this structure */ + unsigned long timeout; + uint16_t step; +}; + struct pau_dev { enum pau_dev_type type; uint32_t index; @@ -54,6 +60,7 @@ struct pau { uint32_t index; struct dt_node *dt_node; uint32_t chip_id; + uint32_t op_chiplet; /* from pervasive: 0x10 -> 0x13 */ uint64_t xscom_base; /* Global MMIO window (all PAU regs) */ @@ -65,6 +72,7 @@ struct pau { uint32_t links; struct pau_dev devices[PAU_LINKS_OPENCAPI_PER_PAU]; + struct phy_proc_state procedure_state; }; #define PAUDBG(pau, fmt, a...) PAULOG(PR_DEBUG, pau, fmt, ##a) @@ -191,4 +199,7 @@ static inline uint64_t pau_read(struct pau *pau, uint64_t reg) void pau_opencapi_dump_scoms(struct phb *phb); +/* PHY */ +int pau_dev_phy_reset(struct pau_dev *dev); + #endif /* __PAU_H */ -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:51 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:51 +0200 Subject: [Skiboot] [PATCH 10/16] [PATCH 10/16] opencapi5: complete phb ops In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-11-clombard@linux.vnet.ibm.com> Add more PHB interfaces: - to control pci error type in case of freeze. - add the addresses of the registers needed by the OS to handle translation failures. - to detect the fence state of a specific brick - to configure BDF (Bus Device Function) and PE (Partitionable Endpoint) for context identification. Signed-off-by: Christophe Lombard --- hw/pau.c | 143 +++++++++++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 9 +++ 2 files changed, 152 insertions(+) diff --git a/hw/pau.c b/hw/pau.c index 98abe704..68195e48 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -597,6 +597,144 @@ PAU_OPENCAPI_PCI_CFG_WRITE(8, u8) PAU_OPENCAPI_PCI_CFG_WRITE(16, u16) PAU_OPENCAPI_PCI_CFG_WRITE(32, u32) +static int64_t pau_opencapi_eeh_freeze_status(struct phb *phb __unused, + uint64_t pe_num __unused, + uint8_t *freeze_state, + uint16_t *pci_error_type, + uint16_t *severity) +{ + /* + * FIXME: When it's called by skiboot PCI config accessor, + * the PE number is fixed to 0, which is incorrect. We need + * introduce another PHB callback to translate it. For now, + * it keeps the skiboot PCI enumeration going. + */ + *freeze_state = OPAL_EEH_STOPPED_NOT_FROZEN; + *pci_error_type = OPAL_EEH_NO_ERROR; + + if (severity) + *severity = OPAL_EEH_SEV_NO_ERROR; + + return OPAL_SUCCESS; +} + +static int64_t pau_opencapi_ioda_reset(struct phb __unused * phb, + bool __unused purge) +{ + /* Not relevant to OpenCAPI - we do this just to silence the error */ + return OPAL_SUCCESS; +} + +static int64_t pau_opencapi_next_error(struct phb *phb, + uint64_t *first_frozen_pe, + uint16_t *pci_error_type, + uint16_t *severity) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint32_t pe_num; + uint64_t val; + + if (!first_frozen_pe || !pci_error_type || !severity) + return OPAL_PARAMETER; + + if (dev->status & PAU_DEV_STATUS_BROKEN) { + val = pau_read(pau, PAU_MISC_BDF2PE_CFG(dev->index)); + pe_num = GETFIELD(PAU_MISC_BDF2PE_CFG_PE, val); + + PAUDEVDBG(dev, "Reporting device as broken\n"); + PAUDEVDBG(dev, "Brick %d fenced! (pe_num: %08x\n", + pau_dev_index(dev, PAU_LINKS_OPENCAPI_PER_PAU), + pe_num); + *first_frozen_pe = pe_num; + *pci_error_type = OPAL_EEH_PHB_ERROR; + *severity = OPAL_EEH_SEV_PHB_DEAD; + } else { + *first_frozen_pe = -1; + *pci_error_type = OPAL_EEH_NO_ERROR; + *severity = OPAL_EEH_SEV_NO_ERROR; + } + return OPAL_SUCCESS; +} + +static uint32_t pau_opencapi_dev_interrupt_level(struct pau_dev *dev) +{ + /* Interrupt Levels + * 35: Translation failure for OCAPI link 0 + * 36: Translation failure for OCAPI link 1 + */ + const uint32_t level[2] = {35, 36}; + + return level[dev->index]; +} + +static int pau_opencapi_dt_add_interrupts(struct phb *phb, + struct pci_device *pd, + void *data __unused) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint64_t dsisr, dar, tfc, handle; + uint32_t irq; + + irq = pau->irq_base + pau_opencapi_dev_interrupt_level(dev); + + /* When an address translation fail causes the PAU to send an + * interrupt, information is stored in three registers for use + * by the interrupt handler. The OS accesses them by mmio. + */ + dsisr = pau->regs[0] + PAU_OTL_MISC_PSL_DSISR_AN(dev->index); + dar = pau->regs[0] + PAU_OTL_MISC_PSL_DAR_AN(dev->index); + tfc = pau->regs[0] + PAU_OTL_MISC_PSL_TFC_AN(dev->index); + handle = pau->regs[0] + PAU_OTL_MISC_PSL_PEHANDLE_AN(dev->index); + dt_add_property_cells(pd->dn, "ibm,opal-xsl-irq", irq); + dt_add_property_cells(pd->dn, "ibm,opal-xsl-mmio", + hi32(dsisr), lo32(dsisr), + hi32(dar), lo32(dar), + hi32(tfc), lo32(tfc), + hi32(handle), lo32(handle)); + return 0; +} + +static void pau_opencapi_phb_final_fixup(struct phb *phb) +{ + pci_walk_dev(phb, NULL, pau_opencapi_dt_add_interrupts, NULL); +} + +static int64_t pau_opencapi_set_pe(struct phb *phb, + uint64_t pe_num, + uint64_t bdfn, + uint8_t bcompare, + uint8_t dcompare, + uint8_t fcompare, + uint8_t action) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint64_t val; + + PAUDEVDBG(dev, "Set partitionable endpoint = %08llx, bdfn = %08llx\n", + pe_num, bdfn); + + if (action != OPAL_MAP_PE && action != OPAL_UNMAP_PE) + return OPAL_PARAMETER; + + if (pe_num >= PAU_MAX_PE_NUM) + return OPAL_PARAMETER; + + if (bcompare != OpalPciBusAll || + dcompare != OPAL_COMPARE_RID_DEVICE_NUMBER || + fcompare != OPAL_COMPARE_RID_FUNCTION_NUMBER) + return OPAL_UNSUPPORTED; + + val = PAU_MISC_BDF2PE_CFG_ENABLE; + val = SETFIELD(PAU_MISC_BDF2PE_CFG_PE, val, pe_num); + val = SETFIELD(PAU_MISC_BDF2PE_CFG_BDF, val, 0); + pau_write(pau, PAU_MISC_BDF2PE_CFG(dev->index), val); + + return OPAL_SUCCESS; +} + static const struct phb_ops pau_opencapi_ops = { .cfg_read8 = pau_opencapi_pcicfg_read8, .cfg_read16 = pau_opencapi_pcicfg_read16, @@ -604,6 +742,11 @@ static const struct phb_ops pau_opencapi_ops = { .cfg_write8 = pau_opencapi_pcicfg_write8, .cfg_write16 = pau_opencapi_pcicfg_write16, .cfg_write32 = pau_opencapi_pcicfg_write32, + .eeh_freeze_status = pau_opencapi_eeh_freeze_status, + .next_error = pau_opencapi_next_error, + .ioda_reset = pau_opencapi_ioda_reset, + .phb_final_fixup = pau_opencapi_phb_final_fixup, + .set_pe = pau_opencapi_set_pe, }; static void pau_opencapi_create_phb(struct pau_dev *dev) diff --git a/include/pau-regs.h b/include/pau-regs.h index d98f435b..19b0b7cd 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -33,6 +33,7 @@ #define PAU_BLOCK_CQ_CTL PAU_BLOCK(4, 4) #define PAU_BLOCK_CQ_DAT PAU_BLOCK(4, 5) #define PAU_BLOCK_OTL(brk) PAU_BLOCK(4, 0xC + (brk)) +#define PAU_BLOCK_OTL_PSL(brk) PAU_BLOCK(0, 0xC + (brk)) #define PAU_BLOCK_XSL PAU_BLOCK(4, 0xE) #define PAU_BLOCK_PAU_XTS PAU_BLOCK(7, 1) #define PAU_BLOCK_PAU_MISC PAU_BLOCK(7, 2) @@ -117,6 +118,10 @@ #define PAU_OTL_MISC_CFG_TX_TEMP2_RATE PPC_BITMASK(16, 19) #define PAU_OTL_MISC_CFG_TX_TEMP3_RATE PPC_BITMASK(20, 23) #define PAU_OTL_MISC_CFG_TX_CRET_FREQ PPC_BITMASK(32, 34) +#define PAU_OTL_MISC_PSL_DSISR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x000) +#define PAU_OTL_MISC_PSL_DAR_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x008) +#define PAU_OTL_MISC_PSL_TFC_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x010) +#define PAU_OTL_MISC_PSL_PEHANDLE_AN(brk) (PAU_BLOCK_OTL_PSL(brk) + 0x018) /* XSL block registers */ #define PAU_XSL_WRAP_CFG (PAU_BLOCK_XSL + 0x100) @@ -149,6 +154,10 @@ #define PAU_MISC_INT_1_CONFIG (PAU_BLOCK_PAU_MISC + 0x068) #define PAU_MISC_INT_BAR (PAU_BLOCK_PAU_MISC + 0x098) #define PAU_MISC_INT_BAR_ADDR PPC_BITMASK(0, 39) +#define PAU_MISC_BDF2PE_CFG(n) (PAU_BLOCK_PAU_MISC + 0x100 + (n) * 8) +#define PAU_MISC_BDF2PE_CFG_ENABLE PPC_BIT(0) +#define PAU_MISC_BDF2PE_CFG_PE PPC_BITMASK(4, 7) +#define PAU_MISC_BDF2PE_CFG_BDF PPC_BITMASK(8, 23) #define PAU_MISC_INT_2_CONFIG (PAU_BLOCK_PAU_MISC + 0x408) #define PAU_MISC_INT_2_CONFIG_XFAULT_2_5(n) PPC_BIT(0 + (n)) #define PAU_MISC_INT_2_CONFIG_XFAULT_0_1(n) PPC_BIT(54 + (n)) -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:56 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:56 +0200 Subject: [Skiboot] [PATCH 15/16] [PATCH 15/16] opencapi5: mmio invalidates In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-16-clombard@linux.vnet.ibm.com> The remaining translation mode: OpenCAPI 5.0 with TLBI/SLBI Snooping, is not used due to performance problems caused by the mismatch between the ERAT and Bloom Filter sizes. When the Address Translation Mode requires TLB and SLB Invalidate operations to be initiated using MMIO registers, a set of registers like the following is used: ? XTS MMIO ATSD0 LPARID register ? XTS MMIO ATSD0 AVA register ? XTS MMIO ATSD0 launch register, write access initiates a shoot down ? XTS MMIO ATSD0 status register Signed-off-by: Christophe Lombard --- hw/npu-opal.c | 3 +++ hw/pau.c | 36 ++++++++++++++++++++++++++++++++++++ include/pau-regs.h | 8 ++++++++ include/pau.h | 2 ++ 4 files changed, 49 insertions(+) diff --git a/hw/npu-opal.c b/hw/npu-opal.c index 4fc4c662..50aa8675 100644 --- a/hw/npu-opal.c +++ b/hw/npu-opal.c @@ -60,6 +60,9 @@ static int64_t opal_npu_map_lpar(uint64_t phb_id, uint64_t bdf, uint64_t lparid, if (phb->phb_type == phb_type_npu_v3) return npu3_map_lpar(phb, bdf, lparid, lpcr); + if (phb->phb_type == phb_type_pau_opencapi) + return pau_opencapi_map_atsd_lpar(phb, bdf, lparid, lpcr); + return OPAL_PARAMETER; } opal_call(OPAL_NPU_MAP_LPAR, opal_npu_map_lpar, 4); diff --git a/hw/pau.c b/hw/pau.c index 33d33c65..1d11aeac 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -274,6 +274,29 @@ static void pau_device_detect_fixup(struct pau_dev *dev) dt_add_property_strings(dn, "ibm,pau-link-type", "unknown"); } +int64_t pau_opencapi_map_atsd_lpar(struct phb *phb, uint64_t __unused bdf, + uint64_t lparid, uint64_t __unused lpcr) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + struct pau *pau = dev->pau; + uint64_t val; + + if (lparid >= PAU_XTS_ATSD_MAX) + return OPAL_PARAMETER; + + lock(&pau->lock); + + /* We need to allocate an ATSD per link */ + val = SETFIELD(PAU_XTS_ATSD_HYP_LPARID, 0ull, lparid); + if (!lparid) + val |= PAU_XTS_ATSD_HYP_MSR_HV; + + pau_write(pau, PAU_XTS_ATSD_HYP(lparid), val); + + unlock(&pau->lock); + return OPAL_SUCCESS; +} + int64_t pau_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, uint64_t addr, uint64_t PE_mask) { @@ -1465,6 +1488,18 @@ static void pau_opencapi_create_phb(struct pau_dev *dev) pau_opencapi_create_phb_slot(dev); } +static void pau_dt_add_mmio_atsd(struct pau_dev *dev) +{ + struct dt_node *dn = dev->phb.dt_node; + struct pau *pau = dev->pau; + uint64_t mmio_atsd[PAU_XTS_ATSD_MAX]; + + for (uint32_t i = 0; i < PAU_XTS_ATSD_MAX; i++) + mmio_atsd[i] = pau->regs[0] + PAU_XTS_ATSD_LAUNCH(i); + + dt_add_property(dn, "ibm,mmio-atsd", mmio_atsd, sizeof(mmio_atsd)); +} + static void pau_opencapi_dt_add_mmio_window(struct pau_dev *dev) { struct dt_node *dn = dev->phb.dt_node; @@ -1531,6 +1566,7 @@ static void pau_opencapi_dt_add_props(struct pau_dev *dev) dt_add_property_cells(dn, "ibm,opal-num-pes", PAU_MAX_PE_NUM); dt_add_property_cells(dn, "ibm,opal-reserved-pe", PAU_RESERVED_PE_NUM); + pau_dt_add_mmio_atsd(dev); pau_opencapi_dt_add_mmio_window(dev); pau_opencapi_dt_add_hotpluggable(dev); } diff --git a/include/pau-regs.h b/include/pau-regs.h index 57c2d723..da83ad44 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -37,6 +37,7 @@ #define PAU_BLOCK_XSL PAU_BLOCK(4, 0xE) #define PAU_BLOCK_PAU_XTS PAU_BLOCK(7, 1) #define PAU_BLOCK_PAU_MISC PAU_BLOCK(7, 2) +#define PAU_BLOCK_PAU_XTS_ATSD(n) PAU_BLOCK(8, (n)) /* * CQ_SM block registers @@ -176,6 +177,9 @@ #define PAU_XTS_CFG2_XSL2_ENA PPC_BIT(55) #define PAU_XTS_CFG3 (PAU_BLOCK_PAU_XTS + 0x068) #define PAU_XTS_CFG3_MMIOSD_OCAPI PPC_BIT(5) +#define PAU_XTS_ATSD_HYP(n) (PAU_BLOCK_PAU_XTS + 0x100 + (n) * 8) +#define PAU_XTS_ATSD_HYP_MSR_HV PPC_BIT(51) +#define PAU_XTS_ATSD_HYP_LPARID PPC_BITMASK(52, 63) /* MISC block registers */ #define PAU_MISC_OPTICAL_IO_CONFIG (PAU_BLOCK_PAU_MISC + 0x018) @@ -204,4 +208,8 @@ #define PAU_MISC_INT_2_CONFIG_XFAULT_2_5(n) PPC_BIT(0 + (n)) #define PAU_MISC_INT_2_CONFIG_XFAULT_0_1(n) PPC_BIT(54 + (n)) +/* PAU_XTS_ATSD block registers */ +#define PAU_XTS_ATSD_LAUNCH(n) (PAU_BLOCK_PAU_XTS_ATSD(n) + 0x000) +#define PAU_XTS_ATSD_MAX 16 + #endif /* __PAU_REGS_H */ diff --git a/include/pau.h b/include/pau.h index 61b17925..9b612fc2 100644 --- a/include/pau.h +++ b/include/pau.h @@ -200,6 +200,8 @@ static inline uint64_t pau_read(struct pau *pau, uint64_t reg) } void pau_opencapi_dump_scoms(struct phb *phb); +int64_t pau_opencapi_map_atsd_lpar(struct phb *phb, uint64_t __unused bdf, + uint64_t lparid, uint64_t __unused lpcr); int64_t pau_opencapi_spa_setup(struct phb *phb, uint32_t __unused bdfn, uint64_t addr, uint64_t PE_mask); int64_t pau_opencapi_spa_clear_cache(struct phb *phb, -- 2.31.1 From clombard at linux.vnet.ibm.com Fri Aug 20 19:45:57 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Fri, 20 Aug 2021 11:45:57 +0200 Subject: [Skiboot] [PATCH 16/16] [PATCH 16/16] opencapi5: Add support for OpenCAPI Persistent Memory devices. In-Reply-To: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> Message-ID: <20210820094557.29743-17-clombard@linux.vnet.ibm.com> Lowest Point of Coherency (LPC) memory allows the host to access memory on an OpenCAPI device. When the P10 chip accesses memory addresses on the AFU, the Real Address on the PowerBus must hit a BAR in the PAU such as GPU-Memory BAR. The BAR defines the range of Real Addresses that represent AFU memory. The two existing OPAL calls, OPAL_NPU_MEM_ALLOC and OPAL_NPU_MEM_RELEASE are used to manage the AFU momory. Signed-off-by: Christophe Lombard --- hw/npu-opal.c | 35 +++++++++++++++++ hw/npu2-opencapi.c | 18 ++------- hw/pau.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++ hw/phys-map.c | 3 ++ include/npu2.h | 3 ++ include/pau-regs.h | 8 ++++ include/pau.h | 4 ++ 7 files changed, 149 insertions(+), 15 deletions(-) diff --git a/hw/npu-opal.c b/hw/npu-opal.c index 50aa8675..0f0b7bbe 100644 --- a/hw/npu-opal.c +++ b/hw/npu-opal.c @@ -252,3 +252,38 @@ static int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, return rc; } opal_call(OPAL_NPU_TL_SET, opal_npu_tl_set, 5); + +static int64_t opal_npu_mem_alloc(uint64_t phb_id, uint32_t bdfn, + uint64_t size, uint64_t *bar) +{ + struct phb *phb = pci_get_phb(phb_id); + int64_t rc = OPAL_SUCCESS; + + if (!phb) + return OPAL_PARAMETER; + + if (phb->phb_type == phb_type_npu_v2_opencapi) + rc = npu2_opencapi_mem_alloc(phb, bdfn, size, bar); + else if (phb->phb_type == phb_type_pau_opencapi) + rc = pau_opencapi_mem_alloc(phb, bdfn, size, bar); + + return rc; +} +opal_call(OPAL_NPU_MEM_ALLOC, opal_npu_mem_alloc, 4); + +static int64_t opal_npu_mem_release(uint64_t phb_id, uint32_t bdfn) +{ + struct phb *phb = pci_get_phb(phb_id); + int64_t rc = OPAL_SUCCESS; + + if (!phb) + return OPAL_PARAMETER; + + if (phb->phb_type == phb_type_npu_v2_opencapi) + rc = npu2_opencapi_mem_release(phb, bdfn); + else if (phb->phb_type == phb_type_pau_opencapi) + rc = pau_opencapi_mem_release(phb, bdfn); + + return rc; +} +opal_call(OPAL_NPU_MEM_RELEASE, opal_npu_mem_release, 2); diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c index 686f2e22..5a0d060e 100644 --- a/hw/npu2-opencapi.c +++ b/hw/npu2-opencapi.c @@ -2300,18 +2300,13 @@ out: return rc; } -static int64_t opal_npu_mem_alloc(uint64_t phb_id, uint32_t __unused bdfn, - uint64_t size, __be64 *__bar) +int64_t npu2_opencapi_mem_alloc(struct phb *phb, uint32_t __unused bdfn, + uint64_t size, uint64_t *__bar) { - struct phb *phb = pci_get_phb(phb_id); struct npu2_dev *dev; uint64_t bar; int64_t rc; - - if (!phb || phb->phb_type != phb_type_npu_v2_opencapi) - return OPAL_PARAMETER; - dev = phb_to_npu2_dev_ocapi(phb); if (!dev) return OPAL_PARAMETER; @@ -2325,21 +2320,14 @@ static int64_t opal_npu_mem_alloc(uint64_t phb_id, uint32_t __unused bdfn, return rc; } -opal_call(OPAL_NPU_MEM_ALLOC, opal_npu_mem_alloc, 4); -static int64_t opal_npu_mem_release(uint64_t phb_id, uint32_t __unused bdfn) +int64_t npu2_opencapi_mem_release(struct phb *phb, uint32_t __unused bdfn) { - struct phb *phb = pci_get_phb(phb_id); struct npu2_dev *dev; - - if (!phb || phb->phb_type != phb_type_npu_v2_opencapi) - return OPAL_PARAMETER; - dev = phb_to_npu2_dev_ocapi(phb); if (!dev) return OPAL_PARAMETER; return release_mem_bar(dev); } -opal_call(OPAL_NPU_MEM_RELEASE, opal_npu_mem_release, 2); diff --git a/hw/pau.c b/hw/pau.c index 1d11aeac..874cf85a 100644 --- a/hw/pau.c +++ b/hw/pau.c @@ -453,6 +453,99 @@ int64_t pau_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, return OPAL_SUCCESS; } +static int64_t pau_opencapi_afu_memory_bars(struct pau_dev *dev, + uint64_t size, + uint64_t *bar) +{ + struct pau *pau = dev->pau; + uint64_t addr, psize; + uint64_t reg, val; + + PAUDEVDBG(dev, "Setup AFU Memory BARs\n"); + + if (dev->memory_bar.enable) { + PAUDEVERR(dev, "AFU memory allocation failed - BAR already in use\n"); + return OPAL_RESOURCE; + } + + phys_map_get(pau->chip_id, OCAPI_MEM, + dev->index, + &addr, &psize); + + if (size > psize) { + PAUDEVERR(dev, "Invalid AFU memory BAR allocation size " + "requested: 0x%llx bytes (limit 0x%llx)\n", + size, psize); + return OPAL_PARAMETER; + } + + if (size < (1 << 30)) + size = 1 << 30; + + dev->memory_bar.enable = true; + dev->memory_bar.addr = addr; + dev->memory_bar.size = size; + + reg = PAU_GPU_MEM_BAR(dev->index); + val = PAU_GPU_MEM_BAR_ENABLE | + PAU_GPU_MEM_BAR_POISON; + val = SETFIELD(PAU_GPU_MEM_BAR_ADDR, val, addr >> 30); + if (!is_pow2(size)) + size = 1ull << (ilog2(size) + 1); + + size = (size >> 30) - 1; + val = SETFIELD(PAU_GPU_MEM_BAR_SIZE, val, size); + pau_write(pau, reg, val); + + reg = PAU_CTL_MISC_GPU_MEM_BAR(dev->index); + pau_write(pau, reg, val); + + reg = PAU_XSL_GPU_MEM_BAR(dev->index); + pau_write(pau, reg, val); + + *bar = addr; + return OPAL_SUCCESS; +} + +int64_t pau_opencapi_mem_alloc(struct phb *phb, uint32_t __unused bdfn, + uint64_t size, uint64_t *bar) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + int64_t rc; + + if (!dev) + return OPAL_PARAMETER; + + if (!opal_addr_valid(bar)) + return OPAL_PARAMETER; + + lock(&dev->pau->lock); + rc = pau_opencapi_afu_memory_bars(dev, size, bar); + + unlock(&dev->pau->lock); + return rc; +} + +int64_t pau_opencapi_mem_release(struct phb *phb, uint32_t __unused bdfn) +{ + struct pau_dev *dev = pau_phb_to_opencapi_dev(phb); + + if (!dev) + return OPAL_PARAMETER; + + lock(&dev->pau->lock); + pau_write(dev->pau, PAU_GPU_MEM_BAR(dev->index), 0ull); + pau_write(dev->pau, PAU_CTL_MISC_GPU_MEM_BAR(dev->index), 0ull); + pau_write(dev->pau, PAU_XSL_GPU_MEM_BAR(dev->index), 0ull); + + dev->memory_bar.enable = false; + dev->memory_bar.addr = 0ull; + dev->memory_bar.size = 0ull; + unlock(&dev->pau->lock); + + return OPAL_SUCCESS; +} + #define CQ_CTL_STATUS_TIMEOUT 10 /* milliseconds */ static int pau_opencapi_set_fence_control(struct pau_dev *dev, diff --git a/hw/phys-map.c b/hw/phys-map.c index 7b44fc61..68d7cd0d 100644 --- a/hw/phys-map.c +++ b/hw/phys-map.c @@ -32,6 +32,9 @@ static const struct phys_map_entry phys_map_table_p10[] = { /* TODO: Figure out GPU memory */ + { OCAPI_MEM, 0, 0x0002000000000000ull, 0x0000040000000000ull }, + { OCAPI_MEM, 1, 0x0002004000000000ull, 0x0000040000000000ull }, + /* 0 TB offset @ MMIO 0x0006000000000000ull */ { PHB5_64BIT_MMIO, 0, 0x0006000000000000ull, 0x0000004000000000ull }, { PHB5_64BIT_MMIO, 1, 0x0006004000000000ull, 0x0000004000000000ull }, diff --git a/include/npu2.h b/include/npu2.h index abe88747..c24861ab 100644 --- a/include/npu2.h +++ b/include/npu2.h @@ -277,5 +277,8 @@ int64_t npu2_opencapi_spa_clear_cache(struct phb *phb, uint32_t __unused bdfn, uint64_t PE_handle); int64_t npu2_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, long capabilities, uint64_t rate_phys, int rate_sz); +int64_t npu2_opencapi_mem_alloc(struct phb *phb, uint32_t __unused bdfn, + uint64_t size, uint64_t *bar); +int64_t npu2_opencapi_mem_release(struct phb *phb, uint32_t __unused bdfn); #endif /* __NPU2_H */ diff --git a/include/pau-regs.h b/include/pau-regs.h index da83ad44..45c36037 100644 --- a/include/pau-regs.h +++ b/include/pau-regs.h @@ -64,6 +64,12 @@ #define PAU_SNP_MISC_CFG0_ENABLE_PBUS PPC_BIT(2) #define PAU_SNP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(32, 36) #define PAU_SNP_MISC_CFG0_OCAPI_C2 PPC_BITMASK(45, 49) +#define PAU_GPU_MEM_BAR(brk) (PAU_BLOCK_CQ_SM(0) + 0x190 + (brk) * 8) +#define PAU_GPU_MEM_BAR_ENABLE PPC_BIT(0) +#define PAU_GPU_MEM_BAR_ADDR_MASK PPC_BITMASK(1, 35) +#define PAU_GPU_MEM_BAR_ADDR PPC_BITMASK(1, 21) +#define PAU_GPU_MEM_BAR_SIZE PPC_BITMASK(22, 35) +#define PAU_GPU_MEM_BAR_POISON PPC_BIT(45) #define PAU_NTL_BAR(brk) (PAU_BLOCK_CQ_SM(0) + 0x1b8 + (brk) * 8) #define PAU_NTL_BAR_ENABLE PPC_BIT(0) #define PAU_NTL_BAR_ADDR PPC_BITMASK(3, 35) @@ -88,6 +94,7 @@ #define PAU_CTL_MISC_CFG2_OCAPI_MEM_OS_BIT PPC_BITMASK(25, 29) #define PAU_CTL_MISC_STATUS(brk) (PAU_BLOCK_CQ_CTL + 0x060 + (brk) * 8) #define PAU_CTL_MISC_STATUS_AM_FENCED(brk) (PPC_BITMASK(41, 42) << ((brk)*32)) +#define PAU_CTL_MISC_GPU_MEM_BAR(brk) (PAU_BLOCK_CQ_CTL + 0x070 + (brk) * 8) #define PAU_CTL_MISC_MMIOPA_CONFIG(brk) (PAU_BLOCK_CQ_CTL + 0x098 + (brk) * 8) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_ADDR PPC_BITMASK(1, 35) #define PAU_CTL_MISC_MMIOPA_CONFIG_BAR_SIZE PPC_BITMASK(39, 43) @@ -159,6 +166,7 @@ /* XSL block registers */ #define PAU_XSL_OSL_SPAP_AN(brk) (PAU_BLOCK_XSL + 0x000 + (brk) * 8) #define PAU_XSL_OSL_SPAP_AN_EN PPC_BIT(63) +#define PAU_XSL_GPU_MEM_BAR(brk) (PAU_BLOCK_XSL + 0x0D0 + (brk) * 8) #define PAU_XSL_WRAP_CFG (PAU_BLOCK_XSL + 0x100) #define PAU_XSL_WRAP_CFG_CLOCK_ENABLE PPC_BIT(0) #define PAU_XSL_OSL_XLATE_CFG(brk) (PAU_BLOCK_XSL + 0x040 + (brk) * 8) diff --git a/include/pau.h b/include/pau.h index 9b612fc2..c601f7ea 100644 --- a/include/pau.h +++ b/include/pau.h @@ -45,6 +45,7 @@ struct pau_dev { struct pau_bar ntl_bar; struct pau_bar genid_bar; + struct pau_bar memory_bar; /* Associated I2C information */ uint8_t i2c_bus_id; @@ -210,6 +211,9 @@ int64_t pau_opencapi_spa_clear_cache(struct phb *phb, int64_t pau_opencapi_tl_set(struct phb *phb, uint32_t __unused bdfn, long capabilities, uint64_t rate_phys, int rate_sz); +int64_t pau_opencapi_mem_alloc(struct phb *phb, uint32_t __unused bdfn, + uint64_t size, uint64_t *bar); +int64_t pau_opencapi_mem_release(struct phb *phb, uint32_t __unused bdfn); /* PHY */ -- 2.31.1 From stewart at flamingspork.com Tue Aug 24 12:25:08 2021 From: stewart at flamingspork.com (Stewart Smith) Date: Mon, 23 Aug 2021 19:25:08 -0700 Subject: [Skiboot] =?utf-8?q?=5BPATCH_14/16=5D_=5BPATCH_14/16=5D_opencapi?= =?utf-8?q?5=3A_add_opal_functions?= In-Reply-To: <20210820094557.29743-15-clombard@linux.vnet.ibm.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> <20210820094557.29743-15-clombard@linux.vnet.ibm.com> Message-ID: <711d8946-a024-4e38-80b1-7093a81fdbb6@www.fastmail.com> On Fri, Aug 20, 2021, at 2:45 AM, Christophe Lombard wrote: > Add three OPAL API calls that are required by the ocxl driver. > > - OPAL_PAU_SPA_SETUP > > The Shared Process Area (SPA) is a table containing one entry (a > "Process Element") per memory context which can be accessed by the > OpenCAPI device. > > - OPAL_PAU_SPA_CLEAR_CACHE > > The PAU keeps a cache of recently accessed memory contexts. When a > Process Element is removed from the SPA, the cache for the link must > be cleared. > > - OPAL_PAU_TL_SET > > The Transaction Layer specification defines several templates for > messages to be exchanged on the link. During link setup, the host > and device must negotiate what templates are supported on both sides > and at what rates those messages can be sent. > > Signed-off-by: Christophe Lombard > --- > hw/npu-opal.c | 8 +++ > hw/pau.c | 159 +++++++++++++++++++++++++++++++++++++++++++++ > include/pau-regs.h | 13 ++++ > include/pau.h | 9 +++ > 4 files changed, 189 insertions(+) Probably want to add some documentation on the OPAL calls in doc/opal-api/ From npiggin at gmail.com Wed Aug 25 15:47:56 2021 From: npiggin at gmail.com (Nicholas Piggin) Date: Wed, 25 Aug 2021 15:47:56 +1000 Subject: [Skiboot] [PATCH v1] external/mambo: Updates for POWER9 and POWER10 configuration Message-ID: <20210825054756.585767-1-npiggin@gmail.com> Update SIM_CTRL1 bits: - Set the LPAR-per-core mode bit. This is required for SMT KVM to work. - Disable hardware atomic RC updates. This matches hardware. - Disable LM on POWER10 (already disabled on POWER9). - Disable TM nesting, TCHECK mask, mttfhar under suspend, and DISABLE_INST_FORCE_CIT_BIT on POWER10. - Enable DEXCR, HILE, ROP, BHRB disable, block BHRB writes in PR=0, BFLOAT, RFC02628 on POWER10. Update PVR and mambo f000f bits: - Set POWER10 to DD2.0 Signed-off-by: Nicholas Piggin --- external/mambo/skiboot.tcl | 11 ++++++----- hw/xscom.c | 2 +- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/external/mambo/skiboot.tcl b/external/mambo/skiboot.tcl index 0ecb55a77..48039ba80 100644 --- a/external/mambo/skiboot.tcl +++ b/external/mambo/skiboot.tcl @@ -134,9 +134,9 @@ if { $default_config == "PEGASUS" } { } if { $default_config == "P9" } { - # PVR configured for POWER9 DD2.3 Scale out 24 Core (ie SMT4) + # PVR configured for POWER9 DD2.3 Scale out 24 Core (ie SMT4), LPAR-per-thread myconf config processor/initial/PVR 0x4e1203 - myconf config processor/initial/SIM_CTRL1 0x4228301710000000 + myconf config processor/initial/SIM_CTRL1 0x42683c1710000000 if { $mconf(numa) } { myconf config memory_region_id_shift 45 @@ -144,9 +144,10 @@ if { $default_config == "P9" } { } if { $default_config == "P10" } { - # PVR configured for POWER10 DD1.0 - myconf config processor/initial/PVR 0x800100 - myconf config processor/initial/SIM_CTRL1 0xc228100400000000 + # PVR configured for POWER10 DD2.0, big-core, LPAR-per-thread + # Small-core has bit 0x1000 set. + myconf config processor/initial/PVR 0x800200 + myconf config processor/initial/SIM_CTRL1 0x0c1cdc1400000000 if { $mconf(numa) } { myconf config memory_region_id_shift 44 diff --git a/hw/xscom.c b/hw/xscom.c index 347457242..298fe0c90 100644 --- a/hw/xscom.c +++ b/hw/xscom.c @@ -826,7 +826,7 @@ int64_t xscom_read_cfam_chipid(uint32_t partid, uint32_t *chip_id) */ if (chip_quirk(QUIRK_NO_F000F)) { if (proc_gen == proc_gen_p10) - val = 0x120DA04980000000UL; /* P10 DD1.0 */ + val = 0x220DA04980000000UL; /* P10 DD2.0 */ else if (proc_gen == proc_gen_p9) val = 0x203D104980000000UL; /* P9 Nimbus DD2.3 */ else -- 2.23.0 From clombard at linux.vnet.ibm.com Wed Aug 25 19:14:38 2021 From: clombard at linux.vnet.ibm.com (Christophe Lombard) Date: Wed, 25 Aug 2021 11:14:38 +0200 Subject: [Skiboot] [PATCH 14/16] [PATCH 14/16] opencapi5: add opal functions In-Reply-To: <711d8946-a024-4e38-80b1-7093a81fdbb6@www.fastmail.com> References: <20210820094557.29743-1-clombard@linux.vnet.ibm.com> <20210820094557.29743-15-clombard@linux.vnet.ibm.com> <711d8946-a024-4e38-80b1-7093a81fdbb6@www.fastmail.com> Message-ID: Le 24/08/2021 ? 04:25, Stewart Smith a ?crit?: > On Fri, Aug 20, 2021, at 2:45 AM, Christophe Lombard wrote: >> Add three OPAL API calls that are required by the ocxl driver. >> >> - OPAL_PAU_SPA_SETUP >> >> The Shared Process Area (SPA) is a table containing one entry (a >> "Process Element") per memory context which can be accessed by the >> OpenCAPI device. >> >> - OPAL_PAU_SPA_CLEAR_CACHE >> >> The PAU keeps a cache of recently accessed memory contexts. When a >> Process Element is removed from the SPA, the cache for the link must >> be cleared. >> >> - OPAL_PAU_TL_SET >> >> The Transaction Layer specification defines several templates for >> messages to be exchanged on the link. During link setup, the host >> and device must negotiate what templates are supported on both sides >> and at what rates those messages can be sent. >> >> Signed-off-by: Christophe Lombard >> --- >> hw/npu-opal.c | 8 +++ >> hw/pau.c | 159 +++++++++++++++++++++++++++++++++++++++++++++ >> include/pau-regs.h | 13 ++++ >> include/pau.h | 9 +++ >> 4 files changed, 189 insertions(+) > Probably want to add some documentation on the OPAL calls in doc/opal-api/ > _______________________________________________ > Skiboot mailing list > Skiboot at lists.ozlabs.org > https://lists.ozlabs.org/listinfo/skiboot right. Thanks. From fbarrat at linux.ibm.com Thu Aug 26 01:04:08 2021 From: fbarrat at linux.ibm.com (Frederic Barrat) Date: Wed, 25 Aug 2021 17:04:08 +0200 Subject: [Skiboot] [PATCH] phb4/5: Escalate page-level TCE kills Message-ID: <20210825150408.54367-1-fbarrat@linux.ibm.com> An hw issue was found on P10 (HW560152) where a page-level TCE kill can be dropped if there are enough TCE kill requests already being processed. The net effect is that data integrity is not guaranteed. The circumvention is to stay away from page-level kills and escalate those to PE kills. Which hurts performance. It also affects P9. Signed-off-by: Frederic Barrat --- hw/phb4.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/hw/phb4.c b/hw/phb4.c index 79083d4a..ddaa18f8 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -1051,6 +1051,14 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, uint64_t val; int64_t rc; + /* + * HW560152: a page-level kill can be dropped if the + * processing queue is backed-up, which can cause data + * integrity issues + */ + if (kill_type == OPAL_PCI_TCE_KILL_PAGES) + kill_type = OPAL_PCI_TCE_KILL_PE; + sync(); switch(kill_type) { case OPAL_PCI_TCE_KILL_PAGES: -- 2.31.1 From stewart at flamingspork.com Thu Aug 26 02:37:27 2021 From: stewart at flamingspork.com (Stewart Smith) Date: Wed, 25 Aug 2021 09:37:27 -0700 Subject: [Skiboot] [PATCH] phb4/5: Escalate page-level TCE kills In-Reply-To: <20210825150408.54367-1-fbarrat@linux.ibm.com> References: <20210825150408.54367-1-fbarrat@linux.ibm.com> Message-ID: Sounds like also for stable? Sent from my iPhone > On Aug 25, 2021, at 8:09 AM, Frederic Barrat wrote: > > ?An hw issue was found on P10 (HW560152) where a page-level TCE kill > can be dropped if there are enough TCE kill requests already being > processed. The net effect is that data integrity is not > guaranteed. The circumvention is to stay away from page-level kills > and escalate those to PE kills. Which hurts performance. > It also affects P9. > > Signed-off-by: Frederic Barrat > --- > hw/phb4.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index 79083d4a..ddaa18f8 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -1051,6 +1051,14 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, > uint64_t val; > int64_t rc; > > + /* > + * HW560152: a page-level kill can be dropped if the > + * processing queue is backed-up, which can cause data > + * integrity issues > + */ > + if (kill_type == OPAL_PCI_TCE_KILL_PAGES) > + kill_type = OPAL_PCI_TCE_KILL_PE; > + > sync(); > switch(kill_type) { > case OPAL_PCI_TCE_KILL_PAGES: > -- > 2.31.1 > > _______________________________________________ > Skiboot mailing list > Skiboot at lists.ozlabs.org > https://lists.ozlabs.org/listinfo/skiboot From fbarrat at linux.ibm.com Thu Aug 26 16:15:07 2021 From: fbarrat at linux.ibm.com (Frederic Barrat) Date: Thu, 26 Aug 2021 08:15:07 +0200 Subject: [Skiboot] [PATCH] phb4/5: Escalate page-level TCE kills In-Reply-To: References: <20210825150408.54367-1-fbarrat@linux.ibm.com> Message-ID: <6a67fcee-2bc7-08cf-f815-8899a5a05a08@linux.ibm.com> On 25/08/2021 18:37, Stewart Smith wrote: > Sounds like also for stable? I sent it to the stable list but didn't explicitly add the "cc" field in the commit log. Vasant: could you add it or should I resend? Fred > Sent from my iPhone > >> On Aug 25, 2021, at 8:09 AM, Frederic Barrat wrote: >> >> ?An hw issue was found on P10 (HW560152) where a page-level TCE kill >> can be dropped if there are enough TCE kill requests already being >> processed. The net effect is that data integrity is not >> guaranteed. The circumvention is to stay away from page-level kills >> and escalate those to PE kills. Which hurts performance. >> It also affects P9. >> >> Signed-off-by: Frederic Barrat >> --- >> hw/phb4.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/hw/phb4.c b/hw/phb4.c >> index 79083d4a..ddaa18f8 100644 >> --- a/hw/phb4.c >> +++ b/hw/phb4.c >> @@ -1051,6 +1051,14 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, >> uint64_t val; >> int64_t rc; >> >> + /* >> + * HW560152: a page-level kill can be dropped if the >> + * processing queue is backed-up, which can cause data >> + * integrity issues >> + */ >> + if (kill_type == OPAL_PCI_TCE_KILL_PAGES) >> + kill_type = OPAL_PCI_TCE_KILL_PE; >> + >> sync(); >> switch(kill_type) { >> case OPAL_PCI_TCE_KILL_PAGES: >> -- >> 2.31.1 >> >> _______________________________________________ >> Skiboot mailing list >> Skiboot at lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/skiboot > From oohall at gmail.com Fri Aug 27 01:47:08 2021 From: oohall at gmail.com (Oliver O'Halloran) Date: Fri, 27 Aug 2021 01:47:08 +1000 Subject: [Skiboot] [Skiboot-stable] [PATCH] phb4/5: Escalate page-level TCE kills In-Reply-To: <20210825150408.54367-1-fbarrat@linux.ibm.com> References: <20210825150408.54367-1-fbarrat@linux.ibm.com> Message-ID: On Thu, Aug 26, 2021 at 1:09 AM Frederic Barrat wrote: > > An hw issue was found on P10 (HW560152) where a page-level TCE kill > can be dropped if there are enough TCE kill requests already being > processed. The net effect is that data integrity is not > guaranteed. Hmm, what is the actual problem? Is there a race between when the bit in TCE_KILL says there's a free queue slot and when one actually comes available? If so, how big is that race window? > The circumvention is to stay away from page-level kills > and escalate those to PE kills. Which hurts performance. understatement > It also affects P9. lol > > Signed-off-by: Frederic Barrat > --- > hw/phb4.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index 79083d4a..ddaa18f8 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -1051,6 +1051,14 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, > uint64_t val; > int64_t rc; > > + /* > + * HW560152: a page-level kill can be dropped if the > + * processing queue is backed-up, which can cause data > + * integrity issues > + */ > + if (kill_type == OPAL_PCI_TCE_KILL_PAGES) > + kill_type = OPAL_PCI_TCE_KILL_PE; > + > sync(); > switch(kill_type) { > case OPAL_PCI_TCE_KILL_PAGES: > -- > 2.31.1 > > -- > Skiboot-stable mailing list > Skiboot-stable at lists.ozlabs.org > https://lists.ozlabs.org/listinfo/skiboot-stable From fbarrat at linux.ibm.com Fri Aug 27 02:53:35 2021 From: fbarrat at linux.ibm.com (Frederic Barrat) Date: Thu, 26 Aug 2021 18:53:35 +0200 Subject: [Skiboot] [Skiboot-stable] [PATCH] phb4/5: Escalate page-level TCE kills In-Reply-To: References: <20210825150408.54367-1-fbarrat@linux.ibm.com> Message-ID: <040fc059-eeb3-89ef-bad3-0ae528e0ca5a@linux.ibm.com> On 26/08/2021 17:47, Oliver O'Halloran wrote: > On Thu, Aug 26, 2021 at 1:09 AM Frederic Barrat wrote: >> >> An hw issue was found on P10 (HW560152) where a page-level TCE kill >> can be dropped if there are enough TCE kill requests already being >> processed. The net effect is that data integrity is not >> guaranteed. > > Hmm, what is the actual problem? Is there a race between when the bit > in TCE_KILL says there's a free queue slot and when one actually comes > available? If so, how big is that race window? Not quite. We need to have the queue backed up with a mix of PE-level and page-level kills. And if there's the right sequence of those in the queue, then a page level kill is dropped. Fred >> The circumvention is to stay away from page-level kills >> and escalate those to PE kills. Which hurts performance. > > understatement > >> It also affects P9. > > lol > > >> >> Signed-off-by: Frederic Barrat >> --- >> hw/phb4.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/hw/phb4.c b/hw/phb4.c >> index 79083d4a..ddaa18f8 100644 >> --- a/hw/phb4.c >> +++ b/hw/phb4.c >> @@ -1051,6 +1051,14 @@ static int64_t phb4_tce_kill(struct phb *phb, uint32_t kill_type, >> uint64_t val; >> int64_t rc; >> >> + /* >> + * HW560152: a page-level kill can be dropped if the >> + * processing queue is backed-up, which can cause data >> + * integrity issues >> + */ >> + if (kill_type == OPAL_PCI_TCE_KILL_PAGES) >> + kill_type = OPAL_PCI_TCE_KILL_PE; >> + >> sync(); >> switch(kill_type) { >> case OPAL_PCI_TCE_KILL_PAGES: >> -- >> 2.31.1 >> >> -- >> Skiboot-stable mailing list >> Skiboot-stable at lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/skiboot-stable From hegdevasant at linux.vnet.ibm.com Fri Aug 27 16:50:24 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Fri, 27 Aug 2021 12:20:24 +0530 Subject: [Skiboot] [Skiboot-stable] [PATCH] phb4/5: Escalate page-level TCE kills In-Reply-To: <20210825150408.54367-1-fbarrat@linux.ibm.com> References: <20210825150408.54367-1-fbarrat@linux.ibm.com> Message-ID: <48a80825-2597-5563-60a7-71d7d0838582@linux.vnet.ibm.com> On 8/25/21 8:34 PM, Frederic Barrat wrote: > An hw issue was found on P10 (HW560152) where a page-level TCE kill > can be dropped if there are enough TCE kill requests already being > processed. The net effect is that data integrity is not > guaranteed. The circumvention is to stay away from page-level kills > and escalate those to PE kills. Which hurts performance. > It also affects P9. Thanks! Merged to master as 2b0b6a1a. -Vasant From hegdevasant at linux.vnet.ibm.com Fri Aug 27 16:50:57 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Fri, 27 Aug 2021 12:20:57 +0530 Subject: [Skiboot] [PATCH] ci: Bump qemu version In-Reply-To: <20210819154103.53068-1-hegdevasant@linux.vnet.ibm.com> References: <20210819154103.53068-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <57f362e3-02a9-cee0-e2fd-1e8b4e2c9f6a@linux.vnet.ibm.com> On 8/19/21 9:11 PM, Vasant Hegde wrote: > Move to qemu version powernv-6.1. > > Signed-off-by: Vasant Hegde Merged to master as 39f12eb0. -Vasant From hegdevasant at linux.vnet.ibm.com Fri Aug 27 16:51:40 2021 From: hegdevasant at linux.vnet.ibm.com (Vasant Hegde) Date: Fri, 27 Aug 2021 12:21:40 +0530 Subject: [Skiboot] [PATCH v2] hello_world: Add p10 mambo tests In-Reply-To: <20210819154039.52851-1-hegdevasant@linux.vnet.ibm.com> References: <20210819154039.52851-1-hegdevasant@linux.vnet.ibm.com> Message-ID: <0cf1ac8c-5504-ee01-339d-aca849cb05ec@linux.vnet.ibm.com> On 8/19/21 9:10 PM, Vasant Hegde wrote: > Signed-off-by: Vasant Hegde > --- > Changes in v2: > I had missed to add run_mambo_p10_hello_world.sh in v1. Merged to master as 6352fa85. -Vasant From fbarrat at linux.ibm.com Mon Aug 30 22:06:33 2021 From: fbarrat at linux.ibm.com (Frederic Barrat) Date: Mon, 30 Aug 2021 14:06:33 +0200 Subject: [Skiboot] [PATCH] npu3: Remove GPU support on Swift Message-ID: <20210830120633.41712-1-fbarrat@linux.ibm.com> npu3 was only used on the Swift platform to add support for GPUs (nvlink). The Swift platform has never left the lab and support for GPUs on it is pretty much dead. So let's remove it. The patch removes all related code. Device tree entries are no longer created and in the very unlikely case that someone is still trying to boot it, the linux nvlink discovery code should be quiet. Tested by booting on Swift with no GPU. Signed-off-by: Frederic Barrat --- core/init.c | 1 - hw/Makefile.inc | 3 +- hw/npu-opal.c | 17 +- hw/npu3-hw-procedures.c | 792 ----------------- hw/npu3-nvlink.c | 1828 -------------------------------------- hw/npu3.c | 549 ------------ include/npu3-regs.h | 253 ------ include/npu3.h | 192 ---- include/pci.h | 1 - include/platform.h | 2 - include/skiboot.h | 1 - platforms/astbmc/swift.c | 89 +- 12 files changed, 4 insertions(+), 3724 deletions(-) delete mode 100644 hw/npu3-hw-procedures.c delete mode 100644 hw/npu3-nvlink.c delete mode 100644 hw/npu3.c delete mode 100644 include/npu3-regs.h delete mode 100644 include/npu3.h diff --git a/core/init.c b/core/init.c index a8bac28a..235f9055 100644 --- a/core/init.c +++ b/core/init.c @@ -1370,7 +1370,6 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt) /* Probe NPUs */ probe_npu(); probe_npu2(); - probe_npu3(); /* Initialize PCI */ pci_init_slots(); diff --git a/hw/Makefile.inc b/hw/Makefile.inc index 37256d3c..c254fcbd 100644 --- a/hw/Makefile.inc +++ b/hw/Makefile.inc @@ -8,8 +8,7 @@ HW_OBJS += dts.o lpc-rtc.o npu.o npu-hw-procedures.o xive.o phb4.o HW_OBJS += fake-nvram.o lpc-mbox.o npu2.o npu2-hw-procedures.o HW_OBJS += npu2-common.o npu2-opencapi.o phys-map.o sbe-p9.o capp.o HW_OBJS += occ-sensor.o vas.o sbe-p8.o dio-p9.o lpc-port80h.o cache-p9.o -HW_OBJS += npu-opal.o npu3.o npu3-nvlink.o npu3-hw-procedures.o -HW_OBJS += ocmb.o xive2.o +HW_OBJS += npu-opal.o ocmb.o xive2.o HW=hw/built-in.a include $(SRC)/hw/fsp/Makefile.inc diff --git a/hw/npu-opal.c b/hw/npu-opal.c index 412ea460..c7f5f9f4 100644 --- a/hw/npu-opal.c +++ b/hw/npu-opal.c @@ -7,7 +7,6 @@ #include #include #include -#include static int64_t opal_npu_init_context(uint64_t phb_id, int pid __unused, uint64_t msr, uint64_t bdf) @@ -20,9 +19,6 @@ static int64_t opal_npu_init_context(uint64_t phb_id, int pid __unused, if (phb->phb_type == phb_type_npu_v2) return npu2_init_context(phb, msr, bdf); - if (phb->phb_type == phb_type_npu_v3) - return npu3_init_context(phb, msr, bdf); - return OPAL_PARAMETER; } opal_call(OPAL_NPU_INIT_CONTEXT, opal_npu_init_context, 4); @@ -38,9 +34,6 @@ static int64_t opal_npu_destroy_context(uint64_t phb_id, uint64_t pid __unused, if (phb->phb_type == phb_type_npu_v2) return npu2_destroy_context(phb, bdf); - if (phb->phb_type == phb_type_npu_v3) - return npu3_destroy_context(phb, bdf); - return OPAL_PARAMETER; } opal_call(OPAL_NPU_DESTROY_CONTEXT, opal_npu_destroy_context, 3); @@ -56,9 +49,6 @@ static int64_t opal_npu_map_lpar(uint64_t phb_id, uint64_t bdf, uint64_t lparid, if (phb->phb_type == phb_type_npu_v2) return npu2_map_lpar(phb, bdf, lparid, lpcr); - if (phb->phb_type == phb_type_npu_v3) - return npu3_map_lpar(phb, bdf, lparid, lpcr); - return OPAL_PARAMETER; } opal_call(OPAL_NPU_MAP_LPAR, opal_npu_map_lpar, 4); @@ -89,13 +79,10 @@ static int64_t npu_set_relaxed_order(uint32_t gcid, int pec, bool enable) int64_t rc; for_each_phb(phb) { - if (phb->phb_type == phb_type_npu_v2) - rc = npu2_set_relaxed_order(phb, gcid, pec, enable); - else if (phb->phb_type == phb_type_npu_v3) - rc = npu3_set_relaxed_order(phb, gcid, pec, enable); - else + if (phb->phb_type != phb_type_npu_v2) continue; + rc = npu2_set_relaxed_order(phb, gcid, pec, enable); if (rc) return rc; } diff --git a/hw/npu3-hw-procedures.c b/hw/npu3-hw-procedures.c deleted file mode 100644 index 098e6e46..00000000 --- a/hw/npu3-hw-procedures.c +++ /dev/null @@ -1,792 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later -/* - * Copyright 2019 IBM Corp. - */ - -#include -#include -#include -#include -#include -#include - -#define NPU3DEVLOG(l, dev, fmt, a...) \ - prlog(l, "NPU[%d:%d:%d]: " fmt, \ - (dev)->npu->chip_id, \ - (dev)->npu->index, \ - (dev)->index, ##a) -#define NPU3DEVDBG(dev, fmt, a...) NPU3DEVLOG(PR_DEBUG, dev, fmt, ##a) -#define NPU3DEVINF(dev, fmt, a...) NPU3DEVLOG(PR_INFO, dev, fmt, ##a) -#define NPU3DEVERR(dev, fmt, a...) NPU3DEVLOG(PR_ERR, dev, fmt, ##a) - -/* - * The documentation for the PHY training is written in terms of bits within an - * actual register so we use that representation here. - */ -struct npu3_phy_reg { - uint64_t offset; - uint64_t mask; -}; - -static struct npu3_phy_reg -NPU3_PHY_RX_RUN_LANE = { 0x0c8, PPC_BIT(48) }, -NPU3_PHY_RX_IORESET = { 0x096, PPC_BIT(63) }, -NPU3_PHY_TX_IORESET = { 0x113, PPC_BIT(48) }, -NPU3_PHY_RX_PR_RESET = { 0x096, PPC_BIT(62) }, -NPU3_PHY_RX_LANE_ANA_PDWN = { 0x002, PPC_BIT(54) }, -NPU3_PHY_RX_LANE_DIG_PDWN = { 0x088, PPC_BIT(48) }, -NPU3_PHY_RX_PR_PHASE_STEP = { 0x08a, PPC_BITMASK(60, 63) }, -NPU3_PHY_TX_LANE_PDWN = { 0x101, PPC_BIT(48) }, -NPU3_PHY_RX_RUN_DCCAL = { 0x0c8, PPC_BIT(49) }, -NPU3_PHY_RX_DCCAL_DONE = { 0x0ca, PPC_BIT(49) }, -NPU3_PHY_RX_LANE_BUSY = { 0x0ca, PPC_BIT(50) }, -NPU3_PHY_RX_B_BANK_CONTROLS = { 0x002, PPC_BITMASK(58, 63) }, -NPU3_PHY_TX_UNLOAD_CLK_DISABLE = { 0x103, PPC_BIT(56) }, -NPU3_PHY_TX_FIFO_INIT = { 0x105, PPC_BIT(53) }, -NPU3_PHY_TX_RXCAL = { 0x103, PPC_BIT(57) }, -NPU3_PHY_RX_INIT_DONE = { 0x0ca, PPC_BIT(48) }, -NPU3_PHY_RX_PR_EDGE_TRACK_CNTL = { 0x092, PPC_BITMASK(48, 49) }, -NPU3_PHY_RX_PR_FW_OFF = { 0x08a, PPC_BIT(56) }, -NPU3_PHY_RX_PR_FW_INERTIA_AMT = { 0x08a, PPC_BITMASK(57, 59) }, -NPU3_PHY_RX_CFG_LTE_MC = { 0x000, PPC_BITMASK(60, 63) }, -NPU3_PHY_RX_A_INTEG_COARSE_GAIN = { 0x00a, PPC_BITMASK(48, 51) }, -NPU3_PHY_RX_B_INTEG_COARSE_GAIN = { 0x026, PPC_BITMASK(48, 51) }, -NPU3_PHY_RX_E_INTEG_COARSE_GAIN = { 0x030, PPC_BITMASK(48, 51) }, - -/* These registers are per-PHY, not per lane */ -NPU3_PHY_TX_ZCAL_SWO_EN = { 0x3c9, PPC_BIT(48) }, -NPU3_PHY_TX_ZCAL_REQ = { 0x3c1, PPC_BIT(49) }, -NPU3_PHY_TX_ZCAL_DONE = { 0x3c1, PPC_BIT(50) }, -NPU3_PHY_TX_ZCAL_ERROR = { 0x3c1, PPC_BIT(51) }, -NPU3_PHY_TX_ZCAL_N = { 0x3c3, PPC_BITMASK(48, 56) }, -NPU3_PHY_TX_ZCAL_P = { 0x3c5, PPC_BITMASK(48, 56) }, -NPU3_PHY_TX_PSEG_PRE_EN = { 0x34d, PPC_BITMASK(51, 55) }, -NPU3_PHY_TX_PSEG_PRE_SELECT = { 0x34d, PPC_BITMASK(56, 60) }, -NPU3_PHY_TX_NSEG_PRE_EN = { 0x34f, PPC_BITMASK(51, 55) }, -NPU3_PHY_TX_NSEG_PRE_SELECT = { 0x34f, PPC_BITMASK(56, 60) }, -NPU3_PHY_TX_PSEG_POST_EN = { 0x361, PPC_BITMASK(49, 55) }, -NPU3_PHY_TX_PSEG_POST_SELECT = { 0x361, PPC_BITMASK(56, 62) }, -NPU3_PHY_TX_NSEG_POST_EN = { 0x363, PPC_BITMASK(49, 55) }, -NPU3_PHY_TX_NSEG_POST_SELECT = { 0x363, PPC_BITMASK(56, 62) }, -NPU3_PHY_TX_PSEG_MARGINPU_EN = { 0x351, PPC_BITMASK(48, 55) }, -NPU3_PHY_TX_NSEG_MARGINPU_EN = { 0x353, PPC_BITMASK(48, 55) }, -NPU3_PHY_TX_PSEG_MARGINPD_EN = { 0x351, PPC_BITMASK(56, 63) }, -NPU3_PHY_TX_NSEG_MARGINPD_EN = { 0x353, PPC_BITMASK(56, 63) }, -NPU3_PHY_TX_MARGINPU_SELECT = { 0x355, PPC_BITMASK(48, 55) }, -NPU3_PHY_TX_MARGINPD_SELECT = { 0x355, PPC_BITMASK(56, 63) }, -NPU3_PHY_TX_PSEG_MAIN_EN = { 0x357, PPC_BITMASK(51, 57) }, -NPU3_PHY_TX_NSEG_MAIN_EN = { 0x359, PPC_BITMASK(51, 57) }, -NPU3_PHY_RX_CLKDIST_PDWN = { 0x204, PPC_BITMASK(48, 50) }, -NPU3_PHY_RX_IREF_PDWN = { 0x230, PPC_BIT(54) }, -NPU3_PHY_TX_CLKDIST_PDWN = { 0x305, PPC_BITMASK(48, 50) }, -NPU3_PHY_RX_CTL_DATASM_CLKDIST_PDWN = { 0x2e0, PPC_BIT(60) }; - -static uint64_t npu3_phy_scom(struct npu3_dev *dev, struct npu3_phy_reg *reg, - int lane) -{ - uint64_t scom; - - /* Don't specify a lane for a non-per-lane register */ - if (lane >= 0) - assert(reg->offset < 0x200); - else - assert(reg->offset >= 0x200); - - scom = OB_INDIRECT(dev->ob_chiplet); - scom = SETFIELD(PPC_BITMASK(12, 21), scom, reg->offset); - - if (lane > 0) - scom = SETFIELD(PPC_BITMASK(27, 31), scom, lane); - - return scom; -} - -static void npu3_phy_write_lane(struct npu3_dev *dev, struct npu3_phy_reg *reg, - int lane, uint64_t val) -{ - struct npu3 *npu = dev->npu; - uint64_t scom, scom_val; - - scom = npu3_phy_scom(dev, reg, lane); - - xscom_read(npu->chip_id, scom, &scom_val); - scom_val = SETFIELD(reg->mask, scom_val, val); - xscom_write(npu->chip_id, scom, scom_val); -} - -static uint64_t npu3_phy_read_lane(struct npu3_dev *dev, - struct npu3_phy_reg *reg, - int lane) -{ - struct npu3 *npu = dev->npu; - uint64_t scom, scom_val; - - scom = npu3_phy_scom(dev, reg, lane); - xscom_read(npu->chip_id, scom, &scom_val); - - return GETFIELD(reg->mask, scom_val); -} - -static inline void npu3_phy_write(struct npu3_dev *dev, - struct npu3_phy_reg *reg, - uint64_t val) -{ - npu3_phy_write_lane(dev, reg, -1, val); -} - -static inline uint64_t npu3_phy_read(struct npu3_dev *dev, - struct npu3_phy_reg *reg) -{ - return npu3_phy_read_lane(dev, reg, -1); -} - -struct procedure { - const char *name; - uint32_t (*steps[])(struct npu3_dev *); -}; - -#define DEFINE_PROCEDURE(NAME, STEPS...) \ -static struct procedure procedure_##NAME = { \ - .name = #NAME, \ - .steps = { NAME, ##STEPS } \ -} - -static uint32_t stop(struct npu3_dev *npu_dev __unused) -{ - return NPU3_PROC_COMPLETE | NPU3_PROC_ABORTED; -} - -DEFINE_PROCEDURE(stop); - -static uint32_t nop(struct npu3_dev *npu_dev __unused) -{ - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(nop); - -static void set_iovalid(struct npu3_dev *dev, bool raise) -{ - struct npu3 *npu = dev->npu; - uint64_t reg, val; - - reg = OB_CPLT_CONF1(dev->ob_chiplet); - - xscom_read(npu->chip_id, reg, &val); - val = SETFIELD(OB_CPLT_CONF1_NV_IOVALID(dev->index), val, raise); - xscom_write(npu->chip_id, reg, val); -} - -#define NPU3_PHY_LANES 24 - -#define npu3_for_each_lane(lane, dev) \ - for (lane = 0; lane < NPU3_PHY_LANES; lane++) \ - if (dev->phy_lane_mask & PPC_BIT32(lane)) \ - -static uint32_t phy_reset(struct npu3_dev *dev) -{ - uint32_t lane; - - set_iovalid(dev, false); - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_RX_RUN_LANE, lane, 0); - - return NPU3_PROC_NEXT; -} - -static uint32_t phy_reset_wait(struct npu3_dev *dev) -{ - int lane; - - /* Wait for all lanes to become inactive */ - npu3_for_each_lane(lane, dev) - if (npu3_phy_read_lane(dev, &NPU3_PHY_RX_LANE_BUSY, lane)) - return NPU3_PROC_INPROGRESS; - - npu3_for_each_lane(lane, dev) { - /* Set lane in reset */ - npu3_phy_write_lane(dev, &NPU3_PHY_RX_IORESET, lane, 1); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_IORESET, lane, 1); - - /* Release lane from reset */ - npu3_phy_write_lane(dev, &NPU3_PHY_RX_IORESET, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_IORESET, lane, 0); - - /* Reset the phase rotator */ - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_RESET, lane, 1); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_RESET, lane, 0); - } - - return NPU3_PROC_NEXT; -} - -/* Procedure 1.2.3 - Initialise I/O PHY Registers */ -static uint32_t phy_reset_complete(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) { - npu3_phy_write_lane(dev, &NPU3_PHY_RX_LANE_ANA_PDWN, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_LANE_DIG_PDWN, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_PHASE_STEP, lane, 0xc); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_LANE_PDWN, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_FW_INERTIA_AMT, lane, 4); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_CFG_LTE_MC, lane, 3); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_A_INTEG_COARSE_GAIN, lane, 11); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_B_INTEG_COARSE_GAIN, lane, 11); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_E_INTEG_COARSE_GAIN, lane, 11); - } - - set_iovalid(dev, true); - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(phy_reset, phy_reset_wait, phy_reset_complete); - -/* Procedure 1.2.6 - I/O PHY Tx Impedance Calibration */ -static uint32_t phy_tx_zcal(struct npu3_dev *dev) -{ - if (dev->npu->tx_zcal_complete) - return NPU3_PROC_COMPLETE; - - /* Turn off SW enable and enable zcal state machine */ - npu3_phy_write(dev, &NPU3_PHY_TX_ZCAL_SWO_EN, 0); - - /* Start impedance calibration state machine */ - npu3_phy_write(dev, &NPU3_PHY_TX_ZCAL_REQ, 1); - - return NPU3_PROC_NEXT; -} - -static uint32_t phy_tx_zcal_wait(struct npu3_dev *dev) -{ - if (npu3_phy_read(dev, &NPU3_PHY_TX_ZCAL_ERROR)) - return NPU3_PROC_COMPLETE | NPU3_PROC_FAILED; - - if (!npu3_phy_read(dev, &NPU3_PHY_TX_ZCAL_DONE)) - return NPU3_PROC_INPROGRESS; - - return NPU3_PROC_NEXT; -} - -#define MARGIN_RATIO 0 -#define FFE_PRE_COEFF 0 -#define FFE_POST_COEFF 0 - -#define PRE_WIDTH 5 -#define POST_WIDTH 7 -#define MAIN_WIDTH 7 -#define ZCAL_MIN (16 * 2) -#define ZCAL_MAX (33 * 2) -#define PRECURSOR_X2_MAX (4 * 2 + 1) -#define POSTCURSOR_X2_MAX (6 * 2 + 1) -#define MARGIN_X2_MAX (8 * 2) -#define MAIN_X2_MAX (6 * 2 + 1) -#define TOTAL_X2_MAX (PRECURSOR_X2_MAX + POSTCURSOR_X2_MAX + \ - 2 * MARGIN_X2_MAX + MAIN_X2_MAX) - -static uint32_t therm(uint32_t dec) -{ - return (0x1 << dec) - 1; -} - -static uint32_t therm_with_half(uint32_t dec, uint8_t width) -{ - /* If the LSB of the 2r equivalent is on, then we need to set the 2r bit (MSB) */ - uint32_t half_on = (dec & 0x1) << (width - 1); - - /* Shift the 2r equivalent to a 1r value and convert to a thermometer code. */ - uint32_t x1_equiv = ((1 << (dec >> 1)) - 1); - - /* Combine 1r equivalent thermometer code + the 2r MSB value. */ - return half_on | x1_equiv; -} - -static uint32_t phy_tx_zcal_calculate(struct npu3_dev *dev) -{ - int p_value, n_value; - uint32_t zcal_n; - uint32_t zcal_p; - uint32_t p_main_enable = MAIN_X2_MAX; - uint32_t p_margin_pu_enable = MARGIN_X2_MAX; - uint32_t p_margin_pd_enable = MARGIN_X2_MAX; - uint32_t p_precursor_select; - uint32_t p_postcursor_select; - uint32_t margin_pu_select; - uint32_t n_main_enable = MAIN_X2_MAX; - uint32_t n_margin_pu_enable = MARGIN_X2_MAX; - uint32_t n_margin_pd_enable = MARGIN_X2_MAX; - uint32_t n_precursor_select; - uint32_t n_postcursor_select; - uint32_t margin_pd_select; - uint32_t margin_select; - - /* Convert the value from 8R to 2R by / 4 */ - zcal_n = npu3_phy_read(dev, &NPU3_PHY_TX_ZCAL_N) / 4; - zcal_p = npu3_phy_read(dev, &NPU3_PHY_TX_ZCAL_P) / 4; - - /* - * Again, if the hardware detects an unexpected condition it's - * better just to fail loudly. - */ - if (zcal_n < ZCAL_MIN || zcal_n > ZCAL_MAX || - zcal_p < ZCAL_MIN || zcal_p > ZCAL_MAX) - return NPU3_PROC_COMPLETE | NPU3_PROC_FAILED; - - p_value = zcal_p - TOTAL_X2_MAX; - p_precursor_select = p_value * FFE_PRE_COEFF / 128; - p_postcursor_select = p_value * FFE_POST_COEFF / 128; - margin_pu_select = p_value * MARGIN_RATIO / 256; - - if (p_value % 2) { - p_main_enable--; - p_value++; - } - - while (p_value < 0) { - if (p_main_enable > 1) { - p_main_enable -= 2; - } else if (p_margin_pu_enable + p_margin_pd_enable > 0) { - if (p_margin_pu_enable == p_margin_pd_enable) - p_margin_pd_enable -= 2; - else - p_margin_pu_enable -= 2; - } - p_value += 2; - } - - n_value = zcal_n - TOTAL_X2_MAX; - n_precursor_select = n_value * FFE_PRE_COEFF / 128; - n_postcursor_select = n_value * FFE_POST_COEFF / 128; - margin_pd_select = p_value * MARGIN_RATIO / 256; - - if (n_value % 2) { - n_main_enable--; - n_value++; - } - - while (n_value < 0) { - if (n_main_enable > 1) { - n_main_enable -= 2; - } else if (n_margin_pu_enable + n_margin_pd_enable > 0) { - if (n_margin_pu_enable == n_margin_pd_enable) - n_margin_pd_enable -= 2; - else - n_margin_pu_enable -= 2; - } - n_value += 2; - } - - margin_select = therm((margin_pu_select + 1) / 2) & - therm((margin_pd_select + 1) / 2) & - therm((p_margin_pu_enable + 1) / 2) & - therm((p_margin_pd_enable + 1) / 2) & - therm((n_margin_pu_enable + 1) / 2) & - therm((n_margin_pd_enable + 1) / 2); - - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_PRE_EN, therm_with_half(PRECURSOR_X2_MAX, PRE_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_PRE_SELECT, therm_with_half(p_precursor_select, PRE_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_POST_EN, therm_with_half(POSTCURSOR_X2_MAX, POST_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_POST_SELECT, therm_with_half(p_postcursor_select, POST_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_MARGINPU_EN, therm((p_margin_pu_enable + 1) / 2)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_MARGINPD_EN, therm((p_margin_pd_enable + 1) / 2)); - npu3_phy_write(dev, &NPU3_PHY_TX_PSEG_MAIN_EN, therm_with_half(p_main_enable, MAIN_WIDTH)); - - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_PRE_EN, therm_with_half(PRECURSOR_X2_MAX, PRE_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_PRE_SELECT, therm_with_half(n_precursor_select, PRE_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_POST_EN, therm_with_half(POSTCURSOR_X2_MAX, POST_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_POST_SELECT, therm_with_half(n_postcursor_select, POST_WIDTH)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_MARGINPU_EN, therm((n_margin_pu_enable + 1) / 2)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_MARGINPD_EN, therm((n_margin_pd_enable + 1) / 2)); - npu3_phy_write(dev, &NPU3_PHY_TX_NSEG_MAIN_EN, therm_with_half(n_main_enable, MAIN_WIDTH)); - - npu3_phy_write(dev, &NPU3_PHY_TX_MARGINPU_SELECT, therm(margin_select + 1) / 2); - npu3_phy_write(dev, &NPU3_PHY_TX_MARGINPD_SELECT, therm(margin_select + 1) / 2); - - dev->npu->tx_zcal_complete = true; - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(phy_tx_zcal, phy_tx_zcal_wait, phy_tx_zcal_calculate); - -/* Procedure 1.2.4 - I/O PHY DC Calibration */ -static uint32_t phy_rx_dccal(struct npu3_dev *dev) -{ - int lane; - - set_iovalid(dev, false); - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_FW_OFF, lane, 1); - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_RX_RUN_DCCAL, lane, 1); - - return NPU3_PROC_NEXT; -} - -static uint32_t phy_rx_dccal_complete(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) - if (!npu3_phy_read_lane(dev, &NPU3_PHY_RX_DCCAL_DONE, lane)) - return NPU3_PROC_INPROGRESS; - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_RX_RUN_DCCAL, lane, 0); - - npu3_for_each_lane(lane, dev) { - npu3_phy_write_lane(dev, &NPU3_PHY_RX_B_BANK_CONTROLS, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_EDGE_TRACK_CNTL, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_PR_FW_OFF, lane, 0); - } - - return NPU3_PROC_NEXT; -} - -/* Procedure 1.2.5 - IO PHY Tx FIFO Init */ -static uint32_t phy_tx_fifo_init(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) { - npu3_phy_write_lane(dev, &NPU3_PHY_TX_UNLOAD_CLK_DISABLE, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_FIFO_INIT, lane, 1); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_UNLOAD_CLK_DISABLE, lane, 1); - } - - set_iovalid(dev, true); - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(phy_rx_dccal, phy_rx_dccal_complete, phy_tx_fifo_init); - -/* Procedure 1.2.8 - Enable Downstream Link Training */ -static uint32_t phy_enable_tx_rxcal(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_TX_RXCAL, lane, 1); - - return NPU3_PROC_COMPLETE; -} -DEFINE_PROCEDURE(phy_enable_tx_rxcal); - -/* Procedure 1.2.9 - Disable Downstream Link Training */ -static uint32_t phy_disable_tx_rxcal(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_TX_RXCAL, lane, 0); - - return NPU3_PROC_COMPLETE; -} -DEFINE_PROCEDURE(phy_disable_tx_rxcal); - -/* Procedure 1.2.7 - I/O PHY Upstream Link Training */ -static uint32_t phy_rx_training(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) - npu3_phy_write_lane(dev, &NPU3_PHY_RX_RUN_LANE, lane, 1); - - return NPU3_PROC_NEXT; -} - -static uint32_t phy_rx_training_wait(struct npu3_dev *dev) -{ - int lane; - - npu3_for_each_lane(lane, dev) - if (!npu3_phy_read_lane(dev, &NPU3_PHY_RX_INIT_DONE, lane)) - return NPU3_PROC_INPROGRESS; - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(phy_rx_training, phy_rx_training_wait); - -static void npu3_dev_fence_set(struct npu3_dev *dev, uint8_t state) -{ - struct npu3 *npu = dev->npu; - uint64_t val; - - val = npu3_read(npu, NPU3_NTL_MISC_CFG1(dev->index)); - val = SETFIELD(NPU3_NTL_MISC_CFG1_NTL_RESET, val, state); - npu3_write(npu, NPU3_NTL_MISC_CFG1(dev->index), val); -} - -static uint8_t npu3_dev_fence_get(struct npu3_dev *dev) -{ - uint64_t val; - - val = npu3_read(dev->npu, NPU3_NTL_CQ_FENCE_STATUS(dev->index)); - return GETFIELD(NPU3_NTL_CQ_FENCE_STATUS_FIELD, val); -} - -/* Procedure 1.2.1 - Reset NPU/NDL */ -static uint32_t reset_ntl(struct npu3_dev *dev) -{ - struct npu3 *npu = dev->npu; - uint64_t val; - int lane; - - set_iovalid(dev, true); - - /* Power on clocks */ - npu3_phy_write(dev, &NPU3_PHY_RX_CLKDIST_PDWN, 0); - npu3_phy_write(dev, &NPU3_PHY_RX_IREF_PDWN, 1); - npu3_phy_write(dev, &NPU3_PHY_TX_CLKDIST_PDWN, 0); - npu3_phy_write(dev, &NPU3_PHY_RX_CTL_DATASM_CLKDIST_PDWN, 0); - - npu3_for_each_lane(lane, dev) { - npu3_phy_write_lane(dev, &NPU3_PHY_RX_LANE_ANA_PDWN, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_RX_LANE_DIG_PDWN, lane, 0); - npu3_phy_write_lane(dev, &NPU3_PHY_TX_LANE_PDWN, lane, 0); - } - - /* Write PRI */ - val = SETFIELD(NPU3_NTL_PRI_CFG_NDL, 0ull, dev->index); - npu3_write(npu, NPU3_NTL_PRI_CFG(dev->index), val); - - /* Disable parity checking */ - val = npu3_read(npu, NPU3_NTL_MISC_CFG2(dev->index)); - val &= ~(NPU3_NTL_MISC_CFG2_NDL_RX_PARITY_ENA | - NPU3_NTL_MISC_CFG2_NDL_TX_PARITY_ENA | - NPU3_NTL_MISC_CFG2_NDL_PRI_PARITY_ENA); - npu3_write(npu, NPU3_NTL_MISC_CFG2(dev->index), val); - - if (dev->type == NPU3_DEV_TYPE_NVLINK) - npu3_pvd_flag_clear(dev, NPU3_DEV_DL_RESET); - - npu3_dev_fence_set(dev, NPU3_NTL_CQ_FENCE_STATUS_FULL); - - return NPU3_PROC_NEXT; -} - -static uint32_t reset_ndl(struct npu3_dev *dev) -{ - struct npu3 *npu = dev->npu; - uint64_t reg; - uint32_t val32; - - if (npu3_dev_fence_get(dev) != NPU3_NTL_CQ_FENCE_STATUS_FULL) - return NPU3_PROC_INPROGRESS; - - reg = NPU3_DLPL_CTL(dev->index); - val32 = npu3_read_4b(npu, reg); - val32 |= NPU3_DLPL_CTL_RESET_RX | NPU3_DLPL_CTL_RESET_MISC; - npu3_write_4b(npu, reg, val32); - - val32 = npu3_read_4b(npu, reg); - val32 &= ~(NPU3_DLPL_CTL_RESET_RX | NPU3_DLPL_CTL_RESET_MISC); - npu3_write_4b(npu, reg, val32); - - reg = NPU3_DLPL_CFG(dev->index); - val32 = NPU3_DLPL_CFG_PRI_BYTESWAP; - npu3_write_4b(npu, reg, val32); - - /* Clear FIR bits */ - for (uint32_t i = 0; i < NPU3_FIR_MAX; i++) - xscom_write(npu->chip_id, npu->xscom_base + NPU3_FIR(i), 0ull); - - npu3_dev_fence_set(dev, NPU3_NTL_CQ_FENCE_STATUS_HALF); - - return NPU3_PROC_NEXT; -} - -static uint32_t reset_ntl_release(struct npu3_dev *dev) -{ - struct npu3 *npu = dev->npu; - uint32_t i = dev->index; - - if (npu3_dev_fence_get(dev) != NPU3_NTL_CQ_FENCE_STATUS_HALF) - return NPU3_PROC_INPROGRESS; - - /* Credit setup */ - npu3_write(npu, NPU3_NTL_CREQ_HDR_CRED_SND(i), 0x0200000000000000); - npu3_write(npu, NPU3_NTL_PRB_HDR_CRED_SND(i), 0x0200000000000000); - npu3_write(npu, NPU3_NTL_ATR_HDR_CRED_SND(i), 0x0200000000000000); - npu3_write(npu, NPU3_NTL_RSP_HDR_CRED_SND(i), 0x0200000000000000); - npu3_write(npu, NPU3_NTL_CREQ_DAT_CRED_SND(i), 0x1000000000000000); - npu3_write(npu, NPU3_NTL_RSP_DAT_CRED_SND(i), 0x1000000000000000); - - npu3_write(npu, NPU3_NTL_CREQ_HDR_CRED_RCV(i), 0x0000be0000000000); - npu3_write(npu, NPU3_NTL_DGD_HDR_CRED_RCV(i), 0x0000640000000000); - npu3_write(npu, NPU3_NTL_ATSD_HDR_CRED_RCV(i), 0x0000200000000000); - npu3_write(npu, NPU3_NTL_RSP_HDR_CRED_RCV(i), 0x0000be0000000000); - npu3_write(npu, NPU3_NTL_CREQ_DAT_CRED_RCV(i), 0x0001000000000000); - npu3_write(npu, NPU3_NTL_RSP_DAT_CRED_RCV(i), 0x0001000000000000); - - npu3_dev_fence_set(dev, NPU3_NTL_CQ_FENCE_STATUS_NONE); - - return NPU3_PROC_NEXT; -} - -static uint32_t reset_ntl_finish(struct npu3_dev *dev) { - struct npu3 *npu = dev->npu; - uint64_t val; - - if (npu3_dev_fence_get(dev) != NPU3_NTL_CQ_FENCE_STATUS_NONE) - return NPU3_PROC_INPROGRESS; - - /* Enable parity checking */ - val = npu3_read(npu, NPU3_NTL_MISC_CFG2(dev->index)); - val |= NPU3_NTL_MISC_CFG2_NDL_RX_PARITY_ENA | - NPU3_NTL_MISC_CFG2_NDL_TX_PARITY_ENA | - NPU3_NTL_MISC_CFG2_NDL_PRI_PARITY_ENA; - npu3_write(npu, NPU3_NTL_MISC_CFG2(dev->index), val); - - if (dev->type == NPU3_DEV_TYPE_NVLINK) - npu3_pvd_flag_set(dev, NPU3_DEV_DL_RESET); - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(reset_ntl, reset_ndl, reset_ntl_release, reset_ntl_finish); - -static int npu3_dev_regcmp(struct npu3_dev *dev, uint64_t reg, - const char *reg_name, uint64_t expected) -{ - uint64_t val; - - val = npu3_read(dev->npu, reg); - if (val == expected) - return 0; - - NPU3DEVERR(dev, "%s: expected 0x%llx, read 0x%llx\n", - reg_name, expected, val); - - return 1; -} - -#define REGCMP(reg, expected) \ - npu3_dev_regcmp(dev, reg(dev->index), #reg, expected) - -static uint32_t check_credits(struct npu3_dev *dev) -{ - /* Use bitwise OR to prevent short-circuit evaluation */ - if (REGCMP(NPU3_NTL_CREQ_HDR_CRED_RCV, 0x0be0be0000000000ull) | - REGCMP(NPU3_NTL_DGD_HDR_CRED_RCV, 0x0640640000000000ull) | - REGCMP(NPU3_NTL_ATSD_HDR_CRED_RCV, 0x0200200000000000ull) | - REGCMP(NPU3_NTL_RSP_HDR_CRED_RCV, 0x0be0be0000000000ull) | - REGCMP(NPU3_NTL_CREQ_DAT_CRED_RCV, 0x1001000000000000ull) | - REGCMP(NPU3_NTL_RSP_DAT_CRED_RCV, 0x1001000000000000ull)) - return NPU3_PROC_COMPLETE | NPU3_PROC_FAILED; - - return NPU3_PROC_COMPLETE; -} - -DEFINE_PROCEDURE(check_credits); - -static struct procedure *procedures[] = { - [0] = &procedure_stop, - [1] = &procedure_nop, - [4] = &procedure_phy_reset, - [5] = &procedure_phy_tx_zcal, - [6] = &procedure_phy_rx_dccal, - [7] = &procedure_phy_enable_tx_rxcal, - [8] = &procedure_phy_disable_tx_rxcal, - [9] = &procedure_phy_rx_training, - [10] = &procedure_reset_ntl, - [11] = &procedure_nop, /* Placeholder for pre-terminate */ - [12] = &procedure_nop, /* Placeholder for terminate */ - [13] = &procedure_check_credits, -}; - -void npu3_dev_procedure_init(struct npu3_dev *dev, uint32_t pnum) -{ - struct npu3_procedure *proc = &dev->proc; - const char *name; - - if (pnum >= ARRAY_SIZE(procedures) || !procedures[pnum]) { - NPU3DEVERR(dev, "Unsupported procedure number %d\n", pnum); - proc->status = NPU3_PROC_COMPLETE | NPU3_PROC_UNSUPPORTED; - return; - } - - name = procedures[pnum]->name; - - if (proc->number == pnum && !(proc->status & NPU3_PROC_COMPLETE)) - NPU3DEVINF(dev, "Restarting procedure %s\n", name); - else - NPU3DEVINF(dev, "Starting procedure %s\n", name); - - proc->status = NPU3_PROC_INPROGRESS; - proc->number = pnum; - proc->step = 0; - proc->timeout = mftb() + msecs_to_tb(1000); -} - -static uint32_t npu3_dev_procedure_run_step(struct npu3_dev *dev) -{ - struct npu3_procedure *proc = &dev->proc; - uint32_t result; - - result = procedures[proc->number]->steps[proc->step](dev); - if (result & NPU3_PROC_NEXT) { - proc->step++; - - NPU3DEVINF(dev, "Running procedure %s step %d\n", - procedures[proc->number]->name, proc->step); - } - - return result; -} - -static void npu3_dev_procedure_run(struct npu3_dev *dev) -{ - struct npu3_procedure *proc = &dev->proc; - const char *name; - uint32_t result; - - do { - result = npu3_dev_procedure_run_step(dev); - } while (result & NPU3_PROC_NEXT); - - name = procedures[proc->number]->name; - - if (result & NPU3_PROC_COMPLETE) { - NPU3DEVINF(dev, "Procedure %s complete\n", name); - } else if (tb_compare(mftb(), proc->timeout) == TB_AAFTERB) { - NPU3DEVINF(dev, "Procedure %s timed out\n", name); - result = NPU3_PROC_COMPLETE | NPU3_PROC_FAILED; - } - - /* Mask off internal state bits */ - proc->status = result & NPU3_PROC_STATUS_MASK; -} - -uint32_t npu3_dev_procedure_status(struct npu3_dev *dev) -{ - /* Run the procedure if not already complete */ - if (!(dev->proc.status & NPU3_PROC_COMPLETE)) - npu3_dev_procedure_run(dev); - - return dev->proc.status; -} - -int64_t npu3_dev_reset(struct npu3_dev *dev) -{ - unsigned long timeout; - - reset_ntl(dev); - timeout = mftb() + msecs_to_tb(1000); - - while (npu3_dev_fence_get(dev) != NPU3_NTL_CQ_FENCE_STATUS_FULL) { - if (tb_compare(mftb(), timeout) == TB_AAFTERB) { - NPU3DEVINF(dev, "Device reset timed out\n"); - return OPAL_BUSY; - } - } - - return OPAL_SUCCESS; -} diff --git a/hw/npu3-nvlink.c b/hw/npu3-nvlink.c deleted file mode 100644 index 920864b3..00000000 --- a/hw/npu3-nvlink.c +++ /dev/null @@ -1,1828 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later -/* - * Copyright 2019 IBM Corp. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define NPU3LOG(l, npu, fmt, a...) \ - prlog(l, "NPU#%04x[%d:%d]: " fmt, \ - (npu)->nvlink.phb.opal_id, \ - (npu)->chip_id, \ - (npu)->index, ##a) -#define NPU3DBG(npu, fmt, a...) NPU3LOG(PR_DEBUG, npu, fmt, ##a) -#define NPU3INF(npu, fmt, a...) NPU3LOG(PR_INFO, npu, fmt, ##a) -#define NPU3ERR(npu, fmt, a...) NPU3LOG(PR_ERR, npu, fmt, ##a) - -#define NPU3DEVLOG(l, dev, fmt, a...) \ - prlog(l, "NPU#%04x:%02x:%02x.%x " fmt, \ - (dev)->npu->nvlink.phb.opal_id, \ - PCI_BUS_NUM((dev)->nvlink.pvd->bdfn), \ - PCI_DEV((dev)->nvlink.pvd->bdfn), \ - PCI_FUNC((dev)->nvlink.pvd->bdfn), ##a) -#define NPU3DEVDBG(dev, fmt, a...) NPU3DEVLOG(PR_DEBUG, dev, fmt, ##a) -#define NPU3DEVINF(dev, fmt, a...) NPU3DEVLOG(PR_INFO, dev, fmt, ##a) -#define NPU3DEVERR(dev, fmt, a...) NPU3DEVLOG(PR_ERR, dev, fmt, ##a) - -#define NPU3_CFG_READ(size, type) \ -static int64_t npu3_cfg_read##size(struct phb *phb, uint32_t bdfn, \ - uint32_t offset, type *data) \ -{ \ - uint32_t val; \ - int64_t ret; \ - \ - ret = pci_virt_cfg_read(phb, bdfn, offset, \ - sizeof(*data), &val); \ - *data = (type)val; \ - return ret; \ -} - -#define NPU3_CFG_WRITE(size, type) \ -static int64_t npu3_cfg_write##size(struct phb *phb, uint32_t bdfn, \ - uint32_t offset, type data) \ -{ \ - uint32_t val = data; \ - int64_t ret; \ - \ - ret = pci_virt_cfg_write(phb, bdfn, offset, \ - sizeof(data), val); \ - return ret; \ -} - -NPU3_CFG_READ(8, u8); -NPU3_CFG_READ(16, u16); -NPU3_CFG_READ(32, u32); -NPU3_CFG_WRITE(8, u8); -NPU3_CFG_WRITE(16, u16); -NPU3_CFG_WRITE(32, u32); - -static int64_t npu3_eeh_freeze_status(struct phb *phb __unused, - uint64_t pe_num __unused, - uint8_t *freeze_state, - uint16_t *pci_error_type, - uint16_t *severity) -{ - /* - * FIXME: When it's called by skiboot PCI config accessor, - * the PE number is fixed to 0, which is incorrect. We need - * introduce another PHB callback to translate it. For now, - * it keeps the skiboot PCI enumeration going. - */ - *freeze_state = OPAL_EEH_STOPPED_NOT_FROZEN; - *pci_error_type = OPAL_EEH_NO_ERROR; - - if (severity) - *severity = OPAL_EEH_SEV_NO_ERROR; - - return OPAL_SUCCESS; -} - -/* Number of PEs supported */ -#define NPU3_MAX_PE_NUM 16 -#define NPU3_RESERVED_PE_NUM 15 - -static int64_t npu3_ioda_reset(struct phb *phb, bool purge __unused) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint64_t val; - - val = NPU3_ATS_IODA_ADDR_AUTO_INC; - val = SETFIELD(NPU3_ATS_IODA_ADDR_TBL_SEL, val, - NPU3_ATS_IODA_ADDR_TBL_TVT); - npu3_write(npu, NPU3_ATS_IODA_ADDR, val); - - for (uint32_t i = 0; i < NPU3_MAX_PE_NUM; i++) - npu3_write(npu, NPU3_ATS_IODA_DATA, 0ull); - - return OPAL_SUCCESS; -} - -static inline void npu3_ioda_sel(struct npu3 *npu, uint32_t table, - uint32_t index) -{ - uint64_t val; - - val = SETFIELD(NPU3_ATS_IODA_ADDR_TBL_SEL, 0ull, table); - val = SETFIELD(NPU3_ATS_IODA_ADDR_TBL_ADDR, val, index); - npu3_write(npu, NPU3_ATS_IODA_ADDR, val); -} - -static int64_t npu3_map_pe_dma_window(struct phb *phb, - uint64_t pe_num, - uint16_t window_id, - uint16_t tce_levels, - uint64_t tce_table_addr, - uint64_t tce_table_size, - uint64_t tce_page_size) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint64_t tts_encoded, val; - uint32_t page_size; - - /* Each PE has one corresponding TVE */ - if (window_id != pe_num || pe_num >= NPU3_MAX_PE_NUM) - return OPAL_PARAMETER; - - npu3_ioda_sel(npu, NPU3_ATS_IODA_ADDR_TBL_TVT, pe_num); - - /* TCE table size zero is used to disable the TVE */ - if (!tce_table_size) { - npu3_write(npu, NPU3_ATS_IODA_DATA, 0ull); - return OPAL_SUCCESS; - } - - /* TCE table size */ - if (!is_pow2(tce_table_size) || tce_table_size < 0x1000) - return OPAL_PARAMETER; - - tts_encoded = ilog2(tce_table_size) - 11; - if (tts_encoded > 39) - return OPAL_PARAMETER; - - val = SETFIELD(NPU3_ATS_IODA_TVT_TABLE_SIZE, 0ull, tts_encoded); - - /* Number of levels */ - if (tce_levels < 1 || tce_levels > 4) - return OPAL_PARAMETER; - - val = SETFIELD(NPU3_ATS_IODA_TVT_TABLE_LEVEL, val, tce_levels - 1); - - /* TCE page size */ - switch (tce_page_size) { - case 256 << 20: - page_size = 17; - break; - case 16 << 20: - page_size = 13; - break; - case 64 << 10: - page_size = 5; - break; - default: - page_size = 1; - } - - val = SETFIELD(NPU3_ATS_IODA_TVT_PAGE_SIZE, val, page_size); - val = SETFIELD(NPU3_ATS_IODA_TVT_XLAT_ADDR, val, tce_table_addr >> 12); - npu3_write(npu, NPU3_ATS_IODA_DATA, val); - - return OPAL_SUCCESS; -} - -static int64_t npu3_map_pe_dma_window_real(struct phb *phb, - uint64_t pe_num, - uint16_t window_id, - uint64_t pci_start_addr __unused, - uint64_t pci_mem_size __unused) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint64_t val; - - /* Each PE has one corresponding TVE */ - if (window_id != pe_num || pe_num >= NPU3_MAX_PE_NUM) - return OPAL_PARAMETER; - - if (pci_mem_size) { - /* - * GPUs need to be able to access the MMIO memory space as well. - * On POWER9 this is above the top of RAM, so disable the TVT - * range check, allowing access to all memory addresses. - */ - val = 0; - } else { - /* Disable */ - val = PPC_BIT(51); - } - - npu3_ioda_sel(npu, NPU3_ATS_IODA_ADDR_TBL_TVT, pe_num); - npu3_write(npu, NPU3_ATS_IODA_DATA, val); - - return OPAL_SUCCESS; -} - -static int64_t npu3_next_error(struct phb *phb, - uint64_t *first_frozen_pe, - uint16_t *pci_error_type, - uint16_t *severity) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint64_t val; - uint32_t pe_num; - - if (!first_frozen_pe || !pci_error_type || !severity) - return OPAL_PARAMETER; - - *first_frozen_pe = -1; - *pci_error_type = OPAL_EEH_NO_ERROR; - *severity = OPAL_EEH_SEV_NO_ERROR; - - for (pe_num = 0; pe_num < NPU3_MAX_PE_NUM; pe_num++) { - val = npu3_read(npu, NPU3_MISC_PESTB_DATA(pe_num)); - if (!GETFIELD(NPU3_MISC_PESTB_DATA_DMA_STOPPED_STATE, val)) - continue; - - *first_frozen_pe = pe_num; - *pci_error_type = OPAL_EEH_PE_ERROR; - *severity = OPAL_EEH_SEV_PE_ER; - break; - } - - return OPAL_SUCCESS; -} - -static struct npu3_dev *npu3_bdfn_to_dev(struct npu3 *npu, uint32_t bdfn) -{ - struct pci_virt_device *pvd; - - /* All emulated devices are attached to root bus */ - if (bdfn & ~0xff) - return NULL; - - pvd = pci_virt_find_device(&npu->nvlink.phb, bdfn); - if (pvd) - return pvd->data; - - return NULL; -} - -static int npu3_match_gpu(struct phb *phb __unused, struct pci_device *pd, - void *data) -{ - const char *slot = data; - struct dt_node *dn; - char *loc_code; - - /* Ignore non-NVIDIA devices */ - if (PCI_VENDOR_ID(pd->vdid) != 0x10de) - return 0; - - /* Find the PCI device's slot location */ - for (dn = pd->dn; - dn && !dt_find_property(dn, "ibm,loc-code"); - dn = dn->parent); - - if (!dn) - return 0; - - loc_code = (char *)dt_prop_get(dn, "ibm,loc-code"); - if (streq(loc_code, slot)) - return 1; - - return 0; -} - -static void npu3_dev_find_gpu(struct npu3_dev *dev) -{ - const char *slot = dev->nvlink.loc_code; - struct phb *phb; - struct pci_device *gpu; - - if (!slot) - return; - - for_each_phb(phb) { - gpu = pci_walk_dev(phb, NULL, npu3_match_gpu, (void *)slot); - if (!gpu) - continue; - - dev->nvlink.gpu = gpu; - return; - } - - NPU3DEVINF(dev, "No PCI device found for slot '%s'\n", slot); -} - -#define VENDOR_CAP_START 0x80 -#define VENDOR_CAP_LINK_FLAG_OFFSET 0x0d - -void npu3_pvd_flag_set(struct npu3_dev *dev, uint8_t flag) -{ - uint32_t offset = VENDOR_CAP_START + VENDOR_CAP_LINK_FLAG_OFFSET; - uint32_t flags; - - PCI_VIRT_CFG_RDONLY_RD(dev->nvlink.pvd, offset, 1, &flags); - flags |= flag; - PCI_VIRT_CFG_INIT_RO(dev->nvlink.pvd, offset, 1, flags); -} - -void npu3_pvd_flag_clear(struct npu3_dev *dev, uint8_t flag) -{ - uint32_t offset = VENDOR_CAP_START + VENDOR_CAP_LINK_FLAG_OFFSET; - uint32_t flags; - - PCI_VIRT_CFG_RDONLY_RD(dev->nvlink.pvd, offset, 1, &flags); - flags &= ~flag; - PCI_VIRT_CFG_INIT_RO(dev->nvlink.pvd, offset, 1, flags); -} - -static struct lock npu3_phandle_lock = LOCK_UNLOCKED; - -static void npu3_append_phandle(struct dt_node *dn, const char *name, - uint32_t phandle) -{ - struct dt_property *prop; - uint32_t *phandles; - size_t len; - - prop = __dt_find_property(dn, name); - if (!prop) { - dt_add_property_cells(dn, name, phandle); - return; - } - - /* - * Make sure no one else has a reference to the property. Assume - * this is the only function that holds a reference to it. - */ - lock(&npu3_phandle_lock); - - /* Need to append to the property */ - len = prop->len + sizeof(*phandles); - dt_resize_property(&prop, len); - - phandles = (uint32_t *)prop->prop; - phandles[len / sizeof(*phandles) - 1] = phandle; - - unlock(&npu3_phandle_lock); -} - -static void npu3_dev_fixup_dt(struct npu3_dev *dev) -{ - struct pci_device *pd = dev->nvlink.pd; - struct pci_device *gpu = dev->nvlink.gpu; - - dt_add_property_cells(pd->dn, "ibm,nvlink", dev->dn->phandle); - dt_add_property_string(pd->dn, "ibm,loc-code", dev->nvlink.loc_code); - if (dev->link_speed != 0xff) - dt_add_property_cells(pd->dn, "ibm,nvlink-speed", - lo32(dev->link_speed)); - - if (!gpu) - return; - - npu3_append_phandle(gpu->dn, "ibm,npu", pd->dn->phandle); - dt_add_property_cells(pd->dn, "ibm,gpu", gpu->dn->phandle); -} - -static int64_t npu3_gpu_bridge_sec_bus_reset(void *pdev, - struct pci_cfg_reg_filter *pcrf __unused, - uint32_t offset, uint32_t len, - uint32_t *data, bool write) -{ - struct pci_device *pd = pdev; - struct pci_device *gpu; - struct npu3 *npu; - struct npu3_dev *dev; - bool purge = false; - - if (!write) - return OPAL_PARAMETER; - - if (len != 2 || offset & 1) { - PCIERR(pd->phb, pd->bdfn, - "Unsupported write to bridge control register\n"); - return OPAL_PARAMETER; - } - - if (!(*data & PCI_CFG_BRCTL_SECONDARY_RESET)) - return OPAL_PARTIAL; - - gpu = list_top(&pd->children, struct pci_device, link); - if (!gpu) - return OPAL_PARTIAL; - - npu3_for_each_nvlink_npu(npu) - npu3_for_each_nvlink_dev(dev, npu) - if (dev->nvlink.gpu == gpu) - if (!npu3_dev_reset(dev)) - purge = true; - - if (purge) - purge_l2_l3_caches(); - - return OPAL_PARTIAL; -} - -static int npu3_dev_bind(struct phb *phb, struct pci_device *pd, - void *data __unused) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - struct npu3_dev *dev = npu3_bdfn_to_dev(npu, pd->bdfn); - struct pci_device *gpu; - - dev->nvlink.pd = pd; - - /* The slot label indicates which GPU this link is connected to */ - dev->nvlink.loc_code = dt_prop_get_def(dev->dn, "ibm,slot-label", NULL); - if (!dev->nvlink.loc_code) { - /** - * @fwts-label NPUNoPHBSlotLabel - * @fwts-advice No GPU/NPU slot information was found. - * NVLink3 functionality will not work. - */ - NPU3DEVERR(dev, "Cannot find GPU slot information\n"); - } - - npu3_dev_find_gpu(dev); - npu3_dev_fixup_dt(dev); - - gpu = dev->nvlink.gpu; - if (!gpu) - return 0; - - /* When a GPU is reset, ensure all of its links are reset too */ - if (gpu->parent && gpu->parent->slot) - pci_add_cfg_reg_filter(gpu->parent, PCI_CFG_BRCTL, 2, - PCI_REG_FLAG_WRITE, - npu3_gpu_bridge_sec_bus_reset); - - npu3_pvd_flag_set(dev, NPU3_DEV_PCI_LINKED); - - return 0; -} - -struct npu3 *npu3_next_nvlink_npu(struct npu3 *npu, uint32_t chip_id) -{ - uint64_t phb_id = 0; - struct phb *phb; - - if (npu) - phb_id = npu->nvlink.phb.opal_id + 1; - - for (; (phb = __pci_next_phb_idx(&phb_id));) { - if (phb->phb_type != phb_type_npu_v3) - continue; - - npu = npu3_phb_to_npu(phb); - if (npu->chip_id == chip_id || chip_id == NPU3_ANY_CHIP) - return npu; - } - - return NULL; -} - -static struct npu3 *npu3_last_npu(void) -{ - static struct npu3 *last = NULL; - struct npu3 *npu; - - if (last) - return last; - - npu3_for_each_nvlink_npu(npu) - last = npu; - - return last; -} - -static uint32_t npu3_gpu_links(struct pci_device *gpu) -{ - const struct dt_property *prop; - - if (!gpu) - return 0; - - /* The link count is the number of phandles in "ibm,npu" */ - prop = dt_find_property(gpu->dn, "ibm,npu"); - if (!prop) - return 0; - - return prop->len / sizeof(uint32_t); -} - -static uint32_t npu3_links_per_gpu(void) -{ - struct npu3 *npu; - struct npu3_dev *dev; - uint32_t links = 0; - - /* Use the first GPU we find to figure this out */ - npu3_for_each_nvlink_npu(npu) { - npu3_for_each_nvlink_dev(dev, npu) { - links = npu3_gpu_links(dev->nvlink.gpu); - if (links) - goto out; - } - } - -out: - prlog(PR_DEBUG, "NPU: %s: %d\n", __func__, links); - - return links; -} - -int32_t npu3_dev_gpu_index(struct npu3_dev *dev) -{ - const char *slot; - char *p = NULL; - int ret; - - slot = dev->nvlink.loc_code; - if (!slot) - return -1; - - if (memcmp(slot, "GPU", 3)) - return -1; - - ret = strtol(slot + 3, &p, 10); - if (*p || p == slot + 3) - return -1; - - return ret; -} - -static uint32_t npu3_chip_possible_gpu_links(void) -{ - struct proc_chip *chip; - struct npu3 *npu; - struct npu3_dev *dev; - uint32_t possible = 0; - - for_each_chip(chip) { - npu3_for_each_chip_nvlink_npu(npu, chip->id) - npu3_for_each_nvlink_dev(dev, npu) - if (npu3_dev_gpu_index(dev) != -1) - possible++; - - if (possible) - break; - } - - prlog(PR_DEBUG, "NPU: %s: %d\n", __func__, possible); - - return possible; -} - -uint32_t npu3_chip_possible_gpus(void) -{ - static uint32_t possible = -1; - uint32_t links_per_gpu; - - /* Static value, same for all chips; only do this once */ - if (possible != -1) - return possible; - - possible = 0; - - links_per_gpu = npu3_links_per_gpu(); - if (links_per_gpu) - possible = npu3_chip_possible_gpu_links() / links_per_gpu; - - prlog(PR_DEBUG, "NPU: %s: %d\n", __func__, possible); - - return possible; -} - -static void npu3_dev_assign_gmb(struct npu3_dev *dev, uint64_t addr, - uint64_t size) -{ - uint32_t mode; - uint64_t val; - - switch (npu3_gpu_links(dev->nvlink.gpu)) { - case 0: - return; - case 1: - mode = 0; - break; - case 2: - mode = 1; - break; - case 3: - mode = 3; - break; - case 4: - mode = 6; - break; - case 6: - mode = 10; - break; - default: - /* Hardware does not support this configuration */ - assert(0); - } - - mode += PCI_FUNC(dev->nvlink.pvd->bdfn); - - val = NPU3_GPU_MEM_BAR_ENABLE | - NPU3_GPU_MEM_BAR_POISON; - val = SETFIELD(NPU3_GPU_MEM_BAR_ADDR, val, addr >> 30); - val = SETFIELD(NPU3_GPU_MEM_BAR_SIZE, val, size >> 30); - val = SETFIELD(NPU3_GPU_MEM_BAR_MODE, val, mode); - - npu3_write(dev->npu, NPU3_GPU_MEM_BAR(dev->index), val); -} - -static struct dt_node *npu3_create_memory_dn(struct npu3_dev *dev, - uint32_t gpu_index, uint64_t addr, - uint64_t size) -{ - uint32_t nid = 255 - gpu_index; - struct dt_node *mem; - - mem = dt_find_by_name_addr(dt_root, "memory", addr); - if (mem) - return mem; - - mem = dt_new_addr(dt_root, "memory", addr); - assert(mem); - - dt_add_property_string(mem, "device_type", "memory"); - dt_add_property_string(mem, "compatible", "ibm,coherent-device-memory"); - dt_add_property_u64s(mem, "reg", addr, size); - dt_add_property_u64s(mem, "linux,usable-memory", addr, 0); - dt_add_property_cells(mem, "ibm,chip-id", nid); - dt_add_property_cells(mem, "ibm,associativity", 4, nid, nid, nid, nid); - - NPU3INF(dev->npu, "%s mem: 0x%016llx (nid %d)\n", dev->nvlink.loc_code, - addr, nid); - - return mem; -} - -static void npu3_dev_init_gpu_mem(struct npu3_dev *dev) -{ - struct pci_device *pd = dev->nvlink.pd; - struct npu3 *npu = dev->npu; - struct dt_node *mem; - uint64_t addr, size, gta; - uint32_t gpu_index; - - if (!dev->nvlink.gpu) - return; - - gpu_index = npu3_dev_gpu_index(dev) % npu3_chip_possible_gpus(); - phys_map_get(npu->chip_id, GPU_MEM_4T_DOWN, gpu_index, &addr, &size); - - npu3_dev_assign_gmb(dev, addr, size); - mem = npu3_create_memory_dn(dev, gpu_index, addr, size); - - /* - * Coral mode address compression. This is documented in Figure 3.5 of - * the NPU workbook; "P9->GPU RA Compression (Coral)". - */ - gta = (addr >> 42 & 0x1) << 42; - gta |= (addr >> 45 & 0x3) << 43; - gta |= (addr >> 49 & 0x3) << 45; - gta |= addr & ((1ul << 43) - 1); - - dt_add_property_cells(pd->dn, "memory-region", mem->phandle); - dt_add_property_u64s(pd->dn, "ibm,device-tgt-addr", gta); -} - -static void npu3_final_fixup(void) -{ - struct npu3 *npu; - struct npu3_dev *dev; - - npu3_for_each_nvlink_npu(npu) - npu3_for_each_nvlink_dev(dev, npu) - npu3_dev_init_gpu_mem(dev); -} - -static void npu3_phb_final_fixup(struct phb *phb) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - - pci_walk_dev(phb, NULL, npu3_dev_bind, NULL); - - /* - * After every npu's devices are bound, do gpu-related fixup. This - * counts on npu3_last_npu() walking the phbs in the same order as - * the PHB final fixup loop in __pci_init_slots(). - */ - if (npu == npu3_last_npu()) - npu3_final_fixup(); -} - -static int64_t npu3_set_pe(struct phb *phb, - uint64_t pe_num, - uint64_t bdfn, - uint8_t bcompare, - uint8_t dcompare, - uint8_t fcompare, - uint8_t action) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - struct npu3_dev *dev; - uint64_t val; - - dev = npu3_bdfn_to_dev(npu, bdfn); - if (!dev) - return OPAL_PARAMETER; - - if (action != OPAL_MAP_PE && action != OPAL_UNMAP_PE) - return OPAL_PARAMETER; - - if (pe_num >= NPU3_MAX_PE_NUM) - return OPAL_PARAMETER; - - if (bcompare != OpalPciBusAll || - dcompare != OPAL_COMPARE_RID_DEVICE_NUMBER || - fcompare != OPAL_COMPARE_RID_FUNCTION_NUMBER) - return OPAL_UNSUPPORTED; - - if (!dev->nvlink.gpu) - return OPAL_SUCCESS; - - val = NPU3_CTL_BDF2PE_CFG_ENABLE; - val = SETFIELD(NPU3_CTL_BDF2PE_CFG_PE, val, pe_num); - val = SETFIELD(NPU3_CTL_BDF2PE_CFG_BDF, val, dev->nvlink.gpu->bdfn); - npu3_write(npu, NPU3_CTL_BDF2PE_CFG(pe_num), val); - - val = NPU3_MISC_BDF2PE_CFG_ENABLE; - val = SETFIELD(NPU3_MISC_BDF2PE_CFG_PE, val, pe_num); - val = SETFIELD(NPU3_MISC_BDF2PE_CFG_BDF, val, dev->nvlink.gpu->bdfn); - npu3_write(npu, NPU3_MISC_BDF2PE_CFG(pe_num), val); - - return OPAL_SUCCESS; -} - -static int64_t npu3_tce_kill_pages(struct npu3 *npu, - uint64_t pe_num, - uint32_t tce_size, - uint64_t dma_addr, - uint32_t npages) -{ - uint32_t check_tce_size; - uint64_t val; - - if (pe_num >= NPU3_MAX_PE_NUM) - return OPAL_PARAMETER; - - npu3_ioda_sel(npu, NPU3_ATS_IODA_ADDR_TBL_TVT, pe_num); - val = npu3_read(npu, NPU3_ATS_IODA_DATA); - - check_tce_size = 0x800 << GETFIELD(NPU3_ATS_IODA_TVT_PAGE_SIZE, val); - if (check_tce_size != tce_size) { - NPU3ERR(npu, "%s: Unexpected TCE size (got 0x%x, expected 0x%x)\n", - __func__, tce_size, check_tce_size); - - return OPAL_PARAMETER; - } - - val = NPU3_ATS_TCE_KILL_ONE; - val = SETFIELD(NPU3_ATS_TCE_KILL_PE_NUMBER, val, pe_num); - - while (npages--) { - val = SETFIELD(NPU3_ATS_TCE_KILL_ADDRESS, val, dma_addr >> 12); - npu3_write(npu, NPU3_ATS_TCE_KILL, val); - - dma_addr += tce_size; - } - - return OPAL_SUCCESS; -} - -static int64_t npu3_tce_kill(struct phb *phb, - uint32_t kill_type, - uint64_t pe_num, - uint32_t tce_size, - uint64_t dma_addr, - uint32_t npages) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - - sync(); - - switch(kill_type) { - case OPAL_PCI_TCE_KILL_PAGES: - return npu3_tce_kill_pages(npu, pe_num, tce_size, - dma_addr, npages); - case OPAL_PCI_TCE_KILL_PE: - /* - * NPU doesn't support killing a PE so fall through - * and do a kill all instead. - */ - case OPAL_PCI_TCE_KILL_ALL: - npu3_write(npu, NPU3_ATS_TCE_KILL, NPU3_ATS_TCE_KILL_ALL); - return OPAL_SUCCESS; - } - - return OPAL_PARAMETER; -} - -static const struct phb_ops npu_ops = { - .cfg_read8 = npu3_cfg_read8, - .cfg_read16 = npu3_cfg_read16, - .cfg_read32 = npu3_cfg_read32, - .cfg_write8 = npu3_cfg_write8, - .cfg_write16 = npu3_cfg_write16, - .cfg_write32 = npu3_cfg_write32, - .eeh_freeze_status = npu3_eeh_freeze_status, - .ioda_reset = npu3_ioda_reset, - .map_pe_dma_window = npu3_map_pe_dma_window, - .map_pe_dma_window_real = npu3_map_pe_dma_window_real, - .next_error = npu3_next_error, - .phb_final_fixup = npu3_phb_final_fixup, - .set_pe = npu3_set_pe, - .tce_kill = npu3_tce_kill, -}; - -static int64_t npu3_reset(struct pci_slot *slot) -{ - struct npu3 *npu = npu3_phb_to_npu(slot->phb); - struct npu3_dev *dev; - int64_t rc = OPAL_SUCCESS; - bool purge = false; - - npu3_for_each_nvlink_dev(dev, npu) { - rc = npu3_dev_reset(dev); - if (rc) - break; - - purge = true; - } - - /* No devices reset; don't purge, just return */ - if (!purge) - return rc; - - /* All devices reset */ - if (!rc) - return purge_l2_l3_caches(); - - /* Some devices successfully reset; purge, but still return error */ - purge_l2_l3_caches(); - return rc; -} - -static int64_t npu3_freset(struct pci_slot *slot __unused) -{ - return OPAL_SUCCESS; -} - -static int64_t npu3_get_link_state(struct pci_slot *slot __unused, - uint8_t *val) -{ - *val = OPAL_SHPC_LINK_UP_x1; - return OPAL_SUCCESS; -} - -static int64_t npu3_get_power_state(struct pci_slot *slot __unused, - uint8_t *val) -{ - *val = PCI_SLOT_POWER_ON; - return OPAL_SUCCESS; -} - -static void npu3_create_phb_slot(struct npu3 *npu) -{ - struct pci_slot *slot; - - slot = pci_slot_alloc(&npu->nvlink.phb, NULL); - if (!slot) - return; - - /* Elementary functions */ - slot->ops.creset = npu3_reset; - slot->ops.freset = npu3_freset; - slot->ops.hreset = npu3_reset; - slot->ops.get_link_state = npu3_get_link_state; - slot->ops.get_power_state = npu3_get_power_state; -} - -static void npu3_create_phb(struct npu3 *npu) -{ - struct phb *phb = &npu->nvlink.phb; - - phb->phb_type = phb_type_npu_v3; - phb->ops = &npu_ops; - phb->dt_node = dt_new_addr(dt_root, "pciex", npu->regs[0]); - assert(phb->dt_node); - - list_head_init(&phb->virt_devices); - pci_register_phb(phb, npu3_get_opal_id(npu->chip_id, - npu3_get_phb_index(npu->index))); - npu3_create_phb_slot(npu); - npu3_ioda_reset(phb, true); -} - -static void npu3_dev_init_hw(struct npu3_dev *dev) -{ - struct npu3 *npu = dev->npu; - uint64_t reg, val; - - reg = NPU3_RELAXED_CFG2(dev->index); - val = npu3_read(npu, reg); - val |= NPU3_RELAXED_CFG2_CMD_CL_DMA_W | - NPU3_RELAXED_CFG2_CMD_CL_DMA_W_HP | - NPU3_RELAXED_CFG2_CMD_CL_DMA_INJ | - NPU3_RELAXED_CFG2_CMD_PR_DMA_INJ | - NPU3_RELAXED_CFG2_CMD_DMA_PR_W | - NPU3_RELAXED_CFG2_CMD_CL_RD_NC_F0 | - NPU3_RELAXED_CFG2_SRC_RDENA(0); - npu3_write(npu, reg, val); - - reg = NPU3_NTL_MISC_CFG2(dev->index); - val = npu3_read(npu, reg); - val |= NPU3_NTL_MISC_CFG2_BRICK_ENABLE | - NPU3_NTL_MISC_CFG2_RCV_CREDIT_OVERFLOW_ENA; - npu3_write(npu, reg, val); -} - -static void npu3_init_hw(struct npu3 *npu) -{ - struct npu3_dev *dev; - uint64_t reg, val; - - reg = NPU3_XTS_CFG; - val = npu3_read(npu, reg); - val |= NPU3_XTS_CFG_MMIOSD | NPU3_XTS_CFG_TRY_ATR_RO; - npu3_write(npu, reg, val); - - reg = NPU3_XTS_CFG2; - val = npu3_read(npu, reg); - val |= NPU3_XTS_CFG2_NO_FLUSH_ENA; - npu3_write(npu, reg, val); - - reg = NPU3_RELAXED_SRC(0); - val = NPU3_RELAXED_SRC_MASK_NPU; - npu3_write(npu, reg, val); - - npu3_for_each_nvlink_dev(dev, npu) - npu3_dev_init_hw(dev); -} - -/* PCI command register (BAR enable/disable) */ -static int64_t npu3_cfg_cmd(void *pvd, - struct pci_cfg_reg_filter *pcrf __unused, - uint32_t offset, uint32_t size, - uint32_t *data, bool write) -{ - struct npu3_dev *dev = ((struct pci_virt_device *)pvd)->data; - - if (!write) - return OPAL_PARTIAL; - - if (offset != PCI_CFG_CMD) - return OPAL_PARAMETER; - - if (size != 1 && size != 2 && size != 4) - return OPAL_PARAMETER; - - npu3_dev_enable_bars(dev, !!(*data & PCI_CFG_CMD_MEM_EN)); - - return OPAL_PARTIAL; -} - -static int64_t npu3_cfg_bar_write(struct npu3_bar *bar, uint64_t mask, - uint32_t data) -{ - if (data != 0xffffffff) - return OPAL_HARDWARE; - - /* Return BAR size on next read */ - bar->trap |= mask; - - return OPAL_SUCCESS; -} - -static int64_t npu3_cfg_bar_read(struct npu3_bar *bar, uint64_t mask, - uint32_t *data) -{ - if (!(bar->trap & mask)) - return OPAL_PARTIAL; - - *data = GETFIELD(mask, bar->size); - bar->trap &= ~mask; - - return OPAL_SUCCESS; -} - -/* PCI BAR registers (NTL/GENID) */ -static int64_t npu3_cfg_bar(void *pvd __unused, - struct pci_cfg_reg_filter *pcrf, - uint32_t offset, uint32_t size, uint32_t *data, - bool write) -{ - struct npu3_bar *bar = (struct npu3_bar *)pcrf->data; - uint64_t mask; - - if (size != 4) - return OPAL_PARAMETER; - - if (offset == pcrf->start) - mask = 0xffffffff; - else if (offset == pcrf->start + 4) - mask = 0xffffffffull << 32; - else - return OPAL_PARAMETER; - - if (write) - return npu3_cfg_bar_write(bar, mask, *data); - - return npu3_cfg_bar_read(bar, mask, data); -} - -/* PCI control register */ -static int64_t npu3_cfg_devctl(void *pvd, - struct pci_cfg_reg_filter *pcrf __unused, - uint32_t offset, uint32_t size, - uint32_t *data, bool write) -{ - struct npu3_dev *dev = ((struct pci_virt_device *)pvd)->data; - - if (!write) - return OPAL_HARDWARE; - - if (size != 2 || offset & 1) { - NPU3DEVERR(dev, "Unsupported write to pcie control register\n"); - return OPAL_PARAMETER; - } - - if (*data & PCICAP_EXP_DEVCTL_FUNC_RESET) - if (!npu3_dev_reset(dev)) - purge_l2_l3_caches(); - - return OPAL_PARTIAL; -} - -static uint32_t npu3_cfg_populate_pcie_cap(struct npu3_dev *dev, uint32_t start, - uint32_t prev_cap) -{ - struct pci_virt_device *pvd = dev->nvlink.pvd; - uint32_t val; - - /* Add capability list */ - PCI_VIRT_CFG_INIT_RO(pvd, prev_cap, 1, start); - PCI_VIRT_CFG_INIT_RO(pvd, start, 1, PCI_CFG_CAP_ID_EXP); - - /* 0x00 - ID/PCIE capability */ - val = PCI_CFG_CAP_ID_EXP; - val |= 0x2 << 16 | PCIE_TYPE_ENDPOINT << 20; - PCI_VIRT_CFG_INIT_RO(pvd, start, 4, val); - - /* 0x04 - Device capability */ - val = PCIE_MPSS_128 | - PCIE_PHANTOM_NONE << 3 | - PCIE_L0SL_MAX_NO_LIMIT << 6 | - PCIE_L1L_MAX_NO_LIMIT << 9 | - PCICAP_EXP_DEVCAP_FUNC_RESET; - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_DEVCAP, 4, val); - - pci_virt_add_filter(pvd, start + PCICAP_EXP_DEVCTL, 2, - PCI_REG_FLAG_WRITE, - npu3_cfg_devctl, NULL); - - /* 0x08 - Device control and status */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_DEVCTL, 4, 0x00002810, - 0xffff0000, 0x000f0000); - - /* 0x0c - Link capability */ - val = PCIE_LSPEED_VECBIT_2 | PCIE_LWIDTH_1X << 4; - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_LCAP, 4, val); - - /* 0x10 - Link control and status */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_LCTL, 4, 0x00130000, - 0xfffff000, 0xc0000000); - - /* 0x14 - Slot capability */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_SLOTCAP, 4, 0x00000000); - - /* 0x18 - Slot control and status */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_SLOTCTL, 4, 0x00000000); - - /* 0x1c - Root control and capability */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_RC, 4, 0x00000000, - 0xffffffe0, 0x00000000); - - /* 0x20 - Root status */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_RSTAT, 4, 0x00000000, - 0xffffffff, 0x00010000); - - /* 0x24 - Device capability 2 */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCIECAP_EXP_DCAP2, 4, 0x00000000); - - /* 0x28 - Device Control and status 2 */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_DCTL2, 4, 0x00070000, - 0xffff0000, 0x00000000); - - /* 0x2c - Link capability 2 */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_LCAP2, 4, 0x00000007); - - /* 0x30 - Link control and status 2 */ - PCI_VIRT_CFG_INIT(pvd, start + PCICAP_EXP_LCTL2, 4, 0x00000003, - 0xffff0000, 0x00200000); - - /* 0x34 - Slot capability 2 */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_SCAP2, 4, 0x00000000); - - /* 0x38 - Slot control and status 2 */ - PCI_VIRT_CFG_INIT_RO(pvd, start + PCICAP_EXP_SCTL2, 4, 0x00000000); - - return start + PCICAP_EXP_SCTL2 + 8; -} - -static int64_t npu3_dev_procedure_write(struct npu3_dev *dev, uint32_t offset, - uint32_t data) -{ - switch (offset) { - case 0: - NPU3DEVINF(dev, "Ignoring write to status register\n"); - break; - case 4: - npu3_dev_procedure_init(dev, data); - break; - default: - return OPAL_PARAMETER; - } - - return OPAL_SUCCESS; -} - -static int64_t npu3_dev_procedure_read(struct npu3_dev *dev, uint32_t offset, - uint32_t *data) -{ - switch (offset) { - case 0: - *data = npu3_dev_procedure_status(dev); - break; - case 4: - *data = dev->proc.number; - break; - default: - *data = 0; - return OPAL_PARAMETER; - } - - return OPAL_SUCCESS; -} - -/* Hardware procedure control/status registers */ -static int64_t npu3_dev_procedure(void *pvd, struct pci_cfg_reg_filter *pcrf, - uint32_t offset, uint32_t size, - uint32_t *data, bool write) -{ - struct npu3_dev *dev = ((struct pci_virt_device *)pvd)->data; - - if (size != 4) - return OPAL_PARAMETER; - - offset -= pcrf->start; - - if (write) - return npu3_dev_procedure_write(dev, offset, *data); - - return npu3_dev_procedure_read(dev, offset, data); -} - -/* PPE SRAM access is indirect via CSAR/CSDR */ -static void npu3_dev_ppe_sram_sel(struct npu3_dev *dev, uint32_t reg) -{ - uint64_t val; - - val = SETFIELD(OB_PPE_CSAR_SRAM_ADDR, 0ull, reg); - xscom_write(dev->npu->chip_id, OB_PPE_CSAR(dev->ob_chiplet), val); -} - -static void npu3_dev_ppe_sram_write(struct npu3_dev *dev, uint32_t reg, - uint64_t val) -{ - npu3_dev_ppe_sram_sel(dev, reg); - xscom_write(dev->npu->chip_id, OB_PPE_CSDR(dev->ob_chiplet), val); -} - -static uint64_t npu3_dev_ppe_sram_read(struct npu3_dev *dev, uint32_t reg) -{ - uint64_t val; - - npu3_dev_ppe_sram_sel(dev, reg); - xscom_read(dev->npu->chip_id, OB_PPE_CSDR(dev->ob_chiplet), &val); - - return val; -} - -/* Software-implemented autonomous link training (SALT) */ -static int64_t npu3_dev_salt(void *pvd, struct pci_cfg_reg_filter *pcrf, - uint32_t offset, uint32_t size, uint32_t *data, - bool write) -{ - struct npu3_dev *dev = ((struct pci_virt_device *)pvd)->data; - unsigned long timeout; - uint32_t cmd_reg; - uint64_t val; - - if (size != 4 || offset != pcrf->start) - return OPAL_PARAMETER; - - /* The config register before this one holds CMD_REG */ - PCI_VIRT_CFG_NORMAL_RD(pvd, pcrf->start - 4, 4, &cmd_reg); - if (cmd_reg == 0xffffffff) - return OPAL_PARAMETER; - - /* Check for another command in progress */ - val = npu3_dev_ppe_sram_read(dev, OB_PPE_SALT_CMD); - if (GETFIELD(OB_PPE_SALT_CMD_READY, val)) { - NPU3DEVINF(dev, "SALT_CMD 0x%x: Not ready\n", cmd_reg); - return OPAL_BUSY; - } - - val = OB_PPE_SALT_CMD_READY; - val = SETFIELD(OB_PPE_SALT_CMD_RW, val, write); - val = SETFIELD(OB_PPE_SALT_CMD_LINKNUM, val, npu3_chip_dev_index(dev)); - val = SETFIELD(OB_PPE_SALT_CMD_REG, val, cmd_reg); - if (write) - val = SETFIELD(OB_PPE_SALT_CMD_DATA, val, *data); - - npu3_dev_ppe_sram_write(dev, OB_PPE_SALT_CMD, val); - - /* Wait for the go bit to clear */ - timeout = mftb() + msecs_to_tb(1000); - - while (GETFIELD(OB_PPE_SALT_CMD_READY, val)) { - if (tb_compare(mftb(), timeout) == TB_AAFTERB) { - NPU3DEVINF(dev, "SALT_CMD 0x%x: Timeout\n", cmd_reg); - return OPAL_BUSY; - } - - val = npu3_dev_ppe_sram_read(dev, OB_PPE_SALT_CMD); - } - - if (GETFIELD(OB_PPE_SALT_CMD_ERR, val)) - NPU3DEVINF(dev, "SALT_CMD 0x%x: Error\n", cmd_reg); - - if (!write) - *data = GETFIELD(OB_PPE_SALT_CMD_DATA, val); - - return OPAL_SUCCESS; -} - -#define VENDOR_CAP_LEN 0x1c -#define VENDOR_CAP_VERSION 0x02 - -static uint32_t npu3_cfg_populate_vendor_cap(struct npu3_dev *dev, - uint32_t start, uint32_t prev_cap) -{ - struct pci_virt_device *pvd = dev->nvlink.pvd; - - /* Capabilities list */ - PCI_VIRT_CFG_INIT_RO(pvd, prev_cap, 1, start); - PCI_VIRT_CFG_INIT_RO(pvd, start, 1, PCI_CFG_CAP_ID_VENDOR); - - /* Length and version */ - PCI_VIRT_CFG_INIT_RO(pvd, start + 2, 1, VENDOR_CAP_LEN); - PCI_VIRT_CFG_INIT_RO(pvd, start + 3, 1, VENDOR_CAP_VERSION); - - /* - * Defaults when the trap can't handle the read/write (eg. due to - * reading/writing less than 4 bytes). - */ - PCI_VIRT_CFG_INIT_RO(pvd, start + 4, 4, 0); - PCI_VIRT_CFG_INIT_RO(pvd, start + 8, 4, 0); - - /* PHY procedure trap */ - pci_virt_add_filter(pvd, start + 4, 8, - PCI_REG_FLAG_READ | PCI_REG_FLAG_WRITE, - npu3_dev_procedure, NULL); - - /* Link index */ - PCI_VIRT_CFG_INIT_RO(pvd, start + 0xc, 1, npu3_chip_dev_index(dev)); - - /* SALT registers */ - PCI_VIRT_CFG_INIT(pvd, start + 0x10, 4, 0xffffffff, 0, 0); - PCI_VIRT_CFG_INIT_RO(pvd, start + 0x14, 4, 0); - - pci_virt_add_filter(pvd, start + 0x14, 4, - PCI_REG_FLAG_READ | PCI_REG_FLAG_WRITE, - npu3_dev_salt, NULL); - - return start + VENDOR_CAP_LEN; -} - -static void npu3_cfg_populate(struct npu3_dev *dev) -{ - struct pci_virt_device *pvd = dev->nvlink.pvd; - uint64_t addr; - uint32_t pos; - - /* 0x00 - Vendor/Device ID */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_VENDOR_ID, 4, 0x04ea1014); - - /* 0x04 - Command/Status */ - PCI_VIRT_CFG_INIT(pvd, PCI_CFG_CMD, 4, 0x00100000, 0xffb802b8, - 0xf9000000); - - pci_virt_add_filter(pvd, PCI_CFG_CMD, 1, PCI_REG_FLAG_WRITE, - npu3_cfg_cmd, NULL); - - /* 0x08 - Rev/Class/Cache */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_REV_ID, 4, 0x06800102); - - /* 0x0c - CLS/Latency Timer/Header/BIST */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_CACHE_LINE_SIZE, 4, 0x00800000); - - /* 0x10/14 - NTL BAR */ - addr = SETFIELD(0xf, dev->ntl_bar.addr, - PCI_CFG_BAR_TYPE_MEM | PCI_CFG_BAR_MEM64); - PCI_VIRT_CFG_INIT(pvd, PCI_CFG_BAR0, 4, lo32(addr), 0xf, 0); - PCI_VIRT_CFG_INIT(pvd, PCI_CFG_BAR1, 4, hi32(addr), 0, 0); - - pci_virt_add_filter(pvd, PCI_CFG_BAR0, 8, - PCI_REG_FLAG_READ | PCI_REG_FLAG_WRITE, - npu3_cfg_bar, &dev->ntl_bar); - - /* 0x18/1c - GENID BAR */ - addr = SETFIELD(0xf, dev->genid_bar.addr, - PCI_CFG_BAR_TYPE_MEM | PCI_CFG_BAR_MEM64); - PCI_VIRT_CFG_INIT(pvd, PCI_CFG_BAR2, 4, lo32(addr), 0xf, 0); - PCI_VIRT_CFG_INIT(pvd, PCI_CFG_BAR3, 4, hi32(addr), 0, 0); - - pci_virt_add_filter(pvd, PCI_CFG_BAR2, 8, - PCI_REG_FLAG_READ | PCI_REG_FLAG_WRITE, - npu3_cfg_bar, &dev->genid_bar); - - /* 0x20/0x24 - BARs, disabled */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_BAR4, 4, 0x00000000); - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_BAR5, 4, 0x00000000); - - /* 0x28 - Cardbus CIS pointer */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_CARDBUS_CIS, 4, 0x00000000); - - /* 0x2c - Subsystem ID */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_SUBSYS_VENDOR_ID, 4, 0x00000000); - - /* 0x30 - ROM BAR, zero sized */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_ROMBAR, 4, 0xffffffff); - - /* 0x34 - PCI Capability */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_CAP, 4, 0x00000000); - - /* 0x38 - Reserved */ - PCI_VIRT_CFG_INIT_RO(pvd, 0x38, 4, 0x00000000); - - /* 0x3c - INT line/pin/Minimal grant/Maximal latency */ - PCI_VIRT_CFG_INIT_RO(pvd, PCI_CFG_INT_LINE, 4, 0x00000100); /* INT A */ - - /* PCIE and vendor specific capability */ - pos = npu3_cfg_populate_pcie_cap(dev, 0x40, PCI_CFG_CAP); - pos = npu3_cfg_populate_vendor_cap(dev, pos, 0x41); - PCI_VIRT_CFG_INIT_RO(pvd, pos + 1, 1, 0); -} - -static void npu3_dev_create_pvd(struct npu3_dev *dev) -{ - struct npu3 *npu = dev->npu; - struct phb *phb = &npu->nvlink.phb; - - dev->nvlink.pvd = pci_virt_add_device(phb, dev->index, 0x100, dev); - if (!dev->nvlink.pvd) - return; - - phb->scan_map |= 0x1 << GETFIELD(0xf8, dev->nvlink.pvd->bdfn); - npu3_cfg_populate(dev); -} - -static void npu3_dt_add_mmio_atsd(struct npu3 *npu) -{ - struct dt_node *dn = npu->nvlink.phb.dt_node; - uint64_t mmio_atsd[NPU3_XTS_ATSD_MAX]; - - for (uint32_t i = 0; i < NPU3_XTS_ATSD_MAX; i++) - mmio_atsd[i] = npu->regs[0] + NPU3_XTS_ATSD_LAUNCH(i); - - dt_add_property(dn, "ibm,mmio-atsd", mmio_atsd, sizeof(mmio_atsd)); -} - -static void npu3_dt_add_mmio_window(struct npu3 *npu) -{ - struct dt_node *dn = npu->nvlink.phb.dt_node; - uint32_t ntl0_index = npu->index * NPU3_LINKS_PER_NPU; - uint64_t addr, size, win[2]; - - /* Device MMIO window (NTL/GENID regs only) */ - phys_map_get(npu->chip_id, NPU_NTL, ntl0_index, &win[0], NULL); - phys_map_get(npu->chip_id, NPU_GENID, npu->index, &addr, &size); - win[1] = addr + size - win[0]; - - dt_add_property(dn, "ibm,mmio-window", win, sizeof(win)); - dt_add_property_cells(dn, "ranges", 0x02000000, - hi32(win[0]), lo32(win[0]), - hi32(win[0]), lo32(win[0]), - hi32(win[1]), lo32(win[1])); -} - -/* NDL No-Stall Event level */ -static uint32_t npu3_dev_interrupt_level(struct npu3_dev *dev) -{ - const uint32_t level[12] = { 1, 3, 5, 7, 9, 11, - 43, 45, 47, 49, 51, 53 }; - - return level[npu3_chip_dev_index(dev)]; -} - -static void npu3_dt_add_interrupts(struct npu3 *npu) -{ - struct dt_node *dn = npu->nvlink.phb.dt_node; - uint32_t *map, icsp, i = 0; - struct npu3_dev *dev; - size_t map_size = 0; - - npu3_for_each_nvlink_dev(dev, npu) - map_size += sizeof(*map) * 7; - - if (!map_size) - return; - - icsp = get_ics_phandle(); - map = zalloc(map_size); - assert(map); - - npu3_for_each_nvlink_dev(dev, npu) { - map[i] = dev->nvlink.pvd->bdfn << 8; - map[i + 3] = 1; /* INT A */ - map[i + 4] = icsp; /* interrupt-parent */ - map[i + 5] = npu->irq_base + npu3_dev_interrupt_level(dev); - map[i + 6] = 0; /* 0 = EDGE, 1 = LEVEL */ - i += 7; - } - - dt_add_property_cells(dn, "interrupt-parent", icsp); - dt_add_property(dn, "interrupt-map", map, map_size); - dt_add_property_cells(dn, "interrupt-map-mask", 0xff00, 0x0, 0x0, 0x7); - - free(map); -} - -/* Populate PCI root device node */ -static void npu3_dt_add_props(struct npu3 *npu) -{ - struct dt_node *dn = npu->nvlink.phb.dt_node; - - dt_add_property_cells(dn, "#address-cells", 3); - dt_add_property_cells(dn, "#size-cells", 2); - dt_add_property_cells(dn, "#interrupt-cells", 1); - dt_add_property_cells(dn, "bus-range", 0, 0xff); - dt_add_property_cells(dn, "clock-frequency", 0x200, 0); - - dt_add_property_strings(dn, "device_type", "pciex"); - - /* - * To the OS, npu2 and npu3 are both ibm,ioda2-npu2-phb. The added - * ibm,ioda3-npu3-phb allows for possible quirks. - */ - dt_add_property_strings(dn, "compatible", - "ibm,power9-npu-pciex", - "ibm,ioda2-npu2-phb", - "ibm,ioda2-npu3-phb"); - - dt_add_property_cells(dn, "ibm,phb-index", - npu3_get_phb_index(npu->index)); - dt_add_property_cells(dn, "ibm,phb-diag-data-size", 0); - dt_add_property_cells(dn, "ibm,opal-num-pes", NPU3_MAX_PE_NUM); - dt_add_property_cells(dn, "ibm,opal-reserved-pe", NPU3_RESERVED_PE_NUM); - dt_add_property_cells(dn, "ibm,supported-tce-sizes", - 12, /* 4K */ - 16, /* 64K */ - 24, /* 16M */ - 28); /* 256M */ - - dt_add_property_cells(dn, "ibm,chip-id", npu->chip_id); - dt_add_property_cells(dn, "ibm,npu-index", npu->index); - dt_add_property_cells(dn, "ibm,npcq", npu->dt_node->phandle); - dt_add_property_cells(dn, "ibm,xscom-base", npu->xscom_base); - dt_add_property_cells(dn, "ibm,links", NPU3_LINKS_PER_NPU); - - dt_add_property(dn, "reg", npu->regs, sizeof(npu->regs)); - - npu3_dt_add_mmio_atsd(npu); - npu3_dt_add_mmio_window(npu); - npu3_dt_add_interrupts(npu); -} - -void npu3_init_nvlink(struct npu3 *npu) -{ - struct npu3_dev *dev; - - if (!npu3_next_dev(npu, NULL, NPU3_DEV_TYPE_NVLINK)) - return; - - npu3_init_hw(npu); - npu3_create_phb(npu); - - npu3_for_each_nvlink_dev(dev, npu) - npu3_dev_create_pvd(dev); - - npu3_dt_add_props(npu); - - /* TODO: Sort out if/why we still can't enable this */ - disable_fast_reboot("NVLink device enabled"); -} - -static int64_t npu3_init_context_pid(struct npu3 *npu, uint32_t index, - uint64_t msr) -{ - uint64_t map, old_map; - - /* Unfiltered XTS mode; index is lparshort */ - map = SETFIELD(NPU3_XTS_PID_MAP_LPARSHORT, 0ull, index); - - /* Enable this mapping for both real and virtual addresses */ - map |= NPU3_XTS_PID_MAP_VALID_ATRGPA0 | NPU3_XTS_PID_MAP_VALID_ATRGPA1; - - /* Enable TLBIE/MMIOSD forwarding for this entry */ - map |= NPU3_XTS_PID_MAP_VALID_ATSD; - - /* Set the relevant MSR bits */ - if (msr & MSR_DR) - map |= NPU3_XTS_PID_MAP_MSR_DR; - - if (msr & MSR_HV) - map |= NPU3_XTS_PID_MAP_MSR_HV; - - if (msr & MSR_PR) - map |= NPU3_XTS_PID_MAP_MSR_PR; - - /* We don't support anything other than 64-bit so hardcode it here */ - map |= NPU3_XTS_PID_MAP_MSR_SF; - - old_map = npu3_read(npu, NPU3_XTS_PID_MAP(index)); - - /* Error out if this entry is already set with different msr bits */ - if (old_map && GETFIELD(NPU3_XTS_PID_MAP_MSR, old_map) != - GETFIELD(NPU3_XTS_PID_MAP_MSR, map)) { - NPU3ERR(npu, "%s: Unexpected MSR value\n", __func__); - return OPAL_PARAMETER; - } - - if (!old_map) { - NPU3DBG(npu, "XTS_PID_MAP[%03d] = 0x%08llx\n", index, map); - npu3_write(npu, NPU3_XTS_PID_MAP(index), map); - } - - npu->nvlink.ctx_ref[index]++; - - return OPAL_SUCCESS; -} - -#define NPU3_VALID_ATS_MSR_BITS (MSR_DR | MSR_HV | MSR_PR | MSR_SF) - -/* - * Allocate a context ID and initialize the tables with the relevant - * information. Returns the ID or error if one couldn't be allocated. - */ -int64_t npu3_init_context(struct phb *phb, uint64_t msr, uint64_t bdf) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint32_t lparshort, i; - uint64_t map; - int64_t rc; - - /* - * MSR bits should be masked by the caller to allow for future - * expansion if required. - */ - if (msr & ~NPU3_VALID_ATS_MSR_BITS) - return OPAL_UNSUPPORTED; - - lock(&npu->lock); - - for (i = 0; i < NPU3_XTS_BDF_MAP_MAX; i++) { - map = npu3_read(npu, NPU3_XTS_BDF_MAP(i)); - - if (map && GETFIELD(NPU3_XTS_BDF_MAP_BDF, map) == bdf) - break; - } - - if (i == NPU3_XTS_BDF_MAP_MAX) { - NPU3ERR(npu, "LPARID not associated with any GPU\n"); - rc = OPAL_PARAMETER; - goto out; - } - - lparshort = GETFIELD(NPU3_XTS_BDF_MAP_LPARSHORT, map); - NPU3DBG(npu, "Found LPARSHORT 0x%x for bdf %02llx:%02llx.%llx\n", - lparshort, PCI_BUS_NUM(bdf), PCI_DEV(bdf), PCI_FUNC(bdf)); - - rc = npu3_init_context_pid(npu, lparshort, msr); - if (rc) - goto out; - - if (!(map & NPU3_XTS_BDF_MAP_VALID)) { - map |= NPU3_XTS_BDF_MAP_VALID; - npu3_write(npu, NPU3_XTS_BDF_MAP(i), map); - } - - rc = lparshort; - -out: - unlock(&npu->lock); - return rc; -} - -static int64_t npu3_destroy_context_pid(struct npu3 *npu, uint32_t index) -{ - if (!npu->nvlink.ctx_ref[index]) - return OPAL_PARAMETER; - - /* Only destroy when refcount hits 0 */ - if (--npu->nvlink.ctx_ref[index]) - return OPAL_PARTIAL; - - NPU3DBG(npu, "XTS_PID_MAP[%03d] = 0 (destroy)\n", index); - npu3_write(npu, NPU3_XTS_PID_MAP(index), 0ull); - - return OPAL_SUCCESS; -} - -int64_t npu3_destroy_context(struct phb *phb, uint64_t bdf) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - uint32_t lparshort, i; - int64_t map, rc; - - lock(&npu->lock); - - for (i = 0; i < NPU3_XTS_BDF_MAP_MAX; i++) { - map = npu3_read(npu, NPU3_XTS_BDF_MAP(i)); - - if (map && GETFIELD(NPU3_XTS_BDF_MAP_BDF, map) == bdf) - break; - } - - if (i == NPU3_XTS_BDF_MAP_MAX) { - NPU3ERR(npu, "LPARID not associated with any GPU\n"); - rc = OPAL_PARAMETER; - goto out; - } - - lparshort = GETFIELD(NPU3_XTS_BDF_MAP_LPARSHORT, map); - rc = npu3_destroy_context_pid(npu, lparshort); - -out: - unlock(&npu->lock); - return rc; -} - -/* Map the given virtual bdf to lparid with given lpcr */ -int64_t npu3_map_lpar(struct phb *phb, uint64_t bdf, uint64_t lparid, - uint64_t lpcr) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - struct npu3_dev *dev; - int64_t rc = OPAL_SUCCESS; - uint64_t map, val; - uint32_t i; - - /* - * The LPCR bits are only required for hash based ATS, which we don't - * currently support, but may need to in the future. - */ - if (lpcr) - return OPAL_UNSUPPORTED; - - lock(&npu->lock); - - /* Update the entry if it already exists */ - for (i = 0; i < NPU3_XTS_BDF_MAP_MAX; i++) { - map = npu3_read(npu, NPU3_XTS_BDF_MAP(i)); - - if (map && GETFIELD(NPU3_XTS_BDF_MAP_BDF, map) == bdf) - break; - } - - if (i == NPU3_XTS_BDF_MAP_MAX) { - /* No existing mapping found, find space for a new one */ - for (i = 0; i < NPU3_XTS_BDF_MAP_MAX; i++) - if (!npu3_read(npu, NPU3_XTS_BDF_MAP(i))) - break; - } - - if (i == NPU3_XTS_BDF_MAP_MAX) { - NPU3ERR(npu, "No free XTS_BDF[] entry\n"); - rc = OPAL_RESOURCE; - goto out; - } - - map = NPU3_XTS_BDF_MAP_UNFILT; - map = SETFIELD(NPU3_XTS_BDF_MAP_BDF, map, bdf); - map = SETFIELD(NPU3_XTS_BDF_MAP_LPARID, map, lparid); - map = SETFIELD(NPU3_XTS_BDF_MAP_LPARSHORT, map, i); - - /* We only support radix at the moment */ - map = SETFIELD(NPU3_XTS_BDF_MAP_XLAT, map, 0x3); - - /* Find a link on which to send ATSDs for this device */ - npu3_for_each_nvlink_dev(dev, npu) - if (dev->nvlink.gpu->bdfn == bdf) - break; - - if (!dev || dev->nvlink.gpu->bdfn != bdf) { - NPU3ERR(npu, "Can't find a link for bdf %02llx:%02llx.%llx\n", - PCI_BUS_NUM(bdf), PCI_DEV(bdf), PCI_FUNC(bdf)); - rc = OPAL_PARAMETER; - goto out; - } - - map = SETFIELD(NPU3_XTS_BDF_MAP_BRICK, map, dev->index); - - NPU3DBG(npu, "XTS_BDF_MAP[%03d] = 0x%08llx\n", i, map); - npu3_write(npu, NPU3_XTS_BDF_MAP(i), map); - - /* We need to allocate an ATSD per link */ - val = SETFIELD(NPU3_XTS_ATSD_HYP_LPARID, 0ull, lparid); - if (!lparid) - val |= NPU3_XTS_ATSD_HYP_MSR_HV; - - npu3_write(npu, NPU3_XTS_ATSD_HYP(dev->index), val); - -out: - unlock(&npu->lock); - return rc; -} - -static int64_t npu3_relaxed_order_enable(struct npu3 *npu, uint64_t src) -{ - struct npu3_dev *dev; - uint32_t i; - - for (i = 0; i < NPU3_RELAXED_SRC_MAX; i++) - if (npu3_read(npu, NPU3_RELAXED_SRC(i)) == src) - return OPAL_SUCCESS; /* Already enabled */ - - /* Find somewhere to write this source */ - for (i = 0; i < NPU3_RELAXED_SRC_MAX; i++) - if (!npu3_read(npu, NPU3_RELAXED_SRC(i))) - break; - - if (i == NPU3_RELAXED_SRC_MAX) { - NPU3ERR(npu, "Insufficient resources to activate relaxed ordering mode\n"); - return OPAL_RESOURCE; - } - - npu3_write(npu, NPU3_RELAXED_SRC(i), src); - - npu3_for_each_nvlink_dev(dev, npu) { - uint64_t val = npu3_read(npu, NPU3_RELAXED_CFG2(dev->index)); - - val |= NPU3_RELAXED_CFG2_SRC_WRENA(i) | - NPU3_RELAXED_CFG2_SRC_RDENA(i); - npu3_write(npu, NPU3_RELAXED_CFG2(dev->index), val); - } - - return OPAL_SUCCESS; -} - -static void npu3_relaxed_order_disable(struct npu3 *npu, uint64_t src) -{ - struct npu3_dev *dev; - uint32_t i; - - for (i = 0; i < NPU3_RELAXED_SRC_MAX; i++) - if (npu3_read(npu, NPU3_RELAXED_SRC(i)) == src) - break; - - if (i == NPU3_RELAXED_SRC_MAX) - return; /* Already disabled */ - - npu3_for_each_nvlink_dev(dev, npu) { - uint64_t val = npu3_read(npu, NPU3_RELAXED_CFG2(dev->index)); - - val &= ~NPU3_RELAXED_CFG2_SRC_WRENA(i); - val &= ~NPU3_RELAXED_CFG2_SRC_RDENA(i); - npu3_write(npu, NPU3_RELAXED_CFG2(dev->index), val); - } - - npu3_write(npu, NPU3_RELAXED_SRC(i), 0ull); -} - -/* Enable or disable relaxed ordering on all nvlinks for a given PEC. */ -int64_t npu3_set_relaxed_order(struct phb *phb, uint32_t gcid, int pec, - bool enable) -{ - struct npu3 *npu = npu3_phb_to_npu(phb); - int64_t rc = OPAL_SUCCESS; - uint64_t src; - - NPU3INF(npu, "%s relaxed ordering for PEC %d on chip %d\n", - enable ? "Enabling" : "Disabling", - pec, gcid); - - lock(&npu->lock); - - src = SETFIELD(NPU3_RELAXED_SRC_GRPCHP, 0ull, gcid); - src = SETFIELD(NPU3_RELAXED_SRC_PEC, src, pec); - src = SETFIELD(NPU3_RELAXED_SRC_RDSTART, src, 0); - src = SETFIELD(NPU3_RELAXED_SRC_RDEND, src, 47); - src = SETFIELD(NPU3_RELAXED_SRC_WRSTART, src, 0); - src = SETFIELD(NPU3_RELAXED_SRC_WREND, src, 23); - - if (enable) - rc = npu3_relaxed_order_enable(npu, src); - else - npu3_relaxed_order_disable(npu, src); - - unlock(&npu->lock); - return rc; -} diff --git a/hw/npu3.c b/hw/npu3.c deleted file mode 100644 index 03461373..00000000 --- a/hw/npu3.c +++ /dev/null @@ -1,549 +0,0 @@ -// SPDX-License-Identifier: Apache-2.0 OR GPL-2.0-or-later -/* - * Copyright 2019 IBM Corp. - */ - -#include -#include -#include -#include -#include -#include -#include - -#define NPU3LOG(l, npu, fmt, a...) \ - prlog(l, "NPU[%d:%d]: " fmt, (npu)->chip_id, (npu)->index, ##a) -#define NPU3DBG(npu, fmt, a...) NPU3LOG(PR_DEBUG, npu, fmt, ##a) -#define NPU3INF(npu, fmt, a...) NPU3LOG(PR_INFO, npu, fmt, ##a) -#define NPU3ERR(npu, fmt, a...) NPU3LOG(PR_ERR, npu, fmt, ##a) - -#define NPU3DEVLOG(l, dev, fmt, a...) \ - prlog(l, "NPU[%d:%d:%d]: " fmt, \ - (dev)->npu->chip_id, \ - (dev)->npu->index, \ - (dev)->index, ##a) -#define NPU3DEVDBG(dev, fmt, a...) NPU3DEVLOG(PR_DEBUG, dev, fmt, ##a) -#define NPU3DEVINF(dev, fmt, a...) NPU3DEVLOG(PR_INFO, dev, fmt, ##a) -#define NPU3DEVERR(dev, fmt, a...) NPU3DEVLOG(PR_ERR, dev, fmt, ##a) - -static void npu3_dt_create_link(struct dt_node *npu, uint32_t npu_index, - uint32_t dev_index) -{ - struct dt_node *link; - uint32_t phy_lane_mask, ob_chiplet; - - link = dt_new_addr(npu, "link", dev_index); - - dt_add_property_string(link, "compatible", "ibm,npu-link"); - dt_add_property_cells(link, "reg", dev_index); - dt_add_property_cells(link, "ibm,npu-link-index", dev_index); - - switch (npu_index) { - case 0: - /* fall through */ - case 2: - ob_chiplet = npu_index ? 3 : 0; - - switch (dev_index) { - case 0: - phy_lane_mask = PPC_BITMASK32(0, 3); - break; - case 1: - phy_lane_mask = PPC_BITMASK32(13, 16); - break; - case 2: - phy_lane_mask = PPC_BITMASK32(7, 10); - break; - case 3: - phy_lane_mask = PPC_BITMASK32(20, 23); - break; - } - - break; - case 1: - switch (dev_index) { - case 0: - ob_chiplet = 1; - phy_lane_mask = PPC_BITMASK32(0, 3); - break; - case 1: - ob_chiplet = 2; - phy_lane_mask = PPC_BITMASK32(0, 3); - break; - case 2: - ob_chiplet = 1; - phy_lane_mask = PPC_BITMASK32(7, 10); - break; - case 3: - ob_chiplet = 2; - phy_lane_mask = PPC_BITMASK32(7, 10); - break; - } - - break; - default: - return; - } - - dt_add_property_cells(link, "ibm,npu-phy", ob_chiplet); - dt_add_property_cells(link, "ibm,npu-lane-mask", phy_lane_mask); -} - -static void npu3_dt_create_npu(struct dt_node *xscom, uint32_t npu_index) -{ - const uint32_t npu_base[] = { 0x5011000, 0x5011400, 0x3011c00 }; - struct dt_node *npu; - - npu = dt_new_addr(xscom, "npu", npu_base[npu_index]); - - dt_add_property_cells(npu, "#size-cells", 0); - dt_add_property_cells(npu, "#address-cells", 1); - dt_add_property_cells(npu, "reg", npu_base[npu_index], 0x2c); - dt_add_property_string(npu, "compatible", "ibm,power9-npu3"); - dt_add_property_cells(npu, "ibm,npu-index", npu_index); - - for (uint32_t i = 0; i < NPU3_LINKS_PER_NPU; i++) - npu3_dt_create_link(npu, npu_index, i); -} - -/* This can be removed when/if we decide to use HDAT instead */ -static bool npu3_dt_create(void) -{ - struct proc_chip *chip = next_chip(NULL); - struct dt_node *xscom; - - /* npu3 chips only */ - if (proc_gen < proc_gen_p9 || - chip->type == PROC_CHIP_P9_NIMBUS || - chip->type == PROC_CHIP_P9_CUMULUS) - return false; - - dt_for_each_compatible(dt_root, xscom, "ibm,xscom") - for (uint32_t i = 0; i < 3; i++) - npu3_dt_create_npu(xscom, i); - - return true; -} - -static struct npu3 *npu3_create(struct dt_node *dn) -{ - struct npu3 *npu; - struct dt_node *link; - struct npu3_dev *dev; - char *path; - uint32_t i; - - npu = zalloc(sizeof(*npu)); - assert(npu); - - init_lock(&npu->lock); - - npu->dt_node = dn; - npu->index = dt_prop_get_u32(dn, "ibm,npu-index"); - npu->xscom_base = dt_get_address(dn, 0, NULL); - - npu->chip_id = dt_get_chip_id(dn); - assert(get_chip(npu->chip_id)); - - dt_for_each_compatible(dn, link, "ibm,npu-link") { - i = dt_prop_get_u32(link, "ibm,npu-link-index"); - assert(i < NPU3_LINKS_PER_NPU); - - dev = &npu->devices[i]; - dev->index = i; - dev->npu = npu; - dev->dn = link; - dev->ob_chiplet = dt_prop_get_u32(link, "ibm,npu-phy"); - dev->phy_lane_mask = dt_prop_get_u32(link, "ibm,npu-lane-mask"); - dev->proc.status = NPU3_PROC_COMPLETE; - }; - - path = dt_get_path(dn); - NPU3INF(npu, "Found %s\n", path); - NPU3INF(npu, "SCOM base: 0x%llx\n", npu->xscom_base); - free(path); - - return npu; -} - -struct npu3_dev *npu3_next_dev(struct npu3 *npu, struct npu3_dev *dev, - enum npu3_dev_type type) -{ - uint32_t i = 0; - - if (dev) - i = dev->index + 1; - - for (; i < NPU3_LINKS_PER_NPU; i++) { - dev = &npu->devices[i]; - - if (dev->type == type || type == NPU3_DEV_TYPE_ANY) - return dev; - } - - return NULL; -} - -static void npu3_device_detect_fixup(struct npu3_dev *dev) -{ - struct dt_node *dn = dev->dn; - - if (dev->type == NPU3_DEV_TYPE_NVLINK) { - dt_add_property_strings(dn, "ibm,npu-link-type", "nvlink"); - dev->link_speed = dt_prop_get_u32_def( - dn, "nvidia,link-speed", 0xff); - return; - } - - NPU3DEVDBG(dev, "Link type unknown\n"); - dt_add_property_strings(dn, "ibm,npu-link-type", "unknown"); -} - -/* - * We use the indirect method because it uses the same addresses as - * the MMIO offsets (NPU RING) - */ -static void npu3_scom_sel(struct npu3 *npu, uint64_t reg, uint64_t size) -{ - uint64_t val; - - val = SETFIELD(NPU3_MISC_DA_ADDR, 0ull, reg); - val = SETFIELD(NPU3_MISC_DA_LEN, val, size); - xscom_write(npu->chip_id, - npu->xscom_base + NPU3_MISC_SCOM_IND_SCOM_ADDR, - val); -} - -static void npu3_scom_write(struct npu3 *npu, uint64_t reg, uint64_t size, - uint64_t val) -{ - npu3_scom_sel(npu, reg, size); - xscom_write(npu->chip_id, - npu->xscom_base + NPU3_MISC_SCOM_IND_SCOM_DATA, - val); -} - -static uint64_t npu3_scom_read(struct npu3 *npu, uint64_t reg, uint64_t size) -{ - uint64_t val; - - npu3_scom_sel(npu, reg, size); - xscom_read(npu->chip_id, - npu->xscom_base + NPU3_MISC_SCOM_IND_SCOM_DATA, - &val); - - return val; -} - -void npu3_write(struct npu3 *npu, uint64_t reg, uint64_t val) -{ - void *mmio = (void *)npu->regs[0]; - - if (mmio) - out_be64(mmio + reg, val); - else - npu3_scom_write(npu, reg, NPU3_MISC_DA_LEN_8B, val); - - /* CQ_SM writes should be mirrored in all four blocks */ - if (NPU3_REG_BLOCK(reg) != NPU3_BLOCK_CQ_SM(0)) - return; - - for (uint32_t i = 1; i < 4; i++) - npu3_write(npu, NPU3_BLOCK_CQ_SM(i) + NPU3_REG_OFFSET(reg), - val); -} - -uint64_t npu3_read(struct npu3 *npu, uint64_t reg) -{ - void *mmio = (void *)npu->regs[0]; - - if (mmio) - return in_be64(mmio + reg); - - return npu3_scom_read(npu, reg, NPU3_MISC_DA_LEN_8B); -} - -void npu3_write_4b(struct npu3 *npu, uint64_t reg, uint32_t val) -{ - void *mmio = (void *)npu->regs[0]; - - if (mmio) - out_be32(mmio + reg, val); - else - npu3_scom_write(npu, reg, NPU3_MISC_DA_LEN_4B, - (uint64_t)val << 32); - - if (NPU3_REG_BLOCK(reg) != NPU3_BLOCK_CQ_SM(0)) - return; - - for (uint32_t i = 1; i < 4; i++) - npu3_write_4b(npu, NPU3_BLOCK_CQ_SM(i) + NPU3_REG_OFFSET(reg), - val); -} - -uint32_t npu3_read_4b(struct npu3 *npu, uint64_t reg) -{ - void *mmio = (void *)npu->regs[0]; - - if (mmio) - return in_be32(mmio + reg); - - return npu3_scom_read(npu, reg, NPU3_MISC_DA_LEN_4B) >> 32; -} - -static void npu3_misc_config(struct npu3 *npu) -{ - struct npu3_dev *dev; - uint32_t typemap = 0; - uint64_t reg, val; - - npu3_for_each_nvlink_dev(dev, npu) - typemap |= 0x10 >> dev->index; - - reg = NPU3_MCP_MISC_CFG0; - val = npu3_read(npu, reg); - val |= NPU3_MCP_MISC_CFG0_ENABLE_PBUS; - val &= ~NPU3_MCP_MISC_CFG0_ENABLE_SNARF_CPM; - val = SETFIELD(NPU3_MCP_MISC_CFG0_NVLINK_MODE, val, typemap); - val = SETFIELD(NPU3_MCP_MISC_CFG0_OCAPI_MODE, val, ~typemap); - npu3_write(npu, reg, val); - - reg = NPU3_SNP_MISC_CFG0; - val = npu3_read(npu, reg); - val |= NPU3_SNP_MISC_CFG0_ENABLE_PBUS; - val = SETFIELD(NPU3_SNP_MISC_CFG0_NVLINK_MODE, val, typemap); - val = SETFIELD(NPU3_SNP_MISC_CFG0_OCAPI_MODE, val, ~typemap); - npu3_write(npu, reg, val); - - reg = NPU3_CTL_MISC_CFG2; - val = npu3_read(npu, reg); - val = SETFIELD(NPU3_CTL_MISC_CFG2_NVLINK_MODE, val, typemap); - val = SETFIELD(NPU3_CTL_MISC_CFG2_OCAPI_MODE, val, ~typemap); - npu3_write(npu, reg, val); - - reg = NPU3_DAT_MISC_CFG1; - val = npu3_read(npu, reg); - val = SETFIELD(NPU3_DAT_MISC_CFG1_NVLINK_MODE, val, typemap); - val = SETFIELD(NPU3_DAT_MISC_CFG1_OCAPI_MODE, val, ~typemap); - npu3_write(npu, reg, val); -} - -static void npu3_assign_bars(struct npu3 *npu) -{ - struct npu3_dev *dev; - uint64_t addr, size, val; - - /* Global MMIO bar (per npu) */ - phys_map_get(npu->chip_id, NPU_REGS, npu->index, &addr, &size); - val = SETFIELD(NPU3_MMIO_BAR_ADDR, 0ull, addr >> 24); - val |= NPU3_MMIO_BAR_ENABLE; - npu3_write(npu, NPU3_MMIO_BAR, val); - - NPU3INF(npu, "MMIO base: 0x%016llx (%lldMB)\n", addr, size >> 20); - npu->regs[0] = addr; - npu->regs[1] = size; - - /* NTL bar (per device) */ - npu3_for_each_dev(dev, npu) { - phys_map_get(npu->chip_id, NPU_NTL, npu3_chip_dev_index(dev), - &addr, &size); - val = SETFIELD(NPU3_NTL_BAR_ADDR, 0ull, addr >> 16); - val = SETFIELD(NPU3_NTL_BAR_SIZE, val, ilog2(size >> 16)); - npu3_write(npu, NPU3_NTL_BAR(dev->index), val); - - dev->ntl_bar.addr = addr; - dev->ntl_bar.size = size; - } - - /* GENID bar (logically divided per device) */ - phys_map_get(npu->chip_id, NPU_GENID, npu->index, &addr, NULL); - val = SETFIELD(NPU3_GENID_BAR_ADDR, 0ull, addr >> 19); - npu3_write(npu, NPU3_GENID_BAR, val); - - npu3_for_each_dev(dev, npu) { - dev->genid_bar.addr = addr + (dev->index << 16); - dev->genid_bar.size = 64 << 10; - } -} - -void npu3_dev_enable_bars(struct npu3_dev *dev, bool enable) -{ - struct npu3 *npu = dev->npu; - uint64_t reg, val; - - if (dev->ntl_bar.enable == enable) /* No state change */ - return; - - dev->ntl_bar.enable = enable; - dev->genid_bar.enable = enable; - - reg = NPU3_NTL_BAR(dev->index); - val = npu3_read(npu, reg); - val = SETFIELD(NPU3_NTL_BAR_ENABLE, val, enable); - npu3_write(npu, reg, val); - - /* - * Generation IDs are a single space in the hardware but we split them - * per device. Only disable in hardware if every device has disabled. - */ - if (!enable) - npu3_for_each_dev(dev, npu) - if (dev->genid_bar.enable) - return; - - reg = NPU3_GENID_BAR; - val = npu3_read(npu, reg); - val = SETFIELD(NPU3_GENID_BAR_ENABLE, val, enable); - npu3_write(npu, reg, val); -} - -static uint64_t npu3_ipi_attributes(struct irq_source *is, uint32_t isn) -{ - struct npu3 *npu = is->data; - uint32_t level = isn - npu->irq_base; - - /* TCE interrupt is used to detect a frozen PE */ - if (level == 18) - return IRQ_ATTR_TARGET_OPAL | - IRQ_ATTR_TARGET_RARE | - IRQ_ATTR_TYPE_MSI; - - return IRQ_ATTR_TARGET_LINUX; -} - -static void npu3_ipi_interrupt(struct irq_source *is, uint32_t isn) -{ - struct npu3 *npu = is->data; - uint32_t level = isn - npu->irq_base; - - if (level != 18) { - NPU3ERR(npu, "Received unknown interrupt %d\n", level); - return; - } - - opal_update_pending_evt(OPAL_EVENT_PCI_ERROR, OPAL_EVENT_PCI_ERROR); -} - -#define NPU3_IRQ_LEVELS 60 - -static char *npu3_ipi_name(struct irq_source *is, uint32_t isn) -{ - struct npu3 *npu = is->data; - uint32_t level = isn - npu->irq_base; - static const char *names[NPU3_IRQ_LEVELS] = { - [0] = "NDL 0 Stall Event (brick 0)", - [1] = "NDL 0 No-Stall Event (brick 0)", - [2] = "NDL 1 Stall Event (brick 1)", - [3] = "NDL 1 No-Stall Event (brick 1)", - [4] = "NDL 2 Stall Event (brick 2)", - [5] = "NDL 2 No-Stall Event (brick 2)", - [6] = "NDL 3 Stall Event (brick 3)", - [7] = "NDL 3 No-Stall Event (brick 3)", - [8] = "NDL 4 Stall Event (brick 4)", - [9] = "NDL 4 No-Stall Event (brick 4)", - [10] = "NDL 5 Stall Event (brick 5)", - [11] = "NDL 5 No-Stall Event (brick 5)", - [12] = "NTL 0 Event", - [13] = "NTL 1 Event", - [14] = "NTL 2 Event", - [15] = "NTL 3 Event", - [16] = "NTL 4 Event", - [17] = "NTL 5 Event", - [18] = "TCE Event", - [19] = "ATS Event", - [20] = "CQ Event", - [21] = "MISC Event", - [41] = "Memory Controller Event", - [42] = "NDL 6 Stall Event (brick 6)", - [43] = "NDL 6 No-Stall Event (brick 6)", - [44] = "NDL 7 Stall Event (brick 7)", - [45] = "NDL 7 No-Stall Event (brick 7)", - [46] = "NDL 8 Stall Event (brick 8)", - [47] = "NDL 8 No-Stall Event (brick 8)", - [48] = "NDL 9 Stall Event (brick 9)", - [49] = "NDL 9 No-Stall Event (brick 9)", - [50] = "NDL 10 Stall Event (brick 10)", - [51] = "NDL 10 No-Stall Event (brick 10)", - [52] = "NDL 11 Stall Event (brick 11)", - [53] = "NDL 11 No-Stall Event (brick 11)", - [54] = "NTL 6 Event", - [55] = "NTL 7 Event", - [56] = "NTL 8 Event", - [57] = "NTL 9 Event", - [58] = "NTL 10 Event", - [59] = "NTL 11 Event", - }; - - if (level >= NPU3_IRQ_LEVELS || !names[level]) - return strdup("Unknown"); - - return strdup(names[level]); -} - -static const struct irq_source_ops npu3_ipi_ops = { - .attributes = npu3_ipi_attributes, - .interrupt = npu3_ipi_interrupt, - .name = npu3_ipi_name, -}; - -static void npu3_setup_irqs(struct npu3 *npu) -{ - uint64_t reg, val; - uint32_t base; - - base = xive_alloc_ipi_irqs(npu->chip_id, NPU3_IRQ_LEVELS, 64); - if (base == XIVE_IRQ_ERROR) { - NPU3ERR(npu, "Failed to allocate interrupt sources\n"); - return; - } - - xive_register_ipi_source(base, NPU3_IRQ_LEVELS, npu, &npu3_ipi_ops); - - /* Set IPI configuration */ - reg = NPU3_MISC_CFG; - val = npu3_read(npu, reg); - val = SETFIELD(NPU3_MISC_CFG_IPI_PS, val, NPU3_MISC_CFG_IPI_PS_64K); - val = SETFIELD(NPU3_MISC_CFG_IPI_OS, val, NPU3_MISC_CFG_IPI_OS_AIX); - npu3_write(npu, reg, val); - - /* Set IRQ base */ - reg = NPU3_MISC_INT_BAR; - val = SETFIELD(NPU3_MISC_INT_BAR_ADDR, 0ull, - (uint64_t)xive_get_trigger_port(base) >> 12); - npu3_write(npu, reg, val); - - npu->irq_base = base; -} - -static void npu3_init(struct npu3 *npu) -{ - struct npu3_dev *dev; - - platform.npu3_device_detect(npu); - npu3_for_each_dev(dev, npu) - npu3_device_detect_fixup(dev); - - npu3_misc_config(npu); - npu3_assign_bars(npu); - npu3_setup_irqs(npu); - npu3_init_nvlink(npu); -} - -void probe_npu3(void) -{ - struct dt_node *dn; - struct npu3 *npu; - - if (!npu3_dt_create()) - return; - - if (!platform.npu3_device_detect) { - prlog(PR_INFO, "NPU: Platform does not support NPU\n"); - return; - } - - dt_for_each_compatible(dt_root, dn, "ibm,power9-npu3") { - npu = npu3_create(dn); - npu3_init(npu); - } -} diff --git a/include/npu3-regs.h b/include/npu3-regs.h deleted file mode 100644 index 380fb549..00000000 --- a/include/npu3-regs.h +++ /dev/null @@ -1,253 +0,0 @@ -/* Copyright 2019 IBM Corp. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - * implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifndef __NPU3_REGS_H -#define __NPU3_REGS_H - -#define NPU3_FIR(n) (0x2c00 + (n) * 0x40) -#define NPU3_FIR_MASK(n) (0x2c03 + (n) * 0x40) -#define NPU3_FIR_ACTION0(n) (0x2c06 + (n) * 0x40) -#define NPU3_FIR_ACTION1(n) (0x2c07 + (n) * 0x40) -#define NPU3_FIR_MAX 3 - -/* NPU RING: Indirect address/data port */ -#define NPU3_MISC_SCOM_IND_SCOM_ADDR 0x33e -#define NPU3_MISC_DA_ADDR PPC_BITMASK(0, 23) -#define NPU3_MISC_DA_LEN PPC_BITMASK(24, 25) -#define NPU3_MISC_DA_LEN_4B 2 -#define NPU3_MISC_DA_LEN_8B 3 -#define NPU3_MISC_SCOM_IND_SCOM_DATA 0x33f - -/* NPU RING: Indirect register blocks */ -#define NPU3_BLOCK(nib0, nib1) ((nib0) << 20 | (nib1) << 16) -#define NPU3_REG_BLOCK(reg) ((reg) & 0xff0000) -#define NPU3_REG_OFFSET(reg) ((reg) & 0xffff) - -#define NPU3_BLOCK_NDL_U(brk) NPU3_BLOCK(0 + (brk) / 2,\ - 8 + (brk) % 2 * 2) -#define NPU3_BLOCK_NTL_U(brk) NPU3_BLOCK(0 + (brk) / 2,\ - 9 + (brk) % 2 * 2) -#define NPU3_BLOCK_CQ_SM(n) NPU3_BLOCK(4, (n)) -#define NPU3_BLOCK_CQ_CTL NPU3_BLOCK(4, 4) -#define NPU3_BLOCK_CQ_DAT NPU3_BLOCK(4, 5) -#define NPU3_BLOCK_NDL(brk) NPU3_BLOCK(4 + (brk) / 2,\ - 8 + (brk) % 2 * 2) -#define NPU3_BLOCK_NTL(brk) NPU3_BLOCK(4 + (brk) / 2,\ - 9 + (brk) % 2 * 2) -#define NPU3_BLOCK_NPU_ATS NPU3_BLOCK(7, 0) -#define NPU3_BLOCK_NPU_XTS NPU3_BLOCK(7, 1) -#define NPU3_BLOCK_NPU_MISC NPU3_BLOCK(7, 2) -#define NPU3_BLOCK_NPU_XTS_ATSD(n) NPU3_BLOCK(8, (n)) - -/* NDL_U block registers */ -#define NPU3_DLPL_CTL(brk) (NPU3_BLOCK_NDL_U(brk) + 0xfff4) -#define NPU3_DLPL_CTL_RESET_RX PPC_BIT32(0) -#define NPU3_DLPL_CTL_RESET_MISC PPC_BIT32(1) -#define NPU3_DLPL_CFG(brk) (NPU3_BLOCK_NDL_U(brk) + 0xfff8) -#define NPU3_DLPL_CFG_PRI_BYTESWAP PPC_BIT32(0) - -/* NTL_U block registers */ -#define NPU3_NTL_MISC_CFG1(brk) (NPU3_BLOCK_NTL_U(brk) + 0x0c0) -#define NPU3_NTL_MISC_CFG1_NTL_RESET PPC_BITMASK(8, 9) -#define NPU3_NTL_CREQ_HDR_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x400) -#define NPU3_NTL_PRB_HDR_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x410) -#define NPU3_NTL_ATR_HDR_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x418) -#define NPU3_NTL_RSP_HDR_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x428) -#define NPU3_NTL_CREQ_DAT_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x430) -#define NPU3_NTL_RSP_DAT_CRED_SND(brk) (NPU3_BLOCK_NTL_U(brk) + 0x438) -#define NPU3_NTL_CREQ_HDR_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x440) -#define NPU3_NTL_DGD_HDR_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x448) -#define NPU3_NTL_ATSD_HDR_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x460) -#define NPU3_NTL_RSP_HDR_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x468) -#define NPU3_NTL_CREQ_DAT_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x470) -#define NPU3_NTL_RSP_DAT_CRED_RCV(brk) (NPU3_BLOCK_NTL_U(brk) + 0x478) -#define NPU3_NTL_CQ_FENCE_STATUS(brk) (NPU3_BLOCK_NTL_U(brk) + 0x500) -#define NPU3_NTL_CQ_FENCE_STATUS_FIELD PPC_BITMASK(0, 1) -#define NPU3_NTL_CQ_FENCE_STATUS_FULL 3 -#define NPU3_NTL_CQ_FENCE_STATUS_HALF 2 -#define NPU3_NTL_CQ_FENCE_STATUS_NONE 0 - -/* - * CQ_SM block registers - * - * Definitions here use NPU3_BLOCK_CQ_SM(0), but when npu3_write() is given - * one of these, it will do corresponding writes to every CQ_SM block. - */ -#define NPU3_MCP_MISC_CFG0 (NPU3_BLOCK_CQ_SM(0) + 0x000) -#define NPU3_MCP_MISC_CFG0_ENABLE_PBUS PPC_BIT(26) -#define NPU3_MCP_MISC_CFG0_ENABLE_SNARF_CPM PPC_BIT(27) -#define NPU3_MCP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(44, 48) -#define NPU3_MCP_MISC_CFG0_NVLINK_MODE PPC_BITMASK(49, 53) -#define NPU3_MCP_MISC_CFG1 (NPU3_BLOCK_CQ_SM(0) + 0x008) -#define NPU3_MCP_MISC_CFG2 (NPU3_BLOCK_CQ_SM(0) + 0x0f0) -#define NPU3_SNP_MISC_CFG0 (NPU3_BLOCK_CQ_SM(0) + 0x180) -#define NPU3_SNP_MISC_CFG0_ENABLE_PBUS PPC_BIT(2) -#define NPU3_SNP_MISC_CFG0_OCAPI_MODE PPC_BITMASK(32, 36) -#define NPU3_SNP_MISC_CFG0_NVLINK_MODE PPC_BITMASK(37, 41) -#define NPU3_GPU_MEM_BAR(brk) (NPU3_BLOCK_CQ_SM(0) + 0x190 + (brk) * 8) -#define NPU3_GPU_MEM_BAR_ENABLE PPC_BIT(0) -#define NPU3_GPU_MEM_BAR_ADDR_MASK PPC_BITMASK(1, 35) -#define NPU3_GPU_MEM_BAR_ADDR PPC_BITMASK(1, 21) -#define NPU3_GPU_MEM_BAR_SIZE PPC_BITMASK(22, 35) -#define NPU3_GPU_MEM_BAR_SL_MODE PPC_BIT(36) -#define NPU3_GPU_MEM_BAR_4T_LIMIT PPC_BIT(37) -#define NPU3_GPU_MEM_BAR_4T_SELECT PPC_BITMASK(38, 39) -#define NPU3_GPU_MEM_BAR_MODE PPC_BITMASK(40, 43) -#define NPU3_GPU_MEM_BAR_POISON PPC_BIT(45) -#define NPU3_GPU_MEM_BAR_CHIP_EQ_GROUP PPC_BIT(49) -#define NPU3_NTL_BAR(brk) (NPU3_BLOCK_CQ_SM(0) + 0x1b8 + (brk) * 8) -#define NPU3_NTL_BAR_ENABLE PPC_BIT(0) -#define NPU3_NTL_BAR_ADDR PPC_BITMASK(3, 35) -#define NPU3_NTL_BAR_SIZE PPC_BITMASK(39, 43) -#define NPU3_NTL_BAR_SIZE_128K 1 -#define NPU3_MMIO_BAR (NPU3_BLOCK_CQ_SM(0) + 0x1e0) -#define NPU3_MMIO_BAR_ENABLE PPC_BIT(0) -#define NPU3_MMIO_BAR_ADDR PPC_BITMASK(3, 27) -#define NPU3_GENID_BAR (NPU3_BLOCK_CQ_SM(0) + 0x1e8) -#define NPU3_GENID_BAR_ENABLE PPC_BIT(0) -#define NPU3_GENID_BAR_ADDR PPC_BITMASK(3, 32) -#define NPU3_RELAXED_SRC(n) (NPU3_BLOCK_CQ_SM(0) + 0x1f0 + (n) * 8) -#define NPU3_RELAXED_SRC_MAX 4 -#define NPU3_RELAXED_SRC_TAG PPC_BITMASK(0, 13) -#define NPU3_RELAXED_SRC_GRPCHP PPC_BITMASK(0, 6) -#define NPU3_RELAXED_SRC_PEC PPC_BITMASK(12, 13) -#define NPU3_RELAXED_SRC_TAGMASK PPC_BITMASK(14, 27) -#define NPU3_RELAXED_SRC_MASK_NPU PPC_BIT(28) -#define NPU3_RELAXED_SRC_MASK_PCIE PPC_BIT(29) -#define NPU3_RELAXED_SRC_MASK_L2L3 PPC_BIT(30) -#define NPU3_RELAXED_SRC_RDSTART PPC_BITMASK(32, 39) -#define NPU3_RELAXED_SRC_RDEND PPC_BITMASK(40, 47) -#define NPU3_RELAXED_SRC_WRSTART PPC_BITMASK(48, 55) -#define NPU3_RELAXED_SRC_WREND PPC_BITMASK(56, 63) -#define NPU3_RELAXED_CFG2(brk) (NPU3_BLOCK_CQ_SM(0) + 0x230 + (brk) * 8) -#define NPU3_RELAXED_CFG2_CMD_CL_DMA_W PPC_BIT(0) -#define NPU3_RELAXED_CFG2_CMD_CL_DMA_W_HP PPC_BIT(1) -#define NPU3_RELAXED_CFG2_CMD_CL_DMA_INJ PPC_BIT(2) -#define NPU3_RELAXED_CFG2_CMD_PR_DMA_INJ PPC_BIT(3) -#define NPU3_RELAXED_CFG2_CMD_DMA_PR_W PPC_BIT(4) -#define NPU3_RELAXED_CFG2_CMD_CL_RD_NC_F0 PPC_BIT(5) -#define NPU3_RELAXED_CFG2_SRC_WRENA(src) PPC_BIT(32 + (src) * 4) -#define NPU3_RELAXED_CFG2_SRC_RDENA(src) PPC_BIT(33 + (src) * 4) -#define NPU3_RELAXED_CFG2_SRC_AWENA(src) PPC_BIT(34 + (src) * 4) -#define NPU3_RELAXED_CFG2_SRC_ARENA(src) PPC_BIT(35 + (src) * 4) - -/* CQ_CTL block registers */ -#define NPU3_CTL_MISC_CFG0 (NPU3_BLOCK_CQ_CTL + 0x000) -#define NPU3_CTL_MISC_CFG1 (NPU3_BLOCK_CQ_CTL + 0x008) -#define NPU3_CTL_MISC_CFG2 (NPU3_BLOCK_CQ_CTL + 0x010) -#define NPU3_CTL_MISC_CFG2_OCAPI_MODE PPC_BITMASK(0, 4) -#define NPU3_CTL_MISC_CFG2_NVLINK_MODE PPC_BITMASK(5, 9) -#define NPU3_CTL_MISC_CFG3 (NPU3_BLOCK_CQ_CTL + 0x018) -#define NPU3_CTL_BDF2PE_CFG(n) (NPU3_BLOCK_CQ_CTL + 0x180 + (n) * 8) -#define NPU3_CTL_BDF2PE_CFG_ENABLE PPC_BIT(0) -#define NPU3_CTL_BDF2PE_CFG_PE PPC_BITMASK(4, 7) -#define NPU3_CTL_BDF2PE_CFG_BDF PPC_BITMASK(8, 23) - -/* CQ_DAT block registers */ -#define NPU3_DAT_MISC_CFG1 (NPU3_BLOCK_CQ_DAT + 0x008) -#define NPU3_DAT_MISC_CFG1_OCAPI_MODE PPC_BITMASK(40, 44) -#define NPU3_DAT_MISC_CFG1_NVLINK_MODE PPC_BITMASK(45, 49) - -/* NTL block registers */ -#define NPU3_NTL_MISC_CFG2(brk) (NPU3_BLOCK_NTL(brk) + 0x000) -#define NPU3_NTL_MISC_CFG2_BRICK_ENABLE PPC_BIT(0) -#define NPU3_NTL_MISC_CFG2_NDL_RX_PARITY_ENA PPC_BIT(16) -#define NPU3_NTL_MISC_CFG2_NDL_TX_PARITY_ENA PPC_BIT(17) -#define NPU3_NTL_MISC_CFG2_NDL_PRI_PARITY_ENA PPC_BIT(18) -#define NPU3_NTL_MISC_CFG2_RCV_CREDIT_OVERFLOW_ENA PPC_BIT(19) -#define NPU3_NTL_PRI_CFG(brk) (NPU3_BLOCK_NTL(brk) + 0x0b0) -#define NPU3_NTL_PRI_CFG_NDL PPC_BITMASK(1, 2) - -/* NPU_ATS block registers */ -#define NPU3_ATS_IODA_ADDR (NPU3_BLOCK_NPU_ATS + 0x108) -#define NPU3_ATS_IODA_ADDR_AUTO_INC PPC_BIT(0) -#define NPU3_ATS_IODA_ADDR_TBL_SEL PPC_BITMASK(11, 15) -#define NPU3_ATS_IODA_ADDR_TBL_TVT 9 -#define NPU3_ATS_IODA_ADDR_TBL_ADDR PPC_BITMASK(54, 63) -#define NPU3_ATS_IODA_DATA (NPU3_BLOCK_NPU_ATS + 0x110) -#define NPU3_ATS_IODA_TVT_XLAT_ADDR PPC_BITMASK(0, 47) -#define NPU3_ATS_IODA_TVT_TABLE_LEVEL PPC_BITMASK(48, 50) -#define NPU3_ATS_IODA_TVT_TABLE_SIZE PPC_BITMASK(51, 55) -#define NPU3_ATS_IODA_TVT_PAGE_SIZE PPC_BITMASK(59, 63) -#define NPU3_ATS_TCE_KILL (NPU3_BLOCK_NPU_ATS + 0x120) -#define NPU3_ATS_TCE_KILL_ALL PPC_BIT(0) -#define NPU3_ATS_TCE_KILL_ONE PPC_BIT(2) -#define NPU3_ATS_TCE_KILL_PE_NUMBER PPC_BITMASK(4, 7) -#define NPU3_ATS_TCE_KILL_ADDRESS PPC_BITMASK(15, 51) - -/* NPU_XTS block registers */ -#define NPU3_XTS_CFG (NPU3_BLOCK_NPU_XTS + 0x020) -#define NPU3_XTS_CFG_MMIOSD PPC_BIT(1) -#define NPU3_XTS_CFG_TRY_ATR_RO PPC_BIT(6) -#define NPU3_XTS_CFG_OPENCAPI PPC_BIT(15) -#define NPU3_XTS_CFG2 (NPU3_BLOCK_NPU_XTS + 0x028) -#define NPU3_XTS_CFG2_NO_FLUSH_ENA PPC_BIT(49) -#define NPU3_XTS_CFG2_XSL2_ENA PPC_BIT(55) -#define NPU3_XTS_CFG3 (NPU3_BLOCK_NPU_XTS + 0x068) -#define NPU3_XTS_ATSD_HYP(n) (NPU3_BLOCK_NPU_XTS + 0x100 + (n) * 8) -#define NPU3_XTS_ATSD_HYP_MSR_HV PPC_BIT(51) -#define NPU3_XTS_ATSD_HYP_LPARID PPC_BITMASK(52, 63) -#define NPU3_XTS_BDF_MAP(n) (NPU3_BLOCK_NPU_XTS + 0x4000 + (n) * 8) -#define NPU3_XTS_BDF_MAP_MAX 16 -#define NPU3_XTS_BDF_MAP_VALID PPC_BIT(0) -#define NPU3_XTS_BDF_MAP_UNFILT PPC_BIT(1) -#define NPU3_XTS_BDF_MAP_STACK PPC_BITMASK(4, 6) -#define NPU3_XTS_BDF_MAP_BRICK PPC_BITMASK(7, 9) -#define NPU3_XTS_BDF_MAP_BDF PPC_BITMASK(16, 31) -#define NPU3_XTS_BDF_MAP_XLAT PPC_BITMASK(39, 40) -#define NPU3_XTS_BDF_MAP_LPCR_PS PPC_BITMASK(41, 43) -#define NPU3_XTS_BDF_MAP_LPCR_ISL PPC_BIT(44) -#define NPU3_XTS_BDF_MAP_LPCR_TC PPC_BIT(45) -#define NPU3_XTS_BDF_MAP_LPCR_SC PPC_BIT(46) -#define NPU3_XTS_BDF_MAP_LPCR_BOT PPC_BIT(47) -#define NPU3_XTS_BDF_MAP_LPARSHORT PPC_BITMASK(48, 51) -#define NPU3_XTS_BDF_MAP_LPARID PPC_BITMASK(52, 63) -#define NPU3_XTS_PID_MAP(n) (NPU3_BLOCK_NPU_XTS + 0x8000 + (n) * 32) -#define NPU3_XTS_PID_MAP_VALID_ATRGPA0 PPC_BIT(0) -#define NPU3_XTS_PID_MAP_VALID_ATRGPA1 PPC_BIT(1) -#define NPU3_XTS_PID_MAP_VALID_ATSD PPC_BIT(2) -#define NPU3_XTS_PID_MAP_MSR PPC_BITMASK(25, 31) -#define NPU3_XTS_PID_MAP_MSR_DR PPC_BIT(25) -#define NPU3_XTS_PID_MAP_MSR_TA PPC_BIT(26) -#define NPU3_XTS_PID_MAP_MSR_HV PPC_BIT(27) -#define NPU3_XTS_PID_MAP_MSR_PR PPC_BIT(28) -#define NPU3_XTS_PID_MAP_MSR_US PPC_BIT(29) -#define NPU3_XTS_PID_MAP_MSR_SF PPC_BIT(30) -#define NPU3_XTS_PID_MAP_MSR_UV PPC_BIT(31) -#define NPU3_XTS_PID_MAP_LPARSHORT PPC_BITMASK(40, 43) -#define NPU3_XTS_PID_MAP_PID PPC_BITMASK(44, 63) - -/* NPU_MISC block registers */ -#define NPU3_MISC_CFG (NPU3_BLOCK_NPU_MISC + 0x030) -#define NPU3_MISC_CFG_IPI_PS PPC_BIT(11) -#define NPU3_MISC_CFG_IPI_PS_64K 1 -#define NPU3_MISC_CFG_IPI_OS PPC_BIT(12) -#define NPU3_MISC_CFG_IPI_OS_AIX 0 -#define NPU3_MISC_CFG_IPI_OS_LINUX 1 -#define NPU3_MISC_INT_BAR (NPU3_BLOCK_NPU_MISC + 0x098) -#define NPU3_MISC_INT_BAR_ADDR PPC_BITMASK(0, 39) -#define NPU3_MISC_BDF2PE_CFG(n) (NPU3_BLOCK_NPU_MISC + 0x100 + (n) * 8) -#define NPU3_MISC_BDF2PE_CFG_ENABLE PPC_BIT(0) -#define NPU3_MISC_BDF2PE_CFG_PE PPC_BITMASK(4, 7) -#define NPU3_MISC_BDF2PE_CFG_BDF PPC_BITMASK(8, 23) -#define NPU3_MISC_PESTB_DATA(pe) (NPU3_BLOCK_NPU_MISC + 0x200 + (pe) * 8) -#define NPU3_MISC_PESTB_DATA_DMA_STOPPED_STATE PPC_BIT(0) - -/* NPU_XTS_ATSD block registers */ -#define NPU3_XTS_ATSD_LAUNCH(n) (NPU3_BLOCK_NPU_XTS_ATSD(n) + 0x000) -#define NPU3_XTS_ATSD_MAX 16 - -#endif /* __NPU3_REGS_H */ diff --git a/include/npu3.h b/include/npu3.h deleted file mode 100644 index dda60ae1..00000000 --- a/include/npu3.h +++ /dev/null @@ -1,192 +0,0 @@ -/* Copyright 2019 IBM Corp. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - * implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifndef __NPU3_H -#define __NPU3_H - -#include -#include -#include -#include - -enum npu3_dev_type { - NPU3_DEV_TYPE_UNKNOWN = 0, - NPU3_DEV_TYPE_NVLINK, - NPU3_DEV_TYPE_ANY = INT_MAX -}; - -/* Information about a currently running hw procedure */ -struct npu3_procedure { - uint16_t number; - uint16_t step; - uint32_t status; - unsigned long timeout; -}; - -/* Used to expose a hardware BAR (or logical slice of it) outside skiboot */ -struct npu3_bar { - bool enable; - uint64_t addr; - uint64_t size; - uint64_t trap; -}; - -struct npu3_dev_nvlink { - /* - * PCI virtual device. BDFN is allocated based on NPU association. - * Links connected to the same NPU will be exposed as different - * functions of the same bus/device. - */ - struct pci_virt_device *pvd; - - /* The PCI device created from pvd */ - const char *loc_code; - struct pci_device *pd; - - /* The associated GPU device */ - struct pci_device *gpu; -}; - -struct npu3_dev { - enum npu3_dev_type type; - uint32_t index; - struct dt_node *dn; - struct npu3 *npu; - struct npu3_procedure proc; - uint64_t link_speed; - - struct npu3_bar ntl_bar; - struct npu3_bar genid_bar; - - /* Associated PHY information */ - uint32_t ob_chiplet; - uint32_t phy_lane_mask; - - /* For NPU3_DEV_TYPE_NVLINK */ - struct npu3_dev_nvlink nvlink; -}; - -struct npu3_nvlink { - struct phb phb; - uint32_t ctx_ref[NPU3_XTS_BDF_MAP_MAX]; -}; - -#define NPU3_LINKS_PER_NPU 4 - -struct npu3 { - uint32_t index; - struct dt_node *dt_node; - uint32_t chip_id; - uint64_t xscom_base; - - /* Global MMIO window (all NPU regs) */ - uint64_t regs[2]; - - uint32_t irq_base; - struct lock lock; - bool tx_zcal_complete; - - struct npu3_dev devices[NPU3_LINKS_PER_NPU]; - - /* Shared by any NPU3_DEV_TYPE_NVLINK devices */ - struct npu3_nvlink nvlink; -}; - -static inline struct npu3 *npu3_phb_to_npu(struct phb *phb) -{ - assert(phb->phb_type == phb_type_npu_v3); - return container_of(phb, struct npu3, nvlink.phb); -} - -/* Chip-scope index of the link */ -static inline uint32_t npu3_chip_dev_index(struct npu3_dev *dev) -{ - return dev->npu->index * NPU3_LINKS_PER_NPU + dev->index; -} - -struct npu3_dev *npu3_next_dev(struct npu3 *npu, struct npu3_dev *dev, - enum npu3_dev_type type); - -#define npu3_for_each_dev_type(dev, npu, type) \ - for (dev = NULL; (dev = npu3_next_dev(npu, dev, type));) - -#define npu3_for_each_nvlink_dev(dev, npu) \ - npu3_for_each_dev_type(dev, npu, NPU3_DEV_TYPE_NVLINK) - -#define npu3_for_each_dev(dev, npu) \ - npu3_for_each_dev_type(dev, npu, NPU3_DEV_TYPE_ANY) - -struct npu3 *npu3_next_nvlink_npu(struct npu3 *npu, uint32_t chip_id); - -#define npu3_for_each_chip_nvlink_npu(npu, chip_id) \ - for (npu = NULL; (npu = npu3_next_nvlink_npu(npu, chip_id));) - -#define NPU3_ANY_CHIP INT_MAX -#define npu3_for_each_nvlink_npu(npu) \ - npu3_for_each_chip_nvlink_npu(npu, NPU3_ANY_CHIP) - -void npu3_init_nvlink(struct npu3 *npu); -void npu3_dev_enable_bars(struct npu3_dev *dev, bool enable); -int64_t npu3_dev_reset(struct npu3_dev *dev); - -uint32_t npu3_chip_possible_gpus(void); -int32_t npu3_dev_gpu_index(struct npu3_dev *dev); - -/* NPU RING register access */ -void npu3_write(struct npu3 *npu, uint64_t reg, uint64_t val); -uint64_t npu3_read(struct npu3 *npu, uint64_t reg); -void npu3_write_4b(struct npu3 *npu, uint64_t reg, uint32_t val); -uint32_t npu3_read_4b(struct npu3 *npu, uint64_t reg); - -/* Link flags */ -#define NPU3_DEV_PCI_LINKED 0x1 -#define NPU3_DEV_DL_RESET 0x2 - -void npu3_pvd_flag_set(struct npu3_dev *dev, uint8_t flag); -void npu3_pvd_flag_clear(struct npu3_dev *dev, uint8_t flag); - -/* PHY procedures */ -#define NPU3_PROC_STATUS_MASK 0xc000000f -#define NPU3_PROC_INPROGRESS (1 << 31) -#define NPU3_PROC_COMPLETE (1 << 30) -#define NPU3_PROC_NEXT (1 << 29) -#define NPU3_PROC_FAILED 2 -#define NPU3_PROC_ABORTED 3 -#define NPU3_PROC_UNSUPPORTED 4 - -void npu3_dev_procedure_init(struct npu3_dev *dev, uint32_t pnum); -uint32_t npu3_dev_procedure_status(struct npu3_dev *dev); - -/* OPAL entry points */ -int64_t npu3_init_context(struct phb *phb, uint64_t msr, uint64_t bdf); -int64_t npu3_destroy_context(struct phb *phb, uint64_t bdf); -int64_t npu3_map_lpar(struct phb *phb, uint64_t bdf, uint64_t lparid, - uint64_t lpcr); -int64_t npu3_set_relaxed_order(struct phb *phb, uint32_t gcid, int pec, - bool enable); - -#define NPU3_PHB_INDEX_BASE 6 /* immediately after real PHBs */ -static inline int npu3_get_phb_index(unsigned int npu_index) -{ - return NPU3_PHB_INDEX_BASE + npu_index; -} - -static inline int npu3_get_opal_id(unsigned int chip_id, unsigned int index) -{ - return phb4_get_opal_id(chip_id, index); -} - -#endif /* __NPU3_H */ diff --git a/include/pci.h b/include/pci.h index eb23a6d9..8d467213 100644 --- a/include/pci.h +++ b/include/pci.h @@ -352,7 +352,6 @@ enum phb_type { phb_type_pcie_v4, phb_type_npu_v2, phb_type_npu_v2_opencapi, - phb_type_npu_v3, }; /* Generic PCI NVRAM flags */ diff --git a/include/platform.h b/include/platform.h index d113e6eb..27a3afa0 100644 --- a/include/platform.h +++ b/include/platform.h @@ -10,7 +10,6 @@ struct pci_device; struct pci_slot; struct errorlog; struct npu2; -struct npu3; enum resource_id { RESOURCE_ID_KERNEL, @@ -126,7 +125,6 @@ struct platform { /* NPU device detection */ void (*npu2_device_detect)(struct npu2 *npu); - void (*npu3_device_detect)(struct npu3 *npu); /* * Probe platform, return true on a match, called before diff --git a/include/skiboot.h b/include/skiboot.h index f3378ec2..df11934f 100644 --- a/include/skiboot.h +++ b/include/skiboot.h @@ -209,7 +209,6 @@ extern int preload_capp_ucode(void); extern void preload_io_vpd(void); extern void probe_npu(void); extern void probe_npu2(void); -extern void probe_npu3(void); extern void uart_init(void); extern void mbox_init(void); extern void early_uart_init(void); diff --git a/platforms/astbmc/swift.c b/platforms/astbmc/swift.c index 991a79d4..401aa6b2 100644 --- a/platforms/astbmc/swift.c +++ b/platforms/astbmc/swift.c @@ -5,93 +5,8 @@ #include #include -#include #include "astbmc.h" - -/* nvidia,link-speed uses a magic driver value */ -#define NVIDIA_LINK_SPEED_20000000000_BPS 3 -#define NVIDIA_LINK_SPEED_25781250000_BPS 8 -#define NVIDIA_LINK_SPEED_25000000000_BPS 9 - -static void swift_npu3_device_detect(struct npu3 *npu) -{ - struct npu3_dev *dev; - uint32_t node, gpu_index; - char slot[6]; - - node = P9_GCID2NODEID(npu->chip_id); - - switch (npu->index) { - case 0: - gpu_index = node * 2 + 1; - break; - case 2: - gpu_index = node * 2; - break; - default: - return; - } - - snprintf(slot, sizeof(slot), "GPU%d", gpu_index); - - npu3_for_each_dev(dev, npu) { - dev->type = NPU3_DEV_TYPE_NVLINK; - dt_add_property_string(dev->dn, "ibm,slot-label", slot); - dt_add_property_u64(dev->dn, "ibm,link-speed", 25000000000ull); - dt_add_property_cells(dev->dn, "nvidia,link-speed", - NVIDIA_LINK_SPEED_25000000000_BPS); - } -} - -#define SWIFT_POSSIBLE_GPUS 4 - -#define G(g) (devs[g] ? devs[g]->nvlink.gpu->dn->phandle : 0) -#define N(g) (devs[g] ? devs[g]->npu->nvlink.phb.dt_node->phandle : 0) - -#define add_peers_prop(g, p...) \ - if (devs[g]) \ - dt_add_property_cells(devs[g]->nvlink.gpu->dn, \ - "ibm,nvlink-peers", ##p) - -static void swift_finalise_dt(bool is_reboot) -{ - struct npu3 *npu; - struct npu3_dev *dev; - struct npu3_dev *devs[SWIFT_POSSIBLE_GPUS] = {}; - int32_t index; - - if (is_reboot) - return; - - /* Collect the first link we find for each GPU */ - npu3_for_each_nvlink_npu(npu) { - npu3_for_each_nvlink_dev(dev, npu) { - index = npu3_dev_gpu_index(dev); - if (index == -1 || index >= ARRAY_SIZE(devs)) - continue; - - if (dev->nvlink.gpu && !devs[index]) - devs[index] = dev; - } - } - - /* Add GPU interconnect properties */ - add_peers_prop(0, G(3), G(2), G(2), G(2), - G(3), G(1), G(1), G(1), - N(0), N(0), N(0), N(0)); - - add_peers_prop(1, G(2), G(3), G(3), G(3), - G(0), G(0), G(0), G(2), - N(1), N(1), N(1), N(1)); - - add_peers_prop(2, G(1), G(3), G(3), G(3), - G(0), G(0), G(0), G(1), - N(2), N(2), N(2), N(2)); - - add_peers_prop(3, G(2), G(2), G(2), G(0), - G(1), G(1), G(0), G(1), - N(3), N(3), N(3), N(3)); -} +#include static bool swift_probe(void) { @@ -113,10 +28,8 @@ DECLARE_PLATFORM(swift) = { .cec_reboot = astbmc_ipmi_reboot, .elog_commit = ipmi_elog_commit, .exit = astbmc_exit, - .finalise_dt = swift_finalise_dt, .init = astbmc_init, .name = "Swift", - .npu3_device_detect = swift_npu3_device_detect, .pci_get_slot_info = dt_slot_get_slot_info, .probe = swift_probe, .resource_loaded = flash_resource_loaded, -- 2.31.1 From joel at jms.id.au Tue Aug 31 14:53:44 2021 From: joel at jms.id.au (Joel Stanley) Date: Tue, 31 Aug 2021 14:23:44 +0930 Subject: [Skiboot] [PATCH] doc: Add Swift, Mowgli and Rainier Message-ID: <20210831045344.4079201-1-joel@jms.id.au> Signed-off-by: Joel Stanley --- doc/platforms-and-cpus.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/doc/platforms-and-cpus.rst b/doc/platforms-and-cpus.rst index 2f5e9436f433..9d4b985163eb 100644 --- a/doc/platforms-and-cpus.rst +++ b/doc/platforms-and-cpus.rst @@ -51,8 +51,15 @@ astbmc zaius Power9 Ingrasys (Foxconn) "ingrasys,zaius" astbmc mihawk Power9 "{wistron,ibm},mihawk" Mihawk, IC922 astbmc nicole Power9 Yadro "YADRO,nicole" Nicole ibm-fsp zz Power9 "ibm,zz-(1|2)s(2|4)u" +astbmc swift Power9 "ibm,swift" Power9P +astbmc mowgli Power9 Wistron "ibm,mowgli" ======== ============ =========== ================== ========================== ============================= ======= +======== ============ =========== ================== ========================== ============================= ======= +Platform Sub platform Host CPU(s) Manufacturer compatible Other names/Notes Link(s) +======== ============ =========== ================== ========================== ============================= ======= +astbmc rainier Power10 "ibm,rainier" +======== ============ =========== ================== ========================== ============================= ======= Dropped Platforms ----------------- -- 2.33.0