[Skiboot] [PATCH] npu2/hw-procedures: Fence bricks via NTL instead of MISC
Andrew Donnellan
andrew.donnellan at au1.ibm.com
Wed Jul 11 13:56:10 AEST 2018
On 11/07/18 12:32, Reza Arbab wrote:
> There are a couple of places we can set/unset fence for a brick:
>
> 1. MISC register: NPU2_MISC_FENCE_STATE
> 2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev)
>
> Recent testing of ATS in combination with GPU reset has exposed a side
> effect of using (1); if fence is set for all six bricks, it triggers a
> sticky nmmu latch which prevents the NPU from getting ATR responses.
> This manifests as a hang in the tests.
>
> We have npu2_dev_fence_brick() which uses (1), and only two calls to it.
> Replace the call which sets fence with a write to (2). Remove the
> corresponding unset call entirely. It's unneeded because the procedures
> already do a progression from full fence to half to idle using (2).
>
> Signed-off-by: Reza Arbab <arbab at linux.ibm.com>
Hmm, I wonder whether there's any side effects from how we use
NPU2_MISC_FENCE_STATE in the opencapi code... anything we should be
aware of?
This looks good to me.
Reviewed-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
> ---
> hw/npu2-hw-procedures.c | 31 +++++++------------------------
> 1 file changed, 7 insertions(+), 24 deletions(-)
>
> diff --git a/hw/npu2-hw-procedures.c b/hw/npu2-hw-procedures.c
> index 7ab08f0..8de6b4d 100644
> --- a/hw/npu2-hw-procedures.c
> +++ b/hw/npu2-hw-procedures.c
> @@ -236,26 +236,6 @@ static bool poll_fence_status(struct npu2_dev *ndev, uint64_t val)
> return false;
> }
>
> -static int64_t npu2_dev_fence_brick(struct npu2_dev *ndev, bool set)
> -{
> - /*
> - * Add support for queisce/fence the brick at
> - * procedure reset time.
> - */
> - uint32_t brick;
> - uint64_t val;
> -
> - brick = ndev->index;
> - if (set)
> - brick += 6;
> -
> - val = PPC_BIT(brick);
> - NPU2DEVINF(ndev, "%s fence brick %d, val %llx\n", set ? "set" : "clear",
> - ndev->index, val);
> - npu2_write(ndev->npu, NPU2_MISC_FENCE_STATE, val);
> - return 0;
> -}
> -
> /* Procedure 1.2.1 - Reset NPU/NDL */
> uint32_t reset_ntl(struct npu2_dev *ndev)
> {
> @@ -326,9 +306,6 @@ static uint32_t reset_ntl_release(struct npu2_dev *ndev)
>
> }
>
> - /* Release the fence */
> - npu2_dev_fence_brick(ndev, false);
> -
> val = npu2_read(ndev->npu, NPU2_NTL_MISC_CFG1(ndev));
> val &= 0xFFBFFFFFFFFFFFFF;
> npu2_write(ndev->npu, NPU2_NTL_MISC_CFG1(ndev), val);
> @@ -952,7 +929,13 @@ int64_t npu2_dev_procedure(void *dev, struct pci_cfg_reg_filter *pcrf,
>
> void npu2_dev_procedure_reset(struct npu2_dev *dev)
> {
> - npu2_dev_fence_brick(dev, true);
> + uint64_t val;
> +
> + /* Fence the brick */
> + val = npu2_read(dev->npu, NPU2_NTL_MISC_CFG1(dev));
> + val |= PPC_BIT(8) | PPC_BIT(9);
> + npu2_write(dev->npu, NPU2_NTL_MISC_CFG1(dev), val);
> +
> npu2_clear_link_flag(dev, NPU2_DEV_DL_RESET);
> }
>
>
--
Andrew Donnellan OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com IBM Australia Limited
More information about the Skiboot
mailing list