[Skiboot] [PATCH] npu2/hw-procedures: Fence bricks via NTL instead of MISC

Andrew Donnellan andrew.donnellan at au1.ibm.com
Wed Jul 11 13:56:10 AEST 2018


On 11/07/18 12:32, Reza Arbab wrote:
> There are a couple of places we can set/unset fence for a brick:
> 
> 1. MISC register: NPU2_MISC_FENCE_STATE
> 2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev)
> 
> Recent testing of ATS in combination with GPU reset has exposed a side
> effect of using (1); if fence is set for all six bricks, it triggers a
> sticky nmmu latch which prevents the NPU from getting ATR responses.
> This manifests as a hang in the tests.
> 
> We have npu2_dev_fence_brick() which uses (1), and only two calls to it.
> Replace the call which sets fence with a write to (2). Remove the
> corresponding unset call entirely. It's unneeded because the procedures
> already do a progression from full fence to half to idle using (2).
> 
> Signed-off-by: Reza Arbab <arbab at linux.ibm.com>

Hmm, I wonder whether there's any side effects from how we use 
NPU2_MISC_FENCE_STATE in the opencapi code... anything we should be 
aware of?

This looks good to me.

Reviewed-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>

> ---
>   hw/npu2-hw-procedures.c | 31 +++++++------------------------
>   1 file changed, 7 insertions(+), 24 deletions(-)
> 
> diff --git a/hw/npu2-hw-procedures.c b/hw/npu2-hw-procedures.c
> index 7ab08f0..8de6b4d 100644
> --- a/hw/npu2-hw-procedures.c
> +++ b/hw/npu2-hw-procedures.c
> @@ -236,26 +236,6 @@ static bool poll_fence_status(struct npu2_dev *ndev, uint64_t val)
>   	return false;
>   }
>   
> -static int64_t npu2_dev_fence_brick(struct npu2_dev *ndev, bool set)
> -{
> -	/*
> -	 * Add support for queisce/fence the brick at
> -	 * procedure reset time.
> -	 */
> -	uint32_t brick;
> -	uint64_t val;
> -
> -	brick = ndev->index;
> -	if (set)
> -		brick += 6;
> -
> -	val = PPC_BIT(brick);
> -	NPU2DEVINF(ndev, "%s fence brick %d, val %llx\n", set ? "set" : "clear",
> -			ndev->index, val);
> -	npu2_write(ndev->npu, NPU2_MISC_FENCE_STATE, val);
> -	return 0;
> -}
> -
>   /* Procedure 1.2.1 - Reset NPU/NDL */
>   uint32_t reset_ntl(struct npu2_dev *ndev)
>   {
> @@ -326,9 +306,6 @@ static uint32_t reset_ntl_release(struct npu2_dev *ndev)
>   
>   	}
>   
> -	/* Release the fence */
> -	npu2_dev_fence_brick(ndev, false);
> -
>   	val = npu2_read(ndev->npu, NPU2_NTL_MISC_CFG1(ndev));
>   	val &= 0xFFBFFFFFFFFFFFFF;
>   	npu2_write(ndev->npu, NPU2_NTL_MISC_CFG1(ndev), val);
> @@ -952,7 +929,13 @@ int64_t npu2_dev_procedure(void *dev, struct pci_cfg_reg_filter *pcrf,
>   
>   void npu2_dev_procedure_reset(struct npu2_dev *dev)
>   {
> -	npu2_dev_fence_brick(dev, true);
> +	uint64_t val;
> +
> +	/* Fence the brick */
> +	val = npu2_read(dev->npu, NPU2_NTL_MISC_CFG1(dev));
> +	val |= PPC_BIT(8) | PPC_BIT(9);
> +	npu2_write(dev->npu, NPU2_NTL_MISC_CFG1(dev), val);
> +
>   	npu2_clear_link_flag(dev, NPU2_DEV_DL_RESET);
>   }
>   
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com  IBM Australia Limited



More information about the Skiboot mailing list