[Skiboot] [PATCH skiboot] npu2: Clear fence state for a brick being reset
Alexey Kardashevskiy
aik at ozlabs.ru
Thu May 30 13:35:00 AEST 2019
On 30/05/2019 00:52, Reza Arbab wrote:
> Hi Alexey,
>
> On Wed, May 29, 2019 at 04:58:59PM +1000, Alexey Kardashevskiy wrote:
>> --- a/hw/npu2-hw-procedures.c
>> +++ b/hw/npu2-hw-procedures.c
>> @@ -283,6 +283,14 @@ uint32_t reset_ntl(struct npu2_dev *ndev)
>> phy_write_lane(ndev, &NPU2_PHY_TX_LANE_PDWN, lane, 0);
>> }
>>
>> + /* Clear fence state for the brick */
>> + val = npu2_read(ndev->npu, NPU2_MISC_FENCE_STATE);
>> + if (val & PPC_BIT(ndev->brick_index)) {
>> + NPU2DEVINF(ndev, "Clearing brick fence\n");
>> + val = PPC_BIT(ndev->brick_index);
>> + npu2_write(ndev->npu, NPU2_MISC_FENCE_STATE, val);
>> + }
>> +
>
> I think you might need a poll_fence_status() call after this, to make
> sure you don't proceed until the fence is actually cleared.
>
>> /* Write PRI */
>> val = SETFIELD(PPC_BITMASK(0,1), 0ull, obus_brick_index(ndev));
>> npu2_write_mask(ndev->npu, NPU2_NTL_PRI_CFG(ndev), val, -1ULL);
>
> I'm not fully across the bug provoking this change, but it seems strange
> to me to clear the fence here when the code raises it again (via
> NPU2_NTL_MISC_CFG1) just a few lines later:
>
> /* NTL Reset */
> val = npu2_read(ndev->npu, NPU2_NTL_MISC_CFG1(ndev));
> val |= PPC_BIT(8) | PPC_BIT(9);
> npu2_write(ndev->npu, NPU2_NTL_MISC_CFG1(ndev), val);
>
> if (!poll_fence_status(ndev, 0xc000000000000000UL))
> return PROCEDURE_COMPLETE | PROCEDURE_FAILED;
>
> Just thought I'd mention that.
The NPU spec says that NPU2_NTL_MISC_CFG1 requires polling but there is
no mention of this about NPU2_MISC_FENCE_STATE. Although
hw/npu2-opencapi.c seems to poll but it is a huge set_fence_control()
and not small poll_fence_status(). I am confused.
> Regardless, if it fixes your bug that's
> proof enough for me that it's worthwhile.
yup, it does successfully recover from HMIs.
--
Alexey
More information about the Skiboot
mailing list