[Skiboot] [PATCH 7/7] npu2-opencapi: Fix adapter reset when using 2 adapters

Frederic Barrat fbarrat at linux.ibm.com
Mon Mar 11 20:02:38 AEDT 2019



Le 06/03/2019 à 03:43, Andrew Donnellan a écrit :
> On 2/3/19 12:52 am, Frederic Barrat wrote:
>> If two opencapi adapters are on the same obus, we may try to train the
>> two links in parallel at boot time, when all the PCI links are being
>> trained. Both links use the same i2c controller to handle the reset
>> signal, so some care is needed to make sure resetting one doesn't
>> interfere with the reset of the other. We need to keep track of the
>> current state of the i2c controller (and use locking).
>>
>> This went mostly unnoticed as you need to have 2 opencapi cards on the
>> same socket and links tended to train anyway because of the retries.
>>
>> Signed-off-by: Frederic Barrat <fbarrat at linux.ibm.com>
> 
> Is this fix important enough that it should go to stable?

Yeah, probably. I'll cc stable in the v2.

> Minor comment below
> 

>> diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c
>> index 786fb655..ccd3cd4b 100644
>> --- a/hw/npu2-opencapi.c
>> +++ b/hw/npu2-opencapi.c

>> +    return pin;
>> +}
>> +
>> +static void assert_adapter_reset(struct npu2_dev *dev)
>> +{
>> +    uint8_t pin, data;
> 
> Any reason to keep data around instead of just referencing dev->npu->* 
> directly?


No reason, I just thought the code would be a bit more readable by 
spelling out what we pass to the underlying i2c layer. Let me know how 
strongly you object.

   Fred



> 
>> +    int rc;
>> +    pin = get_reset_pin(dev);
>>       /*
>>        * set the i2c reset pin in output mode
>>        *
>>        * On the 9554 device, register 3 is the configuration
>>        * register and a pin is in output mode if its value is 0
>>        */
>> -    data = ~pin;
>> +    lock(&dev->npu->i2c_lock);
>> +    dev->npu->i2c_pin_mode &= ~pin;
>> +    data = dev->npu->i2c_pin_mode;
>> +
>>       rc = i2c_request_send(dev->npu->i2c_port_id_ocapi,
>>               platform.ocapi->i2c_reset_addr, SMBUS_WRITE,
>>               0x3, 1,
>> @@ -885,16 +895,20 @@ static void assert_adapter_reset(struct npu2_dev 
>> *dev)
>>           goto err;
>>       /* register 1 controls the signal, reset is active low */
>> -    data = ~pin;
>> +    dev->npu->i2c_pin_wr_state &= ~pin;
>> +    data = dev->npu->i2c_pin_wr_state;
>> +
>>       rc = i2c_request_send(dev->npu->i2c_port_id_ocapi,
>>               platform.ocapi->i2c_reset_addr, SMBUS_WRITE,
>>               0x1, 1,
>>               &data, sizeof(data), 120);
>>       if (rc)
>>           goto err;
>> +    unlock(&dev->npu->i2c_lock);
>>       return;
>>   err:
>> +    unlock(&dev->npu->i2c_lock);
>>       /**
>>        * @fwts-label OCAPIDeviceResetFailed
>>        * @fwts-advice There was an error attempting to send
>> @@ -905,14 +919,20 @@ err:
>>   static void deassert_adapter_reset(struct npu2_dev *dev)
>>   {
>> -    uint8_t data;
>> +    uint8_t pin, data;
>>       int rc;
>> -    data = 0xFF;
>> +    pin = get_reset_pin(dev);
>> +
>> +    lock(&dev->npu->i2c_lock);
>> +    dev->npu->i2c_pin_wr_state |= pin;
>> +    data = dev->npu->i2c_pin_wr_state;
>> +
>>       rc = i2c_request_send(dev->npu->i2c_port_id_ocapi,
>>               platform.ocapi->i2c_reset_addr, SMBUS_WRITE,
>>               0x1, 1,
>>               &data, sizeof(data), 120);
>> +    unlock(&dev->npu->i2c_lock);
>>       if (rc) {
>>           /**
>>            * @fwts-label OCAPIDeviceResetFailed
>> diff --git a/include/npu2.h b/include/npu2.h
>> index cfa3b1f8..b62d1748 100644
>> --- a/include/npu2.h
>> +++ b/include/npu2.h
>> @@ -173,7 +173,11 @@ struct npu2 {
>>       struct phb    phb_nvlink;
>>       uint32_t    phb_index;
>> +    /* OCAPI */
>>       uint64_t    i2c_port_id_ocapi;
>> +    struct lock    i2c_lock;
>> +    uint8_t        i2c_pin_mode;
>> +    uint8_t        i2c_pin_wr_state;
>>   };
>>   static inline struct npu2 *phb_to_npu2_nvlink(struct phb *phb)
>>
> 



More information about the Skiboot mailing list