[Skiboot] [PATCH v2 3/6] hw/npu2: Common NPU2 init routine between NVLink and OpenCAPI

Thu Aug 30 14:37:04 AEST 2018

On 30/08/18 02:26, Frederic Barrat wrote:
>> +    dt_for_each_compatible(dn, np, "ibm,npu-link") {
>> +        dev = &npu->devices[i];
>> +        dev->link_index = dt_prop_get_u32(np, "ibm,npu-link-index");
>> +        /* May be overridden by platform presence detection */
>> +        dev->brick_index = dev->link_index;
>> +        dev->type = npu2_dt_link_dev_type(np);
>> +        dev->npu = npu;
>> +        dev->dt_node = np;
>> +        dev->pl_xscom_base = dt_prop_get_u64(np, "ibm,npu-phy");
>> +        dev->lane_mask = dt_prop_get_u32(np, "ibm,npu-lane-mask");
>> +        dev->link_speed = dt_prop_get_u64(np, "ibm,link-speed");
>> +        i++;
> 
> I think we should add a check to make sure we're not overflowing our 
> array of devices if somehow the device tree data is not consistent and 
> we have more "ibm,npu-link" than advertised in "ibm,npu-links".

Good idea.

>> +int npu2_opencapi_init_npu(struct npu2 *npu)
>> +{
>> +    struct npu2_dev *dev;
>> +    uint64_t reg[2];
>> +    int rc;
>>
>>       assert(platform.ocapi);
>> +    read_nvram_training_state();
>>
>>       /* TODO: Test OpenCAPI with fast reboot and make it work */
>>       disable_fast_reboot("OpenCAPI device enabled");
>>
>> -    scom_base = dt_get_address(dn, 0, NULL);
>> -    prlog(PR_INFO, "OCAPI:     SCOM Base:  %08x\n", scom_base);
>> +    setup_global_mmio_bar(npu->chip_id, npu->xscom_base, reg);
>>
>> -    setup_global_mmio_bar(gcid, scom_base, reg);
>> +    npu->ocapi_global_mmio_base = (void *)reg[0];
> 
> How is this different from npu->regs that npu2.c is setting up? Do we 
> have 2 different variables for the same thing, nvlink using one and 
> opencapi using the other? I'm probably missing something...

I know there was a good reason I did this. It was a really good reason. 
But you're right, they seem to be set identically from everything I can 
see...

Let me see if I can figure out what the really good reason was...

> 
> More generally, we have fields in the struct npu2 which are either used 
> for nvlink or opencapi. And it's getting confusing to know which ones 
> are defined. At the very minimum, we should probably make it clear 
> through comments or ordering which is defined where. A lot of them 
> already were, I think we've just adding a few, maybe less obvious ones: 
> regs, mm_base, mm_size. I could have missed some.
> A per-type substructure would work, but it's more cumbersome to use, so 
> I don't have strong feelings there.

Agreed but nothing stands out as a particularly elegant solution here.

> 
> 
>>
>> -    n = zalloc(sizeof(struct npu2) + links * sizeof(struct npu2_dev));
>> -    n->devices = (struct npu2_dev *)(n + 1);
>> -    n->chip_id = gcid;
>> -    n->xscom_base = scom_base;
>> -    n->regs = (void *)reg[0];
>> -    n->dt_node = dn;
>> +    for (int i = 0; i < npu->total_devices; i++) {
>> +        dev = &npu->devices[i];
>> +        if (dev->type != NPU2_DEV_TYPE_OPENCAPI)
>> +            continue;
>>
>> -    dt_for_each_compatible(dn, link, "ibm,npu-link") {
>> -        dev_index = dt_prop_get_u32(link, "ibm,npu-link-index");
>> -        prlog(PR_INFO, "OCAPI: Configuring link index %lld\n",
>> -              dev_index);
>> +        prlog(PR_INFO, "OCAPI: Configuring link index %d, brick %d\n",
>> +              dev->link_index, dev->brick_index);
>>
>>           /* Procedure 13.1.3.1 - Select OCAPI vs NVLink */
>> -        brick_config(gcid, scom_base, dev_index);
>> +        brick_config(npu->chip_id, npu->xscom_base, dev->brick_index);
>>
>>           /* Procedure 13.1.3.5 - Transaction Layer Configuration */
>> -        tl_config(gcid, scom_base, dev_index);
>> +        tl_config(npu->chip_id, npu->xscom_base, dev->brick_index);
>>
>>           /* Procedure 13.1.3.6 - Address Translation Configuration */
>> -        address_translation_config(gcid, scom_base, dev_index);
>> +        address_translation_config(npu->chip_id, npu->xscom_base, 
>> dev->brick_index);
>>       }
>>
>>       /* Procedure 13.1.3.10 - Interrupt Configuration */
>> -    rc = setup_irq(n);
>> +    rc = setup_irq(npu);
>>       if (rc)
>>           goto failed;
>>
>> -    dt_for_each_compatible(dn, link, "ibm,npu-link") {
>> -        npu2_opencapi_setup_device(link, n, &n->devices[i]);
>> -        i++;
>> +    for (int i = 0; i < npu->total_devices; i++) {
> 
> 
> Just to make sure we are on the same page: the only reason why we broke 
> that loop into 2 parts, with setup_irq() in the middle, is because we do 
> all the hw setup first, and we only want to call the 2nd part, which 
> modifies the device tree and will give visibility of the hw to the OS, 
> when we are sure that all the hw setup was successful for all the links. 
> Is that right?

Well, the real reason is that it kind of matched how things were being 
done in npu2.c the first time I wrote it... but yes, in the event that 
the NPU-wide component of setup fails we bail out before setting up the 
PHB. Of course the only part of that sequence where we error out is if 
IRQ allocation fails...

Looking at setup_device() I should restructure that a bit, but that 
should be a separate patch.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com  IBM Australia Limited