[Skiboot] [PATCH v2 4/5] hw/phb3: add host sync notifier to trigger creset/CAPP disable on kexec
Andrew Donnellan
andrew.donnellan at au1.ibm.com
Mon Jan 16 19:30:21 AEDT 2017
On 16/01/17 10:51, Gavin Shan wrote:
>> + switch (slot->state) {
>> + case PHB3_SLOT_NORMAL:
>> + lock(&capi_lock);
>> + rc = (chip->capp_phb3_attached_mask & (1 << p->index)) ?
>> + OPAL_PHB_CAPI_MODE_CAPI :
>> + OPAL_PHB_CAPI_MODE_PCIE;
>> + unlock(&capi_lock);
>> +
>> + if (rc == OPAL_PHB_CAPI_MODE_PCIE)
>> + return true;
>> +
>> + PHBINF(p, "PHB in CAPI mode, resetting\n");
>> + p->flags &= ~PHB3_CAPP_RECOVERY;
>> + phb3_creset(slot);
>> + return false;
>> + default:
>> + rc = slot->ops.poll(slot);
>> + return rc == OPAL_SUCCESS;
>
> The code seems incorrect. In opal_sync_host_reboot(), OPAL_BUSY_EVENT
> is returned to Linux if any notifier returns false. Linux delays 10ms
> and call opal_sync_host_reboot() again, meaning all notifiers will be
> triggered (including phb3_host_sync_reset()) again.
>
> When error is returned from slot->ops.poll(slot), the PHB might be put
> to non_PHB3_SLOT_NORMAL state. false is returned on the error. This
> function is called again and jump to default case. It's obviously not
> what we want. So the code would be:
>
> return rc <= OPAL_SUCCESS;
Good catch, will investigate further tomorrow.
>
> Anyway, I think opal_sync_host_reboot() needs enhancement to avoid
> triggering all notifier on failure from any of them. It's something
> out of scope though.
opal_sync_host_reboot() wasn't exactly designed with "failure" in mind -
the reason I describe this as an abuse of the host sync notifier system
is that the other notifiers just flush buffers rather than doing
anything complex like this.
--
Andrew Donnellan OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com IBM Australia Limited
More information about the Skiboot
mailing list