[Skiboot] [PATCH v2 2/5] fast-reboot: creset PHBs on fast reboot
Andrew Donnellan
andrew.donnellan at au1.ibm.com
Mon Jan 16 19:32:16 AEDT 2017
On 16/01/17 10:20, Gavin Shan wrote:
> On Fri, Jan 13, 2017 at 04:09:38PM +1100, Andrew Donnellan wrote:
>> On fast reboot, perform a creset of all PHBs. This ensures that any PHBs
>> that are fenced will be working after the reboot.
>>
>> A later patch will disable CAPI mode during cresets - as such, PHBs in CAPI
>> mode will return to regular PCIe mode during a fast reboot.
>>
>> Signed-off-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
>>
>> ---
>>
>> This slows down fast reboot a bit. Could we do this in parallel? Should we
>> limit it only to fenced/CAPI mode PHBs? (Resetting all PHBs might make
>> things slightly more reliable, but probably only slightly...)
>> ---
>> core/pci.c | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/core/pci.c b/core/pci.c
>> index 05fd80f..bef42b4 100644
>> --- a/core/pci.c
>> +++ b/core/pci.c
>> @@ -1497,6 +1497,7 @@ static void __pci_reset(struct list_head *list)
>> void pci_reset(void)
>> {
>> unsigned int i;
>> + struct pci_slot *slot;
>>
>> prlog(PR_NOTICE, "PCI: Clearing all devices...\n");
>>
>> @@ -1508,6 +1509,25 @@ void pci_reset(void)
>> if (!phb)
>> continue;
>> __pci_reset(&phb->devices);
>
> PCINOTICE(phb, 0, ...) indicates the message applies to root port,
> not PHB.
OK, will fix
>
>> +
>> + slot = phb->slot;
>> + if (!slot || !slot->ops.creset) {
>> + PCINOTICE(phb, 0, "Can't do complete reset\n");
>> + } else {
>> + int rc = slot->ops.creset(slot);
>
> slot->ops.create() returns error or delay whose data type is int64_t.
Good catch, will fix
>
>> + while (rc > 0) {
>> + time_wait(rc);
>> + rc = slot->ops.poll(slot);
>> + }
>> + if (rc < 0) {
>> + PCIERR(phb, 0, "Complete reset failed, aborting"
>> + "fast reboot\n");
>
> It might be worthy to print @rc.
Will add.
>
>> + if (platform.cec_reboot)
>> + platform.cec_reboot();
>> + while (true) {}
>
> Rebooting the system in failing case seems not good idea. To
> remove the failing PHB to avoid probing on it afterwards would
> be better.
Hmm, I'm not entirely sure here - I see your point, though on the other
hand, if a PHB has been left in a bad state somehow, and a user attempts
to reboot, they're probably going to want a full IPL. Stewart, any thoughts?
Thanks for the review!
--
Andrew Donnellan OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com IBM Australia Limited
More information about the Skiboot
mailing list