[Skiboot] [PATCH] OPAL:PCI should throw error on platform PCI devices not being detected

Mamatha Inamdar mamatha4 at linux.vnet.ibm.com
Thu Jun 9 15:50:41 AEST 2016



On 06/08/2016 09:48 PM, Mukesh Ojha wrote:
> HI Mamatha,
>
> Few observations.
>
> On Wednesday 08 June 2016 09:00 PM, Mamatha Inamdar wrote:
>> Problem Description: Some times system boots to petitboot and get 
>> into a state where
>> only the PHBs were detected and *no* other PCI devices.
>
> s/Some times/Sometimes

Thanks..will update

>
>>
>> Fix: This patch is to check the detected PCI devices against the PCI 
>> slot table in the platform
>> definition and display an error if they don't match and commit an 
>> errorlog.
>>
>> Test Results:
>> After testing the patch, we see following traces on the SOL console.
>> [8212824503,5] PCI: Check for a present device...
>> [8212921065,3] Slot3 PCI: No device found
>> [8212987391,5] Device Found in SLOT= Backplane PLX
>> [8213085726,3] Slot4 PCI: No device found
>>
>> From: Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com>
>>
>> Signed-off-by: Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com>
>> ---
>>   core/pci.c         |   33 +++++++++++++++++++++++++++++++++
>>   include/errorlog.h |    3 ++-
>>   2 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/core/pci.c b/core/pci.c
>> index 9b238d0..987f69d 100644
>> --- a/core/pci.c
>> +++ b/core/pci.c
>> @@ -20,6 +20,7 @@
>>   #include <pci-cfg.h>
>>   #include <timebase.h>
>>   #include <device.h>
>> +#include <errorlog.h>
>>   #include <fsp.h>
>>     #define MAX_PHB_ID    256
>> @@ -47,6 +48,9 @@ int last_phb_id = 0;
>>             ((_bdfn) >> 8) & 0xff,            \
>>             ((_bdfn) >> 3) & 0x1f, (_bdfn) & 0x7, ## a)
>>   +DEFINE_LOG_ENTRY(OPAL_RC_PCI_SLOT, OPAL_PLATFORM_ERR_EVT, OPAL_PCI,
>> +        OPAL_MISC_SUBSYSTEM,OPAL_PREDICTIVE_ERR_GENERAL,
>> +        OPAL_NA);
>
> Is this critical enough to be logged into BMC?
> For logging into BMC the severity should be greater than
> 'OPAL_PREDICTIVE_ERR_FAULT_RECTIFY_REBOOT'.
>>   /*
>>    * Generic PCI utilities
>>    */
>> @@ -1510,6 +1514,28 @@ static void pci_do_jobs(void (*fn)(void *))
>>       free(jobs);
>>   }
>>   +static void scan_present_device(struct phb *phb)
>> +{
>> +    int64_t rc;
>> +    struct pci_device *pd;
>> +
>> +    /*
>> +    for PCI/PCI-X, we get the slot info and heck
>> +    if the PHB has anything connected to it
>> +    */
>> +    while ((pd = list_pop(&phb->devices, struct pci_device, link)) 
>> != NULL) {
>> +        if (platform.pci_get_slot_info)
>> +            platform.pci_get_slot_info(phb, pd);
>> +
>> +        rc = phb->ops->presence_detect(phb);
>> +        if (rc != OPAL_SHPC_DEV_PRESENT)
>> +            log_simple_error(&e_info(OPAL_RC_PCI_SLOT), "%s "
>> +            "PCI: No device found\n", pd->slot_info->label);
>> +        else
>> +            prlog(PR_NOTICE, "Device Found in SLOT= %s\n", 
>> pd->slot_info->label);
>> +    }
>> +}
>> +
>>   void pci_init_slots(void)
>>   {
>>       unsigned int i;
>> @@ -1538,6 +1564,13 @@ void pci_init_slots(void)
>>             phbs[i]->ops->phb_final_fixup(phbs[i]);
>>       }
>> +
>> +    prlog(PR_NOTICE, "PCI: Check for a present device...\n");
>> +    for (i = 0; i < ARRAY_SIZE(phbs); i++) {
>> +        if (!phbs[i])
>> +            continue;
>> +        scan_present_device(phbs[i]);
>> +    }
>>   }
>>     /*
>> diff --git a/include/errorlog.h b/include/errorlog.h
>> index b8fca7d..5b754f7 100644
>> --- a/include/errorlog.h
>> +++ b/include/errorlog.h
>> @@ -280,7 +280,8 @@ enum opal_reasoncode {
>>       OPAL_RC_PCI_INIT_SLOT   = OPAL_PC | 0x10,
>>       OPAL_RC_PCI_ADD_SLOT    = OPAL_PC | 0x11,
>>       OPAL_RC_PCI_SCAN        = OPAL_PC | 0x12,
>> -    OPAL_RC_PCI_RESET_PHB   = OPAL_PC | 0x10,
>> +    OPAL_RC_PCI_RESET_PHB   = OPAL_PC | 0x13,
>> +    OPAL_RC_PCI_SLOT    = OPAL_PC | 0x14,
>
> Can't we use 'OPAL_RC_PCI_INIT_SLOT' here?

In this patch We are not initializing the SLOT to use  above reason code,
We are checking the devices are detected or not in the available slot.

>
> Cheers,
> -Mukesh
>
>>   /* ATTN */
>>       OPAL_RC_ATTN        = OPAL_AT | 0x10,
>>   /* MEM_ERR */
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
>



More information about the Skiboot mailing list