[Skiboot] [PATCH] OPAL:PCI should throw error on platform PCI devices not being detected

Mukesh Ojha mukesh02 at linux.vnet.ibm.com
Thu Jun 9 02:18:55 AEST 2016


HI Mamatha,

Few observations.

On Wednesday 08 June 2016 09:00 PM, Mamatha Inamdar wrote:
> Problem Description: Some times system boots to petitboot and get into a state where
> only the PHBs were detected and *no* other PCI devices.

s/Some times/Sometimes

>
> Fix: This patch is to check the detected PCI devices against the PCI slot table in the platform
> definition and display an error if they don't match and commit an errorlog.
>
> Test Results:
> After testing the patch, we see following traces on the SOL console.
> [8212824503,5] PCI: Check for a present device...
> [8212921065,3] Slot3 PCI: No device found
> [8212987391,5] Device Found in SLOT= Backplane PLX
> [8213085726,3] Slot4 PCI: No device found
>
> From: Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com>
>
> Signed-off-by: Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com>
> ---
>   core/pci.c         |   33 +++++++++++++++++++++++++++++++++
>   include/errorlog.h |    3 ++-
>   2 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/core/pci.c b/core/pci.c
> index 9b238d0..987f69d 100644
> --- a/core/pci.c
> +++ b/core/pci.c
> @@ -20,6 +20,7 @@
>   #include <pci-cfg.h>
>   #include <timebase.h>
>   #include <device.h>
> +#include <errorlog.h>
>   #include <fsp.h>
>   
>   #define MAX_PHB_ID	256
> @@ -47,6 +48,9 @@ int last_phb_id = 0;
>   	      ((_bdfn) >> 8) & 0xff,			\
>   	      ((_bdfn) >> 3) & 0x1f, (_bdfn) & 0x7, ## a)
>   
> +DEFINE_LOG_ENTRY(OPAL_RC_PCI_SLOT, OPAL_PLATFORM_ERR_EVT, OPAL_PCI,
> +		OPAL_MISC_SUBSYSTEM,OPAL_PREDICTIVE_ERR_GENERAL,
> +		OPAL_NA);

Is this critical enough to be logged into BMC?
For logging into BMC the severity should be greater than
'OPAL_PREDICTIVE_ERR_FAULT_RECTIFY_REBOOT'.

>   /*
>    * Generic PCI utilities
>    */
> @@ -1510,6 +1514,28 @@ static void pci_do_jobs(void (*fn)(void *))
>   	free(jobs);
>   }
>   
> +static void scan_present_device(struct phb *phb)
> +{
> +	int64_t rc;
> +	struct pci_device *pd;
> +
> +	/*
> +	for PCI/PCI-X, we get the slot info and heck
> +	if the PHB has anything connected to it
> +	*/
> +	while ((pd = list_pop(&phb->devices, struct pci_device, link)) != NULL) {
> +		if (platform.pci_get_slot_info)
> +			platform.pci_get_slot_info(phb, pd);
> +
> +		rc = phb->ops->presence_detect(phb);
> +		if (rc != OPAL_SHPC_DEV_PRESENT)
> +			log_simple_error(&e_info(OPAL_RC_PCI_SLOT), "%s "
> +			"PCI: No device found\n", pd->slot_info->label);
> +		else
> +			prlog(PR_NOTICE, "Device Found in SLOT= %s\n", pd->slot_info->label);
> +	}
> +}
> +
>   void pci_init_slots(void)
>   {
>   	unsigned int i;
> @@ -1538,6 +1564,13 @@ void pci_init_slots(void)
>   
>   		phbs[i]->ops->phb_final_fixup(phbs[i]);
>   	}
> +
> +	prlog(PR_NOTICE, "PCI: Check for a present device...\n");
> +	for (i = 0; i < ARRAY_SIZE(phbs); i++) {
> +		if (!phbs[i])
> +			continue;
> +		scan_present_device(phbs[i]);
> +	}
>   }
>   
>   /*
> diff --git a/include/errorlog.h b/include/errorlog.h
> index b8fca7d..5b754f7 100644
> --- a/include/errorlog.h
> +++ b/include/errorlog.h
> @@ -280,7 +280,8 @@ enum opal_reasoncode {
>   	OPAL_RC_PCI_INIT_SLOT   = OPAL_PC | 0x10,
>   	OPAL_RC_PCI_ADD_SLOT    = OPAL_PC | 0x11,
>   	OPAL_RC_PCI_SCAN        = OPAL_PC | 0x12,
> -	OPAL_RC_PCI_RESET_PHB   = OPAL_PC | 0x10,
> +	OPAL_RC_PCI_RESET_PHB   = OPAL_PC | 0x13,
> +	OPAL_RC_PCI_SLOT	= OPAL_PC | 0x14,

Can't we use 'OPAL_RC_PCI_INIT_SLOT' here?

Cheers,
-Mukesh

>   /* ATTN */
>   	OPAL_RC_ATTN		= OPAL_AT | 0x10,
>   /* MEM_ERR */
>
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot



More information about the Skiboot mailing list