[Skiboot] [PATCH V2] PCI: pci should throw error on platform PCI devices not being detected
Gavin Shan
gwshan at linux.vnet.ibm.com
Sat Aug 6 17:11:01 AEST 2016
On Fri, Aug 05, 2016 at 02:48:02PM +0530, Mamatha Inamdar wrote:
>Problem Description: Sometimes system boots to petitboot and gets into a state where
>only the PHBs are detected and *no* other PCI devices.
>
We have a Root Complex (RC) behind the PHB. It should be detected
successfully at least, meaning the changelog isn't precise enough.
Could you please correct it?
>Fix: This patch is to look up the platform slot table and to check whether devices are detected,
>if devices are not detected then commit an error log.
>
Each line in the chagelog cannot exceed 75 characters in Linux kernel
patches. I assume skiboot patch is following it as well. Also, the
statement isn't correct enough: I think you just need to grab the PHB's
slot and call into its ops.get_presence_state() to check if there is an
adapter plugged behind RC. Alternatively, you can check if any PCI devices
attached to RC's children list (struct pci_device::children). In this
way, you needn't call into ops.get_presence_state() and some CPU cycles
are saved.
>Signed-off-by: Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com>
>---
> 0 files changed
>
>diff --git a/core/pci.c b/core/pci.c
>index cbaea35..224800e 100644
>--- a/core/pci.c
>+++ b/core/pci.c
>@@ -21,6 +21,7 @@
> #include <pci-slot.h>
> #include <timebase.h>
> #include <device.h>
>+#include <errorlog.h>
> #include <fsp.h>
>
> #define MAX_PHB_ID 256
>@@ -48,6 +49,9 @@ int last_phb_id = 0;
> ((_bdfn) >> 8) & 0xff, \
> ((_bdfn) >> 3) & 0x1f, (_bdfn) & 0x7, ## a)
>
>+DEFINE_LOG_ENTRY(OPAL_RC_PCI_DETECT_DEV, OPAL_PLATFORM_ERR_EVT, OPAL_PCI,
>+ OPAL_MISC_SUBSYSTEM,OPAL_PREDICTIVE_ERR_GENERAL,
>+ OPAL_NA);
> /*
> * Generic PCI utilities
> */
>@@ -1510,6 +1514,26 @@ static void pci_do_jobs(void (*fn)(void *))
> free(jobs);
> }
>
>+static void scan_present_device(struct phb *phb)
>+{
>+ uint8_t presence = 1;
>+ struct pci_device *pd = NULL;
>+
>+ /*
>+ * For PCI/PCI-X, we get the slot info and we also
>+ * check if the PHB has anything connected to it
>+ */
>+ if (platform.pci_get_slot_info)
>+ platform.pci_get_slot_info(phb, pd);
It needn't call into platform.pci_get_slot_info() to get the PHB slot
as it should have been created and attached to the PHB.
>+ if(phb->slot->ops.get_presence_state)
>+ phb->slot->ops.get_presence_state(phb->slot, &presence);
Please use the alternative way as I suggested if you think it's reasonable.
>+ if (!presence)
>+ log_simple_error(&e_info(OPAL_RC_PCI_DETECT_DEV), "%s "
>+ "Slot empty\n", (char *)(phb->slot->data));
>+ else
>+ prlog(PR_NOTICE, "Device Found in SLOT= %s\n", (char *)(phb->slot->data));
What information is expected to be conveyed by log_simple_error()?
phb->slot->data can be NULL. Also, I don't think the prlog() is
needed because we can check the situation (empty slot) from other
logs.
>+}
>+
> void pci_init_slots(void)
> {
> unsigned int i;
>@@ -1540,6 +1564,12 @@ void pci_init_slots(void)
>
> phbs[i]->ops->phb_final_fixup(phbs[i]);
> }
>+
>+ for (i = 0; i < ARRAY_SIZE(phbs); i++) {
>+ if (!phbs[i])
>+ continue;
>+ scan_present_device(phbs[i]);
>+ }
Plus this iteration, there will have 3 iterations. I don't think
the previous two can be merged, but this newly added one can be
merged to the last one.
> }
>
> /*
>diff --git a/include/errorlog.h b/include/errorlog.h
>index f89eac9..0325546 100644
>--- a/include/errorlog.h
>+++ b/include/errorlog.h
>@@ -281,7 +281,8 @@ enum opal_reasoncode {
> OPAL_RC_PCI_INIT_SLOT = OPAL_SRC_COMPONENT_PCI | 0x10,
> OPAL_RC_PCI_ADD_SLOT = OPAL_SRC_COMPONENT_PCI | 0x11,
> OPAL_RC_PCI_SCAN = OPAL_SRC_COMPONENT_PCI | 0x12,
>- OPAL_RC_PCI_RESET_PHB = OPAL_SRC_COMPONENT_PCI | 0x10,
>+ OPAL_RC_PCI_RESET_PHB = OPAL_SRC_COMPONENT_PCI | 0x13,
>+ OPAL_RC_PCI_DETECT_DEV = OPAL_SRC_COMPONENT_PCI | 0x14,
> /* ATTN */
> OPAL_RC_ATTN = OPAL_SRC_COMPONENT_ATTN | 0x10,
> /* MEM_ERR */
>
Thanks,
Gavin
More information about the Skiboot
mailing list