[PATCH kernel] powerpc/powernv/eeh/npu: Fix uninitialized variables in opal_pci_eeh_freeze_status

Michael Ellerman mpe at ellerman.id.au
Tue Nov 20 13:51:06 AEDT 2018


Alexey Kardashevskiy <aik at ozlabs.ru> writes:

> The current implementation of the OPAL_PCI_EEH_FREEZE_STATUS call in
> skiboot's NPU driver does not touch the pci_error_type parameter so
> it might have garbage but the powernv code analyzes it nevertheless.
>
> This initializes pcierr and fstate to zero in all call sites.
>
> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> ---

Can we tag this with a Fixes? And seems like it should probably go to
stable, or can we not trigger this path on older kernels?

cheers

> Without this, this happens:
>
> pnv_eeh_get_phb_diag: Failure -7 getting PHB#6 diag-data
> EEH: PHB#6 failure detected, location: N/A
> CPU: 23 PID: 5939 Comm: qemu-system-ppc Not tainted 4.19.0-le_f5a7bb7_aikATfstn1-p1 torvalds#106
> Call Trace:
> [c000003fea9df9c0] [c000000000a990ec] dump_stack+0xb0/0xf4 (unreliable)
> [c000003fea9dfa00] [c000000000038480] eeh_dev_check_failure+0x1f0/0x5f0
> [c000003fea9dfaa0] [c0000000000a2768] pnv_pci_read_config+0x128/0x160
> [c000003fea9dfae0] [c0000000005d2b0c] pci_bus_read_config_dword+0x9c/0xf0
> [c000003fea9dfb40] [c0000000005df3d4] pci_save_state+0x64/0x250
> [c000003fea9dfbc0] [c0000000005e0730] pci_dev_save_and_disable+0x70/0xa0
> [c000003fea9dfbf0] [c0000000005e4078] pci_try_reset_function+0x48/0xc0
> [c000003fea9dfc20] [c00800001cbc1b1c] vfio_pci_ioctl+0x334/0xea0 [vfio_pci]
> [c000003fea9dfcf0] [c00800001ca9046c] vfio_device_fops_unl_ioctl+0x44/0x70 [vfio]
> [c000003fea9dfd10] [c00000000039fd84] do_vfs_ioctl+0xd4/0xa00
> [c000003fea9dfdb0] [c0000000003a07b4] ksys_ioctl+0x104/0x120
> [c000003fea9dfe00] [c0000000003a07f8] sys_ioctl+0x28/0x80
> [c000003fea9dfe20] [c00000000000b3a4] system_call+0x5c/0x70
> EEH: Detected error on PHB#6
> EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 fail
> ures.
> EEH: Notify device drivers to shutdown
> EEH: Beginning: 'error_detected(IO frozen)'
> EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0)
> EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0)
> EEH: PE#c (PCI 0006:00:01.0): Invoking vfio-pci->error_detected(IO frozen)
> EEH: PE#c (PCI 0006:00:01.0): vfio-pci driver reports: 'can recover'
> EEH: PE#c (PCI 0006:00:01.1): Invoking vfio-pci->error_detected(IO frozen)
> EEH: PE#c (PCI 0006:00:01.1): vfio-pci driver reports: 'can recover'
> EEH: PE#b (PCI 0006:00:02.0): Invoking vfio-pci->error_detected(IO frozen)
> EEH: PE#b (PCI 0006:00:02.0): vfio-pci driver reports: 'can recover'
> EEH: PE#b (PCI 0006:00:02.1): Invoking vfio-pci->error_detected(IO frozen)
> EEH: PE#b (PCI 0006:00:02.1): vfio-pci driver reports: 'can recover'
> EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'can recover'
> EEH: Collect temporary log
> pnv_pci_dump_phb_diag_data: Unrecognized ioType 0
> EEH: Reset without hotplug activity
> iommu: Removing device 0006:00:01.0 from group 4
> iommu: Removing device 0006:00:01.1 from group 4
> iommu: Removing device 0006:00:02.0 from group 4
> iommu: Removing device 0006:00:02.1 from group 4
> pnv_ioda_freeze_pe: Failure -7 freezing PHB#6-PE#0
> pnv_eeh_restore_config: Can't reinit PCI dev 0x0 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x1 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x8 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x9 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x10 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x11 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x0 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x1 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x8 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x9 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x10 (-7)
> pnv_eeh_restore_config: Can't reinit PCI dev 0x11 (-7)
> EEH: Sleep 5s ahead of partial hotplug
> pci 0004:04     : [PE# 00] Setting up window#0 0..3fffffff pg=1000
> pci 0004:05     : [PE# 18] Setting up window#0 0..3fffffff pg=1000
> pci 0004:06     : [PE# 30] Setting up window#0 0..3fffffff pg=1000
> pci 0006:00:00.0: [PE# 0d] Setting up window 0..3fffffff pg=1000
> pci 0006:00:01.0: [PE# 0c] Setting up window 0..3fffffff pg=1000
> pci 0006:00:02.0: [PE# 0b] Setting up window 0..3fffffff pg=1000
> EEH: Beginning: 'slot_reset'
> EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0)
> EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0)
> EEH: Finished:'slot_reset' with aggregate recovery state:'none'
> EEH: Notify device driver to resume
> EEH: Beginning: 'resume'
> EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0)
> EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0)
> EEH: Finished:'resume'
> EEH: Recovery successful.
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 8 ++++----
>  arch/powerpc/platforms/powernv/pci-ioda.c    | 4 ++--
>  arch/powerpc/platforms/powernv/pci.c         | 4 ++--
>  3 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index abc0be7..f380789 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -564,8 +564,8 @@ static void pnv_eeh_get_phb_diag(struct eeh_pe *pe)
>  static int pnv_eeh_get_phb_state(struct eeh_pe *pe)
>  {
>  	struct pnv_phb *phb = pe->phb->private_data;
> -	u8 fstate;
> -	__be16 pcierr;
> +	u8 fstate = 0;
> +	__be16 pcierr = 0;
>  	s64 rc;
>  	int result = 0;
>  
> @@ -603,8 +603,8 @@ static int pnv_eeh_get_phb_state(struct eeh_pe *pe)
>  static int pnv_eeh_get_pe_state(struct eeh_pe *pe)
>  {
>  	struct pnv_phb *phb = pe->phb->private_data;
> -	u8 fstate;
> -	__be16 pcierr;
> +	u8 fstate = 0;
> +	__be16 pcierr = 0;
>  	s64 rc;
>  	int result;
>  
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index dd80744..72b5cc0 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -604,8 +604,8 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
>  static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
>  {
>  	struct pnv_ioda_pe *slave, *pe;
> -	u8 fstate, state;
> -	__be16 pcierr;
> +	u8 fstate = 0, state;
> +	__be16 pcierr = 0;
>  	s64 rc;
>  
>  	/* Sanity check on PE number */
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 13aef23..db230a35 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -602,8 +602,8 @@ static void pnv_pci_handle_eeh_config(struct pnv_phb *phb, u32 pe_no)
>  static void pnv_pci_config_check_eeh(struct pci_dn *pdn)
>  {
>  	struct pnv_phb *phb = pdn->phb->private_data;
> -	u8	fstate;
> -	__be16	pcierr;
> +	u8	fstate = 0;
> +	__be16	pcierr = 0;
>  	unsigned int pe_no;
>  	s64	rc;
>  
> -- 
> 2.17.1


More information about the Linuxppc-dev mailing list