[PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

Alistair Popple alistair at popple.id.au
Thu Oct 18 12:05:52 AEDT 2018


Hi Alexey,

> > wouldn't you also need to do that somewhere? Unless the driver
> > does it at startup?
> 
> VFIO performs GPU reset so I'd expect the GPUs to flush its caches
> without any software interactions. Am I hoping for too much here?

Sadly you are. It's not the GPU caches that need flushing, it's the CPU caches. 
This needs to happen as part of the reset sequence, so I guess you would need 
to add it to the VFIO driver.

- Alistair

> 
> > - Alistair
> > 
> >>> - Alistair
> >>> 
> >>>>> - Alistair
> >>>>> 
> >>>>>>> - Alistair
> >>>>>>> 
> >>>>>>> On Monday, 15 October 2018 6:17:51 PM AEDT Alexey Kardashevskiy 
wrote:
> >>>>>>>> Ping?
> >>>>>>>> 
> >>>>>>>> On 02/10/2018 13:20, Alexey Kardashevskiy wrote:
> >>>>>>>>> The skiboot firmware has a hot reset handler which fences the
> >>>>>>>>> NVIDIA V100
> >>>>>>>>> GPU RAM on Witherspoons and makes accesses no-op instead of
> >>>>>>>>> throwing HMIs:
> >>>>>>>>> https://github.com/open-power/skiboot/commit/fca2b2b839a67
> >>>>>>>>> 
> >>>>>>>>> Now we are going to pass V100 via VFIO which most certainly
> >>>>>>>>> involves
> >>>>>>>>> KVM guests which are often terminated without getting a chance to
> >>>>>>>>> offline
> >>>>>>>>> GPU RAM so we end up with a running machine with misconfigured
> >>>>>>>>> memory.
> >>>>>>>>> Accessing this memory produces hardware management interrupts
> >>>>>>>>> (HMI)
> >>>>>>>>> which bring the host down.
> >>>>>>>>> 
> >>>>>>>>> To suppress HMIs, this wires up this hot reset hook to
> >>>>>>>>> vfio_pci_disable()
> >>>>>>>>> via pci_disable_device() which switches NPU2 to a safe mode and
> >>>>>>>>> prevents
> >>>>>>>>> HMIs.
> >>>>>>>>> 
> >>>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> >>>>>>>>> ---
> >>>>>>>>> Changes:
> >>>>>>>>> v2:
> >>>>>>>>> * updated the commit log
> >>>>>>>>> ---
> >>>>>>>>> 
> >>>>>>>>>  arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++++++++++
> >>>>>>>>>  1 file changed, 10 insertions(+)
> >>>>>>>>> 
> >>>>>>>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> >>>>>>>>> b/arch/powerpc/platforms/powernv/pci-ioda.c index
> >>>>>>>>> cde7102..e37b9cc 100644
> >>>>>>>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> >>>>>>>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> >>>>>>>>> @@ -3688,6 +3688,15 @@ static void pnv_pci_release_device(struct
> >>>>>>>>> pci_dev *pdev)>>>>>>>>> 
> >>>>>>>>>  		pnv_ioda_release_pe(pe);
> >>>>>>>>>  
> >>>>>>>>>  }
> >>>>>>>>> 
> >>>>>>>>> +static void pnv_npu_disable_device(struct pci_dev *pdev)
> >>>>>>>>> +{
> >>>>>>>>> +	struct eeh_dev *edev = pci_dev_to_eeh_dev(pdev);
> >>>>>>>>> +	struct eeh_pe *eehpe = edev ? edev->pe : NULL;
> >>>>>>>>> +
> >>>>>>>>> +	if (eehpe && eeh_ops && eeh_ops->reset)
> >>>>>>>>> +		eeh_ops->reset(eehpe, EEH_RESET_HOT);
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> 
> >>>>>>>>>  static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
> >>>>>>>>>  {
> >>>>>>>>>  
> >>>>>>>>>  	struct pnv_phb *phb = hose->private_data;
> >>>>>>>>> 
> >>>>>>>>> @@ -3732,6 +3741,7 @@ static const struct pci_controller_ops
> >>>>>>>>> pnv_npu_ioda_controller_ops = {>>>>>>>>> 
> >>>>>>>>>  	.reset_secondary_bus	= pnv_pci_reset_secondary_bus,
> >>>>>>>>>  	.dma_set_mask		= pnv_npu_dma_set_mask,
> >>>>>>>>>  	.shutdown		= pnv_pci_ioda_shutdown,
> >>>>>>>>> 
> >>>>>>>>> +	.disable_device		= pnv_npu_disable_device,
> >>>>>>>>> 
> >>>>>>>>>  };
> >>>>>>>>>  
> >>>>>>>>>  static const struct pci_controller_ops
> >>>>>>>>>  pnv_npu_ocapi_ioda_controller_ops = {




More information about the Linuxppc-dev mailing list