[Skiboot] [PATCH skiboot] npu2: Allow disabling Probe.I.MO snarfing

Michael Neuling mikey at neuling.org
Sat Apr 27 10:10:03 AEST 2019


On Fri, 2019-04-26 at 20:32 +0000, Brian J King wrote:
> Mikey,
>  
> The reason I suggested an nvram configuration option is due simply to time. We
> need a solution for the KVM case as soon as possible. We've done some
> performance measurements with snarfing disabled in bare metal and it has not
> shown there to be any significant performance delta. However, there is some
> concern that there might be workloads we've not run that might be affected. To
> mitigate that, the proposal was a chicken switch. That way if anyone running
> bare metal reported a performance regression, we could ask them to re-enable
> snarfing and we'd easily know if this was the source of the regression. Long
> term, I certainly would like to see this go away. 

I think that makes some sense. It's basically a regression test that people
should never use, but the patch says snarfing is enabled by default. Your
comments says we should run with it disabled, and enable it in the odd case.

I think the option is the wrong way around in the patch.

> As to the bare metal scenario, bare metal users should not encounter this due
> to a change made in the Nvidia driver. However, an NVidia driver change for a
> host checkstop is not a solution in a KVM environment.
>  
> Let me know if you still want Alexey to pull out the config option and we can
> just disable snarfing everywhere. We'd just be doing that sooner than we
> planned.
>  
> Thanks,
>  
> Brian
> 
> Brian King
> STSM, Linux on Power I/O Chief Architect
> Linux Technology Center
>  
>  
> > ----- Original message -----
> > From: "Michael Neuling" <mikey at neuling.org>
> > To: "Alexey Kardashevskiy" <aik at ozlabs.ru>, skiboot at lists.ozlabs.org
> > Cc: Brian J King/Rochester/IBM at IBMUS, "Jose Ricardo Ziviani" <
> > joserz at linux.ibm.com>, "Alistair Popple" <alistair at popple.id.au>, "Daniel
> > Henrique Barboza" <danielhb413 at gmail.com>, "Piotr Jaroszynski" <
> > pjaroszynski at nvidia.com>, "Leonardo Augusto Guimarães Garcia" <
> > lagarcia at br.ibm.com>, "Reza Arbab" <arbab at linux.ibm.com>
> > Subject: Re: [Skiboot] [PATCH skiboot] npu2: Allow disabling Probe.I.MO
> > snarfing
> > Date: Fri, Apr 26, 2019 2:31 AM
> >  
> > > Re: [Skiboot] [PATCH skiboot] npu2: Allow disabling Probe.I.MO snarfing
> > 
> > "I.MO" ???
> > 
> > 
> > On Wed, 2019-04-24 at 17:21 +1000, Alexey Kardashevskiy wrote:
> > > V100 GPUs are known to violate NVLink2 protocol in some cases (one is when
> > > memory was accessed by the CPU and they by GPU using so called block
> > > linear mapping) and issue double probes to NPU which can still handle this
> > > but only if CONFIG_ENABLE_SNARF_CPM is not set in the CQ_SM Misc Config
> > > register #0. If the bit is set (which is the case today), NPU issues
> > > the machine check stop.
> > >
> > > The snarfing feature is designed to detect 2 probes in flight and combine
> > > them into one.
> > >
> > > This adds a new "opal-npu2-snarf-cpm" nvram variable which controls
> > > CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check
> > > stop from happening. By default snarfing is not disabled.
> > 
> > change "not disabled" to "enabled".
> > 
> > > In order to
> > > disable it, the user has to run:
> > >
> > > sudo nvram -p ibm,skiboot --update-config opal-npu2-snarf-cpm=disable
> > 
> > I'm against adding this as an nvram option.
> > 
> > We need to make a decision as to which way to go and stick with that. Users
> > shouldn't be controlling this sort of thing unless in a very very very
> > special
> > case. In that case, they can recompile skiboot.
> > 
> > If it's a platform decision, then make it per platform for the user.
> > 
> > If it's causing a checkstop, we need to disable it.
> > 
> > If we start adding nvram options for all these types of things we are going
> > to
> > end up with a billion different options that users will never know what to
> > set
> > to.
> > 
> > > and reboot the host system.
> > >
> > > While at this, define macros for register names as well to avoid touching
> > > same lines over and over again.
> > >
> > > Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> > > ---
> > >  include/npu2-regs.h | 14 ++++++++++++++
> > >  hw/npu2.c           | 45 ++++++++++++++++++++++++++++++++-------------
> > >  2 files changed, 46 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/include/npu2-regs.h b/include/npu2-regs.h
> > > index ba10b8eaf88d..61e8ea8615f8 100644
> > > --- a/include/npu2-regs.h
> > > +++ b/include/npu2-regs.h
> > > @@ -791,4 +791,18 @@ void npu2_scom_write(uint64_t gcid, uint64_t
> > scom_base,
> > >  #define L3_PRD_PURGE_TTYPE_MASK PPC_BIT(1) | PPC_BIT(2) |
> > > PPC_BIT(3) | PPC_BIT(4)
> > >  #define L3_FULL_PURGE 0x0
> > >  
> > > +/* Config registers for NPU2 */
> > > +#define NPU_STCK0_CS_SM0_MISC_CONFIG0 0x5011000
> > > +#define NPU_STCK0_CS_SM1_MISC_CONFIG0 0x5011030
> > > +#define NPU_STCK0_CS_SM2_MISC_CONFIG0 0x5011060
> > > +#define NPU_STCK0_CS_SM3_MISC_CONFIG0 0x5011090
> > > +#define NPU_STCK1_CS_SM0_MISC_CONFIG0 0x5011200
> > > +#define NPU_STCK1_CS_SM1_MISC_CONFIG0 0x5011230
> > > +#define NPU_STCK1_CS_SM2_MISC_CONFIG0 0x5011260
> > > +#define NPU_STCK1_CS_SM3_MISC_CONFIG0 0x5011290
> > > +#define NPU_STCK2_CS_SM0_MISC_CONFIG0 0x5011400
> > > +#define NPU_STCK2_CS_SM1_MISC_CONFIG0 0x5011430
> > > +#define NPU_STCK2_CS_SM2_MISC_CONFIG0 0x5011460
> > > +#define NPU_STCK2_CS_SM3_MISC_CONFIG0 0x5011490
> > > +
> > >  #endif /* __NPU2_REGS_H */
> > > diff --git a/hw/npu2.c b/hw/npu2.c
> > > index d532c4da3532..c7b7b071f3e0 100644
> > > --- a/hw/npu2.c
> > > +++ b/hw/npu2.c
> > > @@ -1452,7 +1452,7 @@ static void assign_mmio_bars(uint64_t gcid, uint32_t
> > > scom, uint64_t reg[2], uint
> > >  int npu2_nvlink_init_npu(struct npu2 *npu)
> > >  {
> > >   struct dt_node *np;
> > > - uint64_t reg[2], mm_win[2], val;
> > > + uint64_t reg[2], mm_win[2], val, mask;
> > >  
> > >   /* TODO: Clean this up with register names, etc. when we get
> > >   * time. This just turns NVLink mode on in each brick and should
> > > @@ -1461,18 +1461,37 @@ int npu2_nvlink_init_npu(struct npu2 *npu)
> > >   *
> > >   * Obviously if the year is now 2020 that didn't happen and you
> > >   * should fix this :-) */
> > > - xscom_write_mask(npu->chip_id, 0x5011000, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011030, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011060, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011090, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011200, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011230, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011260, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011290, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011400, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011430, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011460, PPC_BIT(58), PPC_BIT(58));
> > > - xscom_write_mask(npu->chip_id, 0x5011490, PPC_BIT(58), PPC_BIT(58));
> > > +
> > > + val = PPC_BIT(58);
> > > + mask = PPC_BIT(58); /* CONFIG_NVLINK_MODE */
> > > +
> > > + if (nvram_query_eq("opal-npu2-snarf-cpm", "disable"))
> > > + mask |= PPC_BIT(40); /* CONFIG_ENABLE_SNARF_CPM */
> > 
> > Need a big print here so we can debug this in the field.
> > 
> > 
> > 
> > > +
> > > + xscom_write_mask(npu->chip_id, NPU_STCK0_CS_SM0_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK0_CS_SM1_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK0_CS_SM2_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK0_CS_SM3_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK1_CS_SM0_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK1_CS_SM1_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK1_CS_SM2_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK1_CS_SM3_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK2_CS_SM0_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK2_CS_SM1_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK2_CS_SM2_MISC_CONFIG0,
> > > + val, mask);
> > > + xscom_write_mask(npu->chip_id, NPU_STCK2_CS_SM3_MISC_CONFIG0,
> > > + val, mask);
> > >  
> > >   xscom_write_mask(npu->chip_id, 0x50110c0, PPC_BIT(53), PPC_BIT(53));
> > >   xscom_write_mask(npu->chip_id, 0x50112c0, PPC_BIT(53), PPC_BIT(53));
> >  
> 
>  
> 



More information about the Skiboot mailing list