[Skiboot] [PATCH v2 1/2] astbmc: Add NPU slot type and use it in Garrison

Alistair Popple alistair at popple.id.au
Wed Jun 22 14:23:47 AEST 2016


On Wed, 22 Jun 2016 12:24:17 Russell Currey wrote:
> On Wed, 2016-06-22 at 12:12 +1000, Alistair Popple wrote:
> > On Tue, 21 Jun 2016 18:04:47 Russell Currey wrote:
> > > NPU links are grouped into pairs, signifying that they both correlate to
> > > the same physical GPU.  These pairs are presented in the Garrison slot
> > > table as typical PCI devices, creating what are essentially redundant
> > > entries.  Add a new slot type specifically for NPUs, and subsequently
> > > perform bdfn correlation using it.
> > > 
> > > Signed-off-by: Russell Currey <ruscur at russell.cc>
> > > ---
> > > V2: New change to facilitate bdfn allocation without using pbcq
> > > ---
> > >  platforms/astbmc/astbmc.h   |  2 ++
> > >  platforms/astbmc/garrison.c | 36 ++++++++----------------------------
> > >  platforms/astbmc/slots.c    |  7 +++++++
> > >  3 files changed, 17 insertions(+), 28 deletions(-)
> > 
> > <snip>
> >  
> > > diff --git a/platforms/astbmc/slots.c b/platforms/astbmc/slots.c
> > > index 36547e1..992f68b 100644
> > > --- a/platforms/astbmc/slots.c
> > > +++ b/platforms/astbmc/slots.c
> > > @@ -70,6 +70,13 @@ static const struct slot_table_entry 
> > *match_slot_dev_entry(struct phb *phb,
> > >  			prerror("SLOT: Bad PHB entry type in table !\n");
> > >  			continue;
> > >  		}
> > > +
> > > +		if (ent->etype == st_npu_slot) {
> > > +			/* NPU groups are at device level, so ignore
> > > function 
> > */
> > > +			if (ent->location == ((pd->bdfn & 0xff) >> 3))
> > 
> > Is this reliable? ent->location is a ST_LOC_NPU_GROUP code but what
> > guarantees 
> > that the pd->bdfn device number matches the NPU_GROUP number?
> 
> It depends.  It works for Garrison since the groups (0, 1) correlate to the
> device numbers, which are (0, 1, 8, 9 -- so 0, 0, 1, 1).  There's probably a
> more robust way to do things, though.

My concern is that this is more by dumb luck than design as doesn't it depend 
on the order in which NPUs are allocated bdfn's in npu_allocate_bdfn?

For example if the links in ST_LOC_NPU_GROUP(1) are assigned bdfn's before 
those in ST_LOC_NPU_GROUP(0) then won't the slot and therefore the GPU 
associations be reversed because those in GROUP(1) will have device number 0? 
Incorrect GPU associations will cause issues in Linux and we don't make any 
guarantees about the order in which links are probed/bdfn's allocated so we 
would need to fix that for this to work (or derive bdfn from group id).

> > Do we need something like "if (ent->location == pd->npu_group)" instead?
> 
> I'm hoping to avoid adding NPU-specific stuff to any generic PCI structs, if
> possible.

Yeah, it's generally best to avoid bloating generic structs with device 
specific fields if possible but in this case it would be justified imho 
(especially as it's only an int).

- Alistair

> > 
> > - Alistair
> > 
> > > +				return ent;
> > > +		}
> > > +
> > >  		if (ent->location == (pd->bdfn & 0xff))
> > >  			return ent;
> > >  	}
> > > 
> 



More information about the Skiboot mailing list