[PATCH] powerpc/npu-dma.c: Fix deadlock in mmio_invalidate

Alistair Popple alistair at popple.id.au
Wed Feb 14 14:23:56 AEDT 2018


> > +struct mmio_atsd_reg {
> > +	struct npu *npu;
> > +	int reg;
> > +};
> > +
> 
> Is it just easier to move reg to inside of struct npu?

I don't think so, struct npu is global to all npu contexts where as this is
specific to the given invalidation. We don't have enough registers to assign
each NPU context it's own dedicated register so I'm not sure it makes sense to
put it there either.

> > +static void acquire_atsd_reg(struct npu_context *npu_context,
> > +			struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS])
> > +{
> > +	int i, j;
> > +	struct npu *npu;
> > +	struct pci_dev *npdev;
> > +	struct pnv_phb *nphb;
> >  
> > -		/*
> > -		 * The GPU requires two flush ATSDs to ensure all entries have
> > -		 * been flushed. We use PID 0 as it will never be used for a
> > -		 * process on the GPU.
> > -		 */
> > -		if (flush)
> > -			mmio_invalidate_pid(npu, 0, true);
> > +	for (i = 0; i <= max_npu2_index; i++) {
> > +		mmio_atsd_reg[i].reg = -1;
> > +		for (j = 0; j < NV_MAX_LINKS; j++) {
> 
> Is it safe to assume that npu_context->npdev will not change in this
> loop? I guess it would need to be stronger than just this loop.

It is not safe to assume that npu_context->npdev won't change during this loop,
however I don't think it is a problem if it does as we only read each element
once during the invalidation.

There are two possibilities for how this could change. pnv_npu2_init_context()
will add a nvlink to the npdev which will result in the TLB invalidation being
sent to that GPU as well which should not be a problem.

pnv_npu2_destroy_context() will remove the the nvlink from npdev. If it happens
prior to this loop it should not be a problem (as the destruction will have
already invalidated the GPU TLB). If it happens after this loop it shouldn't be
a problem either (it will just result in an extra TLB invalidate being sent to
this GPU).

> > +			npdev = npu_context->npdev[i][j];
> > +			if (!npdev)
> > +				continue;
> > +
> > +			nphb = pci_bus_to_host(npdev->bus)->private_data;
> > +			npu = &nphb->npu;
> > +			mmio_atsd_reg[i].npu = npu;
> > +			mmio_atsd_reg[i].reg = get_mmio_atsd_reg(npu);
> > +			while (mmio_atsd_reg[i].reg < 0) {
> > +				mmio_atsd_reg[i].reg = get_mmio_atsd_reg(npu);
> > +				cpu_relax();
> 
> A cond_resched() as well if we have too many tries?

I don't think we can as the invalidate_range() function is called under the ptl
spin-lock and is not allowed to sleep (at least according to
include/linux/mmu_notifier.h).

- Alistair

> Balbir
> 




More information about the Linuxppc-dev mailing list