[PATCH 2/3] powerpc/powernv/npu: Use size-based ATSD invalidates

Mark Hairgrove mhairgrove at nvidia.com
Wed Oct 3 11:10:31 AEST 2018


Thanks for the review. Comments below.

On Tue, 2 Oct 2018, Alistair Popple wrote:

> Thanks Mark,
> 
> Looks like some worthwhile improvments to be had. I've added a couple of
> comments inline below.
> 
> > +#define PAGE_64K (64UL * 1024) +#define PAGE_2M (2UL * 1024 * 1024) +#define
> > PAGE_1G (1UL * 1024 * 1024 * 1024)
> 
> include/linux/sizes.h includes definitions for SZ_64K, SZ_2M, SZ_1G, etc. so
> unless they're redefined here for some reason I personally think it's cleaner to
> use those.

Agreed, will fix. Thanks for the pointer.


> 
> >  /*
> > - * Invalidate either a single address or an entire PID depending on
> > - * the value of va.
> > + * Invalidate a virtual address range
> >   */
> > -static void mmio_invalidate(struct npu_context *npu_context, int va,
> > -			unsigned long address, bool flush)
> > +static void mmio_invalidate(struct npu_context *npu_context,
> > +			unsigned long start, unsigned long size, bool flush)
> 
> With this optimisation every caller of mmio_invalidate() sets flush == true so
> it no longer appears to be used. We should drop it as a parameter unless you
> think there might be some reason to use it in future?
> 
> Therefore we could also drop it as a parameter to get_atsd_launch_val(),
> mmio_invalidate_pid() and mmio_invalidate_range() as well as I couldn't find any
> callers of those that set it to anything other than true.

Yeah, good catch. I'll simplify all of those.


> 
> >  	struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS];
> >  	unsigned long pid = npu_context->mm->context.id;
> > +	unsigned long atsd_start = 0;
> > +	unsigned long end = start + size - 1;
> > +	int atsd_psize = MMU_PAGE_COUNT;
> > +
> > +	/*
> > +	 * Convert the input range into one of the supported sizes. If the range
> > +	 * doesn't fit, use the next larger supported size. Invalidation latency
> > +	 * is high, so over-invalidation is preferred to issuing multiple
> > +	 * invalidates.
> > +	 */
> > +	if (size == PAGE_64K) {
> 
> We also support 4K page sizes on PPC. If I am not mistaken this means every ATSD
> would invalidate the entire GPU TLB for a the given PID on those systems. Could
> we change the above check to `if (size <= PAGE_64K)` to avoid this?

PPC supports 4K pages but the GPU ATS implementation does not. For that
reason I didn't bother handling invalidates smaller than 64K. I'll add a
comment on that.

I don't know that this requirement is enforced anywhere though. I could
add a PAGE_SIZE == 64K check to pnv_npu2_init_context if you think it
would be useful.


> 
> > +		atsd_start = start;
> 
> Which would also require:
> 
>       	    atsd_start = ALIGN_DOWN(start, PAGE_64K);
> 
> > +		atsd_psize = MMU_PAGE_64K;
> > +	} else if (ALIGN_DOWN(start, PAGE_2M) == ALIGN_DOWN(end, PAGE_2M)) {
> 
> Wouldn't this lead to under invalidation in ranges which happen to cross a 2M
> boundary? For example invalidating a 128K (ie. 2x64K pages) range with start ==
> 0x1f0000 and end == 0x210000 would result in an invalidation of the range 0x0 -
> 0x200000 incorrectly leaving 0x200000 - 0x210000 in the GPU TLB.

In this example:
start                         0x1f0000
size                          0x020000
end (start + size - 1)        0x20ffff
ALIGN_DOWN(start, PAGE_2M)    0x000000
ALIGN_DOWN(end, PAGE_2M)      0x200000

Since ALIGN_DOWN(start, PAGE_2M) != ALIGN_DOWN(end, PAGE_2M), the
condition fails and we move to the 1G clause. Then
ALIGN_DOWN(start, PAGE_1G) == ALIGN_DOWN(end, PAGE_1G) == 0, so we
invalidate the range [0, 1G).


More information about the Linuxppc-dev mailing list