[PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
Alistair Popple
alistair at popple.id.au
Fri Apr 20 13:51:58 AEST 2018
Sorry, forgot to include:
Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Thanks
On Tuesday, 17 April 2018 7:11:28 PM AEST Alistair Popple wrote:
> The NPU has a limited number of address translation shootdown (ATSD)
> registers and the GPU has limited bandwidth to process ATSDs. This can
> result in contention of ATSD registers leading to soft lockups on some
> threads, particularly when invalidating a large address range in
> pnv_npu2_mn_invalidate_range().
>
> At some threshold it becomes more efficient to flush the entire GPU TLB for
> the given MM context (PID) than individually flushing each address in the
> range. This patch will result in ranges greater than 2MB being converted
> from 32+ ATSDs into a single ATSD which will flush the TLB for the given
> PID on each GPU.
>
> Signed-off-by: Alistair Popple <alistair at popple.id.au>
> ---
> arch/powerpc/platforms/powernv/npu-dma.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 94801d8e7894..dc34662e9df9 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -40,6 +40,13 @@
> DEFINE_SPINLOCK(npu_context_lock);
>
> /*
> + * When an address shootdown range exceeds this threshold we invalidate the
> + * entire TLB on the GPU for the given PID rather than each specific address in
> + * the range.
> + */
> +#define ATSD_THRESHOLD (2*1024*1024)
> +
> +/*
> * Other types of TCE cache invalidation are not functional in the
> * hardware.
> */
> @@ -675,11 +682,19 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
> struct npu_context *npu_context = mn_to_npu_context(mn);
> unsigned long address;
>
> - for (address = start; address < end; address += PAGE_SIZE)
> - mmio_invalidate(npu_context, 1, address, false);
> + if (end - start > ATSD_THRESHOLD) {
> + /*
> + * Just invalidate the entire PID if the address range is too
> + * large.
> + */
> + mmio_invalidate(npu_context, 0, 0, true);
> + } else {
> + for (address = start; address < end; address += PAGE_SIZE)
> + mmio_invalidate(npu_context, 1, address, false);
>
> - /* Do the flush only on the final addess == end */
> - mmio_invalidate(npu_context, 1, address, true);
> + /* Do the flush only on the final addess == end */
> + mmio_invalidate(npu_context, 1, address, true);
> + }
> }
>
> static const struct mmu_notifier_ops nv_nmmu_notifier_ops = {
>
More information about the Linuxppc-dev
mailing list