[Skiboot-stable] [PATCH v3] hw/phb4: Tune GPU direct performance on witherspoon in PCI mode
Oliver O'Halloran
oohall at gmail.com
Wed Apr 1 15:36:28 AEDT 2020
On Wed, Mar 25, 2020 at 9:41 PM Frederic Barrat <fbarrat at linux.ibm.com> wrote:
>
> Good GPU direct performance on witherspoon, with a Mellanox adapter
> on the shared slot, requires to reallocate some dma engines within
> PEC2, "stealing" some from PHB4&5 and giving extras to PHB3. It's
> currently done when using CAPI mode. But the same is true if the
> adapter stays in PCI mode.
>
> In preparation for upcoming versions of MOFED, which may not use CAPI
> mode, this patch reallocates dma engines even in PCI mode for a series
> of Mellanox adapters that can be used with GPU direct, on witherspoon
> and on the shared slot only.
>
> The loss of dma engines for PHB4&5 on witherspoon has not shown
> problems in testing, as well as in current deployments where CAPI mode
> is used.
>
> Here is a comparison of the bandwidth numbers seen with the PHB in PCI
> mode (no CAPI) with and without this patch. Variations on smaller
> packet sizes can be attributed to jitter and are not that meaningful.
>
> # OSU MPI-CUDA Bi-Directional Bandwidth Test v5.6.1
> # Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
> # Size Bandwidth (MB/s) Bandwidth (MB/s)
> # with patch without patch
> 1 1.29 1.48
> 2 2.66 3.04
> 4 5.34 5.93
> 8 10.68 11.86
> 16 21.39 23.71
> 32 42.78 49.15
> 64 85.43 97.67
> 128 170.82 196.64
> 256 385.47 383.02
> 512 774.68 755.54
> 1024 1535.14 1495.30
> 2048 2599.31 2561.60
> 4096 5192.31 5092.47
> 8192 9930.30 9566.90
> 16384 18189.81 16803.42
> 32768 24671.48 21383.57
> 65536 28977.71 24104.50
> 131072 31110.55 25858.95
> 262144 32180.64 26470.61
> 524288 32842.23 26961.93
> 1048576 33184.87 27217.38
> 2097152 33342.67 27338.08
>
> Signed-off-by: Frederic Barrat <fbarrat at linux.ibm.com>
> Cc: skiboot-stable at lists.ozlabs.org # skiboot-op940.x
Thanks, merged as e876514b3773
More information about the Skiboot-stable
mailing list