[Linuxppc-users] xsadddp throughput on Power9

Bill Schmidt wschmidt at linux.ibm.com
Wed Mar 6 23:55:52 AEDT 2019


On 3/6/19 4:35 AM, Nicolas Koenig wrote:
> Hello world,
>
> After asking this question on another mailing list, I was redirected
> to this list. I hope someone on here will be able to help me :)
>
> While running a few benchmarks, I noticed that the following code
> (with SMT disabled) only manages about 2.25 xsadddp instr/clk
> (measured via pmc6) instead of the expected 4:
>
> loop:
>     .rept 12
>         xsadddp %vs2, %vs1, %vs1
>     .endr
>     bdnz loop
>
> From what I can gather, the bottleneck shouldn't be the history
> buffers. Since there are no long latency operations, FIN->COMP
> shouldn't take more than 12 cycles (the size of the secondary HB for
> FPSCR, the smallest relevant one). The primary HB and the issue queue
> shouldn't overflow either, since xsadddp takes 7 cycles from issue to
> finish and they can accomodate 20 and 13 entries respectivly with one
> instruction only using one of each. It doesn't stall on writeback
> ports either, because there are only 4 results in any one clock and 4
> writeback ports (the decrement of the bdnz instruction is handled in
> the branch slice without involving the writeback network).
>
> Has anyone here any idea where the bottleneck might be?

Hi Nicolas,

I'm going to pass this along to some folks to investigate, but first I need to confirm:
I believe this is a POWER9 measurement, and specifically on a Talos workstation, correct?

Thanks,
Bill

>
> Thanks in advance
>     Nicolas
> _______________________________________________
> Linuxppc-users mailing list
> Linuxppc-users at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-users/attachments/20190306/ed309a66/attachment.htm>


More information about the Linuxppc-users mailing list