[Linuxppc-users] Discrepancies between Performance Simulator and Silicon

Nicolas Koenig koenigni at student.ethz.ch
Tue Jun 11 08:42:55 AEST 2019


Hello Pat,

On 10/06/2019 19:26, Pat Haugen wrote:
> On 6/8/19 6:06 PM, Nicolas Koenig wrote:
>> Hello everyone,
>>
>> while trying to solve the riddle surrounding xsadddp's throughput, I recently came across the power9 performance simulator, which is supposed to be cycle-accurate. When trying it, I noticed that there appears to be a discrepancy for the following code:
>>
>> loop:
>>    .rept 16
>>      mtvsrd %vs1, %r3
>>    .endr
>>    bdnz loop
>>
>> When executing it in the performance simulator, it yields a stable 4 mtvsrd instructions per cycle (excluding branches), while the actual silicon can only sustain 3 mtvsrd instructions per cycle (again, excluding branches). What might be the reason for this difference?
>>
> How did you determine the hardware can only sustain 3? Is the loop at least quadword aligned to eliminate any variability between the two wrt fetching behavior?

A bit more of the code (NUM_INSTR = 16, ctr = 0x8000):

   mfspr %r5, 776
.align 4
mtvsr_loop:
.rept NUM_INSTR
   mtvsrd %vs9, %r3
.endr
   bdnz mtvsr_loop
.align 4
   mfspr %r6, 776

I determined the throughput of mtvsrd instructions via 
(num_iterations*NUM_INSTR)/(%r6-%r5), which turned out to be 2.961 i/c 
in the test I just ran. Increasing NUM_INSTR further converges the 
result (NUM_INSTR=32 yields 2.995 i/c).

The whole test code is at 
https://github.com/Dichloromethane/pwr9/blob/master/bench/maxthroughput/

The processor is a P9 Sforza D2.2 (pvr 004e 1202) and the performance 
simulator says  "Version: p9 v1662 built on Fri Jan 19 08:15:59 201", if 
that is of any help.

Thank you for looking into this
   Nicolas

> 
>> Thanks in advance
>>    Nicolas
>>
>> P.S.: It also seems like scrollpv can't disassemble the mtvsrd instruction, it just shows ?????? and the instruction in hex (it is the right instruction though, I double checked).
> 
> Sounds like an old version or missing flag for whatever scrollpv uses for disassembling.
> 
> -Pat
> 


More information about the Linuxppc-users mailing list