KVM P9 optimisation series

Nicholas Piggin npiggin at gmail.com
Mon May 31 15:10:47 AEST 2021


I have put my current series here

https://github.com/npiggin/linux/tree/kvm-in-c-new

It contains existing Cify series plus about 50 patches, it's getting 
fairly stable with both L0 and L1 hypervisors. The aim of the series
is to speed up the P9 entry/exit code and also simplify things where
possible.

It does this in several main ways:

- Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard
  stalls.

- Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive.

- Reduce mftb. mftb is expensive.

- Demand fault certain facilities to avoid saving and/or restoring them
  (at the cost of fault when they are used, but this is mitigated over
  a number of entries, like the facilities when context switching 
  processes). PM, TM, and EBB so far.

- Defer some sequences that are made just in case a guest is interrupted
  in the middle of a critical section to the case where the guest is
  scheduled on a different CPU, rather than every time (at the cost of
  an extra IPI in this case). Namely the tlbsync sequence for radix with
  GTSE, which is very expensive.

- Reduce barriers, atomics, start shedding some of vcore complexity to
  reduce path length, locking, etc.

So far this speeds up the full entry/exit cycle (measured by guest 
spinning in 'sc 1' to cause exits, with a host hack make it exit rather
than SIGILL), by about 2x on P9 and more on a P10.

There is some more that can be done (xive optimisation, more complexity
reduction, removing another mftb) but there are not many easy gains left
here. The big thing which is not yet addressed is a light weight exit
that does not switch all context each time. That will take a bit more
design to get working really well, so I prefer to do that over a longer
period perhaps with the help of some realistic workloads. It's very
simple to hack something up to work fast with a few TCE or HPT hcalls
for example, but really we may prefer on balance to do something which
is slightly slower for those but works for other host interrupts like 
timers, device irqs, IPIs, partition scope page faults, etc.

I will submit this after the first Cify series is accepted into the
powerpc/kvm tree.

Thanks,
Nick


More information about the Linuxppc-dev mailing list