[PATCH] powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C
Stewart Smith
stewart at linux.ibm.com
Wed Feb 27 13:22:27 AEDT 2019
Nicholas Piggin <npiggin at gmail.com> writes:
> The OPAL call wrapper gets interrupt disabling wrong. It disables
> interrupts just by clearing MSR[EE], which has two problems:
>
> - It doesn't call into the IRQ tracing subsystem, which means tracing
> across OPAL calls does not always notice IRQs have been disabled.
>
> - It doesn't go through the IRQ soft-mask code, which causes a minor
> bug. MSR[EE] can not be restored by saving the MSR then clearing
> MSR[EE], because a racing interrupt while soft-masked could clear
> MSR[EE] between the two steps. This can cause MSR[EE] to be
> incorrectly enabled when the OPAL call returns. Fortunately that
> should only result in another masked interrupt being taken to
> disable MSR[EE] again, but it's a bit sloppy.
>
> The existing code also saves MSR to PACA, which is not re-entrant if
> there is a nested OPAL call from different MSR contexts, which can
> happen these days with SRESET interrupts on bare metal.
>
> To fix these issues, move the tracing and IRQ handling code to C, and
> call into asm just for the low level call when everything is ready to
> go. Save the MSR on stack rather than PACA.
>
> Performance cost is kept to a minimum with a few optimisations:
>
> - The endian switch upon return is combined with the MSR restore,
> which avoids an expensive context synchronizing operation for LE
> kernels. This makes up for the additional mtmsrd to enable
> interrupts with local_irq_enable().
>
> - blr is now used to return from the opal_* functions that are called
> as C functions, to avoid link stack corruption. This requires a
> skiboot fix as well to keep the call stack balanced.
>
> A NULL call is more costly after this, (410ns->430ns on POWER9), but
> OPAL calls are generally not performance critical at this scale.
At least with this patch we can start to better measure things, and
that's a big plus. Also, When I get to start worrying about <1ms OPAL
calls, I think we can revisit optimising this path :)
--
Stewart Smith
OPAL Architect, IBM.
More information about the Linuxppc-dev
mailing list