[PATCH v6 28/29] x86/tsc: Restart NMI watchdog after refining tsc_khz

Nicholas Piggin npiggin at gmail.com
Tue May 10 21:16:21 AEST 2022


Excerpts from Ricardo Neri's message of May 6, 2022 10:00 am:
> The HPET hardlockup detector relies on tsc_khz to estimate the value of
> that the TSC will have when its HPET channel fires. A refined tsc_khz
> helps to estimate better the expected TSC value.
> 
> Using the early value of tsc_khz may lead to a large error in the expected
> TSC value. Restarting the NMI watchdog detector has the effect of kicking
> its HPET channel and make use of the refined tsc_khz.
> 
> When the HPET hardlockup is not in use, restarting the NMI watchdog is
> a noop.
> 
> Cc: Andi Kleen <ak at linux.intel.com>
> Cc: Stephane Eranian <eranian at google.com>
> Cc: "Ravi V. Shankar" <ravi.v.shankar at intel.com>
> Cc: iommu at lists.linux-foundation.org
> Cc: linuxppc-dev at lists.ozlabs.org
> Cc: x86 at kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon at linux.intel.com>
> ---
> Changes since v5:
>  * Introduced this patch
> 
> Changes since v4
>  * N/A
> 
> Changes since v3
>  * N/A
> 
> Changes since v2:
>  * N/A
> 
> Changes since v1:
>  * N/A
> ---
>  arch/x86/kernel/tsc.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index cafacb2e58cc..cc1843044d88 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1386,6 +1386,12 @@ static void tsc_refine_calibration_work(struct work_struct *work)
>  	/* Inform the TSC deadline clockevent devices about the recalibration */
>  	lapic_update_tsc_freq();
>  
> +	/*
> +	 * If in use, the HPET hardlockup detector relies on tsc_khz.
> +	 * Reconfigure it to make use of the refined tsc_khz.
> +	 */
> +	lockup_detector_reconfigure();

I don't know if the API is conceptually good.

You change something that the lockup detector is currently using, 
*while* the detector is running asynchronously, and then reconfigure
it. What happens in the window? If this code is only used for small
adjustments maybe it does not really matter but in principle it's
a bad API to export.

lockup_detector_reconfigure as an internal API is okay because it
reconfigures things while the watchdog is stopped [actually that
looks untrue for soft dog which uses watchdog_thresh in
is_softlockup(), but that should be fixed].

You're the arch so you're allowed to stop the watchdog and configure
it, e.g., hardlockup_detector_perf_stop() is called in arch/.

So you want to disable HPET watchdog if it was enabled, then update
wherever you're using tsc_khz, then re-enable.

Thanks,
Nick


More information about the Linuxppc-dev mailing list