[PATCH v6 22/29] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

Thomas Gleixner tglx at linutronix.de
Tue May 10 00:03:39 AEST 2022


On Thu, May 05 2022 at 17:00, Ricardo Neri wrote:
> +	if (is_hpet_hld_interrupt(hdata)) {
> +		/*
> +		 * Kick the timer first. If the HPET channel is periodic, it
> +		 * helps to reduce the delta between the expected TSC value and
> +		 * its actual value the next time the HPET channel fires.
> +		 */
> +		kick_timer(hdata, !(hdata->has_periodic));
> +
> +		if (cpumask_weight(hld_data->monitored_cpumask) > 1) {
> +			/*
> +			 * Since we cannot know the source of an NMI, the best
> +			 * we can do is to use a flag to indicate to all online
> +			 * CPUs that they will get an NMI and that the source of
> +			 * that NMI is the hardlockup detector. Offline CPUs
> +			 * also receive the NMI but they ignore it.
> +			 *
> +			 * Even though we are in NMI context, we have concluded
> +			 * that the NMI came from the HPET channel assigned to
> +			 * the detector, an event that is infrequent and only
> +			 * occurs in the handling CPU. There should not be races
> +			 * with other NMIs.
> +			 */
> +			cpumask_copy(hld_data->inspect_cpumask,
> +				     cpu_online_mask);
> +
> +			/* If we are here, IPI shorthands are enabled. */
> +			apic->send_IPI_allbutself(NMI_VECTOR);

So if the monitored cpumask is a subset of online CPUs, which is the
case when isolation features are enabled, then you still send NMIs to
those isolated CPUs. I'm sure the isolation folks will be enthused.

Thanks,

        tglx


More information about the Linuxppc-dev mailing list