[PATCH] powerpc: irq work racing with timer interrupt can result in timer interrupt hang

Preeti U Murthy preeti at linux.vnet.ibm.com
Sun May 11 01:36:31 EST 2014


On 05/10/2014 09:56 AM, Benjamin Herrenschmidt wrote:
> On Fri, 2014-05-09 at 15:22 +0530, Preeti U Murthy wrote:
>> in __timer_interrupt() outside the _else_ loop? This will ensure that no
>> matter what, before exiting timer interrupt handler we check for pending
>> irq work.
> 
> We still need to make sure that set_next_event() doesn't move the
> dec beyond the next tick if there is a pending timer... maybe we

Sorry, but didn't get this. s/if there is pending timer/if there is
pending irq work ?

> can fix it like this:

We can call set_next_event() from events like hrtimer_cancel() or
hrtimer_forward() as well. In that case we don't come to
decrementer_set_next_event() from __timer_interrupt(). Then, if we race
with irq work, we *do not do* a set_dec(1) ( I am referring to the patch
below ), we might never set the decrementer to fire immediately right?

Or does this scenario never arise?

Regards
Preeti U Murthy
> 
> static int decrementer_set_next_event(unsigned long evt,
> 				      struct clock_event_device *dev)
> {
> 	__get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
> 
> 	/* Don't adjust the decrementer if some irq work is pending */
> 	if (!test_irq_work_pending())
> 		set_dec(evt);
> 
> 	return 0;
> }
> 
> Along with a single occurrence of:
> 
> 	if (test_irq_work_pending())
> 		set_dec(1);
> 
> At the end of __timer_interrupt(), outside if the current else {}
> case, this should work, don't you think ?
> 
> What about this completely untested patch ?
> 
> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> index 122a580..ba7e83b 100644
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -503,12 +503,13 @@ void __timer_interrupt(void)
>                 now = *next_tb - now;
>                 if (now <= DECREMENTER_MAX)
>                         set_dec((int)now);
> -               /* We may have raced with new irq work */
> -               if (test_irq_work_pending())
> -                       set_dec(1);
>                 __get_cpu_var(irq_stat).timer_irqs_others++;
>         }
> 
> +       /* We may have raced with new irq work */
> +       if (test_irq_work_pending())
> +               set_dec(1);
> +
>  #ifdef CONFIG_PPC64
>         /* collect purr register values often, for accurate calculations */
>         if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
> @@ -813,15 +814,11 @@ static void __init clocksource_init(void)
>  static int decrementer_set_next_event(unsigned long evt,
>                                       struct clock_event_device *dev)
>  {
> -       /* Don't adjust the decrementer if some irq work is pending */
> -       if (test_irq_work_pending())
> -               return 0;
>         __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
> -       set_dec(evt);
> 
> -       /* We may have raced with new irq work */
> -       if (test_irq_work_pending())
> -               set_dec(1);
> +       /* Don't adjust the decrementer if some irq work is pending */
> +       if (!test_irq_work_pending())
> +               set_dec(evt);
> 
>         return 0;
>  }
> 
> 
> 
> 



More information about the Linuxppc-dev mailing list