[RFC V2 PATCH 2/6] powerpc: Implement broadcast timer interrupt as an IPI message

Preeti U Murthy preeti at linux.vnet.ibm.com
Thu Aug 22 15:49:42 EST 2013


Hi Ben

On 08/22/2013 08:40 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2013-08-14 at 17:26 +0530, Preeti U Murthy wrote:
>> -static irqreturn_t unused_action(int irq, void *data)
>> +static irqreturn_t timer_action(int irq, void *data)
>>  {
>> -       /* This slot is unused and hence available for use, if needed
>> */
>> +       timer_interrupt();
>>         return IRQ_HANDLED;
>>  }
>>  
> 
> That means we'll do irq_enter/irq_exit twice no ? And things like
> may_hard_irq_enable() are also already done by do_IRQ so you
> don't need timer_interrupt() to do it again.
> 
> We probably are better off breaking timer_interrupt in two:
> 
> void __timer_interrupt(struct pt_regs * regs)
> 
> Does the current stuff between irq_enter and irq_exit, timer_interrupt
> does the remaining around it and calls __timer_interrupt.
> 
> Then from timer_action, you call __timer_interrupt()

We actually tried out this approach. The implementation was have a
set_dec(0) in the timer_action(). This would ensure that we actually do
get a timer interrupt.

But the problem with either this approach or the one that you
suggest,i.e. calling __timer_interrupt is in the following flow.

do_IRQ() -> irq_exit() -> tick_irq_exit() -> tick_nohz_irq_exit()
-> tick_nohz_stop_sched_tick()

The problem lies in the function tick_nohz_stop_sched_tick(). This
function checks for the next timer interrupt pending on this cpu, and
programs the decrementer_next_event to the time of the next event, which
is of course > now.

As a result when in the timer_action() above, we do call
__timer_interrupt() or try to trigger a timer interrupt through
set_dec(0), the condition  if(now >= *next_tb) in timer_interrupt() or
__timer_interrupt() will fail, and will not call the timer interrupt
event handler.

---> if (now >= *next_tb) {
             *next_tb = ~(u64)0;
             if (evt->event_handler)
                evt->event_handler(evt);
      } else {

The broadcast IPI , is meant to make the target CPU of this IPI believe
that it woke up from a timer interrupt, and not from an IPI. (The reason
for this I will explain in the reply to the next patch). The target CPU
should then ideally do what it would have done had it received a real
timer interrupt, call the timer interrupt event handler.

But due to the above code flow this does  not happen.
Hence as the next patch PATCH 3/6 shows, we simply call the event
handler of a timer interrupt without this explicit now >= *next_tb check.

This problem arises only in the implementation of this patchset, because
a timer interrupt is pseudo triggered from an IPI. So the effects of the
IPI handler will be felt on the timer interrupt handler triggered from
this IPI, like above.

Regards
Preeti U Murthy



More information about the Linuxppc-dev mailing list