[Skiboot] [PATCH RESEND v6 2/2] SBE: Add timer support

Tue Apr 17 13:40:18 AEST 2018

On Mon, 2018-04-16 at 15:26 +0530, Vasant Hegde wrote:
> 
> > >   	/* SBE came back from reset */
> > >   	if (data & SBE_HOST_RESET) {
> > >   		prlog(PR_NOTICE, "Back from reset [chip id = %x]\n", chip_id);
> > > @@ -547,6 +586,9 @@ clr_interrupt:
> > >   
> > >   	p9_sbe_process_queue(sbe);
> > >   	unlock(&sbe->lock);
> > > +	/* Drop lock and call timers */
> > > +	if (timer_interrupt)
> > > +		check_timers(true);
> > 
> > If for some reason the timer fired early, are you confident it will be
> > re-armed ?
> 
> Our understanding is SBE will never get back to us early. They will always come 
> after timeout.
> So we are good here.

No, you need to ensure you are robust. It could be something as simple
as the clock drifting between the SBE and the CPU for example.

> > 
> > >   }
> > >   
> > >   static void p9_sbe_timeout_poll(void *user_data __unused)
> > > @@ -598,6 +640,120 @@ static void p9_sbe_timeout_poll(void *user_data __unused)
> > >   		p9_sbe_process_queue(sbe);
> > >   		unlock(&sbe->lock);
> > >   	}
> > > +
> > > +	/*
> > > +	 * Check if the timer is working. If at least 1ms elapsed since
> > > +	 * last scheduled timer expiry.
> > > +	 */
> > > +	if (sbe_has_timer && sbe_timer_in_progress) {
> > > +		chip = get_chip(sbe_default_chip_id);
> > > +		sbe = chip->sbe;
> > > +		if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(1))
> > > +		    != TB_AAFTERB)
> > > +			return;
> > 
> > So I would have put that inside the above lock section, basically, if
> > the chip is the default chip, add the timeout logic (and move it to a
> > separate function).
> 
> Yeah. I will have separate p9_sbe_timer_poll() to timer timeout checking. Also I 
> will move above block  inside lock.
> 
> > 
> > > +		/*
> > > +		 * In some cases there will be a delay in calling OPAL interrupt
> > > +		 * handler routine (opal_handle_interrupt). In such cases its
> > > +		 * possible that SBE has responded, but OPAL didn't act on that.
> > > +		 * Hence check for SBE response before disabling timer.
> > > +		 */
> > > +		p9_sbe_interrupt(sbe_default_chip_id);
> > 
> > 
> > Otherwise you call the above twice in the same function and the
> > function becomes a mess.
> > 
> > Also you keep taking & releasing the lock, it's inefficient. Follow my
> > advice in the other patch, have the lock taken once in a per-chip thing
> > and everything else underneath is done with that one lock instance.
> 
> Agree. Moved to separate function.
> 
> 
> > 
> > > +
> > > +		lock(&sbe->lock);
> > > +		if (!sbe_timer_in_progress) {
> > > +			unlock(&sbe->lock);
> > > +			return;
> > > +		}
> > > +
> > > +		if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(1))
> > > +		    != TB_AAFTERB) {
> > > +			unlock(&sbe->lock);
> > > +			return;
> > > +		}
> > > +
> > > +		prlog(PR_ERR, "Timer stuck, falling back to OPAL pollers.\n");
> > > +		prlog(PR_ERR, "You will likely have slower I2C and may have "
> > > +		      "experienced increased jitter.\n");
> > > +		p9_sbe_reg_dump(sbe_default_chip_id);
> > > +		sbe_has_timer = false;
> > > +		sbe_timer_in_progress = false;
> > > +		unlock(&sbe->lock);
> > > +	}
> > > +}
> > > +
> > > +/*
> > > + * This is called with the timer lock held, so there is no
> > > + * issue with re-entrancy or concurrence
> > > + */
> > > +void p9_sbe_update_timer_expiry(uint64_t new_target)
> > > +{
> > > +	int rc;
> > > +	u32 tick_us = SBE_TIMER_DEFAULT_US;
> > > +	u64 tb_cnt, now = mftb();
> > > +
> > > +	if (!sbe_has_timer || new_target == sbe_timer_target)
> > > +		return;
> > > +
> > > +	sbe_timer_target = new_target;
> > > +
> > > +	/* Modify timer */
> > > +	if (sbe_timer_in_progress) {
> > > +		if (sbe_timer_target < now)
> > > +			return;
> > > +
> > > +		/* remaining time <= sbe_timer_def_tb */
> > > +		if ((sbe_last_gen_stamp - now) <= sbe_timer_def_tb)
> > > +			return;
> > > +
> > > +		if (sbe_timer_target > sbe_last_gen_stamp)
> > > +			return;
> > > +
> > > +		sbe_timer_in_progress = false;
> > 
> > So aren't you potentially re-using or modifying a message that is in
> > the middle of being sent ?
> > 
> > IE a previous call to p9_sbe_update_timer_expiry has queued the
> > message, it's now in the process of being sent, while this gets called,
> > cleares sbe_timer_in_progress, ad move on with hacking the "active"
> > message ... that isn't right.
> > 
> > You need to flag that a new target is needed and prep a new message
> > from the completion of the previous one.
> 
> Done.
> 
> I'm moving from sync msg to async message. Also I've "has_new_target" note down 
> new timer value while we are processing current timer.
> 
> Thanks
> -Vasant