[Skiboot] [PATCH v7 2/2] SBE: Add timer support

Tue Apr 17 17:01:39 AEST 2018

On 04/17/2018 09:48 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2018-04-16 at 17:18 +0530, Vasant Hegde wrote:
>> SBE on P9 provides one shot programmable timer facility. We can use this
>> to implement OPAL timers and hence limit the reliance on the Linux
>> heartbeat (similar to HW timer facility provided by SLW on P8).
>>

.../...

>>   
>> +/*
>> + * Check if the timer is working. If at least 10ms elapsed since
>> + * last scheduled timer expiry.
>> + */
>> +static void p9_sbe_timer_poll(struct p9_sbe *sbe)
>> +{
>> +	if (!sbe_has_timer || !sbe_timer_in_progress)
>> +		return;
>> +
>> +	lock(&sbe->lock);
>> +	if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(10))
>> +	    != TB_AAFTERB)
>> +		goto out;
> 
> Do you need the lock for the above ?

Not required.

> 
>> +	/*
>> +	 * In some cases there will be a delay in calling OPAL interrupt
>> +	 * handler routine (opal_handle_interrupt). In such cases its
>> +	 * possible that SBE has responded, but OPAL didn't act on that.
>> +	 * Hence check for SBE response before disabling timer.
>> +	 */
>> +	__p9_sbe_interrupt(sbe);
>> +	if (got_timer_interrupt)
>> +		goto call_timer;
> 
> Again, why do you call that here ?
> 
> You just called p9_sbe_timeout_poll_one() for that same SBE, which
> already took the lock *and* called __p9_sbe_interrupt.
> 
> I would instead of a call-out from p9_sbe_timeout_poll_one() to check
> timers.

Yeah. We can move p9_sbe_timeout_poll inside p9_sbe_timeout_poll_one(). That way 
we can avoid lock-unlock. But we still need to call __p9_sbe_interrupt() twice.. 
as timer interrupt will come after removing msg from sbe->msg_list.

> 
>> +	if (!sbe_timer_in_progress)
>> +		goto out;
>> +
>> +	if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(10))
>> +	    != TB_AAFTERB)
>> +		goto out;
>> +
>> +	prlog(PR_ERR, "Timer stuck, falling back to OPAL pollers.\n");
>> +	prlog(PR_ERR, "You will likely have slower I2C and may have "
>> +	      "experienced increased jitter.\n");
>> +	p9_sbe_reg_dump(sbe_default_chip_id);
>> +	sbe_has_timer = false;
>> +	sbe_timer_in_progress = false;
>> +
>> +out:
>> +	unlock(&sbe->lock);
>> +	return;
>> +
>> +call_timer:
>> +	got_timer_interrupt = false;
>> +	/* Drop lock and call timers */
>> +	unlock(&sbe->lock);
>> +	check_timers(true);
>> +}
>> +
>>   static void p9_sbe_timeout_poll(void *user_data __unused)
>>   {
>>   	struct p9_sbe *sbe;
>> @@ -644,9 +736,119 @@ static void p9_sbe_timeout_poll(void *user_data __unused)
>>   			continue;
>>   		sbe = chip->sbe;
>>   		p9_sbe_timeout_poll_one(sbe);
>> +		if (sbe->chip_id == sbe_default_chip_id)
>> +			p9_sbe_timer_poll(sbe);
>>   	}
>>   }
>>   
>> +static void p9_sbe_timer_resp(struct p9_sbe_msg *msg)
>> +{
>> +	if (msg->state != sbe_msg_done) {
>> +		prlog(PR_DEBUG, "Failed to schedule timer [chip id %x]\n",
>> +		      sbe_default_chip_id);
>> +
>> +		if (!has_new_target)
>> +			return;
>> +	} else {
>> +		/* Update last scheduled timer value */
>> +		sbe_last_gen_stamp = mftb() +
>> +			usecs_to_tb(timer_ctrl_msg->reg[1]);
>> +		sbe_timer_in_progress = true;
>> +	}
> 
> The common path should be that we don't have a new target,
> I would suggest movign the !has_new_target down here, before
> the lock. To be race-free, we need to order it with the msg state, so
> you do need a sync.

Oh yes. I should have moved has_new_target to down.

> 
> What about the lockless updates to sbe_last_gen_stamp and
> sbe_timer_in_progress ? Is that safe?

I think its safe.

> 
>> +	lock(&sbe_timer_lock);
>> +	if (has_new_target) {
>> +		has_new_target = false;
>> +		p9_sbe_timer_schedule();
>> +	}
>> +	unlock(&sbe_timer_lock);
>> +}
>> +
>> +static void p9_sbe_timer_schedule(void)
>> +{
>> +	int rc;
>> +	u32 tick_us = SBE_TIMER_DEFAULT_US;
>> +	u64 tb_cnt, now = mftb();
>> +
>> +	if (sbe_timer_in_progress) {
>> +		/* Remaining time of inflight timer <= sbe_timer_def_tb */
>> +		if ((sbe_last_gen_stamp - now) <= sbe_timer_def_tb)
>> +			return;
>> +	}
>> +
>> +	if (now < sbe_timer_target) {
>> +		/* Calculate how many microseconds from now, rounded up */
>> +		if ((sbe_timer_target - now) > sbe_timer_def_tb) {
>> +			tb_cnt = sbe_timer_target - now + usecs_to_tb(1) - 1;
>> +			tick_us = tb_to_usecs(tb_cnt);
>> +		}
>> +	}
>> +
>> +	/* Clear sequence number. p9_sbe_queue_msg will add new sequene ID */
>> +	timer_ctrl_msg->reg[0] &= ~(PPC_BITMASK(32, 47));
>> +	/* Update timeout value */
>> +	timer_ctrl_msg->reg[1] = tick_us;
>> +	rc = p9_sbe_queue_msg(sbe_default_chip_id, timer_ctrl_msg,
>> +			      p9_sbe_timer_resp);
>> +	if (rc != OPAL_SUCCESS) {
>> +		prlog(PR_ERR, "Failed to start timer [chip id = %x]\n",
>> +		      sbe_default_chip_id);
>> +		return;
>> +	}
>> +}
>> +
>> +/*
>> + * This is called with the timer lock held, so there is no
>> + * issue with re-entrancy or concurrence
>> + */
>> +void p9_sbe_update_timer_expiry(uint64_t new_target)
>> +{
>> +	u64 now = mftb();
>> +
>> +	if (!sbe_has_timer || new_target == sbe_timer_target)
>> +		return;
>> +
>> +	lock(&sbe_timer_lock);
>> +	/* Timer message is in flight. Record new timer and schedule later */
>> +	if (p9_sbe_msg_busy(timer_ctrl_msg) || has_new_target) {
>> +		if (new_target < now)
>> +			goto out;
> 
> Will the caller handle the case where we just passed the target ? Will
> it re-check timers after calling us ?

Sorry. I'm not sure I understood your question.

So we enter this block only when new_target is < inflight timer. So we are good 
in source.
Now on callback path if SBE triggers early we have problem (..as check_timer 
doesn't re-arm).

May be fix check_timer to re-arm timer? -OR- do you want me to take care in 
driver itself?
We can take care here by doing something like below in p9_sbe_interrupt () function.
         if (got_timer_interrupt) {
                 got_timer_interrupt = false;
                 /* Drop lock and call timers */
                 unlock(&sbe->lock);
                 check_timers(true);

		/* Early interrupt , re-arm timer */
		if (sbe_last_gen_timestamp > mftb())
			p9_sbe_update_timer_expiry(sbe_last_gen_timestamp);

	}

-Vasant