[Skiboot] [PATCH RESEND v6 2/2] SBE: Add timer support
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Mon Apr 16 19:56:58 AEST 2018
On 04/13/2018 03:05 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2018-04-03 at 17:27 +0530, Vasant Hegde wrote:
>> SBE on P9 provides one shot programmable timer facility. We can use this
>> to implement OPAL timers and hence limit the reliance on the Linux
>> heartbeat (similar to HW timer facility provided by SLW on P8).
.../...
>>
>> + /* Timer expired */
>> + if (data & SBE_HOST_TIMER_EXPIRY) {
>> + if (!p9_sbe_timer_got_ack()) {
>> + data &= ~(SBE_HOST_TIMER_EXPIRY);
>> + } else {
>> + sbe_timer_in_progress = false;
>> + timer_interrupt = true;
>> + }
>> + }
>
> So not sure I understand the above. It's the same bit in the doorbell
> that is the "ack" and the timer expiry ? That means you basically need
> "2" interrupts, one for the ack, one for the expiry, when you set the
> timer ?
Its same doorbell register, but different bits. In some cases there will be
delay in processing interrupt and we endup getting both bits set simultaneously.
So I added get_ack() to make sure we process ack before processing timer
interrupt. In such cases we endup having two interrupt to process timer.
I'm changing from sync_msg() to async_msg(). With that we don't need about stuff
anyway.
>
> That's racy no ? IE what if you poll a bit slowly bcs you're busy
> elsewhere and get both ack and timer interrupt ? They coalesce into a
> single event and your timer is gone, no ?
>
>> /* SBE came back from reset */
>> if (data & SBE_HOST_RESET) {
>> prlog(PR_NOTICE, "Back from reset [chip id = %x]\n", chip_id);
>> @@ -547,6 +586,9 @@ clr_interrupt:
>>
>> p9_sbe_process_queue(sbe);
>> unlock(&sbe->lock);
>> + /* Drop lock and call timers */
>> + if (timer_interrupt)
>> + check_timers(true);
>
> If for some reason the timer fired early, are you confident it will be
> re-armed ?
Our understanding is SBE will never get back to us early. They will always come
after timeout.
So we are good here.
>
>> }
>>
>> static void p9_sbe_timeout_poll(void *user_data __unused)
>> @@ -598,6 +640,120 @@ static void p9_sbe_timeout_poll(void *user_data __unused)
>> p9_sbe_process_queue(sbe);
>> unlock(&sbe->lock);
>> }
>> +
>> + /*
>> + * Check if the timer is working. If at least 1ms elapsed since
>> + * last scheduled timer expiry.
>> + */
>> + if (sbe_has_timer && sbe_timer_in_progress) {
>> + chip = get_chip(sbe_default_chip_id);
>> + sbe = chip->sbe;
>> + if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(1))
>> + != TB_AAFTERB)
>> + return;
>
> So I would have put that inside the above lock section, basically, if
> the chip is the default chip, add the timeout logic (and move it to a
> separate function).
Yeah. I will have separate p9_sbe_timer_poll() to timer timeout checking. Also I
will move above block inside lock.
>
>> + /*
>> + * In some cases there will be a delay in calling OPAL interrupt
>> + * handler routine (opal_handle_interrupt). In such cases its
>> + * possible that SBE has responded, but OPAL didn't act on that.
>> + * Hence check for SBE response before disabling timer.
>> + */
>> + p9_sbe_interrupt(sbe_default_chip_id);
>
>
> Otherwise you call the above twice in the same function and the
> function becomes a mess.
>
> Also you keep taking & releasing the lock, it's inefficient. Follow my
> advice in the other patch, have the lock taken once in a per-chip thing
> and everything else underneath is done with that one lock instance.
Agree. Moved to separate function.
>
>> +
>> + lock(&sbe->lock);
>> + if (!sbe_timer_in_progress) {
>> + unlock(&sbe->lock);
>> + return;
>> + }
>> +
>> + if (tb_compare(mftb(), sbe_last_gen_stamp + msecs_to_tb(1))
>> + != TB_AAFTERB) {
>> + unlock(&sbe->lock);
>> + return;
>> + }
>> +
>> + prlog(PR_ERR, "Timer stuck, falling back to OPAL pollers.\n");
>> + prlog(PR_ERR, "You will likely have slower I2C and may have "
>> + "experienced increased jitter.\n");
>> + p9_sbe_reg_dump(sbe_default_chip_id);
>> + sbe_has_timer = false;
>> + sbe_timer_in_progress = false;
>> + unlock(&sbe->lock);
>> + }
>> +}
>> +
>> +/*
>> + * This is called with the timer lock held, so there is no
>> + * issue with re-entrancy or concurrence
>> + */
>> +void p9_sbe_update_timer_expiry(uint64_t new_target)
>> +{
>> + int rc;
>> + u32 tick_us = SBE_TIMER_DEFAULT_US;
>> + u64 tb_cnt, now = mftb();
>> +
>> + if (!sbe_has_timer || new_target == sbe_timer_target)
>> + return;
>> +
>> + sbe_timer_target = new_target;
>> +
>> + /* Modify timer */
>> + if (sbe_timer_in_progress) {
>> + if (sbe_timer_target < now)
>> + return;
>> +
>> + /* remaining time <= sbe_timer_def_tb */
>> + if ((sbe_last_gen_stamp - now) <= sbe_timer_def_tb)
>> + return;
>> +
>> + if (sbe_timer_target > sbe_last_gen_stamp)
>> + return;
>> +
>> + sbe_timer_in_progress = false;
>
> So aren't you potentially re-using or modifying a message that is in
> the middle of being sent ?
>
> IE a previous call to p9_sbe_update_timer_expiry has queued the
> message, it's now in the process of being sent, while this gets called,
> cleares sbe_timer_in_progress, ad move on with hacking the "active"
> message ... that isn't right.
>
> You need to flag that a new target is needed and prep a new message
> from the completion of the previous one.
Done.
I'm moving from sync msg to async message. Also I've "has_new_target" note down
new timer value while we are processing current timer.
Thanks
-Vasant
More information about the Skiboot
mailing list