[Skiboot] question about sbe-p9 timer

Thiago Jung Bauermann bauerman at linux.ibm.com
Fri Feb 14 11:32:50 AEDT 2020


Vasant Hegde <hegdevasant at linux.vnet.ibm.com> writes:

> On 2/13/20 7:16 AM, Oliver O'Halloran wrote:
>> On Thu, Feb 13, 2020 at 3:32 AM Thiago Jung Bauermann
>> <bauerman at linux.ibm.com> wrote:
>>>
>>>
>>> Hello,
>>>
>>> I've been studying the timer code in hw/sbe-p9.c and I have a question
>>> regarding concurrency: I noticed that in almost all cases,
>>> sbe_last_gen_stamp is accessed with sbe->lock held (where sbe is the
>>> default chip).
>>>
>>> Except in one case: p9_sbe_update_timer_expiry() calls
>>> p9_sbe_timer_schedule() - which reads sbe_last_gen_stamp - without
>>> acquiring sbe->lock. Is this a problem that needs to be fixed?
>>
>> The SBE timer state is protected by a separate lock (sbe_timer_lock)
>> since it's global state and not tied to a specific SBE. When starting
>> the timer we take the per-SBE lock for the default SBE since we need
>> to modify its message queue. Seems ok to me.
>
>
> Oliver is correct. sbe->lock is per SBE lock used for modifying message queue, etc.
>
> Timer is one specific functionality of SBE.. which is protected by `sbe_timer_lock`.
>
> I don't see any issue in the code.

I still see sbe_last_gen_stamp (and also sbe_timer_in_progress) being
modified without protection of the sbe_timer_lock, or perhaps I'm still
confused.

For instance, this could happen:

CPU 1                                   CPU 2

p9_sbe_interrupt()
  lock(&sbe->lock)
  __p9_sbe_interrupt()
    p9_sbe_send_complete()
      p9_sbe_msg_complete()
        p9_sbe_timer_resp()
          sbe_last_gen_stamp = ...
          sbe_timer_in_progress = ...

                                        p9_sbe_update_timer_expiry()
                                          lock(&sbe_timer_lock)
                                          p9_sbe_timer_schedule()
                                            if (sbe_timer_in_progress)
                                              if (now >= sbe_last_gen_stamp)

Notice that CPU 1 writes to sbe_last_gen_stamp and sbe_timer_in_progress
without holding the sbe_timer_lock, so CPU 2 could be reading them at
the same time. Could this cause problems?

My first reaction is that it could, and something like this would be
needed (untested):

diff --git a/hw/sbe-p9.c b/hw/sbe-p9.c
index 53f378e18..ffc121828 100644
--- a/hw/sbe-p9.c
+++ b/hw/sbe-p9.c
@@ -753,6 +753,7 @@ static void p9_sbe_timeout_poll(void *user_data __unused)
 
 static void p9_sbe_timer_resp(struct p9_sbe_msg *msg)
 {
+	lock(&sbe_timer_lock);
 	if (msg->state != sbe_msg_done) {
 		prlog(PR_DEBUG, "Failed to schedule timer [chip id %x]\n",
 		      sbe_default_chip_id);
@@ -763,10 +764,6 @@ static void p9_sbe_timer_resp(struct p9_sbe_msg *msg)
 		sbe_timer_in_progress = true;
 	}
 
-	if (!has_new_target)
-		return;
-
-	lock(&sbe_timer_lock);
 	if (has_new_target) {
 		has_new_target = false;
 		p9_sbe_timer_schedule();

--
Thiago Jung Bauermann
IBM Linux Technology Center


More information about the Skiboot mailing list