[Skiboot] [PATCH] fix lock error when BT IRQ preempt BT timer
lixg
lixgemail at gmail.com
Wed Jan 6 19:33:18 AEDT 2021
BT IRQ may preempt BT timer if BMC response host when bt msg timeout.
When BT IRQ preempt BT timer, the infight_bt_msg did not protected by bt.lock very well.
And we will see the following log:
[29006114.163785853,3] BT: seq 0x81 netfn 0x0a cmd 0x23: Timeout sending message
[29006114.288029290,3] BT: seq 0x81 netfn 0x0b cmd 0x23: Timeout sending message
[29006114.288917798,3] IPMI: Incorrect netfn 0x0b in response
It may cause 'CPU Hardlock UP', 'memory refree', 'kernel crash' or something else...
Signed-off-by: lixg <867314078 at qq.com>
---
hw/bt.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/hw/bt.c b/hw/bt.c
index cf967f89..24e6ef7f 100644
--- a/hw/bt.c
+++ b/hw/bt.c
@@ -111,7 +111,7 @@ struct bt {
};
static struct bt bt;
-static struct bt_msg *inflight_bt_msg; /* Holds in flight message */
+static struct bt_msg * volatile inflight_bt_msg; /* Holds in flight message */
static int ipmi_seq;
@@ -211,6 +211,11 @@ static void bt_msg_del(struct bt_msg *bt_msg)
{
list_del(&bt_msg->link);
bt.queue_len--;
+
+ /* once inflight_bt_msg out of list, it should be emptyed */
+ if (bt_msg == inflight_bt_msg)
+ inflight_bt_msg = NULL;
+
unlock(&bt.lock);
ipmi_cmd_done(bt_msg->ipmi_msg.cmd,
IPMI_NETFN_RETURN_CODE(bt_msg->ipmi_msg.netfn),
@@ -394,7 +399,7 @@ static void bt_expire_old_msg(uint64_t tb)
bt_msg_del(bt_msg);
/* Ready to send next message */
- inflight_bt_msg = NULL;
+ //inflight_bt_msg = NULL;
/*
* Timing out a message is inherently racy as the BMC
--
2.17.1
More information about the Skiboot
mailing list