[Skiboot] [PATCH] fix lock error when BT IRQ preempt BT timer
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Thu Jan 7 22:36:04 AEDT 2021
On 1/6/21 2:03 PM, lixg wrote:
> BT IRQ may preempt BT timer if BMC response host when bt msg timeout.
> When BT IRQ preempt BT timer, the infight_bt_msg did not protected by bt.lock very well.
>
> And we will see the following log:
> [29006114.163785853,3] BT: seq 0x81 netfn 0x0a cmd 0x23: Timeout sending message
> [29006114.288029290,3] BT: seq 0x81 netfn 0x0b cmd 0x23: Timeout sending message
> [29006114.288917798,3] IPMI: Incorrect netfn 0x0b in response
>
> It may cause 'CPU Hardlock UP', 'memory refree', 'kernel crash' or something else...
>
> Signed-off-by: lixg <867314078 at qq.com>
> ---
> hw/bt.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/hw/bt.c b/hw/bt.c
> index cf967f89..24e6ef7f 100644
> --- a/hw/bt.c
> +++ b/hw/bt.c
> @@ -111,7 +111,7 @@ struct bt {
> };
>
> static struct bt bt;
> -static struct bt_msg *inflight_bt_msg; /* Holds in flight message */
> +static struct bt_msg * volatile inflight_bt_msg; /* Holds in flight message */
Why do we need volatile here?
>
> static int ipmi_seq;
>
> @@ -211,6 +211,11 @@ static void bt_msg_del(struct bt_msg *bt_msg)
> {
> list_del(&bt_msg->link);
> bt.queue_len--;
> +
> + /* once inflight_bt_msg out of list, it should be emptyed */
> + if (bt_msg == inflight_bt_msg)
> + inflight_bt_msg = NULL;
> +
> unlock(&bt.lock);
> ipmi_cmd_done(bt_msg->ipmi_msg.cmd,
> IPMI_NETFN_RETURN_CODE(bt_msg->ipmi_msg.netfn),
> @@ -394,7 +399,7 @@ static void bt_expire_old_msg(uint64_t tb)
> bt_msg_del(bt_msg);
>
> /* Ready to send next message */
> - inflight_bt_msg = NULL;
> + //inflight_bt_msg = NULL;
We don't want to keep commented code. Please remove this.
-Vasant
More information about the Skiboot
mailing list