[PATCH] net: mctp: Fix tx queue stall
Andrew Jeffery
andrew at codeconstruct.com.au
Mon Nov 24 16:35:59 AEDT 2025
Hi Marc,
On Fri, 2025-11-21 at 12:29 -0800, Marc Olberding wrote:
> From: Jinliang Wang <jinliangw at google.com>
>
> The tx queue can become permanently stuck in a stopped state due to a
> race condition between the URB submission path and its completion
> callback.
>
> The URB completion callback can run immediately after usb_submit_urb()
> returns, before the submitting function calls netif_stop_queue(). If
> this occurs, the queue state management becomes desynchronized, leading
> to a stall where the queue is never woken.
>
> Fix this by moving the netif_stop_queue() call to before submitting the
> URB. This closes the race window by ensuring the network stack is aware
> the queue is stopped before the URB completion can possibly run.
>
> (cherry picked from commit da2522df3fcc6f57068470cbdcd6516d9eb76b37)
Interesting that this hasn't yet come in via stable, but oh well.
>
> Fixes: 0791c0327a6e ("net: mctp: Add MCTP USB transport driver")
> Signed-off-by: Jinliang Wang <jinliangw at google.com>
> Acked-by: Jeremy Kerr <jk at codeconstruct.com.au>
> Link: https://patch.msgid.link/20251027065530.2045724-1-jinliangw@google.com
> Signed-off-by: Jakub Kicinski <kuba at kernel.org>
> ---
> Backports a fix from net-next to openbmc 6.12 for a race condition
> in the mctp-usb driver that results in an effective deadlock.
> This was seen to fix issues on the nvl32-obmc model with pldm
> firmware update
>
> Signed-off-by: Marc Olberding <molberding at nvidia.com>
Just a quick note that because you've put this below the --- mark git
drops it when the patch is applied. You need to put your tag in the
trailer section above, under Jakub's S-o-b tag.
See [1] for a bit of a formalisation of it all.
Andrew
[1]: https://git-scm.com/docs/git-interpret-trailers
More information about the openbmc
mailing list