softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)

Ronny Meeus ronny.meeus at gmail.com
Tue Jul 12 22:03:04 EST 2011


On Tue, Jul 12, 2011 at 12:10 PM, Eric Dumazet <eric.dumazet at gmail.com> wrote:
> Le mardi 12 juillet 2011 à 11:23 +0200, Thomas De Schampheleire a
> écrit :
>> Hi,
>>
>> I'm adding the linuxppc-dev mailing list since this may be pointing to
>> an irq/softirq problem in the powerpc architecture-specific code...
>
>>
>> Note that the reason we are seeing this problem, may be because the
>> kernel we are using contains some patches from Freescale.
>> Specifically, in dev_queue_xmit(), support is added for hardware queue
>> handling, just before entering the rcu_read_lock_bh():
>>
>
> Oh well, what a mess.
>
>>         if (dev->features & NETIF_F_HW_QDISC) {
>>                 txq = dev_pick_tx(dev, skb);
>
>
>
>>                 return dev_hard_start_xmit(skb, dev, txq);
>        This need to be :
>                local_bh_disable();
>                rc = dev_hard_start_xmit(skb, dev, txq);
>                local_bh_enable();
>                return rc;
>
>
>>         }
>>
>>         /* Disable soft irqs for various locks below. Also
>>          * stops preemption for RCU.
>>          */
>>         rcu_read_lock_bh();
>>
>> We just tried moving the escaping to dev_hard_start_xmit() after
>> taking the lock, but this gives a large number of other problems, e.g.
>>
>> [   78.662428] BUG: sleeping function called from invalid context at
>> mm/slab.c:3101
>> [   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
>> send_eth_socket
>> [   78.839582] Call Trace:
>> [   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
>> [   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
>> [   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
>> [   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
>> [   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758
>
> doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure.
>
>
>> [   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
>> [   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
>> [   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
>> [   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
>> [   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [   79.731015] --- Exception: c01 at 0x48051f00
>> [   79.731019]     LR = 0x4808e030
>>
>>
>> Note that this may just be the cause for us seeing this problem. If
>> indeed the main problem is irq_exit() invoking softirqs in a locked
>> context, then this patch adding hardware queue support is not really
>> relevant.
>
> irq_exit() is fine. This is because BH are not masked because of the
> Freescale patches.
>
> Really, suggesting an af_packet patch to solve a problem introduced in
> an out of tree patch is insane.
>
> You guys hould have clearly stated you were using an alien kernel.
>
>
>
>

Sorry for not mentioning we were using a patched kernel.
I was not aware that the code involved was patched by the FreeScale
patches we applied. The code found in the stack dumps is not
implemented in FSL specific files.

While reading the code of af_packet I saw that the spin_lock_bh is
used in several places while this is not the case in the tpacket_rcv
function. Since we had a locking issue in that code, I thought that my
patch would be OK.
I was not aware that for that specific function (tpacket_rcv) a
different lock primitive must be used. A suggestion for improvement:
it would be better to document this pre-condition in the code.

After doing the change you proposed our code now looks like:

>---if (dev->features & NETIF_F_HW_QDISC) {
>--->---txq = dev_pick_tx(dev, skb);
>--->---local_bh_disable();
>--->---rc = dev_hard_start_xmit(skb, dev, txq);
>--->---local_bh_enable();
>--->---return rc;
>---}

>---/* Disable soft irqs for various locks below. Also
>--- * stops preemption for RCU.
>--- */
>---rcu_read_lock_bh();

but we still see the issue "BUG: sleeping function called from invalid context":

[   91.015989] BUG: sleeping function called from invalid context at
include/linux/skbuff.h:786
[   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
[   91.200461] Call Trace:
[   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
[   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
[   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
[   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
[   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
[   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
[   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
[   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
[   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
[   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
[   91.900153] --- Exception: c01 at 0x4824df00
[   91.900157]     LR = 0x4828a030


The FreeScale patch that introduced this code was created by Andy
Fleming <afleming at freescale.com> (Put in CC).

The purpose of the patch is:
"
Subject: [PATCH] net: Add support for handling queueing in hardware

The QDisc code does a bunch of locking which is unnecessary if
you have hardware which handles all of the queueing.  Add
support for this, and skip over all of the queueing code if
the feature is enabled on a given device.
"

Ronny


More information about the Linuxppc-dev mailing list