On 05/07, Cosmin Ratiu wrote:
On Wed, 2025-05-07 at 15:13 +0000, Cosmin Ratiu wrote:
In any case, please hold off with picking this patch up, it seems there's a possibility of a real deadlock. Here's the scenario:
============================================ WARNING: possible recursive locking detected
ethtool/44150 is trying to acquire lock: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: __netdev_update_features+0x31e/0xe20
but task is already holding lock: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: ethnl_set_features+0xbc/0x4b0 and the lock comparison function returns 0:
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&dev_instance_lock_key#7); lock(&dev_instance_lock_key#7);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by ethtool/44150: #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 #1: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: ethnl_set_features+0x88/0x4b0 #2: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: ethnl_set_features+0xbc/0x4b0
stack backtrace: Call Trace: <TASK> dump_stack_lvl+0x69/0xa0 print_deadlock_bug.cold+0xbd/0xca __lock_acquire+0x163c/0x2f00 lock_acquire+0xd3/0x2e0 __mutex_lock+0x98/0xf10 __netdev_update_features+0x31e/0xe20 netdev_update_features+0x1f/0x60 vlan_device_event+0x57d/0x930 [8021q] notifier_call_chain+0x3d/0x100 netdev_features_change+0x32/0x50 ethnl_set_features+0x17e/0x4b0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x188/0x290 [...]
I'm not sure how to solve this yet... Cosmin.
If it's not clear, the problem is that:
- the lower device is already ops locked
- netdev_feature_change gets called.
- __netdev_update_features gets called for the vlan (upper) dev.
- It tries to acquire the same lock instance as 1 (this patch).
- Deadlock.
One solution I can think of would be to run device notifiers for changing features outside the lock, it doesn't seem like the netdev lock has anything to do with what other devices might do with this information.
This can be triggered from many scenarios, I have another similar stack involving bonding.
What do you think?
All I could think of was to drop the lock during the netdev_features_changed notifier calls, like in the following hunk. I'm running this through regressions, let's see if it's a good idea or not.
diff --git a/net/core/dev.c b/net/core/dev.c index 1be7cb73a602..817fd5bc21b1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1514,7 +1514,12 @@ int dev_get_alias(const struct net_device *dev, char *name, size_t len) */ void netdev_features_change(struct net_device *dev) {
- /* Drop the lock to avoid potential deadlocks from e.g. upper
dev
* notifiers altering features of 'dev' and acquiring dev lock
again.
*/
- netdev_unlock_ops(dev); call_netdevice_notifiers(NETDEV_FEAT_CHANGE, dev);
- netdev_lock_ops(dev);
} EXPORT_SYMBOL(netdev_features_change);
Hmm, are you sure you're calling __netdev_update_features on the upper? I don't see how the lower would be locked in that case. From my POW, this is what happens:
1. your dev (lower) has a vlan on it (upper) 2. you call lro=off on the _lower_ 3. this triggers FEAT_CHANGE notifier and vlan_device_event catches it 4. since the lower has a vlan device (dev->vlan_info != NULL), it goes over every other vlan in the group and triggers netdev_update_features for the upper (netdev_update_features vlandev) 5. the upper tries to sync the features into the lower (including the one that triggered FEAT_CHANGE) and that's where the deadlock happens
But I think (5) should be largely a no-op for the device triggering the notification, because the features have been already applied in ethnl_set_features. I'd move the lock into netdev_sync_lower_features, and only after checking the features (and making sure that we are going to change them). The feature check might be racy, but I think it should still work?
Can you also share the bonding stacktrace as well to confirm it's the same issue?
diff --git a/net/core/dev.c b/net/core/dev.c index bcb266ab2912..b5fc8a740e8b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10312,6 +10312,7 @@ static void netdev_sync_lower_features(struct net_device *upper, for_each_netdev_feature(upper_disables, feature_bit) { feature = __NETIF_F_BIT(feature_bit); if (!(features & feature) && (lower->features & feature)) { + netdev_lock_ops(lower); netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", &feature, lower->name); lower->wanted_features &= ~feature; @@ -10322,6 +10323,7 @@ static void netdev_sync_lower_features(struct net_device *upper, &feature, lower->name); else netdev_features_change(lower); + netdev_unlock_ops(lower); } } }