Hello Eric,
On Wed, Feb 12, 2025 at 07:55:32PM +0100, Eric Dumazet wrote:
On Wed, Feb 12, 2025 at 7:34 PM Breno Leitao leitao@debian.org wrote:
--- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -87,7 +87,9 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(nsim_forward_skb(peer_dev, skb, rq) == NET_RX_DROP)) goto out_drop_cnt;
local_bh_disable(); napi_schedule(&rq->napi);
local_bh_enable();
I thought all ndo_start_xmit() were done under local_bh_disable()
I think it depends on the path?
Could you give more details ?
There are several paths to ndo_start_xmit(), and please correct me if I am reading the code wrongly here.
Common path:
__dev_direct_xmit() local_bh_disable(); netdev_start_xmit() __netdev_start_xmit() ops->ndo_start_xmit(skb, dev);
But, in some other cases, I see:
netpoll_start_xmit() netdev_start_xmit() ....
My reading is that not all cases have local_bh_disable() disabled before calling ndo_start_xmit().
Question: Must BH be disabled before calling ndo_start_xmit()? If so, the problem might be in the netpoll code!? Also, is it worth adding a DEBUG_NET_WARN_ON_ONCE()?
Note: Jakub gave another suggestion on how to fix this, so, I send a v2 with a different approach:
https://lore.kernel.org/all/20250213071426.01490615@kernel.org/
Thanks for the review! --breno