From: Aditya Garg gargaditya@linux.microsoft.com
[ Upstream commit 45120304e84171fd215c1b57b15b285446d15106 ]
Drop TX packets when posting the work request fails and ensure DMA mappings are always cleaned up.
Signed-off-by: Aditya Garg gargaditya@linux.microsoft.com Reviewed-by: Haiyang Zhang haiyangz@microsoft.com Link: https://patch.msgid.link/1763464269-10431-3-git-send-email-gargaditya@linux.... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Looking at this commit, I need to analyze it for stable backporting suitability.
## Commit Analysis
### What the commit does:
1. **In `mana_en.c`**: When `mana_gd_post_work_request()` fails during TX: - Previously: returned `NETDEV_TX_BUSY` without cleaning up DMA mappings - Now: calls `mana_unmap_skb()` to clean up DMA mappings, then drops the packet properly
2. **In `gdma_main.c`**: Removes a noisy `dev_err` message for flow control failure
3. **In `mana.h`**: Exports `mana_unmap_skb()` function for use across files
### Bug Being Fixed
Looking at the original error path: ```c if (err) { (void)skb_dequeue_tail(&txq->pending_skbs); netdev_warn(ndev, "Failed to post TX OOB: %d\n", err); err = NETDEV_TX_BUSY; goto tx_busy; // DMA mappings NOT unmapped! } ```
The original code has **two bugs**: 1. **DMA resource leak**: When `mana_gd_post_work_request()` fails, the DMA mappings set up earlier (via `mana_map_skb()`) are never freed 2. **Improper error handling**: Returns `NETDEV_TX_BUSY` which tells the network stack to retry, but the skb was already dequeued from `pending_skbs`, leading to inconsistent state
### Code Change Assessment
The fix is straightforward and surgical: ```c if (err) { (void)skb_dequeue_tail(&txq->pending_skbs); mana_unmap_skb(skb, apc); // NEW: Clean up DMA mappings netdev_warn(ndev, "Failed to post TX OOB: %d\n", err); goto free_sgl_ptr; // Drop packet properly } ```
### Stable Criteria Evaluation
| Criterion | Assessment | |-----------|------------| | Fixes real bug | ✅ Yes - DMA resource leak and improper error handling | | Obviously correct | ✅ Yes - unmapping DMA on error is clearly correct | | Small and contained | ✅ Yes - only affects error path in MANA driver | | No new features | ✅ Yes - pure bug fix | | User impact | ✅ Medium-high - affects Azure VM users; leaks can cause long-term instability |
### Risk Assessment
- **Low risk**: Changes only affect the TX error path - **Self-contained**: The `mana_unmap_skb()` function already exists, just made non-static - **No dependencies**: This commit is standalone - **Well-tested**: Has "Reviewed-by" tag from Microsoft maintainer
### Missing Signals
- No `Cc: stable@vger.kernel.org` tag - No `Fixes:` tag
However, this is clearly fixing a resource leak bug that has existed since the error path was written. Resource leaks in network drivers are exactly the kind of bugs stable trees should fix.
### User Impact
The MANA driver is used in Microsoft Azure VMs. DMA mapping leaks: - Cause memory pressure over time - Can lead to allocation failures under sustained TX errors - May cause VM instability in production cloud environments
This is a production-relevant bug affecting real cloud users.
### Conclusion
This commit fixes a genuine resource leak bug in the MANA network driver. The fix is: - Small (adds one function call + declaration) - Surgical (only affects error path) - Obviously correct (unmapping DMA on error is mandatory) - Low risk (doesn't touch normal TX path) - Important for production Azure VM users
The lack of stable/Fixes tags is unfortunate but doesn't diminish the clear bug-fix nature of this change.
**YES**
drivers/net/ethernet/microsoft/mana/gdma_main.c | 6 +----- drivers/net/ethernet/microsoft/mana/mana_en.c | 7 +++---- include/net/mana/mana.h | 1 + 3 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 43f034e180c41..7d232ac250958 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -1299,7 +1299,6 @@ int mana_gd_post_work_request(struct gdma_queue *wq, struct gdma_posted_wqe_info *wqe_info) { u32 client_oob_size = wqe_req->inline_oob_size; - struct gdma_context *gc; u32 sgl_data_size; u32 max_wqe_size; u32 wqe_size; @@ -1329,11 +1328,8 @@ int mana_gd_post_work_request(struct gdma_queue *wq, if (wqe_size > max_wqe_size) return -EINVAL;
- if (wq->monitor_avl_buf && wqe_size > mana_gd_wq_avail_space(wq)) { - gc = wq->gdma_dev->gdma_context; - dev_err(gc->dev, "unsuccessful flow control!\n"); + if (wq->monitor_avl_buf && wqe_size > mana_gd_wq_avail_space(wq)) return -ENOSPC; - }
if (wqe_info) wqe_info->wqe_size_in_bu = wqe_size / GDMA_WQE_BU_SIZE; diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 0142fd98392c2..6d37f39930453 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -438,9 +438,9 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
if (err) { (void)skb_dequeue_tail(&txq->pending_skbs); + mana_unmap_skb(skb, apc); netdev_warn(ndev, "Failed to post TX OOB: %d\n", err); - err = NETDEV_TX_BUSY; - goto tx_busy; + goto free_sgl_ptr; }
err = NETDEV_TX_OK; @@ -460,7 +460,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) tx_stats->bytes += len + ((num_gso_seg - 1) * gso_hs); u64_stats_update_end(&tx_stats->syncp);
-tx_busy: if (netif_tx_queue_stopped(net_txq) && mana_can_tx(gdma_sq)) { netif_tx_wake_queue(net_txq); apc->eth_stats.wake_queue++; @@ -1606,7 +1605,7 @@ static int mana_move_wq_tail(struct gdma_queue *wq, u32 num_units) return 0; }
-static void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc) +void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc) { struct mana_skb_head *ash = (struct mana_skb_head *)skb->head; struct gdma_context *gc = apc->ac->gdma_dev->gdma_context; diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 0921485565c05..330e1bb088bb9 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -580,6 +580,7 @@ int mana_set_bw_clamp(struct mana_port_context *apc, u32 speed, void mana_query_phy_stats(struct mana_port_context *apc); int mana_pre_alloc_rxbufs(struct mana_port_context *apc, int mtu, int num_queues); void mana_pre_dealloc_rxbufs(struct mana_port_context *apc); +void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc);
extern const struct ethtool_ops mana_ethtool_ops; extern struct dentry *mana_debugfs_root;