-----Original Message----- From: Vladimir Oltean vladimir.oltean@nxp.com Sent: 2024年9月20日 22:25 To: Wei Fang wei.fang@nxp.com Cc: Maciej Fijalkowski maciej.fijalkowski@intel.com; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program
On Fri, Sep 20, 2024 at 05:05:14PM +0300, Wei Fang wrote:
zero init is good but shouldn't you be draining these buffers when removing XDP resources at least? what happens with DMA mappings that are related
to
these cached buffers?
All the buffers will be freed and DMA will be unmapped when XDP program
is
installed.
There is still a problem with the patch you proposed here, which is that enetc_reconfigure() has one more call site, from enetc_hwtstamp_set(). If enetc_free_rxtx_rings() is the one that gets rid of the stale buffers, it should also be the one that resets xdp_tx_in_flight, otherwise you will still leave the problem unsolved where XDP_TX can be interrupted by a change in hwtstamping state, and the software "in flight" counter gets out of sync with the ring state.
Yes, you are right. It's a potential issue if RX_TSTAMP is set when XDP is also enabled.
Also, I suspect that the blamed commit is wrong. Also the normal netdev close path should be susceptible to this issue, not just enetc_reconfigure(). Maybe something like ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing").
Thanks for the reminder, I will change the blamed commit in next version
That's when we started rushing the NAPI poll routing to finish. I don't think it was possible, before that, to close the netdev while there were XDP_TX frames pending to be recycled.
I am thinking that another solution may be better, which is mentioned in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally
drop
to 0, and the TX-related statistics will be more accurate.
Please give me some more time to analyze the flow after just your patch 2/3. I have a draft reply, but I would still like to test some things.
Okay, I have tested this solution (see changes below), and from what I observed, the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no other risks, the next version will use this solution.
@@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
- enetc_setup_interrupts(priv); - - enetc_enable_tx_bdrs(priv); - for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i); @@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
+ enetc_setup_interrupts(priv); + + enetc_enable_tx_bdrs(priv); + enetc_enable_rx_bdrs(priv);
netif_tx_start_all_queues(ndev); @@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
enetc_disable_rx_bdrs(priv);
+ enetc_wait_bdrs(priv); + + enetc_disable_tx_bdrs(priv); + + enetc_clear_interrupts(priv); + for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i); @@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev) napi_synchronize(&priv->int_vector[i]->napi); napi_disable(&priv->int_vector[i]->napi); } - - enetc_wait_bdrs(priv); - - enetc_disable_tx_bdrs(priv); - - enetc_clear_interrupts(priv); } EXPORT_SYMBOL_GPL(enetc_stop);