Hi reviewers,
I suggest to backport commit "c739b17a715c net: stmmac: don't attach interface until resume finishes" to linux-5.4 stable tree.
This patch fix resume issue by deferring netif_device_attach().
However, the patch cannot be cherry-pick directly on to stable-5.4. A slightly change to the origin patch is required. I'd like to provide the modification to stable-5.4 if it is needed.
commit: c739b17a715c6a850477189fb7c5f9a6af74f4bb subject: net: stmmac: don't attach interface until resume finishes kernel version to apply to: Linux-5.4
Thanks. Macpaul Lin
On Mon, Sep 27, 2021 at 06:45:00PM +0800, Macpaul Lin wrote:
Hi reviewers,
I suggest to backport commit "c739b17a715c net: stmmac: don't attach interface until resume finishes" to linux-5.4 stable tree.
I see no such commit id in Linus's kernel tree.
Are you sure you got the correct id?
thanks,
greg k-h
Hi reviewers,
I suggest to backport commit "31096c3e8b11 net: stmmac: don't attach interface until resume finishes" to linux-5.4 stable tree.
This patch fix resume issue by deferring netif_device_attach().
However, the patch cannot be cherry-pick directly on to stable-5.4. A slightly change to the origin patch is required. I'd like to provide the modification to stable-5.4 if it is needed.
commit: 31096c3e8b1163c6e966bf4d1f36d8b699008f84 subject: net: stmmac: don't attach interface until resume finishes kernel version to apply to: Linux-5.4
Sorry for that I've send a wrong commit hash which is in my working tree previously.
Thanks. Macpaul Lin
On Tue, Sep 28, 2021 at 03:43:49PM +0800, Macpaul Lin wrote:
Hi reviewers,
I suggest to backport commit "31096c3e8b11 net: stmmac: don't attach interface until resume finishes" to linux-5.4 stable tree.
This patch fix resume issue by deferring netif_device_attach().
However, the patch cannot be cherry-pick directly on to stable-5.4. A slightly change to the origin patch is required. I'd like to provide the modification to stable-5.4 if it is needed.
Ok, can you please send a properly backported patch so that we can apply it?
thanks,
greg k-h
From: Leon Yu leoyu@nvidia.com
commit 31096c3e8b1163c6e966bf4d1f36d8b699008f84 upstream.
Commit 14b41a2959fb ("net: stmmac: Delete txtimer in suspend()") was the first attempt to fix a race between mod_timer() and setup_timer() during stmmac_resume(). However the issue still exists as the commit only addressed half of the issue.
Same race can still happen as stmmac_resume() re-attaches interface way too early - even before hardware is fully initialized. Worse, doing so allows network traffic to restart and stmmac_tx_timer_arm() being called in the middle of stmmac_resume(), which re-init tx timers in stmmac_init_coalesce(). timer_list will be corrupted and system crashes as a result of race between mod_timer() and setup_timer().
systemd--1995 2.... 552950018us : stmmac_suspend: 4994 ksoftirq-9 0..s2 553123133us : stmmac_tx_timer_arm: 2276 systemd--1995 0.... 553127896us : stmmac_resume: 5101 systemd--320 7...2 553132752us : stmmac_tx_timer_arm: 2276 (sd-exec-1999 5...2 553135204us : stmmac_tx_timer_arm: 2276 --------------------------------- pc : run_timer_softirq+0x468/0x5e0 lr : run_timer_softirq+0x570/0x5e0 Call trace: run_timer_softirq+0x468/0x5e0 __do_softirq+0x124/0x398 irq_exit+0xd8/0xe0 __handle_domain_irq+0x6c/0xc0 gic_handle_irq+0x60/0xb0 el1_irq+0xb8/0x180 arch_cpu_idle+0x38/0x230 default_idle_call+0x24/0x3c do_idle+0x1e0/0x2b8 cpu_startup_entry+0x28/0x48 secondary_start_kernel+0x1b4/0x208
Fix this by deferring netif_device_attach() to the end of stmmac_resume().
Signed-off-by: Leon Yu leoyu@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 10d28be73f45..56d227b31dbd 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4853,8 +4853,6 @@ int stmmac_resume(struct device *dev) stmmac_mdio_reset(priv->mii); }
- netif_device_attach(ndev); - mutex_lock(&priv->lock);
stmmac_reset_queues_param(priv); @@ -4878,6 +4876,8 @@ int stmmac_resume(struct device *dev)
phylink_mac_change(priv->phylink, true);
+ netif_device_attach(ndev); + return 0; } EXPORT_SYMBOL_GPL(stmmac_resume);
On Tue, Sep 28, 2021 at 04:36:20PM +0800, Macpaul Lin wrote:
From: Leon Yu leoyu@nvidia.com
commit 31096c3e8b1163c6e966bf4d1f36d8b699008f84 upstream.
Commit 14b41a2959fb ("net: stmmac: Delete txtimer in suspend()") was the first attempt to fix a race between mod_timer() and setup_timer() during stmmac_resume(). However the issue still exists as the commit only addressed half of the issue.
Same race can still happen as stmmac_resume() re-attaches interface way too early - even before hardware is fully initialized. Worse, doing so allows network traffic to restart and stmmac_tx_timer_arm() being called in the middle of stmmac_resume(), which re-init tx timers in stmmac_init_coalesce(). timer_list will be corrupted and system crashes as a result of race between mod_timer() and setup_timer().
systemd--1995 2.... 552950018us : stmmac_suspend: 4994 ksoftirq-9 0..s2 553123133us : stmmac_tx_timer_arm: 2276 systemd--1995 0.... 553127896us : stmmac_resume: 5101 systemd--320 7...2 553132752us : stmmac_tx_timer_arm: 2276 (sd-exec-1999 5...2 553135204us : stmmac_tx_timer_arm: 2276
pc : run_timer_softirq+0x468/0x5e0 lr : run_timer_softirq+0x570/0x5e0 Call trace: run_timer_softirq+0x468/0x5e0 __do_softirq+0x124/0x398 irq_exit+0xd8/0xe0 __handle_domain_irq+0x6c/0xc0 gic_handle_irq+0x60/0xb0 el1_irq+0xb8/0x180 arch_cpu_idle+0x38/0x230 default_idle_call+0x24/0x3c do_idle+0x1e0/0x2b8 cpu_startup_entry+0x28/0x48 secondary_start_kernel+0x1b4/0x208
Fix this by deferring netif_device_attach() to the end of stmmac_resume().
Signed-off-by: Leon Yu leoyu@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net
Whenever you forward on a patch, you should add yourself to the signed-off-by chain.
I'll just add you to the cc: to let us know who asked for this patch.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org