When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down.
Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect.
Fixes: 9f8a67b65a49 ("usb: dwc3: gadget: fix gadget suspend/resume") CC: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com ---
Kernel panic - not syncing: Asynchronous SError Interrupt Workqueue: events vbus_event_work Call trace: dump_backtrace+0xf4/0x118 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c dump_stack+0x18/0x3c panic+0x16c/0x390 nmi_panic+0xa4/0xa8 arm64_serror_panic+0x6c/0x94 do_serror+0xc4/0xd0 el1h_64_error_handler+0x34/0x48 el1h_64_error+0x68/0x6c readl+0x4c/0x8c __dwc3_gadget_ep_disable+0x48/0x230 dwc3_gadget_ep_disable+0x50/0xc0 usb_ep_disable+0x44/0xe4 ffs_func_eps_disable+0x64/0xc8 ffs_func_set_alt+0x74/0x368 ffs_func_disable+0x18/0x28 composite_disconnect+0x90/0xec configfs_composite_disconnect+0x64/0x88 usb_gadget_disconnect_locked+0xc0/0x168 vbus_event_work+0x3c/0x58 process_one_work+0x1e4/0x43c worker_thread+0x25c/0x430 kthread+0x104/0x1d4 ret_from_fork+0x10/0x20
--- Changelog:
v4: - correct the mistake where semicolon was forgotten - return -EAGAIN upon dwc3_gadget_suspend() failure
v3: - change the Fixes tag
v2: - move declarations in separate lines - add the Fixes tag
--- drivers/usb/dwc3/core.c | 9 +++++++-- drivers/usb/dwc3/gadget.c | 22 +++++++++------------- 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 66a08b527165..f36bc933c55b 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2388,6 +2388,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) { u32 reg; int i; + int ret;
if (!pm_runtime_suspended(dwc->dev) && !PMSG_IS_AUTO(msg)) { dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) & @@ -2406,7 +2407,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) case DWC3_GCTL_PRTCAP_DEVICE: if (pm_runtime_suspended(dwc->dev)) break; - dwc3_gadget_suspend(dwc); + ret = dwc3_gadget_suspend(dwc); + if (ret) + return ret; synchronize_irq(dwc->irq_gadget); dwc3_core_exit(dwc); break; @@ -2441,7 +2444,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) break;
if (dwc->current_otg_role == DWC3_OTG_ROLE_DEVICE) { - dwc3_gadget_suspend(dwc); + ret = dwc3_gadget_suspend(dwc); + if (ret) + return ret; synchronize_irq(dwc->irq_gadget); }
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc); - if (ret) - goto err; - - spin_lock_irqsave(&dwc->lock, flags); - if (dwc->gadget_driver) - dwc3_disconnect_gadget(dwc); - spin_unlock_irqrestore(&dwc->lock, flags); - - return 0; - -err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */ - if (dwc->softconnect) + if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc); + return -EAGAIN; + }
- return ret; + spin_lock_irqsave(&dwc->lock, flags); + if (dwc->gadget_driver) + dwc3_disconnect_gadget(dwc); + spin_unlock_irqrestore(&dwc->lock, flags); + + return 0; }
int dwc3_gadget_resume(struct dwc3 *dwc) -- 2.49.0.604.gff1f9ca942-goog
On Wed, Apr 16, 2025, Kuen-Han Tsai wrote:
When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down.
Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect.
Fixes: 9f8a67b65a49 ("usb: dwc3: gadget: fix gadget suspend/resume") CC: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com
Kernel panic - not syncing: Asynchronous SError Interrupt Workqueue: events vbus_event_work Call trace: dump_backtrace+0xf4/0x118 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c dump_stack+0x18/0x3c panic+0x16c/0x390 nmi_panic+0xa4/0xa8 arm64_serror_panic+0x6c/0x94 do_serror+0xc4/0xd0 el1h_64_error_handler+0x34/0x48 el1h_64_error+0x68/0x6c readl+0x4c/0x8c __dwc3_gadget_ep_disable+0x48/0x230 dwc3_gadget_ep_disable+0x50/0xc0 usb_ep_disable+0x44/0xe4 ffs_func_eps_disable+0x64/0xc8 ffs_func_set_alt+0x74/0x368 ffs_func_disable+0x18/0x28 composite_disconnect+0x90/0xec configfs_composite_disconnect+0x64/0x88 usb_gadget_disconnect_locked+0xc0/0x168 vbus_event_work+0x3c/0x58 process_one_work+0x1e4/0x43c worker_thread+0x25c/0x430 kthread+0x104/0x1d4 ret_from_fork+0x10/0x20
Changelog:
v4:
- correct the mistake where semicolon was forgotten
- return -EAGAIN upon dwc3_gadget_suspend() failure
v3:
- change the Fixes tag
v2:
- move declarations in separate lines
- add the Fixes tag
drivers/usb/dwc3/core.c | 9 +++++++-- drivers/usb/dwc3/gadget.c | 22 +++++++++------------- 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 66a08b527165..f36bc933c55b 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2388,6 +2388,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) { u32 reg; int i;
int ret;
if (!pm_runtime_suspended(dwc->dev) && !PMSG_IS_AUTO(msg)) { dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
@@ -2406,7 +2407,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) case DWC3_GCTL_PRTCAP_DEVICE: if (pm_runtime_suspended(dwc->dev)) break;
dwc3_gadget_suspend(dwc);
ret = dwc3_gadget_suspend(dwc);
if (ret)
synchronize_irq(dwc->irq_gadget); dwc3_core_exit(dwc); break;return ret;
@@ -2441,7 +2444,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) break;
if (dwc->current_otg_role == DWC3_OTG_ROLE_DEVICE) {
dwc3_gadget_suspend(dwc);
ret = dwc3_gadget_suspend(dwc);
if (ret)
}return ret; synchronize_irq(dwc->irq_gadget);
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc);
- if (ret)
goto err;
- spin_lock_irqsave(&dwc->lock, flags);
- if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
- spin_unlock_irqrestore(&dwc->lock, flags);
- return 0;
-err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */
- if (dwc->softconnect)
- if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc);
return -EAGAIN;
This may make sense to have -EAGAIN for runtime suspend. I supposed this should be fine for system suspend since it doesn't do anything special for this error code.
When you tested runtime suspend, did you observe that the device successfully going into suspend on retry?
In any case, I think this should be good. Thanks for the fix:
Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com
Thanks, Thinh
- }
- return ret;
- spin_lock_irqsave(&dwc->lock, flags);
- if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
- spin_unlock_irqrestore(&dwc->lock, flags);
- return 0;
}
int dwc3_gadget_resume(struct dwc3 *dwc)
2.49.0.604.gff1f9ca942-goog
On Sat, Apr 19, 2025 at 9:24 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Wed, Apr 16, 2025, Kuen-Han Tsai wrote:
When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down.
Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect.
Fixes: 9f8a67b65a49 ("usb: dwc3: gadget: fix gadget suspend/resume") CC: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com
Kernel panic - not syncing: Asynchronous SError Interrupt Workqueue: events vbus_event_work Call trace: dump_backtrace+0xf4/0x118 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c dump_stack+0x18/0x3c panic+0x16c/0x390 nmi_panic+0xa4/0xa8 arm64_serror_panic+0x6c/0x94 do_serror+0xc4/0xd0 el1h_64_error_handler+0x34/0x48 el1h_64_error+0x68/0x6c readl+0x4c/0x8c __dwc3_gadget_ep_disable+0x48/0x230 dwc3_gadget_ep_disable+0x50/0xc0 usb_ep_disable+0x44/0xe4 ffs_func_eps_disable+0x64/0xc8 ffs_func_set_alt+0x74/0x368 ffs_func_disable+0x18/0x28 composite_disconnect+0x90/0xec configfs_composite_disconnect+0x64/0x88 usb_gadget_disconnect_locked+0xc0/0x168 vbus_event_work+0x3c/0x58 process_one_work+0x1e4/0x43c worker_thread+0x25c/0x430 kthread+0x104/0x1d4 ret_from_fork+0x10/0x20
Changelog:
v4:
- correct the mistake where semicolon was forgotten
- return -EAGAIN upon dwc3_gadget_suspend() failure
v3:
- change the Fixes tag
v2:
- move declarations in separate lines
- add the Fixes tag
drivers/usb/dwc3/core.c | 9 +++++++-- drivers/usb/dwc3/gadget.c | 22 +++++++++------------- 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 66a08b527165..f36bc933c55b 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2388,6 +2388,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) { u32 reg; int i;
int ret; if (!pm_runtime_suspended(dwc->dev) && !PMSG_IS_AUTO(msg)) { dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
@@ -2406,7 +2407,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) case DWC3_GCTL_PRTCAP_DEVICE: if (pm_runtime_suspended(dwc->dev)) break;
dwc3_gadget_suspend(dwc);
ret = dwc3_gadget_suspend(dwc);
if (ret)
return ret; synchronize_irq(dwc->irq_gadget); dwc3_core_exit(dwc); break;
@@ -2441,7 +2444,9 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) break;
if (dwc->current_otg_role == DWC3_OTG_ROLE_DEVICE) {
dwc3_gadget_suspend(dwc);
ret = dwc3_gadget_suspend(dwc);
if (ret)
return ret; synchronize_irq(dwc->irq_gadget); }
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc);
if (ret)
goto err;
spin_lock_irqsave(&dwc->lock, flags);
if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
return 0;
-err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */
if (dwc->softconnect)
if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc);
return -EAGAIN;
This may make sense to have -EAGAIN for runtime suspend. I supposed this should be fine for system suspend since it doesn't do anything special for this error code.
When you tested runtime suspend, did you observe that the device successfully going into suspend on retry?
Hi Thinh,
Yes, the dwc3 device can be suspended using either pm_runtime_suspend() or dwc3_gadget_pullup(), the latter of which ultimately invokes pm_runtime_put().
One question: Do you know how to naturally cause a run stop failure? Based on the spec, the controller cannot halt until the event buffer becomes empty. If the driver doesn't acknowledge the events, this should lead to the run stop failure. However, since I cannot naturally reproduce this problem, I am simulating this scenario by modifying dwc3_gadget_run_stop() to return a timeout error directly.
Regards, Kuen-Han
In any case, I think this should be good. Thanks for the fix:
Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com
Thanks, Thinh
}
return ret;
spin_lock_irqsave(&dwc->lock, flags);
if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
return 0;
}
int dwc3_gadget_resume(struct dwc3 *dwc)
2.49.0.604.gff1f9ca942-goog
On Mon, Apr 21, 2025, Kuen-Han Tsai wrote:
On Sat, Apr 19, 2025 at 9:24 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Wed, Apr 16, 2025, Kuen-Han Tsai wrote:
<snip>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc);
if (ret)
goto err;
spin_lock_irqsave(&dwc->lock, flags);
if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
return 0;
-err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */
if (dwc->softconnect)
if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc);
return -EAGAIN;
This may make sense to have -EAGAIN for runtime suspend. I supposed this should be fine for system suspend since it doesn't do anything special for this error code.
When you tested runtime suspend, did you observe that the device successfully going into suspend on retry?
Hi Thinh,
Yes, the dwc3 device can be suspended using either pm_runtime_suspend() or dwc3_gadget_pullup(), the latter of which ultimately invokes pm_runtime_put().
One question: Do you know how to naturally cause a run stop failure? Based on the spec, the controller cannot halt until the event buffer becomes empty. If the driver doesn't acknowledge the events, this should lead to the run stop failure. However, since I cannot naturally reproduce this problem, I am simulating this scenario by modifying dwc3_gadget_run_stop() to return a timeout error directly.
I'm not clear what you meant by "naturally" here. The driver is implemented in such a way that this should not happen. If it does, we need to take look to see what we missed.
However, to force the driver to hit the controller halt timeout, just wait/generate some events and don't clear the GEVNTCOUNT of event bytes before clearing the run_stop bit.
BR, Thinh
On Tue, Apr 22, 2025 at 7:20 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Mon, Apr 21, 2025, Kuen-Han Tsai wrote:
On Sat, Apr 19, 2025 at 9:24 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Wed, Apr 16, 2025, Kuen-Han Tsai wrote:
<snip>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc);
if (ret)
goto err;
spin_lock_irqsave(&dwc->lock, flags);
if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
return 0;
-err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */
if (dwc->softconnect)
if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc);
return -EAGAIN;
This may make sense to have -EAGAIN for runtime suspend. I supposed this should be fine for system suspend since it doesn't do anything special for this error code.
When you tested runtime suspend, did you observe that the device successfully going into suspend on retry?
Hi Thinh,
Yes, the dwc3 device can be suspended using either pm_runtime_suspend() or dwc3_gadget_pullup(), the latter of which ultimately invokes pm_runtime_put().
One question: Do you know how to naturally cause a run stop failure? Based on the spec, the controller cannot halt until the event buffer becomes empty. If the driver doesn't acknowledge the events, this should lead to the run stop failure. However, since I cannot naturally reproduce this problem, I am simulating this scenario by modifying dwc3_gadget_run_stop() to return a timeout error directly.
I'm not clear what you meant by "naturally" here. The driver is implemented in such a way that this should not happen. If it does, we need to take look to see what we missed.
However, to force the driver to hit the controller halt timeout, just wait/generate some events and don't clear the GEVNTCOUNT of event bytes before clearing the run_stop bit.
BR, Thinh
Hi Thinh,
Thank you for getting back to me and the method to force the timeout!
By "naturally," I meant reproducing the issue without artificial steps designed solely to trigger it. You're right; since the driver is designed to prevent this state, reproducing it "naturally" is difficult.
I really appreciate your patience, and thank you once more for the helpful clarification.
Regards, Kuen-Han
On Tue, Apr 22, 2025, Kuen-Han Tsai wrote:
On Tue, Apr 22, 2025 at 7:20 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Mon, Apr 21, 2025, Kuen-Han Tsai wrote:
On Sat, Apr 19, 2025 at 9:24 AM Thinh Nguyen Thinh.Nguyen@synopsys.com wrote:
On Wed, Apr 16, 2025, Kuen-Han Tsai wrote:
<snip>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89a4dc8ebf94..630fd5f0ce97 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4776,26 +4776,22 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) int ret;
ret = dwc3_gadget_soft_disconnect(dwc);
if (ret)
goto err;
spin_lock_irqsave(&dwc->lock, flags);
if (dwc->gadget_driver)
dwc3_disconnect_gadget(dwc);
spin_unlock_irqrestore(&dwc->lock, flags);
return 0;
-err: /* * Attempt to reset the controller's state. Likely no * communication can be established until the host * performs a port reset. */
if (dwc->softconnect)
if (ret && dwc->softconnect) { dwc3_gadget_soft_connect(dwc);
return -EAGAIN;
This may make sense to have -EAGAIN for runtime suspend. I supposed this should be fine for system suspend since it doesn't do anything special for this error code.
When you tested runtime suspend, did you observe that the device successfully going into suspend on retry?
Hi Thinh,
Yes, the dwc3 device can be suspended using either pm_runtime_suspend() or dwc3_gadget_pullup(), the latter of which ultimately invokes pm_runtime_put().
One question: Do you know how to naturally cause a run stop failure? Based on the spec, the controller cannot halt until the event buffer becomes empty. If the driver doesn't acknowledge the events, this should lead to the run stop failure. However, since I cannot naturally reproduce this problem, I am simulating this scenario by modifying dwc3_gadget_run_stop() to return a timeout error directly.
I'm not clear what you meant by "naturally" here. The driver is implemented in such a way that this should not happen. If it does, we need to take look to see what we missed.
However, to force the driver to hit the controller halt timeout, just wait/generate some events and don't clear the GEVNTCOUNT of event bytes before clearing the run_stop bit.
BR, Thinh
Hi Thinh,
Thank you for getting back to me and the method to force the timeout!
By "naturally," I meant reproducing the issue without artificial steps
Ok.
designed solely to trigger it. You're right; since the driver is designed to prevent this state, reproducing it "naturally" is difficult.
I really appreciate your patience, and thank you once more for the helpful clarification.
You're welcome.
BR, Thinh
linux-stable-mirror@lists.linaro.org