We have met a deadlock issue on our device which use 5.15.y when resuming. After applying this patch which is picked from mainline, issue solved. Backport to 6.6.y also.
Rafael J. Wysocki (1): PM: sleep: Restore asynchronous device resume optimization
drivers/base/power/main.c | 117 +++++++++++++++++++++----------------- include/linux/pm.h | 1 + 2 files changed, 65 insertions(+), 53 deletions(-)
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
commit 3e999770ac1c7c31a70685dd5b88e89473509e9c upstream.
Before commit 7839d0078e0d ("PM: sleep: Fix possible deadlocks in core system-wide PM code"), the resume of devices that were allowed to resume asynchronously was scheduled before starting the resume of the other devices, so the former did not have to wait for the latter unless functional dependencies were present.
Commit 7839d0078e0d removed that optimization in order to address a correctness issue, but it can be restored with the help of a new device power management flag, so do that now.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Signed-off-by: Yenchia Chen yenchia.chen@mediatek.com --- drivers/base/power/main.c | 117 +++++++++++++++++++++----------------- include/linux/pm.h | 1 + 2 files changed, 65 insertions(+), 53 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 9c5a5f4dba5a..fadcd0379dc2 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -579,7 +579,7 @@ bool dev_pm_skip_resume(struct device *dev) }
/** - * __device_resume_noirq - Execute a "noirq resume" callback for given device. + * device_resume_noirq - Execute a "noirq resume" callback for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. * @async: If true, the device is being resumed asynchronously. @@ -587,7 +587,7 @@ bool dev_pm_skip_resume(struct device *dev) * The driver of @dev will not receive interrupts while this function is being * executed. */ -static void __device_resume_noirq(struct device *dev, pm_message_t state, bool async) +static void device_resume_noirq(struct device *dev, pm_message_t state, bool async) { pm_callback_t callback = NULL; const char *info = NULL; @@ -674,16 +674,22 @@ static bool dpm_async_fn(struct device *dev, async_func_t func) { reinit_completion(&dev->power.completion);
- if (!is_async(dev)) - return false; - - get_device(dev); + if (is_async(dev)) { + dev->power.async_in_progress = true;
- if (async_schedule_dev_nocall(func, dev)) - return true; + get_device(dev);
- put_device(dev); + if (async_schedule_dev_nocall(func, dev)) + return true;
+ put_device(dev); + } + /* + * Because async_schedule_dev_nocall() above has returned false or it + * has not been called at all, func() is not running and it is safe to + * update the async_in_progress flag without extra synchronization. + */ + dev->power.async_in_progress = false; return false; }
@@ -691,18 +697,10 @@ static void async_resume_noirq(void *data, async_cookie_t cookie) { struct device *dev = data;
- __device_resume_noirq(dev, pm_transition, true); + device_resume_noirq(dev, pm_transition, true); put_device(dev); }
-static void device_resume_noirq(struct device *dev) -{ - if (dpm_async_fn(dev, async_resume_noirq)) - return; - - __device_resume_noirq(dev, pm_transition, false); -} - static void dpm_noirq_resume_devices(pm_message_t state) { struct device *dev; @@ -712,18 +710,28 @@ static void dpm_noirq_resume_devices(pm_message_t state) mutex_lock(&dpm_list_mtx); pm_transition = state;
+ /* + * Trigger the resume of "async" devices upfront so they don't have to + * wait for the "non-async" ones they don't depend on. + */ + list_for_each_entry(dev, &dpm_noirq_list, power.entry) + dpm_async_fn(dev, async_resume_noirq); + while (!list_empty(&dpm_noirq_list)) { dev = to_device(dpm_noirq_list.next); - get_device(dev); list_move_tail(&dev->power.entry, &dpm_late_early_list);
- mutex_unlock(&dpm_list_mtx); + if (!dev->power.async_in_progress) { + get_device(dev);
- device_resume_noirq(dev); + mutex_unlock(&dpm_list_mtx);
- put_device(dev); + device_resume_noirq(dev, state, false);
- mutex_lock(&dpm_list_mtx); + put_device(dev); + + mutex_lock(&dpm_list_mtx); + } } mutex_unlock(&dpm_list_mtx); async_synchronize_full(); @@ -747,14 +755,14 @@ void dpm_resume_noirq(pm_message_t state) }
/** - * __device_resume_early - Execute an "early resume" callback for given device. + * device_resume_early - Execute an "early resume" callback for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. * @async: If true, the device is being resumed asynchronously. * * Runtime PM is disabled for @dev while this function is being executed. */ -static void __device_resume_early(struct device *dev, pm_message_t state, bool async) +static void device_resume_early(struct device *dev, pm_message_t state, bool async) { pm_callback_t callback = NULL; const char *info = NULL; @@ -820,18 +828,10 @@ static void async_resume_early(void *data, async_cookie_t cookie) { struct device *dev = data;
- __device_resume_early(dev, pm_transition, true); + device_resume_early(dev, pm_transition, true); put_device(dev); }
-static void device_resume_early(struct device *dev) -{ - if (dpm_async_fn(dev, async_resume_early)) - return; - - __device_resume_early(dev, pm_transition, false); -} - /** * dpm_resume_early - Execute "early resume" callbacks for all devices. * @state: PM transition of the system being carried out. @@ -845,18 +845,28 @@ void dpm_resume_early(pm_message_t state) mutex_lock(&dpm_list_mtx); pm_transition = state;
+ /* + * Trigger the resume of "async" devices upfront so they don't have to + * wait for the "non-async" ones they don't depend on. + */ + list_for_each_entry(dev, &dpm_late_early_list, power.entry) + dpm_async_fn(dev, async_resume_early); + while (!list_empty(&dpm_late_early_list)) { dev = to_device(dpm_late_early_list.next); - get_device(dev); list_move_tail(&dev->power.entry, &dpm_suspended_list);
- mutex_unlock(&dpm_list_mtx); + if (!dev->power.async_in_progress) { + get_device(dev);
- device_resume_early(dev); + mutex_unlock(&dpm_list_mtx);
- put_device(dev); + device_resume_early(dev, state, false);
- mutex_lock(&dpm_list_mtx); + put_device(dev); + + mutex_lock(&dpm_list_mtx); + } } mutex_unlock(&dpm_list_mtx); async_synchronize_full(); @@ -876,12 +886,12 @@ void dpm_resume_start(pm_message_t state) EXPORT_SYMBOL_GPL(dpm_resume_start);
/** - * __device_resume - Execute "resume" callbacks for given device. + * device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. * @async: If true, the device is being resumed asynchronously. */ -static void __device_resume(struct device *dev, pm_message_t state, bool async) +static void device_resume(struct device *dev, pm_message_t state, bool async) { pm_callback_t callback = NULL; const char *info = NULL; @@ -975,18 +985,10 @@ static void async_resume(void *data, async_cookie_t cookie) { struct device *dev = data;
- __device_resume(dev, pm_transition, true); + device_resume(dev, pm_transition, true); put_device(dev); }
-static void device_resume(struct device *dev) -{ - if (dpm_async_fn(dev, async_resume)) - return; - - __device_resume(dev, pm_transition, false); -} - /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -1006,16 +1008,25 @@ void dpm_resume(pm_message_t state) pm_transition = state; async_error = 0;
+ /* + * Trigger the resume of "async" devices upfront so they don't have to + * wait for the "non-async" ones they don't depend on. + */ + list_for_each_entry(dev, &dpm_suspended_list, power.entry) + dpm_async_fn(dev, async_resume); + while (!list_empty(&dpm_suspended_list)) { dev = to_device(dpm_suspended_list.next);
get_device(dev);
- mutex_unlock(&dpm_list_mtx); + if (!dev->power.async_in_progress) { + mutex_unlock(&dpm_list_mtx);
- device_resume(dev); + device_resume(dev, state, false);
- mutex_lock(&dpm_list_mtx); + mutex_lock(&dpm_list_mtx); + }
if (!list_empty(&dev->power.entry)) list_move_tail(&dev->power.entry, &dpm_prepared_list); diff --git a/include/linux/pm.h b/include/linux/pm.h index 629c1633bbd0..943b553720f8 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -681,6 +681,7 @@ struct dev_pm_info { bool wakeup_path:1; bool syscore:1; bool no_pm_callbacks:1; /* Owned by the PM core */ + bool async_in_progress:1; /* Owned by the PM core */ unsigned int must_resume:1; /* Owned by the PM core */ unsigned int may_skip_resume:1; /* Set by subsystems */ #else
On Mon, Sep 02, 2024 at 05:32:48PM +0800, Yenchia Chen wrote:
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
commit 3e999770ac1c7c31a70685dd5b88e89473509e9c upstream.
Before commit 7839d0078e0d ("PM: sleep: Fix possible deadlocks in core system-wide PM code"), the resume of devices that were allowed to resume asynchronously was scheduled before starting the resume of the other devices, so the former did not have to wait for the latter unless functional dependencies were present.
Commit 7839d0078e0d removed that optimization in order to address a correctness issue, but it can be restored with the help of a new device power management flag, so do that now.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Signed-off-by: Yenchia Chen yenchia.chen@mediatek.com
drivers/base/power/main.c | 117 +++++++++++++++++++++----------------- include/linux/pm.h | 1 + 2 files changed, 65 insertions(+), 53 deletions(-)
Why does this need to be backported? What bug is it fixing?
confused,
greg k-h
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
commit 3e999770ac1c7c31a70685dd5b88e89473509e9c upstream.
Before commit 7839d0078e0d ("PM: sleep: Fix possible deadlocks in core system-wide PM code"), the resume of devices that were allowed to resume asynchronously was scheduled before starting the resume of the other devices, so the former did not have to wait for the latter unless functional dependencies were present.
Commit 7839d0078e0d removed that optimization in order to address a correctness issue, but it can be restored with the help of a new device power management flag, so do that now.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Signed-off-by: Yenchia Chen yenchia.chen@mediatek.com
drivers/base/power/main.c | 117 +++++++++++++++++++++----------------- include/linux/pm.h | 1 + 2 files changed, 65 insertions(+), 53 deletions(-)
Why does this need to be backported? What bug is it fixing?
confused,
greg k-h
Below is the scenario we met the issue: 1) use command 'echo 3 > /proc/sys/vm/drop_caches' and enter suspending stage immediately. 2) power on device, our driver try to read mmc after leaving resume callback and got stucked.
We found if we did not drop caches, mmc_blk_resume will be called and system works fine.
If we drop caches before suspending, there is a high possibility that mmc_blk_resume not be called and our driver stucked at filp_open.
We still try to find the root casue is but with this patch, it works.
Since it has been merged in mainline, we'd like to know it is ok to merge to stable.
On Thu, Sep 05, 2024 at 05:34:33PM +0800, Yenchia Chen wrote:
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
commit 3e999770ac1c7c31a70685dd5b88e89473509e9c upstream.
Before commit 7839d0078e0d ("PM: sleep: Fix possible deadlocks in core system-wide PM code"), the resume of devices that were allowed to resume asynchronously was scheduled before starting the resume of the other devices, so the former did not have to wait for the latter unless functional dependencies were present.
Commit 7839d0078e0d removed that optimization in order to address a correctness issue, but it can be restored with the help of a new device power management flag, so do that now.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Signed-off-by: Yenchia Chen yenchia.chen@mediatek.com
drivers/base/power/main.c | 117 +++++++++++++++++++++----------------- include/linux/pm.h | 1 + 2 files changed, 65 insertions(+), 53 deletions(-)
Why does this need to be backported? What bug is it fixing?
confused,
greg k-h
Below is the scenario we met the issue:
- use command 'echo 3 > /proc/sys/vm/drop_caches' and enter suspending stage immediately.
- power on device, our driver try to read mmc after leaving resume callback and got stucked.
We found if we did not drop caches, mmc_blk_resume will be called and system works fine.
If we drop caches before suspending, there is a high possibility that mmc_blk_resume not be called and our driver stucked at filp_open.
We still try to find the root casue is but with this patch, it works.
I think you are getting lucky as this is just changing the order in which things are suspending. Please find and fix the root problem.
Since it has been merged in mainline, we'd like to know it is ok to merge to stable.
It changes the behavior of the system overall, and doesn't really fix a bug on its own, so I don't want to, sorry.
Please find the real problem in your driver, or the mmc subsystem.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org