In 08810a4119aaebf6318f209ec5dd9828e969cba4 setting dev->power.direct_complete was made conditional on pm_runtime_suspended().
The justification was: While at it, make the core check pm_runtime_suspended() when setting power.direct_complete so that it doesn't need to be checked by ->prepare callbacks.
However, this breaks resuming from suspend on those newer HP laptops if the amdgpu driver is used (due to hybrid intel+radeon graphics). Given the justification for the change, undoing it seems best as it appears to have unintended side effects.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199693 References: https://bugs.freedesktop.org/show_bug.cgi?id=106447 Signed-off-by: Thomas Martitz kugel@rockbox.org Cc: Pavel Machek pavel@ucw.cz Cc: Len Brown len.brown@intel.com Cc: linux-pm@vger.kernel.org Cc: stable@vger.kernel.org [4.15+] Signed-off-by: Thomas Martitz kugel@rockbox.org --- drivers/base/power/main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 02a497e7c785..b2fb0974f832 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1960,8 +1960,7 @@ static int device_prepare(struct device *dev, pm_message_t state) */ spin_lock_irq(&dev->power.lock); dev->power.direct_complete = state.event == PM_EVENT_SUSPEND && - pm_runtime_suspended(dev) && ret > 0 && - !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP); + ret > 0 && !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP); spin_unlock_irq(&dev->power.lock); return 0; }
On Tue, May 15, 2018 at 3:35 PM, Thomas Martitz kugel@rockbox.org wrote:
In 08810a4119aaebf6318f209ec5dd9828e969cba4 setting dev->power.direct_complete was made conditional on pm_runtime_suspended().
The justification was: While at it, make the core check pm_runtime_suspended() when setting power.direct_complete so that it doesn't need to be checked by ->prepare callbacks.
However, this breaks resuming from suspend on those newer HP laptops if the amdgpu driver is used (due to hybrid intel+radeon graphics). Given the justification for the change, undoing it seems best as it appears to have unintended side effects.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199693 References: https://bugs.freedesktop.org/show_bug.cgi?id=106447 Signed-off-by: Thomas Martitz kugel@rockbox.org Cc: Pavel Machek pavel@ucw.cz Cc: Len Brown len.brown@intel.com Cc: linux-pm@vger.kernel.org Cc: stable@vger.kernel.org [4.15+] Signed-off-by: Thomas Martitz kugel@rockbox.org
drivers/base/power/main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 02a497e7c785..b2fb0974f832 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1960,8 +1960,7 @@ static int device_prepare(struct device *dev, pm_message_t state) */ spin_lock_irq(&dev->power.lock); dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
pm_runtime_suspended(dev) && ret > 0 &&
!dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
ret > 0 && !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP); spin_unlock_irq(&dev->power.lock); return 0;
}
Before applying anything like this I need to understand the failure in the first place.
Since direct_complete is optional anyway and it should be safe to call pm_runtime_suspended() at any time, I'm suspecting a bug in a driver that is exposed by the commit turned up by bisection.
Where's the code I need to look at?
Am 15.05.2018 um 17:10 schrieb Rafael J. Wysocki:
Before applying anything like this I need to understand the failure in the first place.
Since direct_complete is optional anyway and it should be safe to call pm_runtime_suspended() at any time, I'm suspecting a bug in a driver that is exposed by the commit turned up by bisection.
Where's the code I need to look at?
Hello,
thanks for looking into this. To answer the question I need your expertise. I couldn't find out which device causes it. Since the freeze only happens of amdgpu is loaded I naturally thought the problematic device(s) would be bound to amdgpu. So I assumed the amdgpu-bound devices would be affected by your change, i.e. not have direct_complete set anymore. But I could not confirm this evidence during testing (I basically restricted my patch to devices which pass !strcmp(dev->driver->name, "amdgpu", and the system was still frozen upon resume).
Then I printed a list which pass "dev->direct_complete && !pm_runtime_suspended(dev)" (with this patch applied), and the list was *very* long (maybe 100+). I can perform this again if you find the list useful.
The sheer number of devices which pass "dev->direct_complete && !pm_runtime_suspended(dev)" made me think that your change should be re-considered, hence my patch.
So, please give me advice on how I can point you to the code that you'd like to look at. I assume key information which device(s) pass "dev->direct_complete && !pm_runtime_suspended(dev)" AND cause the freeze on !dev->direct_complete? I'm afraid that will take a very long time to find out.
Best regards.
On Tue, May 15, 2018 at 9:39 PM, Thomas Martitz kugel@rockbox.org wrote:
Am 15.05.2018 um 17:10 schrieb Rafael J. Wysocki:
Before applying anything like this I need to understand the failure in the first place.
Since direct_complete is optional anyway and it should be safe to call pm_runtime_suspended() at any time, I'm suspecting a bug in a driver that is exposed by the commit turned up by bisection.
Where's the code I need to look at?
Hello,
thanks for looking into this. To answer the question I need your expertise. I couldn't find out which device causes it. Since the freeze only happens of amdgpu is loaded I naturally thought the problematic device(s) would be bound to amdgpu. So I assumed the amdgpu-bound devices would be affected by your change, i.e. not have direct_complete set anymore. But I could not confirm this evidence during testing (I basically restricted my patch to devices which pass !strcmp(dev->driver->name, "amdgpu", and the system was still frozen upon resume).
Then I printed a list which pass "dev->direct_complete && !pm_runtime_suspended(dev)" (with this patch applied), and the list was *very* long (maybe 100+). I can perform this again if you find the list useful.
The sheer number of devices which pass "dev->direct_complete && !pm_runtime_suspended(dev)" made me think that your change should be re-considered, hence my patch.
So, please give me advice on how I can point you to the code that you'd like to look at. I assume key information which device(s) pass "dev->direct_complete && !pm_runtime_suspended(dev)" AND cause the freeze on !dev->direct_complete? I'm afraid that will take a very long time to find out.
Let's continue this in the Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=199693
Thanks, Rafael
linux-stable-mirror@lists.linaro.org