commit 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup") adds support for using another platform timer in lieu of the RTC which doesn't work properly on some systems. This path was validated and worked well before submission. During the 5.16-rc1 merge window other patches were merged that caused this to stop working properly.
When this feature was used with 5.16-rc1 or later some OEM laptops with the matching firmware requirements from that commit would shutdown instead of program a timer based wakeup.
This was bisected to commit 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path"). This wasn't supposed to cause any negative impacts and also tested well on both Intel and ARM platforms. However this changed the semantics of when CPUs are allowed to be in the deepest state. For the AMD systems in question it appears this causes a firmware crash for timer based wakeup.
It's hypothesized to be caused by the `amd-pmc` driver sending `OS_HINT` and all the CPUs going into a deep state while the timer is still being programmed. It's likely a firmware bug, but to avoid it don't allow setting CPUs into the deepest state while using CZN timer wakeup path.
If later it's discovered that this also occurs from "regular" suspends without a timer as well or on other silicon, this may be later expanded to run in the suspend path for more scenarios.
Cc: stable@vger.kernel.org # 5.16+ Suggested-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/linux-acpi/BL1PR12MB51570F5BD05980A0DCA1F3F4E23A9@BL... Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path") Fixes: 23f62d7ab25b ("PM: sleep: Pause cpuidle later and resume it earlier during system transitions") Fixes: 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup" Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20220223175237.6209-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com (cherry picked from commit 68af28426b3ca1bf9ba21c7d8bdd0ff639e5134c) --- This didn't apply cleanly to 5.16.y because 5.16.y doesn't contain the STB feature. Manually fixed up the commit for this. This is *only* intended for 5.16.
drivers/platform/x86/amd-pmc.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/x86/amd-pmc.c b/drivers/platform/x86/amd-pmc.c index 8c74733530e3..11d0f829302b 100644 --- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -21,6 +21,7 @@ #include <linux/module.h> #include <linux/pci.h> #include <linux/platform_device.h> +#include <linux/pm_qos.h> #include <linux/rtc.h> #include <linux/suspend.h> #include <linux/seq_file.h> @@ -79,6 +80,9 @@ #define PMC_MSG_DELAY_MIN_US 50 #define RESPONSE_REGISTER_LOOP_MAX 20000
+/* QoS request for letting CPUs in idle states, but not the deepest */ +#define AMD_PMC_MAX_IDLE_STATE_LATENCY 3 + #define SOC_SUBSYSTEM_IP_MAX 12 #define DELAY_MIN_US 2000 #define DELAY_MAX_US 3000 @@ -123,6 +127,7 @@ struct amd_pmc_dev { u8 rev; struct device *dev; struct mutex lock; /* generic mutex lock */ + struct pm_qos_request amd_pmc_pm_qos_req; #if IS_ENABLED(CONFIG_DEBUG_FS) struct dentry *dbgfs_dir; #endif /* CONFIG_DEBUG_FS */ @@ -459,6 +464,14 @@ static int amd_pmc_verify_czn_rtc(struct amd_pmc_dev *pdev, u32 *arg) rc = rtc_alarm_irq_enable(rtc_device, 0); dev_dbg(pdev->dev, "wakeup timer programmed for %lld seconds\n", duration);
+ /* + * Prevent CPUs from getting into deep idle states while sending OS_HINT + * which is otherwise generally safe to send when at least one of the CPUs + * is not in deep idle states. + */ + cpu_latency_qos_update_request(&pdev->amd_pmc_pm_qos_req, AMD_PMC_MAX_IDLE_STATE_LATENCY); + wake_up_all_idle_cpus(); + return rc; }
@@ -476,17 +489,24 @@ static int __maybe_unused amd_pmc_suspend(struct device *dev) /* Activate CZN specific RTC functionality */ if (pdev->cpu_id == AMD_CPU_ID_CZN) { rc = amd_pmc_verify_czn_rtc(pdev, &arg); - if (rc < 0) - return rc; + if (rc) + goto fail; }
/* Dump the IdleMask before we send hint to SMU */ amd_pmc_idlemask_read(pdev, dev, NULL); msg = amd_pmc_get_os_hint(pdev); rc = amd_pmc_send_cmd(pdev, arg, NULL, msg, 0); - if (rc) + if (rc) { dev_err(pdev->dev, "suspend failed\n"); + goto fail; + }
+ return 0; +fail: + if (pdev->cpu_id == AMD_CPU_ID_CZN) + cpu_latency_qos_update_request(&pdev->amd_pmc_pm_qos_req, + PM_QOS_DEFAULT_VALUE); return rc; }
@@ -507,7 +527,12 @@ static int __maybe_unused amd_pmc_resume(struct device *dev) /* Dump the IdleMask to see the blockers */ amd_pmc_idlemask_read(pdev, dev, NULL);
- return 0; + /* Restore the QoS request back to defaults if it was set */ + if (pdev->cpu_id == AMD_CPU_ID_CZN) + cpu_latency_qos_update_request(&pdev->amd_pmc_pm_qos_req, + PM_QOS_DEFAULT_VALUE); + + return rc; }
static const struct dev_pm_ops amd_pmc_pm_ops = { @@ -597,6 +622,7 @@ static int amd_pmc_probe(struct platform_device *pdev) amd_pmc_get_smu_version(dev); platform_set_drvdata(pdev, dev); amd_pmc_dbgfs_register(dev); + cpu_latency_qos_add_request(&dev->amd_pmc_pm_qos_req, PM_QOS_DEFAULT_VALUE); return 0; }
On Mon, Feb 28, 2022 at 10:15:10PM -0600, Mario Limonciello wrote:
commit 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup") adds support for using another platform timer in lieu of the RTC which doesn't work properly on some systems. This path was validated and worked well before submission. During the 5.16-rc1 merge window other patches were merged that caused this to stop working properly.
When this feature was used with 5.16-rc1 or later some OEM laptops with the matching firmware requirements from that commit would shutdown instead of program a timer based wakeup.
This was bisected to commit 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path"). This wasn't supposed to cause any negative impacts and also tested well on both Intel and ARM platforms. However this changed the semantics of when CPUs are allowed to be in the deepest state. For the AMD systems in question it appears this causes a firmware crash for timer based wakeup.
It's hypothesized to be caused by the `amd-pmc` driver sending `OS_HINT` and all the CPUs going into a deep state while the timer is still being programmed. It's likely a firmware bug, but to avoid it don't allow setting CPUs into the deepest state while using CZN timer wakeup path.
If later it's discovered that this also occurs from "regular" suspends without a timer as well or on other silicon, this may be later expanded to run in the suspend path for more scenarios.
Cc: stable@vger.kernel.org # 5.16+ Suggested-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/linux-acpi/BL1PR12MB51570F5BD05980A0DCA1F3F4E23A9@BL... Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path") Fixes: 23f62d7ab25b ("PM: sleep: Pause cpuidle later and resume it earlier during system transitions") Fixes: 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup" Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20220223175237.6209-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com (cherry picked from commit 68af28426b3ca1bf9ba21c7d8bdd0ff639e5134c)
This didn't apply cleanly to 5.16.y because 5.16.y doesn't contain the STB feature. Manually fixed up the commit for this. This is *only* intended for 5.16.
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org