This patchset adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors and core. This is required for early suspend and late resume of governors and cpufreq core.
There are multiple problems that are fixed by this patch: - Nishanth Menon (TI) found an interesting problem on his platform, OMAP. His board wasn't working well with suspend/resume as calls for removing non-boot CPUs was turning out into a call to drivers ->target() which then tries to play with regulators. But regulators and their I2C bus were already suspended and this resulted in a failure. Many platforms have such problems, samsung, tegra, etc.. They solved it with driver specific PM notifiers where they used to disable their driver's ->target() routine. Most of these are updated in this patchset to use new infrastructure.
- Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found another issue where tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is also fixed with this patch as don't allow any operation on Governors during suspend/resume now.
So to solve these issues we introduce early suspend and late resume callbacks which would remove need of cpufreq drivers to implement PM notifiers to disable transition after suspend and before resume.
@Nishanth: Can you please test V2 as well and confirm that suspend_noirq() doesn't work for you. I am sure it will not, but would be better if you confirm that.
Viresh Kumar (6): cpufreq: suspend governors on system suspend/hibernate cpufreq: call driver's suspend/resume for each policy cpufreq: Implement cpufreq_generic_suspend() cpufreq: exynos: Use cpufreq_generic_suspend() cpufreq: s5pv210: Use cpufreq_generic_suspend() cpufreq: Tegra: Use cpufreq_generic_suspend()
drivers/base/power/main.c | 5 ++ drivers/cpufreq/cpufreq.c | 133 +++++++++++++++++++++----------------- drivers/cpufreq/exynos-cpufreq.c | 97 ++------------------------- drivers/cpufreq/s5pv210-cpufreq.c | 49 +------------- drivers/cpufreq/tegra-cpufreq.c | 54 ++-------------- include/linux/cpufreq.h | 6 ++ 6 files changed, 99 insertions(+), 245 deletions(-)
This patch adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors. This is required for early suspend and late resume of governors and cpufreq core.
There are multiple problems that are fixed by this patch: - Nishanth Menon (TI) found an interesting problem on his platform, OMAP. His board wasn't working well with suspend/resume as calls for removing non-boot CPUs was turning out into a call to drivers ->target() which then tries to play with regulators. But regulators and their I2C bus were already suspended and this resulted in a failure. Many platforms have such problems, samsung, tegra, etc.. They solved it with driver specific PM notifiers where they used to disable their driver's ->target() routine. - Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found another issue where tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is also fixed with this patch as don't allow any operation on Governors during suspend/resume now.
Reported-by: Lan Tianyu tianyu.lan@intel.com Reported-by: Nishanth Menon nm@ti.com Reported-by: Jinhyuk Choi jinchoi@broadcom.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/main.c | 5 +++++ drivers/cpufreq/cpufreq.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/cpufreq.h | 3 +++ 3 files changed, 58 insertions(+)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 1b41fca..c9fbb9d 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -29,6 +29,7 @@ #include <linux/async.h> #include <linux/suspend.h> #include <trace/events/power.h> +#include <linux/cpufreq.h> #include <linux/cpuidle.h> #include <linux/timer.h>
@@ -789,6 +790,8 @@ void dpm_resume(pm_message_t state) mutex_unlock(&dpm_list_mtx); async_synchronize_full(); dpm_show_time(starttime, state, NULL); + + cpufreq_resume(); }
/** @@ -1259,6 +1262,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
+ cpufreq_suspend(); + mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0; diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 02d534d..b6c7821 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -26,6 +26,7 @@ #include <linux/module.h> #include <linux/mutex.h> #include <linux/slab.h> +#include <linux/suspend.h> #include <linux/syscore_ops.h> #include <linux/tick.h> #include <trace/events/power.h> @@ -47,6 +48,9 @@ static LIST_HEAD(cpufreq_policy_list); static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); #endif
+/* Flag to suspend/resume CPUFreq governors */ +static bool cpufreq_suspended; + static inline bool has_target(void) { return cpufreq_driver->target_index || cpufreq_driver->target; @@ -1462,6 +1466,48 @@ static struct subsys_interface cpufreq_interface = { .remove_dev = cpufreq_remove_dev, };
+/* + * Callbacks for suspending/resuming governors as some platforms can't change + * frequency after this point in suspend cycle. Because some of the devices + * (like: i2c, regulators, etc) they use for changing frequency are suspended + * quickly after this point. + */ +void cpufreq_suspend(void) +{ + struct cpufreq_policy *policy; + + if (!has_target()) + return; + + pr_debug("%s: Suspending Governors\n", __func__); + + list_for_each_entry(policy, &cpufreq_policy_list, policy_list) + if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP)) + pr_err("%s: Failed to stop governor for policy: %p\n", + __func__, policy); + + cpufreq_suspended = true; +} + +void cpufreq_resume(void) +{ + struct cpufreq_policy *policy; + + if (!has_target()) + return; + + pr_debug("%s: Resuming Governors\n", __func__); + + cpufreq_suspended = false; + + list_for_each_entry(policy, &cpufreq_policy_list, policy_list) + if (__cpufreq_governor(policy, CPUFREQ_GOV_START) || + __cpufreq_governor(policy, + CPUFREQ_GOV_LIMITS)) + pr_err("%s: Failed to start governor for policy: %p\n", + __func__, policy); +} + /** * cpufreq_bp_suspend - Prepare the boot CPU for system suspend. * @@ -1764,6 +1810,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy, struct cpufreq_governor *gov = NULL; #endif
+ /* Don't start any governor operations if we are entering suspend */ + if (cpufreq_suspended) + return 0; + if (policy->governor->max_transition_latency && policy->cpuinfo.transition_latency > policy->governor->max_transition_latency) { diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index dc196bb..6d93f91 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -255,6 +255,9 @@ struct cpufreq_driver { int cpufreq_register_driver(struct cpufreq_driver *driver_data); int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
+void cpufreq_suspend(void); +void cpufreq_resume(void); + const char *cpufreq_get_current_driver(void);
static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
On Monday, November 25, 2013 07:41:41 PM Viresh Kumar wrote:
This patch adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors. This is required for early suspend and late resume of governors and cpufreq core.
There are multiple problems that are fixed by this patch:
Nishanth Menon (TI) found an interesting problem on his platform, OMAP. His board wasn't working well with suspend/resume as calls for removing non-boot CPUs was turning out into a call to drivers ->target() which then tries to play with regulators. But regulators and their I2C bus were already suspended and this resulted in a failure. Many platforms have such problems, samsung, tegra, etc.. They solved it with driver specific PM notifiers where they used to disable their driver's ->target() routine.
Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found another issue where
tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is also fixed with this patch as don't allow any operation on Governors during suspend/resume now.
Reported-by: Lan Tianyu tianyu.lan@intel.com Reported-by: Nishanth Menon nm@ti.com Reported-by: Jinhyuk Choi jinchoi@broadcom.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/base/power/main.c | 5 +++++
drivers/cpufreq/cpufreq.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/cpufreq.h | 3 +++ 3 files changed, 58 insertions(+)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 1b41fca..c9fbb9d 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -29,6 +29,7 @@
#include <linux/async.h> #include <linux/suspend.h> #include <trace/events/power.h>
+#include <linux/cpufreq.h>
#include <linux/cpuidle.h> #include <linux/timer.h>
@@ -789,6 +790,8 @@ void dpm_resume(pm_message_t state)
mutex_unlock(&dpm_list_mtx); async_synchronize_full(); dpm_show_time(starttime, state, NULL);
- cpufreq_resume();
} /**
@@ -1259,6 +1262,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
cpufreq_suspend();
mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0;
Shouldn't it do cpufreq_resume() on errors?
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 02d534d..b6c7821 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -26,6 +26,7 @@
#include <linux/module.h> #include <linux/mutex.h> #include <linux/slab.h>
+#include <linux/suspend.h>
#include <linux/syscore_ops.h> #include <linux/tick.h> #include <trace/events/power.h>
@@ -47,6 +48,9 @@ static LIST_HEAD(cpufreq_policy_list);
static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); #endif
+/* Flag to suspend/resume CPUFreq governors */ +static bool cpufreq_suspended;
static inline bool has_target(void) { return cpufreq_driver->target_index || cpufreq_driver->target;
@@ -1462,6 +1466,48 @@ static struct subsys_interface cpufreq_interface = {
.remove_dev = cpufreq_remove_dev, };
+/*
- Callbacks for suspending/resuming governors as some platforms can't
change + * frequency after this point in suspend cycle. Because some of the devices + * (like: i2c, regulators, etc) they use for changing frequency are suspended + * quickly after this point.
- */
+void cpufreq_suspend(void) +{
- struct cpufreq_policy *policy;
- if (!has_target())
return;
- pr_debug("%s: Suspending Governors\n", __func__);
- list_for_each_entry(policy, &cpufreq_policy_list, policy_list)
if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
pr_err("%s: Failed to stop governor for policy: %p\n",
__func__, policy);
This appears to be racy. Is it really racy, or just seemingly?
- cpufreq_suspended = true;
+}
+void cpufreq_resume(void) +{
- struct cpufreq_policy *policy;
- if (!has_target())
return;
- pr_debug("%s: Resuming Governors\n", __func__);
- cpufreq_suspended = false;
- list_for_each_entry(policy, &cpufreq_policy_list, policy_list)
if (__cpufreq_governor(policy, CPUFREQ_GOV_START) ||
__cpufreq_governor(policy,
CPUFREQ_GOV_LIMITS))
pr_err("%s: Failed to start governor for policy: %p\n",
__func__, policy);
+}
/**
- cpufreq_bp_suspend - Prepare the boot CPU for system suspend.
@@ -1764,6 +1810,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,> struct cpufreq_governor *gov = NULL; #endif
/* Don't start any governor operations if we are entering suspend */
if (cpufreq_suspended)
return 0;
if (policy->governor->max_transition_latency && policy->cpuinfo.transition_latency > policy->governor->max_transition_latency) {
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index dc196bb..6d93f91 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -255,6 +255,9 @@ struct cpufreq_driver {
int cpufreq_register_driver(struct cpufreq_driver *driver_data); int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
+void cpufreq_suspend(void); +void cpufreq_resume(void);
const char *cpufreq_get_current_driver(void); static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
Thanks!
On 26 November 2013 04:59, Rafael J. Wysocki rjw@rjwysocki.net wrote:
@@ -1259,6 +1262,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
cpufreq_suspend();
mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0;
Shouldn't it do cpufreq_resume() on errors?
Yes and this is already done I believe. In case dpm_suspend() fails, dpm_resume() gets called. Isn't it?
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c +void cpufreq_suspend(void) +{
struct cpufreq_policy *policy;
if (!has_target())
return;
pr_debug("%s: Suspending Governors\n", __func__);
list_for_each_entry(policy, &cpufreq_policy_list, policy_list)
if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
pr_err("%s: Failed to stop governor for policy: %p\n",
__func__, policy);
This appears to be racy. Is it really racy, or just seemingly?
Why does it look racy to you? Userspace should be frozen by now, policy_list should be stable as well as nobody would touch it.
On Tuesday, November 26, 2013 07:56:19 AM Viresh Kumar wrote:
On 26 November 2013 04:59, Rafael J. Wysocki rjw@rjwysocki.net wrote:
@@ -1259,6 +1262,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
cpufreq_suspend();
mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0;
Shouldn't it do cpufreq_resume() on errors?
Yes and this is already done I believe. In case dpm_suspend() fails, dpm_resume() gets called. Isn't it?
OK
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c +void cpufreq_suspend(void) +{
struct cpufreq_policy *policy;
if (!has_target())
return;
pr_debug("%s: Suspending Governors\n", __func__);
list_for_each_entry(policy, &cpufreq_policy_list, policy_list)
if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
pr_err("%s: Failed to stop governor for policy: %p\n",
__func__, policy);
This appears to be racy. Is it really racy, or just seemingly?
Why does it look racy to you? Userspace should be frozen by now, policy_list should be stable as well as nobody would touch it.
You're stopping governors while they may be in use in principle. Do we have suitable synchronization in place for that?
Rafael
On Tuesday, November 26, 2013 09:23:15 PM Rafael J. Wysocki wrote:
On Tuesday, November 26, 2013 07:56:19 AM Viresh Kumar wrote:
On 26 November 2013 04:59, Rafael J. Wysocki rjw@rjwysocki.net wrote:
@@ -1259,6 +1262,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
cpufreq_suspend();
mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0;
Shouldn't it do cpufreq_resume() on errors?
Yes and this is already done I believe. In case dpm_suspend() fails, dpm_resume() gets called. Isn't it?
OK
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c +void cpufreq_suspend(void) +{
struct cpufreq_policy *policy;
if (!has_target())
return;
pr_debug("%s: Suspending Governors\n", __func__);
list_for_each_entry(policy, &cpufreq_policy_list, policy_list)
if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
pr_err("%s: Failed to stop governor for policy: %p\n",
__func__, policy);
This appears to be racy. Is it really racy, or just seemingly?
Why does it look racy to you? Userspace should be frozen by now, policy_list should be stable as well as nobody would touch it.
You're stopping governors while they may be in use in principle. Do we have suitable synchronization in place for that?
Anyway, if you did what I asked you to do and put the cpufreq suspend/resume into dpm_suspend/resume_noirq(), I'd probably take this for 3.13. However, since you've decided to put those things somewhere else thus making the change much more intrusive, I can only queue it up for 3.14.
This means I'm going to take the Tianyu's patch as a stop gap for 3.13.
Thanks!
On 27 November 2013 07:12, Rafael J. Wysocki rjw@rjwysocki.net wrote:
Anyway, if you did what I asked you to do and put the cpufreq suspend/resume into dpm_suspend/resume_noirq(), I'd probably take this for 3.13. However, since you've decided to put those things somewhere else thus making the change much more intrusive, I can only queue it up for 3.14.
This means I'm going to take the Tianyu's patch as a stop gap for 3.13.
There were design issues with that patch actually, as I pointed out earlier (handling EXIT part in core and INIT in governors).. And so in case we need to get something for v3.13, I will send a short version of this series with callbacks from suspend_noirq.
Get that one instead.
-- viresh
On 2013年11月27日 11:07, Viresh Kumar wrote:
On 27 November 2013 07:12, Rafael J. Wysocki rjw@rjwysocki.net wrote:
Anyway, if you did what I asked you to do and put the cpufreq suspend/resume into dpm_suspend/resume_noirq(), I'd probably take this for 3.13. However, since you've decided to put those things somewhere else thus making the change much more intrusive, I can only queue it up for 3.14.
This means I'm going to take the Tianyu's patch as a stop gap for 3.13.
Hi Viresh: First, I agree the new solution you are working on. :) But actually I don't totally agree my origin patch have design issue. Because I think governor should have the ability to check whether it has been EXIT when doing INIT and it should return error code at that point. The design is to make governor code stronger to deal with the case that governor is reinitialized before EXIT. Just from my view. Sorry for noise.
There were design issues with that patch actually, as I pointed out earlier (handling EXIT part in core and INIT in governors).. And so in case we need to get something for v3.13, I will send a short version of this series with callbacks from suspend_noirq.
Get that one instead.
-- viresh
On 27 November 2013 12:38, Lan Tianyu tianyu.lan@intel.com wrote:
Hi Viresh:
Hey Lan,
First, I agree the new solution you are working on. :)
Thanks :)
But actually I don't totally agree my origin patch have design issue. Because I think governor should have the ability to check whether it has been EXIT when doing INIT and it should return error code at that point. The design is to make governor code stronger to deal with the case that governor is reinitialized before EXIT. Just from my view.
Sorry for noise.
Ahh, these are useful discussions. Everyone have their own thoughts and its upto all of us to get meaningful stuff out of it..
I agree to whatever you wrote above but this isn't exactly what's being done in your patch. I was more concerned about this stuff:
On 22 November 2013 13:08, Lan Tianyu tianyu.lan@intel.com wrote:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
if (has_target() && !frozen) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c @@ -204,9 +204,20 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
switch (event) { case CPUFREQ_GOV_POLICY_INIT:
/*
* In order to keep governor data across suspend/resume,
* Governor doesn't exit when suspend and will be
* reinitialized when resume. Here check policy governor
* data to determine whether the governor has been exited.
* If not, return EALREADY.
*/ if (have_governor_per_policy()) {
WARN_ON(dbs_data);
if (dbs_data)
return -EALREADY; } else if (dbs_data) {
if (policy->governor_data == dbs_data)
return -EALREADY;
dbs_data->usage_count++; policy->governor_data = dbs_data; return 0;
Here the cpufreq core has skipped the call to governor's EXIT, and so it shouldn't pass on the following INIT call to them..
That's a bit wrong. These two calls work in pairs and are exactly opposite to each other. And so if some decision has to be taken then either that should be done completely at governor level or core level. Doing stuff partly in governor and partly in core is like giving invitation to new bugs/problems :)
Nothing personal otherwise. Recently there were patches sent by people, you, Nishanth, etc, which I have just overridden with my versions.. It wasn't about getting my count higher :) but getting the solution at right places instead of solving them at wrong locations..
I am already having tough time upstreaming patches for cpufreq consolidation, as the number of patches is huge. It takes time for people to absorb/test them. Though Rafael has taken almost all of them in v3.13 finally, but I understand its difficult for him as well and he did his job wonderfully :)
And so I don't really want to get any new stuff in which will surely get consolidated later. Lets do it now, we had enough of it :)
Even, related to your patch, I was already thinking of getting rid of "frozen" variable and parameter to functions, as we already know status from a global variable now, cpufreq_suspended. And so we don't actually need to pass any additional parameters to many routines, which have something like 'frozen' currently.
-- viresh
On 27 November 2013 01:53, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Tuesday, November 26, 2013 07:56:19 AM Viresh Kumar wrote:
On 26 November 2013 04:59, Rafael J. Wysocki rjw@rjwysocki.net wrote:
This appears to be racy. Is it really racy, or just seemingly?
Why does it look racy to you? Userspace should be frozen by now, policy_list should be stable as well as nobody would touch it.
You're stopping governors while they may be in use in principle. Do we have suitable synchronization in place for that?
At what point exactly in suspend cycle do we suspend timers and workqueues. I thought userspace would be frozen by now and so would be the governors..
On Monday 25 November 2013 07:41 PM, Viresh Kumar wrote:
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index dc196bb..6d93f91 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -255,6 +255,9 @@ struct cpufreq_driver { int cpufreq_register_driver(struct cpufreq_driver *driver_data); int cpufreq_unregister_driver(struct cpufreq_driver *driver_data); +void cpufreq_suspend(void); +void cpufreq_resume(void);
const char *cpufreq_get_current_driver(void); static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
A minor fix here to get kernel compiled without cpufreq support enabled:
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 8d8b2f4..d40809d 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -259,9 +259,6 @@ struct cpufreq_driver { int cpufreq_register_driver(struct cpufreq_driver *driver_data); int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
-void cpufreq_suspend(void); -void cpufreq_resume(void); - const char *cpufreq_get_current_driver(void);
static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy, @@ -287,6 +284,14 @@ cpufreq_verify_within_cpu_limits(struct cpufreq_policy *policy) policy->cpuinfo.max_freq); }
+#ifdef CONFIG_CPU_FREQ +void cpufreq_suspend(void); +void cpufreq_resume(void); +#elif +static inline void cpufreq_suspend(void) {} +static inline void cpufreq_resume(void) {} +#endif + /********************************************************************* * CPUFREQ NOTIFIER INTERFACE * *********************************************************************/
Earlier cpufreq suspend/resume callbacks into drivers were getting called only for the boot CPU, as by the time callbacks were called non-boot CPUs were already removed. Because we might still need driver specific actions on suspend/resume, its better to use earlier infrastructure from the early suspend/late resume calls.
In effect, we call suspend/resume for each policy. The resume part also takes care of synchronising frequency for boot CPU, which might turn out be different than what cpufreq core believes.
Hence, the earlier syscore infrastructure is getting removed now.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 98 +++++++++-------------------------------------- 1 file changed, 18 insertions(+), 80 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b6c7821..026efe4a 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -27,7 +27,6 @@ #include <linux/mutex.h> #include <linux/slab.h> #include <linux/suspend.h> -#include <linux/syscore_ops.h> #include <linux/tick.h> #include <trace/events/power.h>
@@ -1481,10 +1480,15 @@ void cpufreq_suspend(void)
pr_debug("%s: Suspending Governors\n", __func__);
- list_for_each_entry(policy, &cpufreq_policy_list, policy_list) + list_for_each_entry(policy, &cpufreq_policy_list, policy_list) { if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP)) pr_err("%s: Failed to stop governor for policy: %p\n", __func__, policy); + else if (cpufreq_driver->suspend && + cpufreq_driver->suspend(policy)) + pr_err("%s: Failed to suspend driver: %p\n", __func__, + policy); + }
cpufreq_suspended = true; } @@ -1500,92 +1504,27 @@ void cpufreq_resume(void)
cpufreq_suspended = false;
- list_for_each_entry(policy, &cpufreq_policy_list, policy_list) + list_for_each_entry(policy, &cpufreq_policy_list, policy_list) { if (__cpufreq_governor(policy, CPUFREQ_GOV_START) || __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS)) pr_err("%s: Failed to start governor for policy: %p\n", __func__, policy); -} - -/** - * cpufreq_bp_suspend - Prepare the boot CPU for system suspend. - * - * This function is only executed for the boot processor. The other CPUs - * have been put offline by means of CPU hotplug. - */ -static int cpufreq_bp_suspend(void) -{ - int ret = 0; - - int cpu = smp_processor_id(); - struct cpufreq_policy *policy; - - pr_debug("suspending cpu %u\n", cpu); - - /* If there's no policy for the boot CPU, we have nothing to do. */ - policy = cpufreq_cpu_get(cpu); - if (!policy) - return 0; - - if (cpufreq_driver->suspend) { - ret = cpufreq_driver->suspend(policy); - if (ret) - printk(KERN_ERR "cpufreq: suspend failed in ->suspend " - "step on CPU %u\n", policy->cpu); - } - - cpufreq_cpu_put(policy); - return ret; -} - -/** - * cpufreq_bp_resume - Restore proper frequency handling of the boot CPU. - * - * 1.) resume CPUfreq hardware support (cpufreq_driver->resume()) - * 2.) schedule call cpufreq_update_policy() ASAP as interrupts are - * restored. It will verify that the current freq is in sync with - * what we believe it to be. This is a bit later than when it - * should be, but nonethteless it's better than calling - * cpufreq_driver->get() here which might re-enable interrupts... - * - * This function is only executed for the boot CPU. The other CPUs have not - * been turned on yet. - */ -static void cpufreq_bp_resume(void) -{ - int ret = 0; + else if (cpufreq_driver->resume && + cpufreq_driver->resume(policy)) + pr_err("%s: Failed to resume driver: %p\n", __func__, + policy);
- int cpu = smp_processor_id(); - struct cpufreq_policy *policy; - - pr_debug("resuming cpu %u\n", cpu); - - /* If there's no policy for the boot CPU, we have nothing to do. */ - policy = cpufreq_cpu_get(cpu); - if (!policy) - return; - - if (cpufreq_driver->resume) { - ret = cpufreq_driver->resume(policy); - if (ret) { - printk(KERN_ERR "cpufreq: resume failed in ->resume " - "step on CPU %u\n", policy->cpu); - goto fail; - } + /* + * schedule call cpufreq_update_policy() for boot CPU, i.e. last + * policy in list. It will verify that the current freq is in + * sync with what we believe it to be. + */ + if (list_is_last(&policy->policy_list, &cpufreq_policy_list)) + schedule_work(&policy->update); } - - schedule_work(&policy->update); - -fail: - cpufreq_cpu_put(policy); }
-static struct syscore_ops cpufreq_syscore_ops = { - .suspend = cpufreq_bp_suspend, - .resume = cpufreq_bp_resume, -}; - /** * cpufreq_get_current_driver - return current driver's name * @@ -2271,7 +2210,6 @@ static int __init cpufreq_core_init(void)
cpufreq_global_kobject = kobject_create(); BUG_ON(!cpufreq_global_kobject); - register_syscore_ops(&cpufreq_syscore_ops);
return 0; }
Multiple platforms need to set CPU to a particular frequency before suspending system. And so they need a common infrastructure which is provided by this patch. Those platforms just need to initialize their ->suspend() pointers with the generic routine.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 25 +++++++++++++++++++++++++ include/linux/cpufreq.h | 3 +++ 2 files changed, 28 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 026efe4a..f6da551 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1466,6 +1466,31 @@ static struct subsys_interface cpufreq_interface = { };
/* + * In case platform wants some specific frequency to be configured + * during suspend.. + */ +int cpufreq_generic_suspend(struct cpufreq_policy *policy) +{ + int ret; + + if (!policy->suspend_freq) { + pr_err("%s: suspend_freq can't be zero\n", __func__); + return -EINVAL; + } + + pr_debug("%s: Setting suspend-freq: %u\n", __func__, + policy->suspend_freq); + + ret = __cpufreq_driver_target(policy, policy->suspend_freq, + CPUFREQ_RELATION_H); + if (ret) + pr_err("%s: unable to set suspend-freq: %u. err: %d\n", + __func__, policy->suspend_freq, ret); + + return ret; +} + +/* * Callbacks for suspending/resuming governors as some platforms can't change * frequency after this point in suspend cycle. Because some of the devices * (like: i2c, regulators, etc) they use for changing frequency are suspended diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 6d93f91..94ccac5 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -72,6 +72,8 @@ struct cpufreq_policy { unsigned int max; /* in kHz */ unsigned int cur; /* in kHz, only needed if cpufreq * governors are used */ + unsigned int suspend_freq; /* freq to set during suspend */ + unsigned int policy; /* see above */ struct cpufreq_governor *governor; /* see below */ void *governor_data; @@ -257,6 +259,7 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
void cpufreq_suspend(void); void cpufreq_resume(void); +int cpufreq_generic_suspend(struct cpufreq_policy *policy);
const char *cpufreq_get_current_driver(void);
Currently we have implemented PM notifiers to disable/enable ->target() routines functionality during suspend/resume.
Now we have support present in cpufreq core, lets use it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/exynos-cpufreq.c | 97 +++------------------------------------- 1 file changed, 6 insertions(+), 91 deletions(-)
diff --git a/drivers/cpufreq/exynos-cpufreq.c b/drivers/cpufreq/exynos-cpufreq.c index f3c2287..88a4e28 100644 --- a/drivers/cpufreq/exynos-cpufreq.c +++ b/drivers/cpufreq/exynos-cpufreq.c @@ -16,7 +16,6 @@ #include <linux/slab.h> #include <linux/regulator/consumer.h> #include <linux/cpufreq.h> -#include <linux/suspend.h>
#include <plat/cpu.h>
@@ -26,10 +25,6 @@ static struct exynos_dvfs_info *exynos_info;
static struct regulator *arm_regulator;
-static unsigned int locking_frequency; -static bool frequency_locked; -static DEFINE_MUTEX(cpufreq_lock); - static unsigned int exynos_getspeed(unsigned int cpu) { return clk_get_rate(exynos_info->cpu_clk) / 1000; @@ -138,82 +133,12 @@ out:
static int exynos_target(struct cpufreq_policy *policy, unsigned int index) { - struct cpufreq_frequency_table *freq_table = exynos_info->freq_table; - int ret = 0; - - mutex_lock(&cpufreq_lock); - - if (frequency_locked) - goto out; - - ret = exynos_cpufreq_scale(freq_table[index].frequency); - -out: - mutex_unlock(&cpufreq_lock); - - return ret; -} - -#ifdef CONFIG_PM -static int exynos_cpufreq_suspend(struct cpufreq_policy *policy) -{ - return 0; -} - -static int exynos_cpufreq_resume(struct cpufreq_policy *policy) -{ - return 0; + return exynos_cpufreq_scale(exynos_info->freq_table[index].frequency); } -#endif - -/** - * exynos_cpufreq_pm_notifier - block CPUFREQ's activities in suspend-resume - * context - * @notifier - * @pm_event - * @v - * - * While frequency_locked == true, target() ignores every frequency but - * locking_frequency. The locking_frequency value is the initial frequency, - * which is set by the bootloader. In order to eliminate possible - * inconsistency in clock values, we save and restore frequencies during - * suspend and resume and block CPUFREQ activities. Note that the standard - * suspend/resume cannot be used as they are too deep (syscore_ops) for - * regulator actions. - */ -static int exynos_cpufreq_pm_notifier(struct notifier_block *notifier, - unsigned long pm_event, void *v) -{ - int ret; - - switch (pm_event) { - case PM_SUSPEND_PREPARE: - mutex_lock(&cpufreq_lock); - frequency_locked = true; - mutex_unlock(&cpufreq_lock); - - ret = exynos_cpufreq_scale(locking_frequency); - if (ret < 0) - return NOTIFY_BAD; - - break; - - case PM_POST_SUSPEND: - mutex_lock(&cpufreq_lock); - frequency_locked = false; - mutex_unlock(&cpufreq_lock); - break; - } - - return NOTIFY_OK; -} - -static struct notifier_block exynos_cpufreq_nb = { - .notifier_call = exynos_cpufreq_pm_notifier, -};
static int exynos_cpufreq_cpu_init(struct cpufreq_policy *policy) { + policy->suspend_freq = exynos_getspeed(policy->cpu); return cpufreq_generic_init(policy, exynos_info->freq_table, 100000); }
@@ -227,8 +152,7 @@ static struct cpufreq_driver exynos_driver = { .name = "exynos_cpufreq", .attr = cpufreq_generic_attr, #ifdef CONFIG_PM - .suspend = exynos_cpufreq_suspend, - .resume = exynos_cpufreq_resume, + .suspend = cpufreq_generic_suspend, #endif };
@@ -263,19 +187,10 @@ static int __init exynos_cpufreq_init(void) goto err_vdd_arm; }
- locking_frequency = exynos_getspeed(0); - - register_pm_notifier(&exynos_cpufreq_nb); - - if (cpufreq_register_driver(&exynos_driver)) { - pr_err("%s: failed to register cpufreq driver\n", __func__); - goto err_cpufreq; - } - - return 0; -err_cpufreq: - unregister_pm_notifier(&exynos_cpufreq_nb); + if (!cpufreq_register_driver(&exynos_driver)) + return 0;
+ pr_err("%s: failed to register cpufreq driver\n", __func__); regulator_put(arm_regulator); err_vdd_arm: kfree(exynos_info);
Currently we have implemented PM notifiers to disable/enable ->target() routines functionality during suspend/resume.
Now we have support present in cpufreq core, lets use it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/s5pv210-cpufreq.c | 49 +++------------------------------------ 1 file changed, 3 insertions(+), 46 deletions(-)
diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index e3973da..89c052e 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -18,7 +18,6 @@ #include <linux/cpufreq.h> #include <linux/reboot.h> #include <linux/regulator/consumer.h> -#include <linux/suspend.h>
#include <mach/map.h> #include <mach/regs-clock.h> @@ -444,18 +443,6 @@ exit: return ret; }
-#ifdef CONFIG_PM -static int s5pv210_cpufreq_suspend(struct cpufreq_policy *policy) -{ - return 0; -} - -static int s5pv210_cpufreq_resume(struct cpufreq_policy *policy) -{ - return 0; -} -#endif - static int check_mem_type(void __iomem *dmc_reg) { unsigned long val; @@ -511,6 +498,7 @@ static int __init s5pv210_cpu_init(struct cpufreq_policy *policy) s5pv210_dram_conf[1].refresh = (__raw_readl(S5P_VA_DMC1 + 0x30) * 1000); s5pv210_dram_conf[1].freq = clk_get_rate(dmc1_clk);
+ policy->suspend_freq = SLEEP_FREQ; return cpufreq_generic_init(policy, s5pv210_freq_table, 40000);
out_dmc1: @@ -520,32 +508,6 @@ out_dmc0: return ret; }
-static int s5pv210_cpufreq_notifier_event(struct notifier_block *this, - unsigned long event, void *ptr) -{ - int ret; - - switch (event) { - case PM_SUSPEND_PREPARE: - ret = cpufreq_driver_target(cpufreq_cpu_get(0), SLEEP_FREQ, 0); - if (ret < 0) - return NOTIFY_BAD; - - /* Disable updation of cpu frequency */ - no_cpufreq_access = true; - return NOTIFY_OK; - case PM_POST_RESTORE: - case PM_POST_SUSPEND: - /* Enable updation of cpu frequency */ - no_cpufreq_access = false; - cpufreq_driver_target(cpufreq_cpu_get(0), SLEEP_FREQ, 0); - - return NOTIFY_OK; - } - - return NOTIFY_DONE; -} - static int s5pv210_cpufreq_reboot_notifier_event(struct notifier_block *this, unsigned long event, void *ptr) { @@ -567,15 +529,11 @@ static struct cpufreq_driver s5pv210_driver = { .init = s5pv210_cpu_init, .name = "s5pv210", #ifdef CONFIG_PM - .suspend = s5pv210_cpufreq_suspend, - .resume = s5pv210_cpufreq_resume, + .suspend = cpufreq_generic_suspend, + .resume = cpufreq_generic_suspend, /* We need to set SLEEP FREQ again */ #endif };
-static struct notifier_block s5pv210_cpufreq_notifier = { - .notifier_call = s5pv210_cpufreq_notifier_event, -}; - static struct notifier_block s5pv210_cpufreq_reboot_notifier = { .notifier_call = s5pv210_cpufreq_reboot_notifier_event, }; @@ -595,7 +553,6 @@ static int __init s5pv210_cpufreq_init(void) return PTR_ERR(int_regulator); }
- register_pm_notifier(&s5pv210_cpufreq_notifier); register_reboot_notifier(&s5pv210_cpufreq_reboot_notifier);
return cpufreq_register_driver(&s5pv210_driver);
Currently we have implemented PM notifiers to disable/enable ->target() routines functionality during suspend/resume.
Now we have support present in cpufreq core, lets use it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/tegra-cpufreq.c | 54 +++++------------------------------------ 1 file changed, 6 insertions(+), 48 deletions(-)
diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c index f42df7e..336368b 100644 --- a/drivers/cpufreq/tegra-cpufreq.c +++ b/drivers/cpufreq/tegra-cpufreq.c @@ -26,7 +26,6 @@ #include <linux/err.h> #include <linux/clk.h> #include <linux/io.h> -#include <linux/suspend.h>
static struct cpufreq_frequency_table freq_table[] = { { .frequency = 216000 }, @@ -48,8 +47,6 @@ static struct clk *pll_p_clk; static struct clk *emc_clk;
static unsigned long target_cpu_speed[NUM_CPUS]; -static DEFINE_MUTEX(tegra_cpu_lock); -static bool is_suspended;
static unsigned int tegra_getspeed(unsigned int cpu) { @@ -137,50 +134,10 @@ static unsigned long tegra_cpu_highest_speed(void)
static int tegra_target(struct cpufreq_policy *policy, unsigned int index) { - unsigned int freq; - int ret = 0; - - mutex_lock(&tegra_cpu_lock); - - if (is_suspended) { - ret = -EBUSY; - goto out; - } - - freq = freq_table[index].frequency; - - target_cpu_speed[policy->cpu] = freq; - - ret = tegra_update_cpu_speed(policy, tegra_cpu_highest_speed()); - -out: - mutex_unlock(&tegra_cpu_lock); - return ret; + target_cpu_speed[policy->cpu] = freq_table[index].frequency; + return tegra_update_cpu_speed(policy, tegra_cpu_highest_speed()); }
-static int tegra_pm_notify(struct notifier_block *nb, unsigned long event, - void *dummy) -{ - mutex_lock(&tegra_cpu_lock); - if (event == PM_SUSPEND_PREPARE) { - struct cpufreq_policy *policy = cpufreq_cpu_get(0); - is_suspended = true; - pr_info("Tegra cpufreq suspend: setting frequency to %d kHz\n", - freq_table[0].frequency); - tegra_update_cpu_speed(policy, freq_table[0].frequency); - cpufreq_cpu_put(policy); - } else if (event == PM_POST_SUSPEND) { - is_suspended = false; - } - mutex_unlock(&tegra_cpu_lock); - - return NOTIFY_OK; -} - -static struct notifier_block tegra_cpu_pm_notifier = { - .notifier_call = tegra_pm_notify, -}; - static int tegra_cpu_init(struct cpufreq_policy *policy) { int ret; @@ -192,6 +149,7 @@ static int tegra_cpu_init(struct cpufreq_policy *policy) clk_prepare_enable(cpu_clk);
target_cpu_speed[policy->cpu] = tegra_getspeed(policy->cpu); + policy->suspend_freq = freq_table[0].frequency;
/* FIXME: what's the actual transition time? */ ret = cpufreq_generic_init(policy, freq_table, 300 * 1000); @@ -201,9 +159,6 @@ static int tegra_cpu_init(struct cpufreq_policy *policy) return ret; }
- if (policy->cpu == 0) - register_pm_notifier(&tegra_cpu_pm_notifier); - return 0; }
@@ -223,6 +178,9 @@ static struct cpufreq_driver tegra_cpufreq_driver = { .exit = tegra_cpu_exit, .name = "tegra", .attr = cpufreq_generic_attr, +#ifdef CONFIG_PM + .suspend = cpufreq_generic_suspend, +#endif };
static int __init tegra_cpufreq_init(void)
On 11/25/2013 07:11 AM, Viresh Kumar wrote:
This patchset adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors and core. This is required for early suspend and late resume of governors and cpufreq core.
Patches 1-3,6, Tested-by: Stephen Warren swarren@nvidia.com
Patch 6, Acked-by: Stephen Warren swarren@nvidia.com
Thanks.
On 11/25/2013 08:11 AM, Viresh Kumar wrote:
This patchset adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors and core. This is required for early suspend and late resume of governors and cpufreq core.
There are multiple problems that are fixed by this patch:
Nishanth Menon (TI) found an interesting problem on his platform, OMAP. His board wasn't working well with suspend/resume as calls for removing non-boot CPUs was turning out into a call to drivers ->target() which then tries to play with regulators. But regulators and their I2C bus were already suspended and this resulted in a failure. Many platforms have such problems, samsung, tegra, etc.. They solved it with driver specific PM notifiers where they used to disable their driver's ->target() routine. Most of these are updated in this patchset to use new infrastructure.
Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found another issue where tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is also fixed with this patch as don't allow any operation on Governors during suspend/resume now.
So to solve these issues we introduce early suspend and late resume callbacks which would remove need of cpufreq drivers to implement PM notifiers to disable transition after suspend and before resume.
@Nishanth: Can you please test V2 as well and confirm that suspend_noirq() doesn't work for you. I am sure it will not, but would be better if you confirm that.
Viresh Kumar (6): cpufreq: suspend governors on system suspend/hibernate cpufreq: call driver's suspend/resume for each policy
patches 1-2, Tested-by: Nishanth Menon nm@ti.com http://pastebin.mozilla.org/3670932
Prior to these two patches: http://pastebin.mozilla.org/3670933 cpufreq driver used: cpufreq_cpu0
On 2013年11月25日 22:11, Viresh Kumar wrote:
This patchset adds cpufreq callbacks to dpm_{suspend|resume}() for handling suspend/resume of cpufreq governors and core. This is required for early suspend and late resume of governors and cpufreq core.
There are multiple problems that are fixed by this patch:
Nishanth Menon (TI) found an interesting problem on his platform, OMAP. His board wasn't working well with suspend/resume as calls for removing non-boot CPUs was turning out into a call to drivers ->target() which then tries to play with regulators. But regulators and their I2C bus were already suspended and this resulted in a failure. Many platforms have such problems, samsung, tegra, etc.. They solved it with driver specific PM notifiers where they used to disable their driver's ->target() routine. Most of these are updated in this patchset to use new infrastructure.
Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found another issue where tunables configuration for clusters/sockets with non-boot CPUs was getting lost after suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that policy and so deallocating memory for tunables. This is also fixed with this patch as don't allow any operation on Governors during suspend/resume now.
So to solve these issues we introduce early suspend and late resume callbacks which would remove need of cpufreq drivers to implement PM notifiers to disable transition after suspend and before resume.
@Nishanth: Can you please test V2 as well and confirm that suspend_noirq() doesn't work for you. I am sure it will not, but would be better if you confirm that.
Viresh Kumar (6): cpufreq: suspend governors on system suspend/hibernate cpufreq: call driver's suspend/resume for each policy cpufreq: Implement cpufreq_generic_suspend() cpufreq: exynos: Use cpufreq_generic_suspend() cpufreq: s5pv210: Use cpufreq_generic_suspend() cpufreq: Tegra: Use cpufreq_generic_suspend()
Patch 1-2, Tested-by: Lan Tianyu tianyu.lan@intel.com
drivers/base/power/main.c | 5 ++ drivers/cpufreq/cpufreq.c | 133 +++++++++++++++++++++----------------- drivers/cpufreq/exynos-cpufreq.c | 97 ++------------------------- drivers/cpufreq/s5pv210-cpufreq.c | 49 +------------- drivers/cpufreq/tegra-cpufreq.c | 54 ++-------------- include/linux/cpufreq.h | 6 ++ 6 files changed, 99 insertions(+), 245 deletions(-)
linaro-kernel@lists.linaro.org