Hi Rafael,
The aim of this series is to stop managing cpufreq sysfs directories on CPU hotplugs.
Currently on removal of a 'cpu != policy->cpu', we remove its sysfs directories by removing the soft-link. And on removal of policy->cpu, we migrate the sysfs directories to the next cpu. But if policy->cpu was the last CPU, we remove the policy completely and allocate it again as soon as the CPUs come back. This has shortcomings:
- Code Complexity - Slower hotplug - sysfs file permissions are reset after all policy->cpus are offlined - CPUFreq stats history lost after all policy->cpus are offlined - Special management of sysfs stuff during suspend/resume
To make things simple we stop playing with sysfs files unless the driver is getting removed. Also the policy is kept intact to be used later.
First few patches provide a clean base for others *more important* patches.
Rebased-over: your bleeding edge branch as there were dependencies on my earlier patches.
Pushed here:
git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/core/sysfs
v1->V2: - Dropped the idea of using policy-lists for getting policy for any cpu - Also dropped fallback list and its per-cpu variable - Stopped cleaning cpufreq_cpu_data and doing list_del(policy) on logical hotplug. - Added support for physical hotplug of CPUs (Untested).
@Srivatsa: Can you please have a look at the above change? I have cc'd you only on this one.
Saravana Kannan (1): cpufreq: Track cpu managing sysfs kobjects separately
Cc: Srivatsa Bhat srivatsa@mit.edu
Viresh Kumar (19): cpufreq: Add doc style comment about cpufreq_cpu_{get|put}() cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() cpufreq: Throw warning when we try to get policy for an invalid CPU cpufreq: Keep a single path for adding managed CPUs cpufreq: Clear policy->cpus even for the last CPU cpufreq: Create for_each_{in}active_policy() cpufreq: Call schedule_work() for the last active policy cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies cpufreq: Get rid of cpufreq_cpu_data_fallback cpufreq: Don't traverse list of all policies for adding policy for a cpu cpufreq: Manage governor usage history with 'policy->last_governor' cpufreq: Mark policy->governor = NULL for inactive policies cpufreq: Don't allow updating inactive-policies from sysfs cpufreq: Stop migrating sysfs files on hotplug cpufreq: Remove cpufreq_update_policy() cpufreq: Initialize policy->kobj while allocating policy cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() cpufreq: Restart governor as soon as possible cpufreq: Add support for physical hoplug of CPUs
drivers/cpufreq/cpufreq.c | 593 ++++++++++++++++++++++++++-------------------- include/linux/cpufreq.h | 5 +- 2 files changed, 340 insertions(+), 258 deletions(-)
This clearly states what the code inside these routines is doing and how these must be used.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 28e59a48b35f..fc51b8fbb0b0 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -207,6 +207,23 @@ struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu) return per_cpu(cpufreq_cpu_data, cpu); }
+/** + * cpufreq_cpu_get: returns policy for a cpu and marks it busy. + * + * @cpu: cpu to find policy for. + * + * This returns policy for 'cpu', returns NULL if it doesn't exist. + * It also increments the kobject reference count to mark it busy and so would + * require a corresponding call to cpufreq_cpu_put() to decrement it back. + * If corresponding call cpufreq_cpu_put() isn't made, the policy wouldn't be + * freed as that depends on the kobj count. + * + * It also takes a read-lock of 'cpufreq_rwsem' and doesn't put it back if a + * valid policy is found. This is done to make sure the driver doesn't get + * unregistered while the policy is being used. + * + * Return: A valid policy on success, otherwise NULL on failure. + */ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) { struct cpufreq_policy *policy = NULL; @@ -237,6 +254,16 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) } EXPORT_SYMBOL_GPL(cpufreq_cpu_get);
+/** + * cpufreq_cpu_put: Decrements the usage count of a policy + * + * @policy: policy earlier returned by cpufreq_cpu_get(). + * + * This decrements the kobject reference count incremented earlier by calling + * cpufreq_cpu_get(). + * + * It also drops the read-lock of 'cpufreq_rwsem' taken at cpufreq_cpu_get(). + */ void cpufreq_cpu_put(struct cpufreq_policy *policy) { kobject_put(&policy->kobj);
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
This clearly states what the code inside these routines is doing and how these must be used.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 28e59a48b35f..fc51b8fbb0b0 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -207,6 +207,23 @@ struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu) return per_cpu(cpufreq_cpu_data, cpu); }
+/**
- cpufreq_cpu_get: returns policy for a cpu and marks it busy.
- @cpu: cpu to find policy for.
- This returns policy for 'cpu', returns NULL if it doesn't exist.
- It also increments the kobject reference count to mark it busy and so would
- require a corresponding call to cpufreq_cpu_put() to decrement it back.
- If corresponding call cpufreq_cpu_put() isn't made, the policy wouldn't be
- freed as that depends on the kobj count.
- It also takes a read-lock of 'cpufreq_rwsem' and doesn't put it back if a
- valid policy is found. This is done to make sure the driver doesn't get
- unregistered while the policy is being used.
- Return: A valid policy on success, otherwise NULL on failure.
- */ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) { struct cpufreq_policy *policy = NULL;
@@ -237,6 +254,16 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) } EXPORT_SYMBOL_GPL(cpufreq_cpu_get);
+/**
- cpufreq_cpu_put: Decrements the usage count of a policy
- @policy: policy earlier returned by cpufreq_cpu_get().
- This decrements the kobject reference count incremented earlier by calling
- cpufreq_cpu_get().
- It also drops the read-lock of 'cpufreq_rwsem' taken at cpufreq_cpu_get().
- */ void cpufreq_cpu_put(struct cpufreq_policy *policy) { kobject_put(&policy->kobj);
Acked-by: Saravana Kannan skannan@codeaurora.org
-Saravana
cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge them.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fc51b8fbb0b0..4884caf92bff 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1125,7 +1125,16 @@ static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, return 0; }
-static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) +/** + * cpufreq_add_dev - add a CPU device + * + * Adds the cpufreq interface for a CPU device. + * + * The Oracle says: try running cpufreq registration/unregistration concurrently + * with with cpu hotplugging and all hell will break loose. Tried to clean this + * mess up, but more thorough testing is needed. - Mathieu + */ +static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM; @@ -1336,20 +1345,6 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return ret; }
-/** - * cpufreq_add_dev - add a CPU device - * - * Adds the cpufreq interface for a CPU device. - * - * The Oracle says: try running cpufreq registration/unregistration concurrently - * with with cpu hotplugging and all hell will break loose. Tried to clean this - * mess up, but more thorough testing is needed. - Mathieu - */ -static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) -{ - return __cpufreq_add_dev(dev, sif); -} - static int __cpufreq_remove_dev_prepare(struct device *dev, struct subsys_interface *sif) { @@ -2328,7 +2323,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb, if (dev) { switch (action & ~CPU_TASKS_FROZEN) { case CPU_ONLINE: - __cpufreq_add_dev(dev, NULL); + cpufreq_add_dev(dev, NULL); break;
case CPU_DOWN_PREPARE: @@ -2340,7 +2335,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb, break;
case CPU_DOWN_FAILED: - __cpufreq_add_dev(dev, NULL); + cpufreq_add_dev(dev, NULL); break; } }
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge them.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fc51b8fbb0b0..4884caf92bff 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1125,7 +1125,16 @@ static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, return 0; }
-static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) +/**
- cpufreq_add_dev - add a CPU device
- Adds the cpufreq interface for a CPU device.
- The Oracle says: try running cpufreq registration/unregistration concurrently
- with with cpu hotplugging and all hell will break loose. Tried to clean this
- mess up, but more thorough testing is needed. - Mathieu
- */
+static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM; @@ -1336,20 +1345,6 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return ret; }
-/**
- cpufreq_add_dev - add a CPU device
- Adds the cpufreq interface for a CPU device.
- The Oracle says: try running cpufreq registration/unregistration concurrently
- with with cpu hotplugging and all hell will break loose. Tried to clean this
- mess up, but more thorough testing is needed. - Mathieu
- */
-static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) -{
- return __cpufreq_add_dev(dev, sif);
-}
- static int __cpufreq_remove_dev_prepare(struct device *dev, struct subsys_interface *sif) {
@@ -2328,7 +2323,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb, if (dev) { switch (action & ~CPU_TASKS_FROZEN) { case CPU_ONLINE:
__cpufreq_add_dev(dev, NULL);
cpufreq_add_dev(dev, NULL); break;
case CPU_DOWN_PREPARE:
@@ -2340,7 +2335,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb, break;
case CPU_DOWN_FAILED:
__cpufreq_add_dev(dev, NULL);
} }cpufreq_add_dev(dev, NULL); break;
Acked-by: Saravana Kannan skannan@codeaurora.org
-Saravana
Simply returning here with an error is not enough. It shouldn't be allowed at all to try calling cpufreq_cpu_get() for an invalid CPU.
Add a WARN here to make it clear that it wouldn't be acceptable at all.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4884caf92bff..9da8ed5bdd21 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -229,7 +229,7 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) struct cpufreq_policy *policy = NULL; unsigned long flags;
- if (cpu >= nr_cpu_ids) + if (WARN_ON(cpu >= nr_cpu_ids)) return NULL;
if (!down_read_trylock(&cpufreq_rwsem))
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
Simply returning here with an error is not enough. It shouldn't be allowed at all to try calling cpufreq_cpu_get() for an invalid CPU.
Add a WARN here to make it clear that it wouldn't be acceptable at all.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4884caf92bff..9da8ed5bdd21 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -229,7 +229,7 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu) struct cpufreq_policy *policy = NULL; unsigned long flags;
- if (cpu >= nr_cpu_ids)
if (WARN_ON(cpu >= nr_cpu_ids)) return NULL;
if (!down_read_trylock(&cpufreq_rwsem))
Acked-by: Saravana Kannan skannan@codeaurora.org
-Saravana
There are two cases when we may try to add already managed CPUs: - On boot, the first cpu has marked all policy->cpus managed and so we will find policy for all other policy->cpus later on. - When a managed cpu is hotplugged out and later brought back in.
Currently we manage these two with separate paths and checks. While the first one is detected by testing cpu against 'policy->cpus', the other one is detected by testing cpu against 'policy->related_cpus'.
We can manage both these via a single path and there is no need to do special checking for the first one.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 9da8ed5bdd21..d06d1241a900 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -992,6 +992,10 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, int ret = 0; unsigned long flags;
+ /* Is cpu already managed ? */ + if (cpumask_test_cpu(cpu, policy->cpus)) + return 0; + if (has_target()) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); if (ret) { @@ -1147,16 +1151,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
pr_debug("adding CPU %u\n", cpu);
- /* check whether a different CPU already registered this - * CPU because it is in the same boat. */ - policy = cpufreq_cpu_get_raw(cpu); - if (unlikely(policy)) - return 0; - if (!down_read_trylock(&cpufreq_rwsem)) return 0;
- /* Check if this cpu was hot-unplugged earlier and has siblings */ + /* Check if this cpu already has a policy to manage it */ read_lock_irqsave(&cpufreq_driver_lock, flags); for_each_policy(policy) { if (cpumask_test_cpu(cpu, policy->related_cpus)) {
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
There are two cases when we may try to add already managed CPUs:
The term manages CPUs is a bit unclear/confusing. When I hear managed CPUs, I think CPUs that are already marked in policy->cpus and the governor has already been started on them.
Can you clarify this commit text please?
- On boot, the first cpu has marked all policy->cpus managed and so we will find policy for all other policy->cpus later on.
- When a managed cpu is hotplugged out and later brought back in.
Currently we manage these two with separate paths and checks. While the first one is detected by testing cpu against 'policy->cpus', the other one is detected by testing cpu against 'policy->related_cpus'.
We can manage both these via a single path and there is no need to do special checking for the first one.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 9da8ed5bdd21..d06d1241a900 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -992,6 +992,10 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, int ret = 0; unsigned long flags;
- /* Is cpu already managed ? */
- if (cpumask_test_cpu(cpu, policy->cpus))
return 0;
- if (has_target()) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); if (ret) {
@@ -1147,16 +1151,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
pr_debug("adding CPU %u\n", cpu);
/* check whether a different CPU already registered this
* CPU because it is in the same boat. */
policy = cpufreq_cpu_get_raw(cpu);
if (unlikely(policy))
return 0;
if (!down_read_trylock(&cpufreq_rwsem)) return 0;
/* Check if this cpu was hot-unplugged earlier and has siblings */
- /* Check if this cpu already has a policy to manage it */ read_lock_irqsave(&cpufreq_driver_lock, flags); for_each_policy(policy) { if (cpumask_test_cpu(cpu, policy->related_cpus)) {
If comment text is fixed, Acked-by: Saravana Kannan skannan@codeaurora.org
-Saravana
On 20 March 2015 at 06:07, Saravana Kannan skannan@codeaurora.org wrote:
The term manages CPUs is a bit unclear/confusing. When I hear managed CPUs, I think CPUs that are already marked in policy->cpus and the governor has already been started on them.
Can you clarify this commit text please?
Sure.
We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs except the last CPU of the policy. I don't remember what the rationale behind that was, but I couldn't think of anything that will break if we remove this conditional clearing and always clear policy->cpus.
The benefit we get out of it is, we can know if a policy is active or not by checking if this field is empty or not. That will be used by later commits.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d06d1241a900..6ed87d02d293 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1430,9 +1430,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
down_write(&policy->rwsem); cpus = cpumask_weight(policy->cpus); - - if (cpus > 1) - cpumask_clear_cpu(cpu, policy->cpus); + cpumask_clear_cpu(cpu, policy->cpus); up_write(&policy->rwsem);
/* If cpu is last user of policy, free policy */
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs except the last CPU of the policy. I don't remember what the rationale behind that was, but I couldn't think of anything that will break if we remove this conditional clearing and always clear policy->cpus.
The benefit we get out of it is, we can know if a policy is active or not by checking if this field is empty or not. That will be used by later commits.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d06d1241a900..6ed87d02d293 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1430,9 +1430,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
down_write(&policy->rwsem); cpus = cpumask_weight(policy->cpus);
- if (cpus > 1)
cpumask_clear_cpu(cpu, policy->cpus);
cpumask_clear_cpu(cpu, policy->cpus); up_write(&policy->rwsem);
/* If cpu is last user of policy, free policy */
Acked-by: Saravana Kannan skannan@codeaurora.org
-Saravana
policy->cpus is cleared unconditionally now on hotplug-out of a CPU and it can be checked to know if a policy is active or inactive. Create helper routines to iterate over all active/inactive policies, based on policy->cpus field.
Replace all instances of for_each_policy() with for_each_active_policy() to make them iterate only for active policies. (We haven't made changes yet to keep inactive policies in the same list, but that will be followed in a later patch).
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 81 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 72 insertions(+), 9 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 6ed87d02d293..d3f9ce3b94d3 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -32,11 +32,74 @@ #include <trace/events/power.h>
/* Macros to iterate over lists */ -/* Iterate over online CPUs policies */ static LIST_HEAD(cpufreq_policy_list); +static inline bool policy_is_inactive(struct cpufreq_policy *policy) +{ + return cpumask_empty(policy->cpus); +} + +/* + * Finds Next Acive/Inactive policy + * + * policy: Previous policy. + * active: Looking for active policy (true) or Inactive policy (false). + */ +static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, + bool active) +{ + while (1) { + if (likely(policy)) + policy = list_next_entry(policy, policy_list); + else + policy = list_first_entry(&cpufreq_policy_list, + typeof(*policy), policy_list); + + /* No more policies */ + if (&policy->policy_list == &cpufreq_policy_list) + return policy; + + /* + * Table to explains logic behind following expression: + * + * Active policy_is_inactive Result + * (policy or next) + * + * 0 0 next + * 0 1 policy + * 1 0 policy + * 1 1 next + */ + if (active ^ policy_is_inactive(policy)) + return policy; + }; +} + +/* + * Iterate over online CPUs policies + * + * Explanation: + * - expr1: marks __temp NULL and gets the first __active policy. + * - expr2: checks if list is finished, if yes then it sets __policy to the last + * __active policy and returns 0 to end the loop. + * - expr3: preserves last __active policy and gets the next one. + */ +#define __for_each_active_policy(__policy, __temp, __active) \ + for (__temp = NULL, __policy = next_policy(NULL, __active); \ + &__policy->policy_list != &cpufreq_policy_list || \ + ((__policy = __temp) && 0); \ + __temp = __policy, __policy = next_policy(__policy, __active)) + #define for_each_policy(__policy) \ list_for_each_entry(__policy, &cpufreq_policy_list, policy_list)
+/* + * Routines to iterate over active or inactive policies + * __policy: next active/inactive policy will be returned in this. + * __temp: for internal purpose, not to be used by caller. + */ +#define for_each_active_policy(__policy, __temp) __for_each_active_policy(__policy, __temp, true) +#define for_each_inactive_policy(__policy, __temp) __for_each_active_policy(__policy, __temp, false) + /* Iterate over governors */ static LIST_HEAD(cpufreq_governor_list); #define for_each_governor(__governor) \ @@ -1142,7 +1205,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM; - struct cpufreq_policy *policy; + struct cpufreq_policy *policy, *tpolicy; unsigned long flags; bool recover_policy = cpufreq_suspended;
@@ -1156,7 +1219,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
/* Check if this cpu already has a policy to manage it */ read_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_policy(policy) { + for_each_active_policy(policy, tpolicy) { if (cpumask_test_cpu(cpu, policy->related_cpus)) { read_unlock_irqrestore(&cpufreq_driver_lock, flags); ret = cpufreq_add_policy_cpu(policy, cpu, dev); @@ -1664,7 +1727,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend); */ void cpufreq_suspend(void) { - struct cpufreq_policy *policy; + struct cpufreq_policy *policy, *tpolicy;
if (!cpufreq_driver) return; @@ -1674,7 +1737,7 @@ void cpufreq_suspend(void)
pr_debug("%s: Suspending Governors\n", __func__);
- for_each_policy(policy) { + for_each_active_policy(policy, tpolicy) { if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP)) pr_err("%s: Failed to stop governor for policy: %p\n", __func__, policy); @@ -1696,7 +1759,7 @@ void cpufreq_suspend(void) */ void cpufreq_resume(void) { - struct cpufreq_policy *policy; + struct cpufreq_policy *policy, *tpolicy;
if (!cpufreq_driver) return; @@ -1708,7 +1771,7 @@ void cpufreq_resume(void)
pr_debug("%s: Resuming Governors\n", __func__);
- for_each_policy(policy) { + for_each_active_policy(policy, tpolicy) { if (cpufreq_driver->resume && cpufreq_driver->resume(policy)) pr_err("%s: Failed to resume driver: %p\n", __func__, policy); @@ -2348,10 +2411,10 @@ static struct notifier_block __refdata cpufreq_cpu_notifier = { static int cpufreq_boost_set_sw(int state) { struct cpufreq_frequency_table *freq_table; - struct cpufreq_policy *policy; + struct cpufreq_policy *policy, *tpolicy; int ret = -EINVAL;
- for_each_policy(policy) { + for_each_active_policy(policy, tpolicy) { freq_table = cpufreq_frequency_get_table(policy->cpu); if (freq_table) { ret = cpufreq_frequency_table_cpuinfo(policy,
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
policy->cpus is cleared unconditionally now on hotplug-out of a CPU and it can be checked to know if a policy is active or inactive. Create helper routines to iterate over all active/inactive policies, based on policy->cpus field.
Replace all instances of for_each_policy() with for_each_active_policy() to make them iterate only for active policies. (We haven't made changes yet to keep inactive policies in the same list, but that will be followed in a later patch).
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 81 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 72 insertions(+), 9 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 6ed87d02d293..d3f9ce3b94d3 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -32,11 +32,74 @@ #include <trace/events/power.h>
/* Macros to iterate over lists */ -/* Iterate over online CPUs policies */ static LIST_HEAD(cpufreq_policy_list); +static inline bool policy_is_inactive(struct cpufreq_policy *policy) +{
- return cpumask_empty(policy->cpus);
+}
+/*
- Finds Next Acive/Inactive policy
- policy: Previous policy.
- active: Looking for active policy (true) or Inactive policy (false).
- */
+static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
bool active)
+{
- while (1) {
I don't like while(1) unless it's really meant to be an infinite loop. I think a do while would work here and also be more compact and readable.
if (likely(policy))
policy = list_next_entry(policy, policy_list);
else
policy = list_first_entry(&cpufreq_policy_list,
typeof(*policy), policy_list);
Can't you just move this part into expr1? That would make it much clear/easier to understand
/* No more policies */
if (&policy->policy_list == &cpufreq_policy_list)
return policy;
I'm kinda confused by the fact that you return policy here unconditionally. I think it's a bug. No? Eg: Is there's only one policy in the system and you are looking for an inactive policy. Wouldn't this return the only policy -- the one that's active.
/*
* Table to explains logic behind following expression:
*
* Active policy_is_inactive Result
* (policy or next)
*
* 0 0 next
* 0 1 policy
* 1 0 policy
* 1 1 next
*/
if (active ^ policy_is_inactive(policy))
return policy;
- };
+}
+/*
- Iterate over online CPUs policies
- Explanation:
- expr1: marks __temp NULL and gets the first __active policy.
- expr2: checks if list is finished, if yes then it sets __policy to the last
__active policy and returns 0 to end the loop.
- expr3: preserves last __active policy and gets the next one.
- */
+#define __for_each_active_policy(__policy, __temp, __active) \
- for (__temp = NULL, __policy = next_policy(NULL, __active); \
&__policy->policy_list != &cpufreq_policy_list || \
((__policy = __temp) && 0); \
__temp = __policy, __policy = next_policy(__policy, __active))
- #define for_each_policy(__policy) \ list_for_each_entry(__policy, &cpufreq_policy_list, policy_list)
+/*
- Routines to iterate over active or inactive policies
- __policy: next active/inactive policy will be returned in this.
- __temp: for internal purpose, not to be used by caller.
- */
+#define for_each_active_policy(__policy, __temp) __for_each_active_policy(__policy, __temp, true) +#define for_each_inactive_policy(__policy, __temp) __for_each_active_policy(__policy, __temp, false)
- /* Iterate over governors */ static LIST_HEAD(cpufreq_governor_list); #define for_each_governor(__governor) \
Stuff below this looks fine.
@@ -1142,7 +1205,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM;
- struct cpufreq_policy *policy;
- struct cpufreq_policy *policy, *tpolicy; unsigned long flags; bool recover_policy = cpufreq_suspended;
<snip>
-Saravana
On 20 March 2015 at 06:31, Saravana Kannan skannan@codeaurora.org wrote:
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
+static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
bool active)
+{
while (1) {
I don't like while(1) unless it's really meant to be an infinite loop. I
I don't hate it that much, and neither does other parts of kernel :)
think a do while would work here and also be more compact and readable.
So you want this:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d3f9ce3b94d3..ecbd8c2118c2 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,7 +47,7 @@ static inline bool policy_is_inactive(struct cpufreq_policy *policy) static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, bool active) { - while (1) { + do { if (likely(policy)) policy = list_next_entry(policy, policy_list); else @@ -69,9 +69,9 @@ static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, * 1 0 policy * 1 1 next */ - if (active ^ policy_is_inactive(policy)) - return policy; - }; + } while (!(active ^ policy_is_inactive(policy))); + + return policy; }
Not sure which one looked better :)
if (likely(policy))
policy = list_next_entry(policy, policy_list);
else
policy = list_first_entry(&cpufreq_policy_list,
typeof(*policy),
policy_list);
Can't you just move this part into expr1? That would make it much clear/easier to understand
No, because we want that for-loop to iterate over active/inactive policies only, and we need to run this routine to find it..
For ex: - We want to iterate over active policies only - The first policy of the list is inactive - The change you are suggesting will break things here..
/* No more policies */
if (&policy->policy_list == &cpufreq_policy_list)
return policy;
I'm kinda confused by the fact that you return policy here unconditionally. I think it's a bug. No? Eg: Is there's only one policy in the system and you are looking for an inactive policy. Wouldn't this return the only policy -- the one that's active.
No. What we are returning here isn't a real policy actually but an container-of of the HEAD of the list, so it only has a valid ->policy_list value, others might give us a crash. See how list_next_entry() works :)
On 03/19/2015 09:41 PM, Viresh Kumar wrote:
On 20 March 2015 at 06:31, Saravana Kannan skannan@codeaurora.org wrote:
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
+static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
bool active)
+{
while (1) {
I don't like while(1) unless it's really meant to be an infinite loop. I
I don't hate it that much, and neither does other parts of kernel :)
think a do while would work here and also be more compact and readable.
So you want this:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d3f9ce3b94d3..ecbd8c2118c2 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,7 +47,7 @@ static inline bool policy_is_inactive(struct cpufreq_policy *policy) static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, bool active) {
while (1) {
do { if (likely(policy)) policy = list_next_entry(policy, policy_list); else
@@ -69,9 +69,9 @@ static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, * 1 0 policy * 1 1 next */
if (active ^ policy_is_inactive(policy))
return policy;
};
} while (!(active ^ policy_is_inactive(policy)));
}return policy;
Yes please!! The other uses like inside a thread seem more reasonable to me.
Not sure which one looked better :)
if (likely(policy))
policy = list_next_entry(policy, policy_list);
else
policy = list_first_entry(&cpufreq_policy_list,
typeof(*policy),
policy_list);
Can't you just move this part into expr1? That would make it much clear/easier to understand
No, because we want that for-loop to iterate over active/inactive policies only, and we need to run this routine to find it..
For ex:
- We want to iterate over active policies only
- The first policy of the list is inactive
- The change you are suggesting will break things here..
Ah, right. Makes sense.
/* No more policies */
if (&policy->policy_list == &cpufreq_policy_list)
return policy;
I'm kinda confused by the fact that you return policy here unconditionally. I think it's a bug. No? Eg: Is there's only one policy in the system and you are looking for an inactive policy. Wouldn't this return the only policy -- the one that's active.
No. What we are returning here isn't a real policy actually but an container-of of the HEAD of the list, so it only has a valid ->policy_list value, others might give us a crash. See how list_next_entry() works :)
I thought the last valid entry is what would point to the list head. Not the actual list head. I'll check again.
-Saravana
On Friday, March 20, 2015 12:18:49 PM Saravana Kannan wrote:
On 03/19/2015 09:41 PM, Viresh Kumar wrote:
On 20 March 2015 at 06:31, Saravana Kannan skannan@codeaurora.org wrote:
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
+static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
bool active)
+{
while (1) {
I don't like while(1) unless it's really meant to be an infinite loop. I
I don't hate it that much, and neither does other parts of kernel :)
think a do while would work here and also be more compact and readable.
So you want this:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d3f9ce3b94d3..ecbd8c2118c2 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,7 +47,7 @@ static inline bool policy_is_inactive(struct cpufreq_policy *policy) static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, bool active) {
while (1) {
do { if (likely(policy)) policy = list_next_entry(policy, policy_list); else
@@ -69,9 +69,9 @@ static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy, * 1 0 policy * 1 1 next */
if (active ^ policy_is_inactive(policy))
return policy;
};
} while (!(active ^ policy_is_inactive(policy)));
}return policy;
Yes please!! The other uses like inside a thread seem more reasonable to me.
Viresh, have you ever sent an update of this one?
On 8 May 2015 at 03:41, Rafael J. Wysocki rjw@rjwysocki.net wrote:
Viresh, have you ever sent an update of this one?
I haven't sent updates for any patches, was waiting for reviews to finish. But will send the new copies now.
We need to call schedule_work() only for the policy belonging to boot CPU, i.e. CPU0. Checking for list_is_last() wouldn't work anymore as there can be inactive policies present in the list after the last active one. 'policy' still points to the last active policy at the termination of the for_each_active_policy() loop, use that to call schedule_work().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d3f9ce3b94d3..e728c796c327 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1779,15 +1779,14 @@ void cpufreq_resume(void) || __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS)) pr_err("%s: Failed to start governor for policy: %p\n", __func__, policy); - - /* - * schedule call cpufreq_update_policy() for boot CPU, i.e. last - * policy in list. It will verify that the current freq is in - * sync with what we believe it to be. - */ - if (list_is_last(&policy->policy_list, &cpufreq_policy_list)) - schedule_work(&policy->update); } + + /* + * schedule call cpufreq_update_policy() for boot CPU, i.e. last + * policy in list. It will verify that the current freq is in + * sync with what we believe it to be. + */ + schedule_work(&policy->update); }
/**
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
We need to call schedule_work() only for the policy belonging to boot CPU, i.e. CPU0. Checking for list_is_last() wouldn't work anymore as there can be inactive policies present in the list after the last active one. 'policy' still points to the last active policy at the termination of the for_each_active_policy() loop, use that to call schedule_work().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d3f9ce3b94d3..e728c796c327 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1779,15 +1779,14 @@ void cpufreq_resume(void) || __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS)) pr_err("%s: Failed to start governor for policy: %p\n", __func__, policy);
/*
* schedule call cpufreq_update_policy() for boot CPU, i.e. last
* policy in list. It will verify that the current freq is in
* sync with what we believe it to be.
*/
if (list_is_last(&policy->policy_list, &cpufreq_policy_list))
}schedule_work(&policy->update);
/*
* schedule call cpufreq_update_policy() for boot CPU, i.e. last
* policy in list. It will verify that the current freq is in
* sync with what we believe it to be.
*/
schedule_work(&policy->update); }
/**
Ok, I don't think this will work after the next patch/end of the series.
Some ground rules/clarification: When the suspend/resume code and this cpufreq_suspend/resume functions talk about boot CPU, they don't really mean the boot CPU. It really means the CPU with the smallest CPU ID when suspend is initiated. For example in a 4 core system, if CPU0 is hotplugged off, then "boot CPU" in the context of suspend/resume is actually CPU1. That's the reason we can't simply do "get policy of CPU0" and the schedule work.
So what this patch really is assuming is that the last active policy in the policy list corresponds to the CPU with the smallest CPU ID when suspend is initiated.
That was true when we added/deleted policies when CPUs are hotplugged on/off (even during lightweight tear-down, we did add/delete). However, at the end of the series, we won't be adding/deleting these policies from the list for each hotplug add/remove. So, the last active policy in the list doesn't always correspond to the online CPU with the smallest CPU ID during suspend/resume.
Here's an example case: * Assume we never physically hot plug the CPUs. * Assume it's a system with 4 independent CPUs. * We boot with just CPU0 active. * So, the policy list will now have: CPU0 policy. * CPU2 is then brought online. * So, the policy list will now have: CPU2 policy, CPU0 policy. * CPU1 is then brought online. * So, the policy list will now have: CPU1 policy CPU2 policy, CPU0 policy. * CPU0 is taken offline. * So, the policy list will now have: CPU1 policy CPU2 policy, CPU0 inactive policy. * Now suspend happens and then we resume. * We really need to be scheduling the work for CPU1 since that's the "boot cpu" for suspend/resume. * But the last active policy is for CPU2 and we schedule work for that.
Nack for the patch, unless I missed some list maintenance code.
-Saravana
On 2 April 2015 at 09:10, Saravana Kannan skannan@codeaurora.org wrote:
Ok, I don't think this will work after the next patch/end of the series.
Sigh, Its broken today as well. See my another patch:
[PATCH] cpufreq: Schedule work for the first-online CPU on resume
This patch is dropped now.
On Thursday, April 02, 2015 10:32:54 AM Viresh Kumar wrote:
On 2 April 2015 at 09:10, Saravana Kannan skannan@codeaurora.org wrote:
Ok, I don't think this will work after the next patch/end of the series.
Sigh, Its broken today as well. See my another patch:
[PATCH] cpufreq: Schedule work for the first-online CPU on resume
This patch is dropped now.
What exactly does this mean?
On 8 May 2015 at 03:43, Rafael J. Wysocki rjw@rjwysocki.net wrote:
What exactly does this mean?
That the $Subject patch is replaced by:
c75de0ac0756 ("cpufreq: Schedule work for the first-online CPU on resume")
as its not that my series has broken what Saravana has found out. But it was broken even without my series.
Now that we can check policy->cpus to find if policy is active or not, we don't need to clean cpufreq_cpu_data and delete policy from the list on light weight tear down of policies (like in suspend).
To make it consistent and clean, set cpufreq_cpu_data for all related CPUs when the policy is first created and clean it only while it is freed.
Also update cpufreq_cpu_get_raw() to check if cpu is part of policy->cpus mask, so that we don't end up getting policies for unmanaged CPUs.
In order to make sure that no users of 'policy' are using an inactive policy, use cpufreq_cpu_get_raw() instead of directly accessing cpufreq_cpu_data.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 79 ++++++++++++++++++++--------------------------- 1 file changed, 34 insertions(+), 45 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e728c796c327..e27b2a7bd9b3 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -250,10 +250,18 @@ int cpufreq_generic_init(struct cpufreq_policy *policy, } EXPORT_SYMBOL_GPL(cpufreq_generic_init);
-unsigned int cpufreq_generic_get(unsigned int cpu) +/* Only for cpufreq core internal use */ +struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu) { struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
+ return policy && cpumask_test_cpu(cpu, policy->cpus) ? policy : NULL; +} + +unsigned int cpufreq_generic_get(unsigned int cpu) +{ + struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu); + if (!policy || IS_ERR(policy->clk)) { pr_err("%s: No %s associated to cpu: %d\n", __func__, policy ? "clk" : "policy", cpu); @@ -264,12 +272,6 @@ unsigned int cpufreq_generic_get(unsigned int cpu) } EXPORT_SYMBOL_GPL(cpufreq_generic_get);
-/* Only for cpufreq core internal use */ -struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu) -{ - return per_cpu(cpufreq_cpu_data, cpu); -} - /** * cpufreq_cpu_get: returns policy for a cpu and marks it busy. * @@ -303,7 +305,7 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu)
if (cpufreq_driver) { /* get the CPU */ - policy = per_cpu(cpufreq_cpu_data, cpu); + policy = cpufreq_cpu_get_raw(cpu); if (policy) kobject_get(&policy->kobj); } @@ -1053,7 +1055,6 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, struct device *dev) { int ret = 0; - unsigned long flags;
/* Is cpu already managed ? */ if (cpumask_test_cpu(cpu, policy->cpus)) @@ -1068,13 +1069,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, }
down_write(&policy->rwsem); - - write_lock_irqsave(&cpufreq_driver_lock, flags); - cpumask_set_cpu(cpu, policy->cpus); - per_cpu(cpufreq_cpu_data, cpu) = policy; - write_unlock_irqrestore(&cpufreq_driver_lock, flags); - up_write(&policy->rwsem);
if (has_target()) { @@ -1165,6 +1160,17 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
static void cpufreq_policy_free(struct cpufreq_policy *policy) { + unsigned long flags; + int cpu; + + /* Remove policy from list */ + write_lock_irqsave(&cpufreq_driver_lock, flags); + list_del(&policy->policy_list); + + for_each_cpu(cpu, policy->related_cpus) + per_cpu(cpufreq_cpu_data, cpu) = NULL; + write_unlock_irqrestore(&cpufreq_driver_lock, flags); + free_cpumask_var(policy->related_cpus); free_cpumask_var(policy->cpus); kfree(policy); @@ -1286,12 +1292,12 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) __func__, ret); goto err_init_policy_kobj; } - }
- write_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_cpu(j, policy->cpus) - per_cpu(cpufreq_cpu_data, j) = policy; - write_unlock_irqrestore(&cpufreq_driver_lock, flags); + write_lock_irqsave(&cpufreq_driver_lock, flags); + for_each_cpu(j, policy->related_cpus) + per_cpu(cpufreq_cpu_data, j) = policy; + write_unlock_irqrestore(&cpufreq_driver_lock, flags); + }
if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); @@ -1350,11 +1356,11 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) goto err_out_unregister; blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_CREATE_POLICY, policy); - }
- write_lock_irqsave(&cpufreq_driver_lock, flags); - list_add(&policy->policy_list, &cpufreq_policy_list); - write_unlock_irqrestore(&cpufreq_driver_lock, flags); + write_lock_irqsave(&cpufreq_driver_lock, flags); + list_add(&policy->policy_list, &cpufreq_policy_list); + write_unlock_irqrestore(&cpufreq_driver_lock, flags); + }
cpufreq_init_policy(policy);
@@ -1378,11 +1384,6 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
err_out_unregister: err_get_freq: - write_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_cpu(j, policy->cpus) - per_cpu(cpufreq_cpu_data, j) = NULL; - write_unlock_irqrestore(&cpufreq_driver_lock, flags); - if (!recover_policy) { kobject_put(&policy->kobj); wait_for_completion(&policy->kobj_unregister); @@ -1418,7 +1419,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
write_lock_irqsave(&cpufreq_driver_lock, flags);
- policy = per_cpu(cpufreq_cpu_data, cpu); + policy = cpufreq_cpu_get_raw(cpu);
/* Save the policy somewhere when doing a light-weight tear-down */ if (cpufreq_suspended) @@ -1476,15 +1477,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, static int __cpufreq_remove_dev_finish(struct device *dev, struct subsys_interface *sif) { - unsigned int cpu = dev->id, cpus; + unsigned int cpu = dev->id; int ret; - unsigned long flags; - struct cpufreq_policy *policy; - - write_lock_irqsave(&cpufreq_driver_lock, flags); - policy = per_cpu(cpufreq_cpu_data, cpu); - per_cpu(cpufreq_cpu_data, cpu) = NULL; - write_unlock_irqrestore(&cpufreq_driver_lock, flags); + struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); @@ -1492,12 +1487,11 @@ static int __cpufreq_remove_dev_finish(struct device *dev, }
down_write(&policy->rwsem); - cpus = cpumask_weight(policy->cpus); cpumask_clear_cpu(cpu, policy->cpus); up_write(&policy->rwsem);
/* If cpu is last user of policy, free policy */ - if (cpus == 1) { + if (policy_is_inactive(policy)) { if (has_target()) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT); @@ -1519,11 +1513,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev, if (cpufreq_driver->exit) cpufreq_driver->exit(policy);
- /* Remove policy from list of active policies */ - write_lock_irqsave(&cpufreq_driver_lock, flags); - list_del(&policy->policy_list); - write_unlock_irqrestore(&cpufreq_driver_lock, flags); - if (!cpufreq_suspended) cpufreq_policy_free(policy); } else if (has_target()) {
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
Now that we can check policy->cpus to find if policy is active or not, we don't need to clean cpufreq_cpu_data and delete policy from the list on light weight tear down of policies (like in suspend).
To make it consistent and clean, set cpufreq_cpu_data for all related CPUs when the policy is first created and clean it only while it is freed.
Also update cpufreq_cpu_get_raw() to check if cpu is part of policy->cpus mask, so that we don't end up getting policies for unmanaged CPUs.
When you say unmanaged CPUs, do you mean offline CPUs? If so, could you use that term instead? The term unmanaged is kinda ambiguous. This comment applies to the entire series.
In order to make sure that no users of 'policy' are using an inactive policy, use cpufreq_cpu_get_raw() instead of directly accessing cpufreq_cpu_data.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 79 ++++++++++++++++++++--------------------------- 1 file changed, 34 insertions(+), 45 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e728c796c327..e27b2a7bd9b3 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -250,10 +250,18 @@ int cpufreq_generic_init(struct cpufreq_policy *policy, } EXPORT_SYMBOL_GPL(cpufreq_generic_init);
<SNIP>
@@ -1053,7 +1055,6 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, struct device *dev) { int ret = 0;
unsigned long flags;
/* Is cpu already managed ? */ if (cpumask_test_cpu(cpu, policy->cpus))
@@ -1068,13 +1069,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, }
down_write(&policy->rwsem);
write_lock_irqsave(&cpufreq_driver_lock, flags);
cpumask_set_cpu(cpu, policy->cpus);
per_cpu(cpufreq_cpu_data, cpu) = policy;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
up_write(&policy->rwsem);
if (has_target()) {
@@ -1165,6 +1160,17 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
static void cpufreq_policy_free(struct cpufreq_policy *policy) {
- unsigned long flags;
- int cpu;
- /* Remove policy from list */
- write_lock_irqsave(&cpufreq_driver_lock, flags);
- list_del(&policy->policy_list);
- for_each_cpu(cpu, policy->related_cpus)
per_cpu(cpufreq_cpu_data, cpu) = NULL;
- write_unlock_irqrestore(&cpufreq_driver_lock, flags);
- free_cpumask_var(policy->related_cpus); free_cpumask_var(policy->cpus); kfree(policy);
@@ -1286,12 +1292,12 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) __func__, ret); goto err_init_policy_kobj; }
}
write_lock_irqsave(&cpufreq_driver_lock, flags);
for_each_cpu(j, policy->cpus)
per_cpu(cpufreq_cpu_data, j) = policy;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
write_lock_irqsave(&cpufreq_driver_lock, flags);
for_each_cpu(j, policy->related_cpus)
per_cpu(cpufreq_cpu_data, j) = policy;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
- }
Why not do this is cpufreq_policy_alloc() to be consistent/symmetric with cpufreq_policy_free()? You had to set cpufreq_cpu_data/add to the list this late in cpufreq_add_dev before because you shouldn't be adding a unfinished policy to the cpufreq_cpu_data/list. But now that you check for policy->cpus before returning it, you may be do all of this in cpufreq_policy_alloc()?
if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); @@ -1350,11 +1356,11 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) goto err_out_unregister; blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_CREATE_POLICY, policy);
}
write_lock_irqsave(&cpufreq_driver_lock, flags);
list_add(&policy->policy_list, &cpufreq_policy_list);
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
write_lock_irqsave(&cpufreq_driver_lock, flags);
list_add(&policy->policy_list, &cpufreq_policy_list);
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
- }
You could probably do this as part of policy alloc too?
cpufreq_init_policy(policy);
@@ -1378,11 +1384,6 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
Rest of the patch looks good.
-Saravana
On 2 April 2015 at 09:44, Saravana Kannan skannan@codeaurora.org wrote:
When you say unmanaged CPUs, do you mean offline CPUs? If so, could you use that term instead? The term unmanaged is kinda ambiguous. This comment applies to the entire series.
Hmm, will take a note of that.
write_lock_irqsave(&cpufreq_driver_lock, flags);
for_each_cpu(j, policy->related_cpus)
per_cpu(cpufreq_cpu_data, j) = policy;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
}
Why not do this is cpufreq_policy_alloc() to be consistent/symmetric with cpufreq_policy_free()? You had to set cpufreq_cpu_data/add to the list this late in cpufreq_add_dev before because you shouldn't be adding a unfinished policy to the cpufreq_cpu_data/list. But now that you check for policy->cpus before returning it, you may be do all of this in cpufreq_policy_alloc()?
We don't know related_cpus until the time ->init() is called and that happens after alloc()..
write_lock_irqsave(&cpufreq_driver_lock, flags);
list_add(&policy->policy_list, &cpufreq_policy_list);
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
}
You could probably do this as part of policy alloc too?
I wouldn't like to add a unfinished policy (which may or maynot be simultaneously referred by other parts of code) to the list.
We can extract the same information from cpufreq_cpu_data as it is also available for inactive policies now.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e27b2a7bd9b3..f849b2a33d3e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -112,7 +112,6 @@ static LIST_HEAD(cpufreq_governor_list); */ static struct cpufreq_driver *cpufreq_driver; static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data); -static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data_fallback); static DEFINE_RWLOCK(cpufreq_driver_lock); DEFINE_MUTEX(cpufreq_governor_lock);
@@ -1092,13 +1091,14 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) unsigned long flags;
read_lock_irqsave(&cpufreq_driver_lock, flags); - - policy = per_cpu(cpufreq_cpu_data_fallback, cpu); - + policy = per_cpu(cpufreq_cpu_data, cpu); read_unlock_irqrestore(&cpufreq_driver_lock, flags);
- if (policy) + if (likely(policy)) { + /* Policy should be inactive here */ + WARN_ON(!policy_is_inactive(policy)); policy->governor = NULL; + }
return policy; } @@ -1394,11 +1394,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) if (cpufreq_driver->exit) cpufreq_driver->exit(policy); err_set_policy_cpu: - if (recover_policy) { - /* Do not leave stale fallback data behind. */ - per_cpu(cpufreq_cpu_data_fallback, cpu) = NULL; + if (recover_policy) cpufreq_policy_put_kobj(policy); - } cpufreq_policy_free(policy);
nomem_out: @@ -1412,21 +1409,11 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, { unsigned int cpu = dev->id, cpus; int ret; - unsigned long flags; struct cpufreq_policy *policy;
pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
- write_lock_irqsave(&cpufreq_driver_lock, flags); - policy = cpufreq_cpu_get_raw(cpu); - - /* Save the policy somewhere when doing a light-weight tear-down */ - if (cpufreq_suspended) - per_cpu(cpufreq_cpu_data_fallback, cpu) = policy; - - write_unlock_irqrestore(&cpufreq_driver_lock, flags); - if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); return -EINVAL;
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
We can extract the same information from cpufreq_cpu_data as it is also available for inactive policies now.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e27b2a7bd9b3..f849b2a33d3e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -112,7 +112,6 @@ static LIST_HEAD(cpufreq_governor_list); */ static struct cpufreq_driver *cpufreq_driver; static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data); -static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data_fallback); static DEFINE_RWLOCK(cpufreq_driver_lock); DEFINE_MUTEX(cpufreq_governor_lock);
@@ -1092,13 +1091,14 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) unsigned long flags;
read_lock_irqsave(&cpufreq_driver_lock, flags);
- policy = per_cpu(cpufreq_cpu_data_fallback, cpu);
- policy = per_cpu(cpufreq_cpu_data, cpu); read_unlock_irqrestore(&cpufreq_driver_lock, flags);
- if (policy)
if (likely(policy)) {
/* Policy should be inactive here */
WARN_ON(!policy_is_inactive(policy));
policy->governor = NULL;
}
return policy; }
@@ -1394,11 +1394,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) if (cpufreq_driver->exit) cpufreq_driver->exit(policy); err_set_policy_cpu:
- if (recover_policy) {
/* Do not leave stale fallback data behind. */
per_cpu(cpufreq_cpu_data_fallback, cpu) = NULL;
- if (recover_policy) cpufreq_policy_put_kobj(policy);
} cpufreq_policy_free(policy);
nomem_out:
@@ -1412,21 +1409,11 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, { unsigned int cpu = dev->id, cpus; int ret;
unsigned long flags; struct cpufreq_policy *policy;
pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
write_lock_irqsave(&cpufreq_driver_lock, flags);
policy = cpufreq_cpu_get_raw(cpu);
/* Save the policy somewhere when doing a light-weight tear-down */
if (cpufreq_suspended)
per_cpu(cpufreq_cpu_data_fallback, cpu) = policy;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);
if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); return -EINVAL;
Acked-by: Saravana Kannan skannan@codeaurora.org
We reach here while adding policy for a CPU and enter into the 'if' block only if a policy already exists for the CPU in question. As cpufreq_cpu_data is filled once and for all now, we can reuse it to find the relevant policy instead of traversing the list of all policies.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index f849b2a33d3e..20d5f4590c4b 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1211,7 +1211,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM; - struct cpufreq_policy *policy, *tpolicy; + struct cpufreq_policy *policy; unsigned long flags; bool recover_policy = cpufreq_suspended;
@@ -1224,16 +1224,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return 0;
/* Check if this cpu already has a policy to manage it */ - read_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_active_policy(policy, tpolicy) { - if (cpumask_test_cpu(cpu, policy->related_cpus)) { - read_unlock_irqrestore(&cpufreq_driver_lock, flags); - ret = cpufreq_add_policy_cpu(policy, cpu, dev); - up_read(&cpufreq_rwsem); - return ret; - } + policy = per_cpu(cpufreq_cpu_data, cpu); + if (policy && !policy_is_inactive(policy)) { + WARN_ON(!cpumask_test_cpu(cpu, policy->related_cpus)); + ret = cpufreq_add_policy_cpu(policy, cpu, dev); + up_read(&cpufreq_rwsem); + return ret; } - read_unlock_irqrestore(&cpufreq_driver_lock, flags);
/* * Restore the saved policy when doing light-weight init and fall back
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
We reach here while adding policy for a CPU and enter into the 'if' block only if a policy already exists for the CPU in question. As cpufreq_cpu_data is filled once and for all now, we can reuse it to find the relevant policy instead of traversing the list of all policies.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index f849b2a33d3e..20d5f4590c4b 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1211,7 +1211,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) { unsigned int j, cpu = dev->id; int ret = -ENOMEM;
- struct cpufreq_policy *policy, *tpolicy;
- struct cpufreq_policy *policy; unsigned long flags; bool recover_policy = cpufreq_suspended;
@@ -1224,16 +1224,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return 0;
/* Check if this cpu already has a policy to manage it */
- read_lock_irqsave(&cpufreq_driver_lock, flags);
- for_each_active_policy(policy, tpolicy) {
if (cpumask_test_cpu(cpu, policy->related_cpus)) {
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
ret = cpufreq_add_policy_cpu(policy, cpu, dev);
up_read(&cpufreq_rwsem);
return ret;
}
- policy = per_cpu(cpufreq_cpu_data, cpu);
- if (policy && !policy_is_inactive(policy)) {
WARN_ON(!cpumask_test_cpu(cpu, policy->related_cpus));
ret = cpufreq_add_policy_cpu(policy, cpu, dev);
up_read(&cpufreq_rwsem);
}return ret;
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
/*
- Restore the saved policy when doing light-weight init and fall back
Acked-by: Saravana Kannan skannan@codeaurora.org
History of which governor was used last is common to all CPUs within a policy and maintaining it per-cpu isn't the best approach for sure.
Apart from wasting memory, this also increases the complexity of managing this data structure as it has to be updated for all CPUs.
To make that somewhat simpler, lets store this information in a new field 'last_governor' in struct cpufreq_policy and update it on removal of last cpu of a policy.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 30 +++++++++++++++--------------- include/linux/cpufreq.h | 1 + 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 20d5f4590c4b..b7cd1bf97044 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -115,9 +115,6 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data); static DEFINE_RWLOCK(cpufreq_driver_lock); DEFINE_MUTEX(cpufreq_governor_lock);
-/* This one keeps track of the previously set governor of a removed CPU */ -static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); - /* Flag to suspend/resume CPUFreq governors */ static bool cpufreq_suspended;
@@ -1028,7 +1025,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy) memcpy(&new_policy, policy, sizeof(*policy));
/* Update governor of new_policy to the governor used before hotplug */ - gov = find_governor(per_cpu(cpufreq_cpu_governor, policy->cpu)); + gov = find_governor(policy->last_governor); if (gov) pr_debug("Restoring governor %s for cpu %d\n", policy->governor->name, policy->cpu); @@ -1422,14 +1419,15 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, pr_err("%s: Failed to stop governor\n", __func__); return ret; } - - strncpy(per_cpu(cpufreq_cpu_governor, cpu), - policy->governor->name, CPUFREQ_NAME_LEN); }
- down_read(&policy->rwsem); + down_write(&policy->rwsem); cpus = cpumask_weight(policy->cpus); - up_read(&policy->rwsem); + + if (has_target() && cpus == 1) + strncpy(policy->last_governor, policy->governor->name, + CPUFREQ_NAME_LEN); + up_write(&policy->rwsem);
if (cpu != policy->cpu) { sysfs_remove_link(&dev->kobj, "cpufreq"); @@ -2142,7 +2140,8 @@ EXPORT_SYMBOL_GPL(cpufreq_register_governor);
void cpufreq_unregister_governor(struct cpufreq_governor *governor) { - int cpu; + struct cpufreq_policy *policy, *tpolicy; + unsigned long flags;
if (!governor) return; @@ -2150,12 +2149,13 @@ void cpufreq_unregister_governor(struct cpufreq_governor *governor) if (cpufreq_disabled()) return;
- for_each_present_cpu(cpu) { - if (cpu_online(cpu)) - continue; - if (!strcmp(per_cpu(cpufreq_cpu_governor, cpu), governor->name)) - strcpy(per_cpu(cpufreq_cpu_governor, cpu), "\0"); + /* clear last_governor for all inactive policies */ + read_lock_irqsave(&cpufreq_driver_lock, flags); + for_each_inactive_policy(policy, tpolicy) { + if (!strcmp(policy->last_governor, governor->name)) + strcpy(policy->last_governor, "\0"); } + read_unlock_irqrestore(&cpufreq_driver_lock, flags);
mutex_lock(&cpufreq_governor_mutex); list_del(&governor->governor_list); diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 2ee4888c1f47..48e37c07eb84 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -80,6 +80,7 @@ struct cpufreq_policy { struct cpufreq_governor *governor; /* see below */ void *governor_data; bool governor_enabled; /* governor start/stop flag */ + char last_governor[CPUFREQ_NAME_LEN]; /* last governor used */
struct work_struct update; /* if update_policy() needs to be * called, but you're in IRQ context */
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
History of which governor was used last is common to all CPUs within a policy and maintaining it per-cpu isn't the best approach for sure.
Apart from wasting memory, this also increases the complexity of managing this data structure as it has to be updated for all CPUs.
To make that somewhat simpler, lets store this information in a new field 'last_governor' in struct cpufreq_policy and update it on removal of last cpu of a policy.
This governor restore code actually has bug -- at least in 3.12 (I think) but probably has a bug in this kernel too. So, this patch actually also fixes a bug.
The bug has to do with restoring the wrong governor (as in, not the governor that was last picked by userspace) for a particular hot plug sequence.
Assume one cluster has CPUs 2, 3. Assume default governor is performance. CPU2 is first brought online. Governor is changed to ondemand. CPU2 is taken offline. CPU3 is brought online. Governor gets set to performance. So, governor for the cluster is now not what was last picked by userspace.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 30 +++++++++++++++--------------- include/linux/cpufreq.h | 1 + 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 20d5f4590c4b..b7cd1bf97044 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -115,9 +115,6 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data); static DEFINE_RWLOCK(cpufreq_driver_lock); DEFINE_MUTEX(cpufreq_governor_lock);
-/* This one keeps track of the previously set governor of a removed CPU */ -static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
- /* Flag to suspend/resume CPUFreq governors */ static bool cpufreq_suspended;
@@ -1028,7 +1025,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy) memcpy(&new_policy, policy, sizeof(*policy));
/* Update governor of new_policy to the governor used before hotplug */
- gov = find_governor(per_cpu(cpufreq_cpu_governor, policy->cpu));
- gov = find_governor(policy->last_governor); if (gov) pr_debug("Restoring governor %s for cpu %d\n", policy->governor->name, policy->cpu);
@@ -1422,14 +1419,15 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, pr_err("%s: Failed to stop governor\n", __func__); return ret; }
strncpy(per_cpu(cpufreq_cpu_governor, cpu),
policy->governor->name, CPUFREQ_NAME_LEN);
}
down_read(&policy->rwsem);
- down_write(&policy->rwsem); cpus = cpumask_weight(policy->cpus);
- up_read(&policy->rwsem);
if (has_target() && cpus == 1)
strncpy(policy->last_governor, policy->governor->name,
CPUFREQ_NAME_LEN);
up_write(&policy->rwsem);
if (cpu != policy->cpu) { sysfs_remove_link(&dev->kobj, "cpufreq");
@@ -2142,7 +2140,8 @@ EXPORT_SYMBOL_GPL(cpufreq_register_governor);
void cpufreq_unregister_governor(struct cpufreq_governor *governor) {
- int cpu;
struct cpufreq_policy *policy, *tpolicy;
unsigned long flags;
if (!governor) return;
@@ -2150,12 +2149,13 @@ void cpufreq_unregister_governor(struct cpufreq_governor *governor) if (cpufreq_disabled()) return;
- for_each_present_cpu(cpu) {
if (cpu_online(cpu))
continue;
if (!strcmp(per_cpu(cpufreq_cpu_governor, cpu), governor->name))
strcpy(per_cpu(cpufreq_cpu_governor, cpu), "\0");
- /* clear last_governor for all inactive policies */
- read_lock_irqsave(&cpufreq_driver_lock, flags);
You are writing to a variable that's protected. This should be a write lock.
for_each_inactive_policy(policy, tpolicy) {
if (!strcmp(policy->last_governor, governor->name))
strcpy(policy->last_governor, "\0");
}
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
mutex_lock(&cpufreq_governor_mutex); list_del(&governor->governor_list);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 2ee4888c1f47..48e37c07eb84 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -80,6 +80,7 @@ struct cpufreq_policy { struct cpufreq_governor *governor; /* see below */ void *governor_data; bool governor_enabled; /* governor start/stop flag */
char last_governor[CPUFREQ_NAME_LEN]; /* last governor used */
struct work_struct update; /* if update_policy() needs to be * called, but you're in IRQ context */
Nack due to lock.
-Saravana
On 2 April 2015 at 10:04, Saravana Kannan skannan@codeaurora.org wrote:
This governor restore code actually has bug -- at least in 3.12 (I think) but probably has a bug in this kernel too. So, this patch actually also fixes a bug.
The bug has to do with restoring the wrong governor (as in, not the governor that was last picked by userspace) for a particular hot plug sequence.
Assume one cluster has CPUs 2, 3. Assume default governor is performance. CPU2 is first brought online. Governor is changed to ondemand. CPU2 is taken offline. CPU3 is brought online. Governor gets set to performance. So, governor for the cluster is now not what was last picked by userspace.
Wasn't aware of that, you are right. Updated my logs for this.
for_each_present_cpu(cpu) {
if (cpu_online(cpu))
continue;
if (!strcmp(per_cpu(cpufreq_cpu_governor, cpu),
governor->name))
strcpy(per_cpu(cpufreq_cpu_governor, cpu), "\0");
/* clear last_governor for all inactive policies */
read_lock_irqsave(&cpufreq_driver_lock, flags);
You are writing to a variable that's protected. This should be a write lock.
cpufreq_driver_lock isn't responsible for protecting policy fields, but the list of policies. And so was a read lock here.
for_each_inactive_policy(policy, tpolicy) {
if (!strcmp(policy->last_governor, governor->name))
strcpy(policy->last_governor, "\0"); }
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
And we don't need to take writelock to policy->rwsem here as the policy is already inactive.
Nack due to lock.
Just to clear the concepts, and not fighting with you :)
I don't think you used Nack correctly here. You have raised some mistakes in the patch here, and its not that the patch and the idea behind it is not acceptable.
So, probably Nack was a bit harsh :)
Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed on suspend), and while the CPU is offline, the sysfs cpufreq files would still be present.
Because we don't mark policy->governor as NULL, it still contains pointer of the last governor it used. And when we read the 'scaling_governor' file, it shows the old value.
To prevent from this, mark policy->governor as NULL after we have issued CPUFREQ_GOV_POLICY_EXIT event for its governor.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b7cd1bf97044..cab4cfdd3ebb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1094,7 +1094,6 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) if (likely(policy)) { /* Policy should be inactive here */ WARN_ON(!policy_is_inactive(policy)); - policy->governor = NULL; }
return policy; @@ -1497,6 +1496,8 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
if (!cpufreq_suspended) cpufreq_policy_free(policy); + else + policy->governor = NULL; } else if (has_target()) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_START); if (!ret)
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed on suspend), and while the CPU is offline, the sysfs cpufreq files would still be present.
Because we don't mark policy->governor as NULL, it still contains pointer of the last governor it used. And when we read the 'scaling_governor' file, it shows the old value.
What's wrong with this behavior? If you cat scaling_min/max_freq, it's going to show the last minimum and maximum freqs too. I don't think this needs to be set to NULL until the governor is unloaded. Which you are already taking care of in the previous patch.
It's handy to be able to tell what governor will be restored when a cluster if offline.
Nacked because I think this patch is not helping.
-Saravana
On 2 April 2015 at 10:08, Saravana Kannan skannan@codeaurora.org wrote:
What's wrong with this behavior? If you cat scaling_min/max_freq, it's going to show the last minimum and maximum freqs too. I don't think this needs to be set to NULL until the governor is unloaded. Which you are already taking care of in the previous patch.
It's handy to be able to tell what governor will be restored when a cluster if offline.
Nacked because I think this patch is not helping.
Oh, I really believe this patch makes sense. :)
Don't confuse it with the 'last-governor' thing or scaling_min/max stuff.. Its different.
Consider a scenario: - Cluster was using ONDEMAND governor. - Cluster hotplugged out - Governor removed, but the pointer policy->governor isn't updated.
-> At this point accessing 'scaling_governor' or 'scaling_setspeed' can get called and that will result in a crash.
- We go ahead and insert the governor again
-> At this point also the same problem will happen as governor will get allocated again.
Now, what we are talking about here is a pointer to the governor, not its name which is stored in 'last_governor' in the previous patch.
We store that name as it is used for internal working of the core, not for userspace.
What we expose to userspace must be consistent, and so if the governor is removed, we will not be able to provide 'scaling_setspeed' or 'scaling_governor' to userspace.
Also, it is scaling-governor, not what the next governor is going to be. Plus, what should we print on scaling_setspeed of a cluster not running anymore ?
And so why keep something that we can't provide to userspace all the time. That may break userspace if it relies on it..
On 04/01/2015 11:09 PM, Viresh Kumar wrote:
On 2 April 2015 at 10:08, Saravana Kannan skannan@codeaurora.org wrote:
What's wrong with this behavior? If you cat scaling_min/max_freq, it's going to show the last minimum and maximum freqs too. I don't think this needs to be set to NULL until the governor is unloaded. Which you are already taking care of in the previous patch.
It's handy to be able to tell what governor will be restored when a cluster if offline.
Nacked because I think this patch is not helping.
Oh, I really believe this patch makes sense. :)
Don't confuse it with the 'last-governor' thing or scaling_min/max stuff.. Its different.
Ok, I agree that I misread your other patch and mentally read it as setting both last_governor to "" and the governor pointer to NULL. I think if you set the governor to NULL in the same for loop, you would solve the issues you mention below without compromising the case where the governor is not removed and you still let userspace query what the last governor is without having to online a CPU.
I think that's way more user friendly without and real negatives. We shouldn't make it any less user friendly than it needs to be.
Consider a scenario:
- Cluster was using ONDEMAND governor.
- Cluster hotplugged out
- Governor removed, but the pointer policy->governor isn't updated.
-> At this point accessing 'scaling_governor' or 'scaling_setspeed' can get called and that will result in a crash.
- We go ahead and insert the governor again
-> At this point also the same problem will happen as governor will get allocated again.
Now, what we are talking about here is a pointer to the governor, not its name which is stored in 'last_governor' in the previous patch.
We store that name as it is used for internal working of the core, not for userspace.
What we expose to userspace must be consistent, and so if the governor is removed, we will not be able to provide 'scaling_setspeed' or 'scaling_governor' to userspace.
Also, it is scaling-governor, not what the next governor is going to be. Plus, what should we print on scaling_setspeed of a cluster not running anymore ?
The same could be said for all the files in that directory. What does it mean to say scaling_min_freq for a CPU that's hotplugged out. And the argument can be make for cases when the CPU is power collapsed too. What does it mean to ask for "current frequency" when it's powered off. The point is that these files are representing what will be done when the CPUs are clocked. Not what's happening at this absolute instance.
And so why keep something that we can't provide to userspace all the time. That may break userspace if it relies on it..
Because there are plenty of use cases where the module isn't going to be removed since it's not even a module. So, still letting userspace figure out what the last used governor was and what will it come up with is still useful. And this isn't going to break userspace any more than returning an error all the time.
I would strongly prefer that this is not done and just clear out the pointer if/when the governor module is removed.
Nack
Thanks, Saravana
On 4 April 2015 at 06:50, Saravana Kannan skannan@codeaurora.org wrote:
I would strongly prefer that this is not done and just clear out the pointer if/when the governor module is removed.
Okay, I give up ..
This looks fine ?
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4604ad5d9c0c..1f285b06d473 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1094,7 +1094,6 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) if (likely(policy)) { /* Policy should be inactive here */ WARN_ON(!policy_is_inactive(policy)); - policy->governor = NULL; }
return policy; @@ -2156,8 +2155,10 @@ void cpufreq_unregister_governor(struct cpufreq_governor *governor) /* clear last_governor for all inactive policies */ read_lock_irqsave(&cpufreq_driver_lock, flags); for_each_inactive_policy(policy, tpolicy) { - if (!strcmp(policy->last_governor, governor->name)) + if (!strcmp(policy->last_governor, governor->name)) { + policy->governor = NULL; strcpy(policy->last_governor, "\0"); + } } read_unlock_irqrestore(&cpufreq_driver_lock, flags);
Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed only for suspend), and while the CPU is offline, the sysfs cpufreq files would still be present.
User may accidentally try to update the sysfs files in following directory: '/sys/devices/system/cpu/cpuX/cpufreq/'. And that would result in undefined behavior as policy wouldn't be active then.
Apart from updating the store() routine, we also update __cpufreq_get() which can call cpufreq_out_of_sync(). The later routine tries to update policy->cur and starts notifying kernel about it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index cab4cfdd3ebb..155e6ff2fa85 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -886,11 +886,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
down_write(&policy->rwsem);
+ /* + * Policy might not be active currently, and so we shouldn't try + * updating any values here. policy->cpus is cleared for inactive policy + * and so cpufreq_cpu_get_raw() should fail. + */ + if (unlikely(policy_is_inactive(policy))) { + ret = -EPERM; + goto unlock_policy_rwsem; + } + if (fattr->store) ret = fattr->store(policy, buf, count); else ret = -EIO;
+unlock_policy_rwsem: up_write(&policy->rwsem);
up_read(&cpufreq_rwsem); @@ -1620,6 +1631,14 @@ static unsigned int __cpufreq_get(struct cpufreq_policy *policy)
ret_freq = cpufreq_driver->get(policy->cpu);
+ /* + * Policy might not be active currently, and so we shouldn't try + * updating any values here. policy->cpus is cleared for inactive policy + * and so cpufreq_cpu_get_raw() should fail. + */ + if (unlikely(policy_is_inactive(policy))) + return ret_freq; + if (ret_freq && policy->cur && !(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) { /* verify no discrepancy between actual and
From: Saravana Kannan skannan@codeaurora.org
In order to prepare for the next few commits, that will stop migrating sysfs files on cpu hotplug, this patch starts managing sysfs-cpu separately.
The behavior is still the same as we are still migrating sysfs files on hotplug, later commits would change that.
Signed-off-by: Saravana Kannan skannan@codeaurora.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 11 +++++++---- include/linux/cpufreq.h | 4 +++- 2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 155e6ff2fa85..fff08145d9ff 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -981,7 +981,7 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy) for_each_cpu(j, policy->cpus) { struct device *cpu_dev;
- if (j == policy->cpu) + if (j == policy->kobj_cpu) continue;
pr_debug("Adding link for CPU: %u\n", j); @@ -1200,6 +1200,7 @@ static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu,
down_write(&policy->rwsem); policy->cpu = cpu; + policy->kobj_cpu = cpu; up_write(&policy->rwsem);
return 0; @@ -1257,10 +1258,12 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */ - if (recover_policy && cpu != policy->cpu) + if (recover_policy && cpu != policy->cpu) { WARN_ON(update_policy_cpu(policy, cpu, dev)); - else + } else { policy->cpu = cpu; + policy->kobj_cpu = cpu; + }
cpumask_copy(policy->cpus, cpumask_of(cpu));
@@ -1439,7 +1442,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, CPUFREQ_NAME_LEN); up_write(&policy->rwsem);
- if (cpu != policy->cpu) { + if (cpu != policy->kobj_cpu) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { /* Nominate new CPU */ diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 48e37c07eb84..29ad97c34fd5 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -65,7 +65,9 @@ struct cpufreq_policy {
unsigned int shared_type; /* ACPI: ANY or ALL affected CPUs should set cpufreq */ - unsigned int cpu; /* cpu nr of CPU managing this policy */ + unsigned int cpu; /* cpu managing this policy, must be online */ + unsigned int kobj_cpu; /* cpu managing sysfs files, can be offline */ + struct clk *clk; struct cpufreq_cpuinfo cpuinfo;/* see above */
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
From: Saravana Kannan skannan@codeaurora.org
In order to prepare for the next few commits, that will stop migrating sysfs files on cpu hotplug, this patch starts managing sysfs-cpu separately.
The behavior is still the same as we are still migrating sysfs files on hotplug, later commits would change that.
Signed-off-by: Saravana Kannan skannan@codeaurora.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 11 +++++++---- include/linux/cpufreq.h | 4 +++- 2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 155e6ff2fa85..fff08145d9ff 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -981,7 +981,7 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy) for_each_cpu(j, policy->cpus) { struct device *cpu_dev;
if (j == policy->cpu)
if (j == policy->kobj_cpu) continue;
pr_debug("Adding link for CPU: %u\n", j);
@@ -1200,6 +1200,7 @@ static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu,
down_write(&policy->rwsem); policy->cpu = cpu;
policy->kobj_cpu = cpu; up_write(&policy->rwsem);
return 0;
@@ -1257,10 +1258,12 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */
- if (recover_policy && cpu != policy->cpu)
- if (recover_policy && cpu != policy->cpu) { WARN_ON(update_policy_cpu(policy, cpu, dev));
- else
} else { policy->cpu = cpu;
policy->kobj_cpu = cpu;
}
cpumask_copy(policy->cpus, cpumask_of(cpu));
@@ -1439,7 +1442,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, CPUFREQ_NAME_LEN); up_write(&policy->rwsem);
- if (cpu != policy->cpu) {
- if (cpu != policy->kobj_cpu) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { /* Nominate new CPU */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 48e37c07eb84..29ad97c34fd5 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -65,7 +65,9 @@ struct cpufreq_policy {
unsigned int shared_type; /* ACPI: ANY or ALL affected CPUs should set cpufreq */
- unsigned int cpu; /* cpu nr of CPU managing this policy */
- unsigned int cpu; /* cpu managing this policy, must be online */
- unsigned int kobj_cpu; /* cpu managing sysfs files, can be offline */
- struct clk *clk; struct cpufreq_cpuinfo cpuinfo;/* see above */
Acked?
-Saravana
On 2 April 2015 at 10:10, Saravana Kannan skannan@codeaurora.org wrote:
Acked?
Yeah, you can Ack the final version..
When we hot-unplug a cpu, we remove its sysfs cpufreq directory and if the outgoing cpu was the owner of policy->kobj earlier then we migrate the sysfs directory to under another online cpu.
There are few disadvantages this brings: - Code Complexity - Slower hotplug/suspend/resume - sysfs file permissions are reset after all policy->cpus are offlined - CPUFreq stats history lost after all policy->cpus are offlined - Special management of sysfs stuff during suspend/resume
To overcome these, this patch modifies the way sysfs directories are managed: - Select sysfs kobjects owner while initializing policy and don't change it during hotplugs. Track it with kobj_cpu created earlier.
- Create symlinks for all related CPUs (can be offline) instead of affected CPUs on policy initialization and remove them only when the policy is freed.
- Free policy structure only on the removal of cpufreq-driver and not during hotplug/suspend/resume, detected by checking 'struct subsys_interface *' (Valid only when called from subsys_interface_unregister() while unregistering driver).
[ Something similar attempted by Saravana earlier ]
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 158 ++++++++++++++++++++++------------------------ 1 file changed, 74 insertions(+), 84 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fff08145d9ff..29cb952960fa 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -972,25 +972,36 @@ void cpufreq_sysfs_remove_file(const struct attribute *attr) } EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
-/* symlink affected CPUs */ -static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy) +/* Add/remove symlinks for all related CPUs */ +static int cpufreq_add_remove_dev_symlink(struct cpufreq_policy *policy, + bool add) { unsigned int j; int ret = 0;
- for_each_cpu(j, policy->cpus) { + for_each_cpu(j, policy->related_cpus) { struct device *cpu_dev;
if (j == policy->kobj_cpu) continue;
- pr_debug("Adding link for CPU: %u\n", j); + pr_debug("%s: %s symlink for CPU: %u\n", __func__, + add ? "Adding" : "Removing", j); + cpu_dev = get_cpu_device(j); - ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, - "cpufreq"); - if (ret) - break; + if (WARN_ON(!cpu_dev)) + continue; + + if (add) { + ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, + "cpufreq"); + if (ret) + break; + } else { + sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); + } } + return ret; }
@@ -1024,7 +1035,7 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy, return ret; }
- return cpufreq_add_dev_symlink(policy); + return cpufreq_add_remove_dev_symlink(policy, true); }
static void cpufreq_init_policy(struct cpufreq_policy *policy) @@ -1090,7 +1101,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, } }
- return sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); + return 0; }
static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) @@ -1110,7 +1121,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) return policy; }
-static struct cpufreq_policy *cpufreq_policy_alloc(void) +static struct cpufreq_policy *cpufreq_policy_alloc(int cpu) { struct cpufreq_policy *policy;
@@ -1131,6 +1142,11 @@ static struct cpufreq_policy *cpufreq_policy_alloc(void) init_completion(&policy->kobj_unregister); INIT_WORK(&policy->update, handle_update);
+ policy->cpu = cpu; + + /* Set this once on allocation */ + policy->kobj_cpu = cpu; + return policy;
err_free_cpumask: @@ -1149,10 +1165,11 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy) blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_REMOVE_POLICY, policy);
- down_read(&policy->rwsem); + down_write(&policy->rwsem); + cpufreq_add_remove_dev_symlink(policy, false); kobj = &policy->kobj; cmp = &policy->kobj_unregister; - up_read(&policy->rwsem); + up_write(&policy->rwsem); kobject_put(kobj);
/* @@ -1183,27 +1200,14 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) kfree(policy); }
-static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, - struct device *cpu_dev) +static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu) { - int ret; - if (WARN_ON(cpu == policy->cpu)) - return 0; - - /* Move kobject to the new policy->cpu */ - ret = kobject_move(&policy->kobj, &cpu_dev->kobj); - if (ret) { - pr_err("%s: Failed to move kobj: %d\n", __func__, ret); - return ret; - } + return;
down_write(&policy->rwsem); policy->cpu = cpu; - policy->kobj_cpu = cpu; up_write(&policy->rwsem); - - return 0; }
/** @@ -1221,7 +1225,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) int ret = -ENOMEM; struct cpufreq_policy *policy; unsigned long flags; - bool recover_policy = cpufreq_suspended; + bool recover_policy = !sif;
if (cpu_is_offline(cpu)) return 0; @@ -1247,7 +1251,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) policy = recover_policy ? cpufreq_policy_restore(cpu) : NULL; if (!policy) { recover_policy = false; - policy = cpufreq_policy_alloc(); + policy = cpufreq_policy_alloc(cpu); if (!policy) goto nomem_out; } @@ -1258,12 +1262,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */ - if (recover_policy && cpu != policy->cpu) { - WARN_ON(update_policy_cpu(policy, cpu, dev)); - } else { - policy->cpu = cpu; - policy->kobj_cpu = cpu; - } + if (recover_policy && cpu != policy->cpu) + update_policy_cpu(policy, cpu);
cpumask_copy(policy->cpus, cpumask_of(cpu));
@@ -1442,29 +1442,14 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, CPUFREQ_NAME_LEN); up_write(&policy->rwsem);
- if (cpu != policy->kobj_cpu) { - sysfs_remove_link(&dev->kobj, "cpufreq"); - } else if (cpus > 1) { - /* Nominate new CPU */ - int new_cpu = cpumask_any_but(policy->cpus, cpu); - struct device *cpu_dev = get_cpu_device(new_cpu); - - sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); - ret = update_policy_cpu(policy, new_cpu, cpu_dev); - if (ret) { - if (sysfs_create_link(&cpu_dev->kobj, &policy->kobj, - "cpufreq")) - pr_err("%s: Failed to restore kobj link to cpu:%d\n", - __func__, cpu_dev->id); - return ret; - } + if (cpu != policy->cpu) + return 0;
- if (!cpufreq_suspended) - pr_debug("%s: policy Kobject moved to cpu: %d from: %d\n", - __func__, new_cpu, cpu); - } else if (cpufreq_driver->stop_cpu) { + if (cpus > 1) + /* Nominate new CPU */ + update_policy_cpu(policy, cpumask_any_but(policy->cpus, cpu)); + else if (cpufreq_driver->stop_cpu) cpufreq_driver->stop_cpu(policy); - }
return 0; } @@ -1485,34 +1470,11 @@ static int __cpufreq_remove_dev_finish(struct device *dev, cpumask_clear_cpu(cpu, policy->cpus); up_write(&policy->rwsem);
- /* If cpu is last user of policy, free policy */ - if (policy_is_inactive(policy)) { - if (has_target()) { - ret = __cpufreq_governor(policy, - CPUFREQ_GOV_POLICY_EXIT); - if (ret) { - pr_err("%s: Failed to exit governor\n", - __func__); - return ret; - } - } + /* Not the last cpu of policy, start governor again ? */ + if (!policy_is_inactive(policy)) { + if (!has_target()) + return 0;
- if (!cpufreq_suspended) - cpufreq_policy_put_kobj(policy); - - /* - * Perform the ->exit() even during light-weight tear-down, - * since this is a core component, and is essential for the - * subsequent light-weight ->init() to succeed. - */ - if (cpufreq_driver->exit) - cpufreq_driver->exit(policy); - - if (!cpufreq_suspended) - cpufreq_policy_free(policy); - else - policy->governor = NULL; - } else if (has_target()) { ret = __cpufreq_governor(policy, CPUFREQ_GOV_START); if (!ret) ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); @@ -1521,8 +1483,36 @@ static int __cpufreq_remove_dev_finish(struct device *dev, pr_err("%s: Failed to start governor\n", __func__); return ret; } + + return 0; + } + + /* If cpu is last user of policy, free policy */ + if (has_target()) { + ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT); + if (ret) { + pr_err("%s: Failed to exit governor\n", __func__); + return ret; + } }
+ /* Free the policy kobjects only if the driver is getting removed. */ + if (sif) + cpufreq_policy_put_kobj(policy); + + /* + * Perform the ->exit() even during light-weight tear-down, + * since this is a core component, and is essential for the + * subsequent light-weight ->init() to succeed. + */ + if (cpufreq_driver->exit) + cpufreq_driver->exit(policy); + + if (sif) + cpufreq_policy_free(policy); + else + policy->governor = NULL; + return 0; }
cpufreq_update_policy() was kept as a separate routine earlier as it was handling migration of sysfs directories, which isn't done anymore. It is only updating policy->cpu now and can be removed.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 32 ++++++++++---------------------- 1 file changed, 10 insertions(+), 22 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 29cb952960fa..814a8c70e2b1 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1116,6 +1116,10 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) if (likely(policy)) { /* Policy should be inactive here */ WARN_ON(!policy_is_inactive(policy)); + + down_write(&policy->rwsem); + policy->cpu = cpu; + up_write(&policy->rwsem); }
return policy; @@ -1200,16 +1204,6 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) kfree(policy); }
-static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu) -{ - if (WARN_ON(cpu == policy->cpu)) - return; - - down_write(&policy->rwsem); - policy->cpu = cpu; - up_write(&policy->rwsem); -} - /** * cpufreq_add_dev - add a CPU device * @@ -1256,15 +1250,6 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) goto nomem_out; }
- /* - * In the resume path, since we restore a saved policy, the assignment - * to policy->cpu is like an update of the existing policy, rather than - * the creation of a brand new one. So we need to perform this update - * by invoking update_policy_cpu(). - */ - if (recover_policy && cpu != policy->cpu) - update_policy_cpu(policy, cpu); - cpumask_copy(policy->cpus, cpumask_of(cpu));
/* call driver. From then on the cpufreq must be able @@ -1445,11 +1430,14 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, if (cpu != policy->cpu) return 0;
- if (cpus > 1) + if (cpus > 1) { /* Nominate new CPU */ - update_policy_cpu(policy, cpumask_any_but(policy->cpus, cpu)); - else if (cpufreq_driver->stop_cpu) + down_write(&policy->rwsem); + policy->cpu = cpumask_any_but(policy->cpus, cpu); + up_write(&policy->rwsem); + } else if (cpufreq_driver->stop_cpu) { cpufreq_driver->stop_cpu(policy); + }
return 0; }
policy->kobj is required to be initialized once in the lifetime of a policy. Currently we are initializing it from __cpufreq_add_dev() and that doesn't look to be the best place for doing so as we have to do this on special cases (like: !recover_policy).
We can initialize it from a more obvious place cpufreq_policy_alloc() and that will make code look cleaner, specially the error handling part.
The error handling part of __cpufreq_add_dev() was doing almost the same thing while recover_policy is true or false. Fix that as well by always calling cpufreq_policy_put_kobj() with an additional parameter to skip notification part of it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 44 ++++++++++++++++++++------------------------ 1 file changed, 20 insertions(+), 24 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 814a8c70e2b1..4fb5041ebfab 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1125,9 +1125,10 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu) return policy; }
-static struct cpufreq_policy *cpufreq_policy_alloc(int cpu) +static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev) { struct cpufreq_policy *policy; + int ret;
policy = kzalloc(sizeof(*policy), GFP_KERNEL); if (!policy) @@ -1139,6 +1140,13 @@ static struct cpufreq_policy *cpufreq_policy_alloc(int cpu) if (!zalloc_cpumask_var(&policy->related_cpus, GFP_KERNEL)) goto err_free_cpumask;
+ ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, &dev->kobj, + "cpufreq"); + if (ret) { + pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret); + goto err_free_rcpumask; + } + INIT_LIST_HEAD(&policy->policy_list); init_rwsem(&policy->rwsem); spin_lock_init(&policy->transition_lock); @@ -1146,13 +1154,15 @@ static struct cpufreq_policy *cpufreq_policy_alloc(int cpu) init_completion(&policy->kobj_unregister); INIT_WORK(&policy->update, handle_update);
- policy->cpu = cpu; + policy->cpu = dev->id;
/* Set this once on allocation */ - policy->kobj_cpu = cpu; + policy->kobj_cpu = dev->id;
return policy;
+err_free_rcpumask: + free_cpumask_var(policy->related_cpus); err_free_cpumask: free_cpumask_var(policy->cpus); err_free_policy: @@ -1161,13 +1171,14 @@ static struct cpufreq_policy *cpufreq_policy_alloc(int cpu) return NULL; }
-static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy) +static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify) { struct kobject *kobj; struct completion *cmp;
- blocking_notifier_call_chain(&cpufreq_policy_notifier_list, - CPUFREQ_REMOVE_POLICY, policy); + if (notify) + blocking_notifier_call_chain(&cpufreq_policy_notifier_list, + CPUFREQ_REMOVE_POLICY, policy);
down_write(&policy->rwsem); cpufreq_add_remove_dev_symlink(policy, false); @@ -1245,7 +1256,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) policy = recover_policy ? cpufreq_policy_restore(cpu) : NULL; if (!policy) { recover_policy = false; - policy = cpufreq_policy_alloc(cpu); + policy = cpufreq_policy_alloc(dev); if (!policy) goto nomem_out; } @@ -1276,15 +1287,6 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) policy->user_policy.min = policy->min; policy->user_policy.max = policy->max;
- /* prepare interface data */ - ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, - &dev->kobj, "cpufreq"); - if (ret) { - pr_err("%s: failed to init policy->kobj: %d\n", - __func__, ret); - goto err_init_policy_kobj; - } - write_lock_irqsave(&cpufreq_driver_lock, flags); for_each_cpu(j, policy->related_cpus) per_cpu(cpufreq_cpu_data, j) = policy; @@ -1376,18 +1378,12 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
err_out_unregister: err_get_freq: - if (!recover_policy) { - kobject_put(&policy->kobj); - wait_for_completion(&policy->kobj_unregister); - } -err_init_policy_kobj: up_write(&policy->rwsem);
if (cpufreq_driver->exit) cpufreq_driver->exit(policy); err_set_policy_cpu: - if (recover_policy) - cpufreq_policy_put_kobj(policy); + cpufreq_policy_put_kobj(policy, recover_policy); cpufreq_policy_free(policy);
nomem_out: @@ -1486,7 +1482,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
/* Free the policy kobjects only if the driver is getting removed. */ if (sif) - cpufreq_policy_put_kobj(policy); + cpufreq_policy_put_kobj(policy, true);
/* * Perform the ->exit() even during light-weight tear-down,
cpufreq_policy_put_kobj() is actually part of freeing the policy and can be called from cpufreq_policy_free() directly instead of a separate call.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4fb5041ebfab..b6d5d94a0d3b 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1197,7 +1197,7 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify) pr_debug("wait complete\n"); }
-static void cpufreq_policy_free(struct cpufreq_policy *policy) +static void cpufreq_policy_free(struct cpufreq_policy *policy, bool notify) { unsigned long flags; int cpu; @@ -1210,6 +1210,7 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) per_cpu(cpufreq_cpu_data, cpu) = NULL; write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+ cpufreq_policy_put_kobj(policy, notify); free_cpumask_var(policy->related_cpus); free_cpumask_var(policy->cpus); kfree(policy); @@ -1383,9 +1384,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) if (cpufreq_driver->exit) cpufreq_driver->exit(policy); err_set_policy_cpu: - cpufreq_policy_put_kobj(policy, recover_policy); - cpufreq_policy_free(policy); - + cpufreq_policy_free(policy, recover_policy); nomem_out: up_read(&cpufreq_rwsem);
@@ -1480,10 +1479,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev, } }
- /* Free the policy kobjects only if the driver is getting removed. */ - if (sif) - cpufreq_policy_put_kobj(policy, true); - /* * Perform the ->exit() even during light-weight tear-down, * since this is a core component, and is essential for the @@ -1492,8 +1487,9 @@ static int __cpufreq_remove_dev_finish(struct device *dev, if (cpufreq_driver->exit) cpufreq_driver->exit(policy);
+ /* Free the policy only if the driver is getting removed. */ if (sif) - cpufreq_policy_free(policy); + cpufreq_policy_free(policy, true); else policy->governor = NULL;
On cpu hot-unplug, we don't need to wait for POST_DEAD notification to restart the governor if the policy has atleast one online cpu left. We can restart the governor right from the DOWN_PREPARE notification instead.
[ Something similar attempted by Saravana earlier ]
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 58 ++++++++++++++++++++--------------------------- 1 file changed, 24 insertions(+), 34 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b6d5d94a0d3b..ffb6700c7195 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1394,8 +1394,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) static int __cpufreq_remove_dev_prepare(struct device *dev, struct subsys_interface *sif) { - unsigned int cpu = dev->id, cpus; - int ret; + unsigned int cpu = dev->id; + int ret = 0; struct cpufreq_policy *policy;
pr_debug("%s: unregistering CPU %u\n", __func__, cpu); @@ -1415,26 +1415,33 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, }
down_write(&policy->rwsem); - cpus = cpumask_weight(policy->cpus); + cpumask_clear_cpu(cpu, policy->cpus);
- if (has_target() && cpus == 1) - strncpy(policy->last_governor, policy->governor->name, - CPUFREQ_NAME_LEN); + if (policy_is_inactive(policy)) { + if (has_target()) + strncpy(policy->last_governor, policy->governor->name, + CPUFREQ_NAME_LEN); + } else if (cpu == policy->cpu) { + /* Nominate new CPU */ + policy->cpu = cpumask_any(policy->cpus); + } up_write(&policy->rwsem);
- if (cpu != policy->cpu) - return 0; + /* Start governor again for active policy */ + if (!policy_is_inactive(policy)) { + if (has_target()) { + ret = __cpufreq_governor(policy, CPUFREQ_GOV_START); + if (!ret) + ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
- if (cpus > 1) { - /* Nominate new CPU */ - down_write(&policy->rwsem); - policy->cpu = cpumask_any_but(policy->cpus, cpu); - up_write(&policy->rwsem); + if (ret) + pr_err("%s: Failed to start governor\n", __func__); + } } else if (cpufreq_driver->stop_cpu) { cpufreq_driver->stop_cpu(policy); }
- return 0; + return ret; }
static int __cpufreq_remove_dev_finish(struct device *dev, @@ -1442,33 +1449,16 @@ static int __cpufreq_remove_dev_finish(struct device *dev, { unsigned int cpu = dev->id; int ret; - struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu); + struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); return -EINVAL; }
- down_write(&policy->rwsem); - cpumask_clear_cpu(cpu, policy->cpus); - up_write(&policy->rwsem); - - /* Not the last cpu of policy, start governor again ? */ - if (!policy_is_inactive(policy)) { - if (!has_target()) - return 0; - - ret = __cpufreq_governor(policy, CPUFREQ_GOV_START); - if (!ret) - ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); - - if (ret) { - pr_err("%s: Failed to start governor\n", __func__); - return ret; - } - + /* Only proceed for inactive policies */ + if (!policy_is_inactive(policy)) return 0; - }
/* If cpu is last user of policy, free policy */ if (has_target()) {
It is possible to physically hotplug the CPUs and it happens in this sequence.
Hot removal: - CPU is offlined first, ~ 'echo 0 > /sys/devices/system/cpu/cpuX/online' - Then its device is removed along with all sysfs files, cpufreq core notified with cpufreq_remove_dev() callback from subsys-interface..
Hot addition: - First the device along with its sysfs files is added, cpufreq core notified with cpufreq_add_dev() callback from subsys-interface.. - CPU is onlined, ~ 'echo 1 > /sys/devices/system/cpu/cpuX/online'
This needs to be handled specially as current code isn't taking care of this case. Because we will hit the same routines with both hotplug and subsys callbacks, we can handle most of the stuff with regular hotplug callback paths. We only need to take care of adding/removing cpufreq sysfs links or freeing policy from subsys callbacks.
And that can be sensed easily as cpu would be offline in those cases. This patch adds special code in those paths to take care of policy and its links.
cpufreq_add_remove_dev_symlink() is also broken into another routine add_remove_cpu_dev_symlink() to reuse code at several places.
Cc: Srivatsa Bhat srivatsa@mit.edu Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 87 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 65 insertions(+), 22 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index ffb6700c7195..3213d81a822c 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -972,6 +972,26 @@ void cpufreq_sysfs_remove_file(const struct attribute *attr) } EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
+static inline int add_remove_cpu_dev_symlink(struct cpufreq_policy *policy, + int cpu, bool add) +{ + struct device *cpu_dev; + + pr_debug("%s: %s symlink for CPU: %u\n", __func__, + add ? "Adding" : "Removing", cpu); + + cpu_dev = get_cpu_device(cpu); + if (WARN_ON(!cpu_dev)) + return 0; + + if (add) + return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, + "cpufreq"); + + sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); + return 0; +} + /* Add/remove symlinks for all related CPUs */ static int cpufreq_add_remove_dev_symlink(struct cpufreq_policy *policy, bool add) @@ -979,27 +999,14 @@ static int cpufreq_add_remove_dev_symlink(struct cpufreq_policy *policy, unsigned int j; int ret = 0;
- for_each_cpu(j, policy->related_cpus) { - struct device *cpu_dev; - + /* Some related CPUs might not be present (physically hotplugged) */ + for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) { if (j == policy->kobj_cpu) continue;
- pr_debug("%s: %s symlink for CPU: %u\n", __func__, - add ? "Adding" : "Removing", j); - - cpu_dev = get_cpu_device(j); - if (WARN_ON(!cpu_dev)) - continue; - - if (add) { - ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, - "cpufreq"); - if (ret) - break; - } else { - sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); - } + ret = add_remove_cpu_dev_symlink(policy, j, add); + if (ret) + break; }
return ret; @@ -1233,11 +1240,23 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) unsigned long flags; bool recover_policy = !sif;
- if (cpu_is_offline(cpu)) - return 0; - pr_debug("adding CPU %u\n", cpu);
+ /* + * Only possible if 'cpu' wasn't physically present earlier and we are + * here from subsys_interface add callback. A hotplug notifier will + * follow and we will handle it like logical CPU hotplug then. For now, + * just create the sysfs link. + */ + if (cpu_is_offline(cpu)) { + policy = per_cpu(cpufreq_cpu_data, cpu); + /* No need to create link of the first cpu of a policy */ + if (!policy) + return 0; + + return add_remove_cpu_dev_symlink(policy, cpu, true); + } + if (!down_read_trylock(&cpufreq_rwsem)) return 0;
@@ -1496,8 +1515,32 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) unsigned int cpu = dev->id; int ret;
- if (cpu_is_offline(cpu)) + /* + * Only possible if 'cpu' is getting physically removed now. A hotplug + * notifier should have already been called and we just need to remove + * link or free policy here. + */ + if (cpu_is_offline(cpu)) { + struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu); + struct cpumask mask; + + if (!policy) + return 0; + + /* Prepare mask similar to related-cpus without this 'cpu' */ + cpumask_copy(&mask, policy->related_cpus); + cpumask_clear_cpu(cpu, &mask); + + /* + * Remove link if few CPUs are still present physically, else + * free policy when all are gone. + */ + if (cpumask_intersects(&mask, cpu_present_mask)) + return add_remove_cpu_dev_symlink(policy, cpu, false); + + cpufreq_policy_free(policy, true); return 0; + }
ret = __cpufreq_remove_dev_prepare(dev, sif);
On 19 February 2015 at 17:02, Viresh Kumar viresh.kumar@linaro.org wrote:
Hi Rafael,
The aim of this series is to stop managing cpufreq sysfs directories on CPU hotplugs.
Currently on removal of a 'cpu != policy->cpu', we remove its sysfs directories by removing the soft-link. And on removal of policy->cpu, we migrate the sysfs directories to the next cpu. But if policy->cpu was the last CPU, we remove the policy completely and allocate it again as soon as the CPUs come back. This has shortcomings:
- Code Complexity
- Slower hotplug
- sysfs file permissions are reset after all policy->cpus are offlined
- CPUFreq stats history lost after all policy->cpus are offlined
- Special management of sysfs stuff during suspend/resume
To make things simple we stop playing with sysfs files unless the driver is getting removed. Also the policy is kept intact to be used later.
First few patches provide a clean base for others *more important* patches.
Rebased-over: your bleeding edge branch as there were dependencies on my earlier patches.
Pushed here:
git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/core/sysfs
v1->V2:
- Dropped the idea of using policy-lists for getting policy for any cpu
- Also dropped fallback list and its per-cpu variable
- Stopped cleaning cpufreq_cpu_data and doing list_del(policy) on logical hotplug.
- Added support for physical hotplug of CPUs (Untested).
Gentle reminder for reviews !! :)
On 02/26/2015 09:26 PM, Viresh Kumar wrote:
On 19 February 2015 at 17:02, Viresh Kumar viresh.kumar@linaro.org wrote:
Hi Rafael,
The aim of this series is to stop managing cpufreq sysfs directories on CPU hotplugs.
Currently on removal of a 'cpu != policy->cpu', we remove its sysfs directories by removing the soft-link. And on removal of policy->cpu, we migrate the sysfs directories to the next cpu. But if policy->cpu was the last CPU, we remove the policy completely and allocate it again as soon as the CPUs come back. This has shortcomings:
- Code Complexity
- Slower hotplug
- sysfs file permissions are reset after all policy->cpus are offlined
- CPUFreq stats history lost after all policy->cpus are offlined
- Special management of sysfs stuff during suspend/resume
To make things simple we stop playing with sysfs files unless the driver is getting removed. Also the policy is kept intact to be used later.
First few patches provide a clean base for others *more important* patches.
Rebased-over: your bleeding edge branch as there were dependencies on my earlier patches.
Pushed here:
git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/core/sysfs
v1->V2:
- Dropped the idea of using policy-lists for getting policy for any cpu
- Also dropped fallback list and its per-cpu variable
- Stopped cleaning cpufreq_cpu_data and doing list_del(policy) on logical hotplug.
- Added support for physical hotplug of CPUs (Untested).
Gentle reminder for reviews !! :)
Sorry, caught up on some internal work. Should be able to get to this by Wednesday.
-Saravana
On 28 February 2015 at 08:06, Saravana Kannan skannan@codeaurora.org wrote:
On 02/26/2015 09:26 PM, Viresh Kumar wrote:
Gentle reminder for reviews !! :)
Sorry, caught up on some internal work. Should be able to get to this by Wednesday.
Will it be possible to get some reviews on this? @Rafael: Can you do them if Saravana isn't available? I don't want to miss another kernel release for this stuff.
On 03/16/2015 02:45 AM, Viresh Kumar wrote:
On 28 February 2015 at 08:06, Saravana Kannan skannan@codeaurora.org wrote:
On 02/26/2015 09:26 PM, Viresh Kumar wrote:
Gentle reminder for reviews !! :)
Sorry, caught up on some internal work. Should be able to get to this by Wednesday.
Will it be possible to get some reviews on this? @Rafael: Can you do them if Saravana isn't available? I don't want to miss another kernel release for this stuff.
Sorry for the delay. I'm a little bit more available. I'm trying to look at this right now. But obviously, more eyes the better. So, others should feel free to review/chime in.
-Saravana
On 18 March 2015 at 03:43, Saravana Kannan skannan@codeaurora.org wrote:
Sorry for the delay. I'm a little bit more available. I'm trying to look at this right now. But obviously, more eyes the better. So, others should feel free to review/chime in.
Hi Saravana/Rafael,
Can we please finish the reviews here ?
On Thursday, March 26, 2015 05:29:01 PM Viresh Kumar wrote:
On 18 March 2015 at 03:43, Saravana Kannan skannan@codeaurora.org wrote:
Sorry for the delay. I'm a little bit more available. I'm trying to look at this right now. But obviously, more eyes the better. So, others should feel free to review/chime in.
Hi Saravana/Rafael,
Can we please finish the reviews here ?
My impression was that there had been some comments to address already. Isn't that the case?
On 03/26/2015 01:28 PM, Rafael J. Wysocki wrote:
On Thursday, March 26, 2015 05:29:01 PM Viresh Kumar wrote:
On 18 March 2015 at 03:43, Saravana Kannan skannan@codeaurora.org wrote:
Sorry for the delay. I'm a little bit more available. I'm trying to look at this right now. But obviously, more eyes the better. So, others should feel free to review/chime in.
Hi Saravana/Rafael,
Can we please finish the reviews here ?
My impression was that there had been some comments to address already. Isn't that the case?
I'm not waiting on those. Just some internal fires to put out. Will review more today/tomorrow.
-Saravana
On 27 March 2015 at 01:58, Rafael J. Wysocki rjw@rjwysocki.net wrote:
My impression was that there had been some comments to address already. Isn't that the case?
It was just about coding style, nothing more. Use do-while instead of while(1) :)
On 02/19/2015 03:32 AM, Viresh Kumar wrote:
Hi Rafael,
The aim of this series is to stop managing cpufreq sysfs directories on CPU hotplugs.
Currently on removal of a 'cpu != policy->cpu', we remove its sysfs directories by removing the soft-link. And on removal of policy->cpu, we migrate the sysfs directories to the next cpu. But if policy->cpu was the last CPU, we remove the policy completely and allocate it again as soon as the CPUs come back. This has shortcomings:
- Code Complexity
- Slower hotplug
- sysfs file permissions are reset after all policy->cpus are offlined
- CPUFreq stats history lost after all policy->cpus are offlined
- Special management of sysfs stuff during suspend/resume
To make things simple we stop playing with sysfs files unless the driver is getting removed. Also the policy is kept intact to be used later.
First few patches provide a clean base for others *more important* patches.
Rebased-over: your bleeding edge branch as there were dependencies on my earlier patches.
Pushed here:
git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/core/sysfs
v1->V2:
- Dropped the idea of using policy-lists for getting policy for any cpu
- Also dropped fallback list and its per-cpu variable
- Stopped cleaning cpufreq_cpu_data and doing list_del(policy) on logical hotplug.
- Added support for physical hotplug of CPUs (Untested).
@Srivatsa: Can you please have a look at the above change? I have cc'd you only on this one.
Saravana Kannan (1): cpufreq: Track cpu managing sysfs kobjects separately
Cc: Srivatsa Bhat srivatsa@mit.edu
Viresh Kumar (19): cpufreq: Add doc style comment about cpufreq_cpu_{get|put}() cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() cpufreq: Throw warning when we try to get policy for an invalid CPU cpufreq: Keep a single path for adding managed CPUs cpufreq: Clear policy->cpus even for the last CPU cpufreq: Create for_each_{in}active_policy() cpufreq: Call schedule_work() for the last active policy cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies cpufreq: Get rid of cpufreq_cpu_data_fallback cpufreq: Don't traverse list of all policies for adding policy for a cpu cpufreq: Manage governor usage history with 'policy->last_governor' cpufreq: Mark policy->governor = NULL for inactive policies cpufreq: Don't allow updating inactive-policies from sysfs cpufreq: Stop migrating sysfs files on hotplug cpufreq: Remove cpufreq_update_policy() cpufreq: Initialize policy->kobj while allocating policy cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() cpufreq: Restart governor as soon as possible cpufreq: Add support for physical hoplug of CPUs
drivers/cpufreq/cpufreq.c | 593 ++++++++++++++++++++++++++-------------------- include/linux/cpufreq.h | 5 +- 2 files changed, 340 insertions(+), 258 deletions(-)
Just so it's clear I'm looking at these patches, I'm acking each patch as and when I look at them.
I might come back and nack some of them if I find an issue with patch N while looking at patch N + M. When I ack the last patch in the series, all my previous acks can be considered as final as they can be.
-Saravana
On Thursday, February 19, 2015 05:02:02 PM Viresh Kumar wrote:
Hi Rafael,
The aim of this series is to stop managing cpufreq sysfs directories on CPU hotplugs.
Currently on removal of a 'cpu != policy->cpu', we remove its sysfs directories by removing the soft-link. And on removal of policy->cpu, we migrate the sysfs directories to the next cpu. But if policy->cpu was the last CPU, we remove the policy completely and allocate it again as soon as the CPUs come back. This has shortcomings:
- Code Complexity
- Slower hotplug
- sysfs file permissions are reset after all policy->cpus are offlined
- CPUFreq stats history lost after all policy->cpus are offlined
- Special management of sysfs stuff during suspend/resume
To make things simple we stop playing with sysfs files unless the driver is getting removed. Also the policy is kept intact to be used later.
First few patches provide a clean base for others *more important* patches.
Rebased-over: your bleeding edge branch as there were dependencies on my earlier patches.
Pushed here:
git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/core/sysfs
v1->V2:
- Dropped the idea of using policy-lists for getting policy for any cpu
- Also dropped fallback list and its per-cpu variable
- Stopped cleaning cpufreq_cpu_data and doing list_del(policy) on logical hotplug.
- Added support for physical hotplug of CPUs (Untested).
@Srivatsa: Can you please have a look at the above change? I have cc'd you only on this one.
Saravana Kannan (1): cpufreq: Track cpu managing sysfs kobjects separately
Cc: Srivatsa Bhat srivatsa@mit.edu
Viresh Kumar (19): cpufreq: Add doc style comment about cpufreq_cpu_{get|put}() cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev() cpufreq: Throw warning when we try to get policy for an invalid CPU cpufreq: Keep a single path for adding managed CPUs cpufreq: Clear policy->cpus even for the last CPU cpufreq: Create for_each_{in}active_policy() cpufreq: Call schedule_work() for the last active policy cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies cpufreq: Get rid of cpufreq_cpu_data_fallback cpufreq: Don't traverse list of all policies for adding policy for a cpu cpufreq: Manage governor usage history with 'policy->last_governor' cpufreq: Mark policy->governor = NULL for inactive policies cpufreq: Don't allow updating inactive-policies from sysfs cpufreq: Stop migrating sysfs files on hotplug cpufreq: Remove cpufreq_update_policy() cpufreq: Initialize policy->kobj while allocating policy cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() cpufreq: Restart governor as soon as possible cpufreq: Add support for physical hoplug of CPUs
drivers/cpufreq/cpufreq.c | 593 ++++++++++++++++++++++++++-------------------- include/linux/cpufreq.h | 5 +- 2 files changed, 340 insertions(+), 258 deletions(-)
I've queued up the first 5 for 4.2, but the rest needs refreshing at least.
linaro-kernel@lists.linaro.org