Hi Rafael,
As discussed here: https://www.mail-archive.com/stable@vger.kernel.org/msg87645.html
This is another attempt to fix an important bug and cleanups around it. First patch is the real fix (Accumulated all Reviewed/Tested-by's) and *should* go in 3.16-rc*.
Others are useful cleanups around it to beautify the code, and can go later in 3.17.
Based over: 3.16-rc4
Viresh Kumar (4): cpufreq: move policy kobj to policy->cpu at resume cpufreq: don't restore policy->cpus on failure to move kobj cpufreq: propagate error returned by kobject_move() cpufreq: move policy kobj from update_policy_cpu()
drivers/cpufreq/cpufreq.c | 68 +++++++++++++++++++++-------------------------- 1 file changed, 31 insertions(+), 37 deletions(-)
This is only relevant to implementations with multiple clusters, where clusters have separate clock lines but all CPUs within a cluster share it.
Consider a dual cluster platform with 2 cores per cluster. During suspend we start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj as we want to retain permissions/values/etc.
Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev(). We will recover the old policy and update policy->cpu from 3 to 2 from update_policy_cpu().
But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a link for CPU2, but would try that for CPU3 while bringing it online. Which will report errors as CPU3 already has kobj assigned to it.
This bug got introduced with commit 42f921a, which overlooked this scenario.
To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here only for the first CPU of a non-boot cluster. And we can't recover from this situation, if kobject_move() fails.
Fixes: ("42f921a cpufreq: remove sysfs files for CPUs which failed to come back after resume") Cc: stable@vger.kernel.org # 3.13+ Reported-by: Bu Yitian ybu@qti.qualcomm.com Reported-by: Saravana Kannan skannan@codeaurora.org Tested-by: Bu Yitian ybu@qti.qualcomm.com Reviewed-by: Srivatsa S. Bhat srivatsa@mit.edu Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 62259d2..6f02485 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1153,10 +1153,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */ - if (recover_policy && cpu != policy->cpu) + if (recover_policy && cpu != policy->cpu) { update_policy_cpu(policy, cpu); - else + WARN_ON(kobject_move(&policy->kobj, &dev->kobj)); + } else { policy->cpu = cpu; + }
cpumask_copy(policy->cpus, cpumask_of(cpu));
On Thursday, July 17, 2014 10:48:25 AM Viresh Kumar wrote:
This is only relevant to implementations with multiple clusters, where clusters have separate clock lines but all CPUs within a cluster share it.
Consider a dual cluster platform with 2 cores per cluster. During suspend we start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj as we want to retain permissions/values/etc.
Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev(). We will recover the old policy and update policy->cpu from 3 to 2 from update_policy_cpu().
But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a link for CPU2, but would try that for CPU3 while bringing it online. Which will report errors as CPU3 already has kobj assigned to it.
This bug got introduced with commit 42f921a, which overlooked this scenario.
To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here only for the first CPU of a non-boot cluster. And we can't recover from this situation, if kobject_move() fails.
Fixes: ("42f921a cpufreq: remove sysfs files for CPUs which failed to come back after resume") Cc: stable@vger.kernel.org # 3.13+ Reported-by: Bu Yitian ybu@qti.qualcomm.com Reported-by: Saravana Kannan skannan@codeaurora.org Tested-by: Bu Yitian ybu@qti.qualcomm.com Reviewed-by: Srivatsa S. Bhat srivatsa@mit.edu Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Applied, thanks! The rest is on my todo list.
drivers/cpufreq/cpufreq.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 62259d2..6f02485 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1153,10 +1153,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */
- if (recover_policy && cpu != policy->cpu)
- if (recover_policy && cpu != policy->cpu) { update_policy_cpu(policy, cpu);
- else
WARN_ON(kobject_move(&policy->kobj, &dev->kobj));
- } else { policy->cpu = cpu;
- }
cpumask_copy(policy->cpus, cpumask_of(cpu));
While hot-unplugging policy->cpu, we call cpufreq_nominate_new_policy_cpu() to nominate next owner of policy, i.e. policy->cpu. If we fail to move policy kobject under the new policy->cpu, we try to update policy->cpus with the old policy->cpu.
This would have been required in case old-cpu is removed from policy->cpus in the first place. But its not done before calling cpufreq_nominate_new_policy_cpu(), but during the POST_DEAD notification which happens quite late in the hot-unplugging path.
So, this is just some useless code hanging around, get rid of it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 6f02485..e572d51 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1322,11 +1322,6 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, ret = kobject_move(&policy->kobj, &cpu_dev->kobj); if (ret) { pr_err("%s: Failed to move kobj: %d\n", __func__, ret); - - down_write(&policy->rwsem); - cpumask_set_cpu(old_cpu, policy->cpus); - up_write(&policy->rwsem); - ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
We are returning -EINVAL instead of the error returned from kobject_move() when it fails. Propagate the actual error number.
Also add a meaningful print when sysfs_create_link() fails.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e572d51..ec25ca6 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1322,10 +1322,12 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, ret = kobject_move(&policy->kobj, &cpu_dev->kobj); if (ret) { pr_err("%s: Failed to move kobj: %d\n", __func__, ret); - ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, - "cpufreq");
- return -EINVAL; + if (sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq")) + pr_err("%s: Failed to restore kobj link to cpu:%d\n", + __func__, cpu_dev->id); + + return ret; }
return cpu_dev->id;
We are calling kobject_move() from two separate places currently and both these places share another routine update_policy_cpu() which is handling everything around updating policy->cpu. Moving ownership of policy->kobj also lies under the role of update_policy_cpu() routine and must handled from there.
So, Lets move kobject_move() to update_policy_cpu() and get rid of cpufreq_nominate_new_policy_cpu() as it doesn't have anything significant left.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 71 ++++++++++++++++++++++------------------------- 1 file changed, 33 insertions(+), 38 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index ec25ca6..d9fdedd 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1076,10 +1076,20 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) kfree(policy); }
-static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu) +static int update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu, + struct device *cpu_dev) { + int ret; + if (WARN_ON(cpu == policy->cpu)) - return; + return 0; + + /* Move kobject to the new policy->cpu */ + ret = kobject_move(&policy->kobj, &cpu_dev->kobj); + if (ret) { + pr_err("%s: Failed to move kobj: %d\n", __func__, ret); + return ret; + }
down_write(&policy->rwsem);
@@ -1090,6 +1100,8 @@ static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu)
blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_UPDATE_POLICY_CPU, policy); + + return 0; }
static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) @@ -1153,12 +1165,10 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * the creation of a brand new one. So we need to perform this update * by invoking update_policy_cpu(). */ - if (recover_policy && cpu != policy->cpu) { - update_policy_cpu(policy, cpu); - WARN_ON(kobject_move(&policy->kobj, &dev->kobj)); - } else { + if (recover_policy && cpu != policy->cpu) + WARN_ON(update_policy_cpu(policy, cpu, dev)); + else policy->cpu = cpu; - }
cpumask_copy(policy->cpus, cpumask_of(cpu));
@@ -1309,35 +1319,11 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return __cpufreq_add_dev(dev, sif); }
-static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, - unsigned int old_cpu) -{ - struct device *cpu_dev; - int ret; - - /* first sibling now owns the new sysfs dir */ - cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu)); - - sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); - ret = kobject_move(&policy->kobj, &cpu_dev->kobj); - if (ret) { - pr_err("%s: Failed to move kobj: %d\n", __func__, ret); - - if (sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq")) - pr_err("%s: Failed to restore kobj link to cpu:%d\n", - __func__, cpu_dev->id); - - return ret; - } - - return cpu_dev->id; -} - static int __cpufreq_remove_dev_prepare(struct device *dev, struct subsys_interface *sif) { unsigned int cpu = dev->id, cpus; - int new_cpu, ret; + int ret; unsigned long flags; struct cpufreq_policy *policy;
@@ -1377,14 +1363,23 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, if (cpu != policy->cpu) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { - new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu); - if (new_cpu >= 0) { - update_policy_cpu(policy, new_cpu); + /* Nominate new CPU */ + int new_cpu = cpumask_any_but(policy->cpus, cpu); + struct device *cpu_dev = get_cpu_device(new_cpu);
- if (!cpufreq_suspended) - pr_debug("%s: policy Kobject moved to cpu: %d from: %d\n", - __func__, new_cpu, cpu); + sysfs_remove_link(&cpu_dev->kobj, "cpufreq"); + ret = update_policy_cpu(policy, new_cpu, cpu_dev); + if (ret) { + if (sysfs_create_link(&cpu_dev->kobj, &policy->kobj, + "cpufreq")) + pr_err("%s: Failed to restore kobj link to cpu:%d\n", + __func__, cpu_dev->id); + return ret; } + + if (!cpufreq_suspended) + pr_debug("%s: policy Kobject moved to cpu: %d from: %d\n", + __func__, new_cpu, cpu); } else if (cpufreq_driver->stop_cpu && cpufreq_driver->setpolicy) { cpufreq_driver->stop_cpu(policy); }
linaro-kernel@lists.linaro.org