Hi Rafael/Srivatsa,
These are some last minute fixes for 3.12. I have found them while looking at the code to fix the latest suspend/resume crashes we see (Reported by Stephen)..
I believe at some place these are behind those crashes, otherwise people with a single cluster or single policy couldn't have faced it.. Like Stephen as he was working on Tegra.
I thought you will wait for my conversation with Srivatsa to get over before actually applying his patches, but saw that just happened :)
No issues, we can talk a bit more about the problems for now and then you can get whatever patches are required for 3.12-rc2
First three of the patchset are minor cleanups (you may or may not take them for 3.12), but last two are some real fixes.. I haven't faced any issue without them but simply found them in code review.
@Stephen: Probably you can try my branch: cpufreq-suspend-fix, which has these patches without Srivatsa's patches (though some bits of those will also be required I believe for multicluster systems)..
-- viresh
Viresh Kumar (5): cpufreq: Remove extra blank line cpufreq: don't break string in print statements cpufreq: remove __cpufreq_remove_dev() cpufreq: don't update policy->cpu while removing while removing other CPUs cpufreq: use correct values of cpus in __cpufreq_remove_dev_finish()
drivers/cpufreq/cpufreq.c | 50 +++++++++++++++++++---------------------------- 1 file changed, 20 insertions(+), 30 deletions(-)
We don't need a blank line just at start of a block, lets remove it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 5a64f66..28477eb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1185,7 +1185,6 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, if (cpu != policy->cpu && !frozen) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { - new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen); if (new_cpu >= 0) { WARN_ON(lock_policy_rwsem_write(cpu));
On 09/12/2013 10:55 AM, Viresh Kumar wrote:
We don't need a blank line just at start of a block, lets remove it.
Well, I felt having that line avoids clutter, especially since the code around it was already a bit hard to read..
Anyway, I don't have any strong opinions either way. So no objections from my side. (But you might have to rebase this patch on top of Rafael's tree, due to the recent changes he pushed in to this code).
Regards, Srivatsa S. Bhat
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 5a64f66..28477eb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1185,7 +1185,6 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, if (cpu != policy->cpu && !frozen) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) {
- new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen); if (new_cpu >= 0) { WARN_ON(lock_policy_rwsem_write(cpu));
On 12 September 2013 13:46, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
Well, I felt having that line avoids clutter, especially since the code around it was already a bit hard to read..
Well, its exact opposite for me.. somehow my attending keeps dragging to extra blank lines :)
Anyway, I don't have any strong opinions either way. So no objections from my side. (But you might have to rebase this patch on top of Rafael's tree, due to the recent changes he pushed in to this code).
I tried a rebase and couldn't find a issue.. And so not resending the series.
As a rule its better not to break string (quoted inside "") in a print statement even if it crosses 80 column boundary as that may introduce unwanted bugs and so this patch rewrites one of the print statements..
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 28477eb..31f7845 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1192,8 +1192,8 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, unlock_policy_rwsem_write(cpu);
if (!frozen) { - pr_debug("%s: policy Kobject moved to cpu: %d " - "from: %d\n",__func__, new_cpu, cpu); + pr_debug("%s: policy Kobject moved to cpu: %d from: %d\n", + __func__, new_cpu, cpu); } } }
On 09/12/2013 10:55 AM, Viresh Kumar wrote:
As a rule its better not to break string (quoted inside "") in a print statement even if it crosses 80 column boundary as that may introduce unwanted bugs and so this patch rewrites one of the print statements..
Ok, if that is the convention, then so be it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Reviewed-by: Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com
Regards, Srivatsa S. Bhat
drivers/cpufreq/cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 28477eb..31f7845 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1192,8 +1192,8 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, unlock_policy_rwsem_write(cpu);
if (!frozen) {
pr_debug("%s: policy Kobject moved to cpu: %d "
"from: %d\n",__func__, new_cpu, cpu);
pr_debug("%s: policy Kobject moved to cpu: %d from: %d\n",
} }__func__, new_cpu, cpu); }
Nobody except cpufreq_remove_dev() is calling __cpufreq_remove_dev() and so we don't need separate routines here. Lets merge code from __cpufreq_remove_dev() to cpufreq_remove_dev() and get rid of __cpufreq_remove_dev().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 31f7845..5e0a82e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1285,36 +1285,26 @@ static int __cpufreq_remove_dev_finish(struct device *dev, }
/** - * __cpufreq_remove_dev - remove a CPU device + * cpufreq_remove_dev - remove a CPU device * * Removes the cpufreq interface for a CPU device. * Caller should already have policy_rwsem in write mode for this CPU. * This routine frees the rwsem before returning. */ -static inline int __cpufreq_remove_dev(struct device *dev, - struct subsys_interface *sif, - bool frozen) -{ - int ret; - - ret = __cpufreq_remove_dev_prepare(dev, sif, frozen); - - if (!ret) - ret = __cpufreq_remove_dev_finish(dev, sif, frozen); - - return ret; -} - static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) { unsigned int cpu = dev->id; - int retval; + int ret;
if (cpu_is_offline(cpu)) return 0;
- retval = __cpufreq_remove_dev(dev, sif, false); - return retval; + ret = __cpufreq_remove_dev_prepare(dev, sif, false); + + if (!ret) + ret = __cpufreq_remove_dev_finish(dev, sif, false); + + return ret; }
static void handle_update(struct work_struct *work)
On 09/12/2013 10:55 AM, Viresh Kumar wrote:
Nobody except cpufreq_remove_dev() is calling __cpufreq_remove_dev() and so we don't need separate routines here. Lets merge code from __cpufreq_remove_dev() to cpufreq_remove_dev() and get rid of __cpufreq_remove_dev().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Reviewed-by: Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com
Regards, Srivatsa S. Bhat
drivers/cpufreq/cpufreq.c | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 31f7845..5e0a82e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1285,36 +1285,26 @@ static int __cpufreq_remove_dev_finish(struct device *dev, }
/**
- __cpufreq_remove_dev - remove a CPU device
*/
- cpufreq_remove_dev - remove a CPU device
- Removes the cpufreq interface for a CPU device.
- Caller should already have policy_rwsem in write mode for this CPU.
- This routine frees the rwsem before returning.
-static inline int __cpufreq_remove_dev(struct device *dev,
struct subsys_interface *sif,
bool frozen)
-{
- int ret;
- ret = __cpufreq_remove_dev_prepare(dev, sif, frozen);
- if (!ret)
ret = __cpufreq_remove_dev_finish(dev, sif, frozen);
- return ret;
-}
static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) { unsigned int cpu = dev->id;
- int retval;
int ret;
if (cpu_is_offline(cpu)) return 0;
- retval = __cpufreq_remove_dev(dev, sif, false);
- return retval;
- ret = __cpufreq_remove_dev_prepare(dev, sif, false);
- if (!ret)
ret = __cpufreq_remove_dev_finish(dev, sif, false);
- return ret;
}
static void handle_update(struct work_struct *work)
With a recent change the logic here is changed a bit and I just figured out it is something we don't want.
Consider we have four CPUs (0,1,2,3) managed by a policy and policy->cpu is set to 0. Now we are suspending and hence we call __cpufreq_remove_dev_prepare() for cpu 1, 2 & 3..
With the current code we always call cpufreq_nominate_new_policy_cpu() for cpu 1, 2 & 3 whereas we should skipped most of __cpufreq_remove_dev_prepare() routine.
Lets fix it by moving the check for !frozen inside the first if block.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 5e0a82e..0e11fcb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1182,8 +1182,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, cpumask_clear_cpu(cpu, policy->cpus); unlock_policy_rwsem_write(cpu);
- if (cpu != policy->cpu && !frozen) { - sysfs_remove_link(&dev->kobj, "cpufreq"); + if (cpu != policy->cpu) { + if (!frozen) + sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen); if (new_cpu >= 0) {
On 09/12/2013 10:55 AM, Viresh Kumar wrote:
With a recent change the logic here is changed a bit and I just figured out it is something we don't want.
Consider we have four CPUs (0,1,2,3) managed by a policy and policy->cpu is set to 0. Now we are suspending and hence we call __cpufreq_remove_dev_prepare() for cpu 1, 2 & 3..
With the current code we always call cpufreq_nominate_new_policy_cpu() for cpu 1, 2 & 3 whereas we should skipped most of __cpufreq_remove_dev_prepare() routine.
Lets fix it by moving the check for !frozen inside the first if block.
As you noted in the other thread, Rafael already applied my patch[1] which does the same thing. So I guess you'll drop this patch from your series.
[1].http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/commit/?h=bl...
Regards, Srivatsa S. Bhat
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 5e0a82e..0e11fcb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1182,8 +1182,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, cpumask_clear_cpu(cpu, policy->cpus); unlock_policy_rwsem_write(cpu);
- if (cpu != policy->cpu && !frozen) {
sysfs_remove_link(&dev->kobj, "cpufreq");
- if (cpu != policy->cpu) {
if (!frozen)
} else if (cpus > 1) { new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen); if (new_cpu >= 0) {sysfs_remove_link(&dev->kobj, "cpufreq");
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa..
Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing cpu 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last cpu from policy :)
Fix it by clearing cpu mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0e11fcb..b556d46 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1175,12 +1175,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, policy->governor->name, CPUFREQ_NAME_LEN); #endif
- WARN_ON(lock_policy_rwsem_write(cpu)); + lock_policy_rwsem_read(cpu); cpus = cpumask_weight(policy->cpus); - - if (cpus > 1) - cpumask_clear_cpu(cpu, policy->cpus); - unlock_policy_rwsem_write(cpu); + unlock_policy_rwsem_read(cpu);
if (cpu != policy->cpu) { if (!frozen) @@ -1222,9 +1219,12 @@ static int __cpufreq_remove_dev_finish(struct device *dev, return -EINVAL; }
- lock_policy_rwsem_read(cpu); + WARN_ON(lock_policy_rwsem_write(cpu)); cpus = cpumask_weight(policy->cpus); - unlock_policy_rwsem_read(cpu); + + if (cpus > 1) + cpumask_clear_cpu(cpu, policy->cpus); + unlock_policy_rwsem_write(cpu);
/* If cpu is last user of policy, free policy */ if (cpus == 1) {
On 09/12/2013 10:55 AM, Viresh Kumar wrote:
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa..
Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing cpu 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last cpu from policy :)
Fix it by clearing cpu mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare().
Oops! Good catch!
That said, your fix doesn't look correct. See below.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0e11fcb..b556d46 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1175,12 +1175,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev, policy->governor->name, CPUFREQ_NAME_LEN); #endif
- WARN_ON(lock_policy_rwsem_write(cpu));
- lock_policy_rwsem_read(cpu); cpus = cpumask_weight(policy->cpus);
- if (cpus > 1)
cpumask_clear_cpu(cpu, policy->cpus);
- unlock_policy_rwsem_write(cpu);
unlock_policy_rwsem_read(cpu);
if (cpu != policy->cpu) { if (!frozen)
Around here, we call cpufreq_nominate_new_policy_cpu(), and if we haven't cleared the CPU by then, there is a chance that it will nominate the same CPU that we are taking offline. So its important to clear the CPU before that point.
@@ -1222,9 +1219,12 @@ static int __cpufreq_remove_dev_finish(struct device *dev, return -EINVAL; }
- lock_policy_rwsem_read(cpu);
- WARN_ON(lock_policy_rwsem_write(cpu)); cpus = cpumask_weight(policy->cpus);
- unlock_policy_rwsem_read(cpu);
- if (cpus > 1)
cpumask_clear_cpu(cpu, policy->cpus);
- unlock_policy_rwsem_write(cpu);
Perhaps we can retain the above as a read operation, ...
/* If cpu is last user of policy, free policy */ if (cpus == 1) {
... and change this suitably (from 1 to 0 etc..) ? To add to it, it will look more clear as well:
if (cpus == 0) { /* No cpus in policy, so free it */ } else { /* Restart governor */ }
Regards, Srivatsa S. Bhat
On 12 September 2013 12:10, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
That said, your fix doesn't look correct. See below.
I thought I was perfect !! :(
... and change this suitably (from 1 to 0 etc..) ? To add to it, it will look more clear as well:
if (cpus == 0) { /* No cpus in policy, so free it */ } else { /* Restart governor */ }
Currently cpus never become zero as we clear mask only while there are more than one cpu in a policy... Wait let me see what's the cleanest way to get this fixed..
On 12 September 2013 12:17, Viresh Kumar viresh.kumar@linaro.org wrote:
Currently cpus never become zero as we clear mask only while there are more than one cpu in a policy... Wait let me see what's the cleanest way to get this fixed..
Okay, simply replace cpumask_first() with cpumask_any_but() in cpufreq_nominate_new_policy_cpu() :)
On 09/12/2013 12:26 PM, Viresh Kumar wrote:
On 12 September 2013 12:17, Viresh Kumar viresh.kumar@linaro.org wrote:
Currently cpus never become zero as we clear mask only while there are more than one cpu in a policy... Wait let me see what's the cleanest way to get this fixed..
Okay, simply replace cpumask_first() with cpumask_any_but() in cpufreq_nominate_new_policy_cpu() :)
That sounds good! Even the naming is much better, it conveys the intent clearly.
Regards, Srivatsa S. Bhat
On 12 September 2013 12:46, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
That sounds good! Even the naming is much better, it conveys the intent clearly.
Folded below change in my patch (attached):
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b556d46..23f5845 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1111,7 +1111,7 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, int ret;
/* first sibling now owns the new sysfs dir */ - cpu_dev = get_cpu_device(cpumask_first(policy->cpus)); + cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu));
/* Don't touch sysfs files during light-weight tear-down */ if (frozen)
-- thanks
On Thursday, September 12, 2013 02:51:58 PM Viresh Kumar wrote:
On 12 September 2013 12:46, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
That sounds good! Even the naming is much better, it conveys the intent clearly.
Folded below change in my patch (attached):
Please resend. And I honestly don't think that [1-3/5] are fixes and [4/5] is not needed any more.
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index b556d46..23f5845 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1111,7 +1111,7 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, int ret;
/* first sibling now owns the new sysfs dir */
cpu_dev = get_cpu_device(cpumask_first(policy->cpus));
cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu)); /* Don't touch sysfs files during light-weight tear-down */ if (frozen)
--
On 12 September 2013 16:17, Rafael J. Wysocki rjw@sisk.pl wrote:
Please resend. And I honestly don't think that [1-3/5] are fixes and [4/5] is not needed any more.
Okay, I will resend 5/5.. I didn't get you last two statements clearly :( - don't think that [1-3/5] are fixes: You meant they are cleanups rather? Yes, and that's why I asked you to take those on for 3.13 and not for 12..
- [4/5] is not needed any more: 4/5 is already applied by you and it came from Rafael.. I reinvented the wheel, nothing more. And so it is not required anymore :)
On Thursday, September 12, 2013 04:13:03 PM Viresh Kumar wrote:
On 12 September 2013 16:17, Rafael J. Wysocki rjw@sisk.pl wrote:
Please resend. And I honestly don't think that [1-3/5] are fixes and [4/5] is not needed any more.
Okay, I will resend 5/5.. I didn't get you last two statements clearly :(
- don't think that [1-3/5] are fixes: You meant they are cleanups rather?
Yes.
Yes, and that's why I asked you to take those on for 3.13 and not for 12..
Ah, sorry, I must have overlooked that. I didn't say I wasn't going to take them, though.
- [4/5] is not needed any more: 4/5 is already applied by you and it came
from Rafael.. I reinvented the wheel, nothing more. And so it is not required anymore :)
Surely from Srivatsa?
Thanks, Rafael
On 12 September 2013 16:26, Rafael J. Wysocki rjw@sisk.pl wrote:
Surely from Srivatsa?
Ofcourse.. I kept repeating this to myself... "Viresh, please read mails you have written before you send.. So many times you are pointed about these silly mistakes"... And I keep forgetting them, ah its a fairly small mail, what can go wrong.. :)
-- viresh
On 09/12/2013 03:21 AM, Viresh Kumar wrote:
On 12 September 2013 12:46, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
That sounds good! Even the naming is much better, it conveys the intent clearly.
Folded below change in my patch (attached):
...
For the record, the patch which was attached to Viresh's email solves the hangs I was getting in next-20130912 during (the second?) CPU hotplug and resume. So,
Tested-by: Stephen Warren swarren@nvidia.com
Without this, I saw:
root@localhost:~# ./cpuonline.py echo 0 > /sys/devices/system/cpu/cpu1/online [ 24.049729] CPU1: shutdown echo 1 > /sys/devices/system/cpu/cpu1/online [ 26.090439] CPU1: Booted secondary processor root@localhost:~# ./cpuonline.py echo 0 > /sys/devices/system/cpu/cpu1/online [ 29.003016] CPU1: shutdown echo 1 > /sys/devices/system/cpu/cpu1/online [ 31.032562] CPU1: Booted secondary processor
[hung]
[ 240.300393] INFO: task ondemand:678 blocked for more than 120 seconds. [ 240.312842] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 240.326658] ondemand D c050f6cc 0 678 1 0x00000000 [ 240.339062] [<c050f6cc>] (__schedule+0x330/0x6d0) from [<c050fe50>] (schedule_preempt_disabled+0x24/0x34) [ 240.354904] [<c050fe50>] (schedule_preempt_disabled+0x24/0x34) from [<c050e810>] (__mutex_lock_slowpath+0x178/0x370) [ 240.371767] [<c050e810>] (__mutex_lock_slowpath+0x178/0x370) from [<c050ea14>] (mutex_lock+0xc/0x24) [ 240.387363] [<c050ea14>] (mutex_lock+0xc/0x24) from [<c00245e8>] (get_online_cpus+0x2c/0x48) [ 240.402273] [<c00245e8>] (get_online_cpus+0x2c/0x48) from [<c0342c0c>] (store+0x18/0xbc) [ 240.417688] [<c0342c0c>] (store+0x18/0xbc) from [<c011d598>] (sysfs_write_file+0x168/0x198) [ 240.432687] [<c011d598>] (sysfs_write_file+0x168/0x198) from [<c00cbf98>] (vfs_write+0xb0/0x188) [ 240.448185] [<c00cbf98>] (vfs_write+0xb0/0x188) from [<c00cc348>] (SyS_write+0x3c/0x70) [ 240.462964] [<c00cc348>] (SyS_write+0x3c/0x70) from [<c000e540>] (ret_fast_syscall+0x0/0x30) [ 240.478239] INFO: task sh:827 blocked for more than 120 seconds. [ 240.491027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 240.505795] sh D c050f6cc 0 827 825 0x00000001 [ 240.519169] [<c050f6cc>] (__schedule+0x330/0x6d0) from [<c0510660>] (__down_write_nested+0x78/0xcc) [ 240.535666] [<c0510660>] (__down_write_nested+0x78/0xcc) from [<c0342ae0>] (lock_policy_rwsem_write+0x34/0x48) [ 240.552950] [<c0342ae0>] (lock_policy_rwsem_write+0x34/0x48) from [<c0343a04>] (cpufreq_update_policy+0x20/0xec) [ 240.570631] [<c0343a04>] (cpufreq_update_policy+0x20/0xec) from [<c0344410>] (cpufreq_cpu_callback+0x78/0x8c) [ 240.588051] [<c0344410>] (cpufreq_cpu_callback+0x78/0x8c) from [<c0042f44>] (notifier_call_chain+0x44/0x84) [ 240.605401] [<c0042f44>] (notifier_call_chain+0x44/0x84) from [<c0024594>] (__cpu_notify+0x28/0x44) [ 240.622085] [<c0024594>] (__cpu_notify+0x28/0x44) from [<c00247fc>] (_cpu_up+0x100/0x154) [ 240.638006] [<c00247fc>] (_cpu_up+0x100/0x154) from [<c00248c4>] (cpu_up+0x5c/0x84) [ 240.653405] [<c00248c4>] (cpu_up+0x5c/0x84) from [<c028a2a8>] (device_online+0x64/0x88) [ 240.669263] [<c028a2a8>] (device_online+0x64/0x88) from [<c028a328>] (store_online+0x5c/0x64) [ 240.685683] [<c028a328>] (store_online+0x5c/0x64) from [<c028816c>] (dev_attr_store+0x18/0x24) [ 240.702337] [<c028816c>] (dev_attr_store+0x18/0x24) from [<c011d598>] (sysfs_write_file+0x168/0x198) [ 240.719520] [<c011d598>] (sysfs_write_file+0x168/0x198) from [<c00cbf98>] (vfs_write+0xb0/0x188) [ 240.736614] [<c00cbf98>] (vfs_write+0xb0/0x188) from [<c00cc348>] (SyS_write+0x3c/0x70) [ 240.752799] [<c00cc348>] (SyS_write+0x3c/0x70) from [<c000e540>] (ret_fast_syscall+0x0/0x30)
On 09/11/2013 11:25 PM, Viresh Kumar wrote:
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa..
Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing cpu 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last cpu from policy :)
Fix it by clearing cpu mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare().
I see this patch isn't in linux-next yet, nor did it make 3.12-rc1. I assume it'll make 3.12-rc2? It solves various CPU hotplug and suspend/resume issues for me.
On 17 September 2013 20:50, Stephen Warren swarren@wwwdotorg.org wrote:
On 09/11/2013 11:25 PM, Viresh Kumar wrote:
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa..
Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing cpu 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last cpu from policy :)
Fix it by clearing cpu mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare().
I see this patch isn't in linux-next yet, nor did it make 3.12-rc1. I assume it'll make 3.12-rc2? It solves various CPU hotplug and suspend/resume issues for me.
Yes, I have asked Rafael to get this in for rc2 and few others too.. Waiting for his reply though..
-- viresh
On Tuesday, September 17, 2013 09:48:08 PM Viresh Kumar wrote:
On 17 September 2013 20:50, Stephen Warren swarren@wwwdotorg.org wrote:
On 09/11/2013 11:25 PM, Viresh Kumar wrote:
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa..
Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing cpu 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last cpu from policy :)
Fix it by clearing cpu mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare().
I see this patch isn't in linux-next yet, nor did it make 3.12-rc1. I assume it'll make 3.12-rc2? It solves various CPU hotplug and suspend/resume issues for me.
Yes, I have asked Rafael to get this in for rc2 and few others too.. Waiting for his reply though..
Care to remind me what the other patches were?
Rafael
On 17 September 2013 21:48, Viresh Kumar viresh.kumar@linaro.org wrote:
On 17 September 2013 20:50, Stephen Warren swarren@wwwdotorg.org wrote:
Yes, I have asked Rafael to get this in for rc2 and few others too.. Waiting for his reply though..
Its applied now by Rafael..
On 12 September 2013 10:55, Viresh Kumar viresh.kumar@linaro.org wrote:
First three of the patchset are minor cleanups (you may or may not take them for 3.12), but last two are some real fixes.. I haven't faced any issue without them but simply found them in code review.
Viresh Kumar (5): cpufreq: Remove extra blank line cpufreq: don't break string in print statements cpufreq: remove __cpufreq_remove_dev() cpufreq: don't update policy->cpu while removing while removing other CPUs cpufreq: use correct values of cpus in __cpufreq_remove_dev_finish()
So in the end there is only one real fix in this series as 4/4/ is already in..
To make it clear again these are my pending patches for cpufreq: For 3.12-rc2: - 5/5 of this patchset (modified one).. - "serialize freq transitions" that I will send separately today as Guennadi has already tested that ..
For 3.13: - First 3 patches from this series (I tried rebase them over your linux-next and didn't found any issue) - 5 patches from: http://www.gossamer-threads.com/lists/linux/kernel/1781051 - And the rest ~220 patches that I will send soon (Will rebase them over all above patches)..
Thanks. Viresh
linaro-kernel@lists.linaro.org