cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors.
We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 1f93dbd..de99517 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -970,6 +970,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) pr_debug("initialization failed\n"); goto err_unlock_policy; } + + /* + * affected cpus must always be the one, which are online. We aren't + * managing offline cpus here. + */ + cpumask_and(policy->cpus, policy->cpus, cpu_online_mask); + policy->user_policy.min = policy->min; policy->user_policy.max = policy->max;
Because cpufreq core and governors worry only about the online cpus, if a cpu is hot [un]plugged, we must notify governors about it, otherwise be ready to expect something unexpected.
We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we just need to call them now.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index de99517..a0a33bd 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -751,11 +751,16 @@ static int cpufreq_add_dev_policy(unsigned int cpu, return -EBUSY; }
+ __cpufreq_governor(managed_policy, CPUFREQ_GOV_STOP); + spin_lock_irqsave(&cpufreq_driver_lock, flags); cpumask_copy(managed_policy->cpus, policy->cpus); per_cpu(cpufreq_cpu_data, cpu) = managed_policy; spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+ __cpufreq_governor(managed_policy, CPUFREQ_GOV_START); + __cpufreq_governor(managed_policy, CPUFREQ_GOV_LIMITS); + pr_debug("CPU already managed, adding link\n"); ret = sysfs_create_link(&dev->kobj, &managed_policy->kobj, @@ -1066,8 +1071,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif */ if (unlikely(cpu != data->cpu)) { pr_debug("removing link\n"); + __cpufreq_governor(data, CPUFREQ_GOV_STOP); cpumask_clear_cpu(cpu, data->cpus); spin_unlock_irqrestore(&cpufreq_driver_lock, flags); + + __cpufreq_governor(data, CPUFREQ_GOV_START); + __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); + kobj = &dev->kobj; cpufreq_cpu_put(data); unlock_policy_rwsem_write(cpu);
This is how the core works: cpufreq_driver_unregister() - subsys_interface_unregister() - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister.
cpufreq_remove_dev(): - Remove policy node - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3. - cpufreq_add_dev() would call cpufreq_driver->init() - init would return mask as AND of 2, 3 and 4 for cluster A7. - cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again.
Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index a0a33bd..271d3be 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,6 +47,9 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); #endif static DEFINE_SPINLOCK(cpufreq_driver_lock);
+/* Used when we unregister cpufreq driver */ +struct cpumask cpufreq_online_mask; + /* * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure * all cpufreq/hotplug/workqueue/etc related lock issues. @@ -981,6 +984,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * managing offline cpus here. */ cpumask_and(policy->cpus, policy->cpus, cpu_online_mask); + cpumask_and(policy->cpus, policy->cpus, &cpufreq_online_mask);
policy->user_policy.min = policy->min; policy->user_policy.max = policy->max; @@ -1064,7 +1068,6 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif } per_cpu(cpufreq_cpu_data, cpu) = NULL;
- #ifdef CONFIG_SMP /* if this isn't the CPU which is the parent of the kobj, we * only need to unlink, put and exit @@ -1185,6 +1188,7 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) if (unlikely(lock_policy_rwsem_write(cpu))) BUG();
+ cpumask_clear_cpu(cpu, &cpufreq_online_mask); retval = __cpufreq_remove_dev(dev, sif); return retval; } @@ -1903,6 +1907,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data) cpufreq_driver = driver_data; spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+ cpumask_setall(&cpufreq_online_mask); + ret = subsys_interface_register(&cpufreq_interface); if (ret) goto err_null_driver;
Hi Viresh,
On 12/16/2012 11:20 AM, Viresh Kumar wrote:
This is how the core works: cpufreq_driver_unregister()
- subsys_interface_unregister()
- for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister.
cpufreq_remove_dev():
- Remove policy node
- Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3.
- cpufreq_add_dev() would call cpufreq_driver->init()
- init would return mask as AND of 2, 3 and 4 for cluster A7.
- cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again.
Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev().
I took a quick look at the problem you described above, and the cpufreq code.. If we cannot avoid calling cpufreq_add_dev() from cpufreq_remove_dev(), then I can't think of anything better than what your patch does.
BTW, off-topic, while going through that path, I think I found a memory leak in __cpufreq_remove_dev():
if (unlikely(cpumask_weight(data->cpus) > 1)) { for_each_cpu(j, data->cpus) { if (j == cpu) continue; per_cpu(cpufreq_cpu_data, j) = NULL; } }
We are assigning NULL without freeing that memory.
Regards, Srivatsa S. Bhat
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index a0a33bd..271d3be 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,6 +47,9 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); #endif static DEFINE_SPINLOCK(cpufreq_driver_lock);
+/* Used when we unregister cpufreq driver */ +struct cpumask cpufreq_online_mask;
/*
- cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
- all cpufreq/hotplug/workqueue/etc related lock issues.
@@ -981,6 +984,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * managing offline cpus here. */ cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
cpumask_and(policy->cpus, policy->cpus, &cpufreq_online_mask);
policy->user_policy.min = policy->min; policy->user_policy.max = policy->max;
@@ -1064,7 +1068,6 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif } per_cpu(cpufreq_cpu_data, cpu) = NULL;
#ifdef CONFIG_SMP /* if this isn't the CPU which is the parent of the kobj, we * only need to unlink, put and exit @@ -1185,6 +1188,7 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) if (unlikely(lock_policy_rwsem_write(cpu))) BUG();
- cpumask_clear_cpu(cpu, &cpufreq_online_mask); retval = __cpufreq_remove_dev(dev, sif); return retval;
} @@ -1903,6 +1907,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data) cpufreq_driver = driver_data; spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
- cpumask_setall(&cpufreq_online_mask);
- ret = subsys_interface_register(&cpufreq_interface); if (ret) goto err_null_driver;
On 3 January 2013 19:55, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
I took a quick look at the problem you described above, and the cpufreq code.. If we cannot avoid calling cpufreq_add_dev() from cpufreq_remove_dev(), then I can't think of anything better than what your patch does.
Good :)
BTW, off-topic, while going through that path, I think I found a memory leak in __cpufreq_remove_dev():
if (unlikely(cpumask_weight(data->cpus) > 1)) { for_each_cpu(j, data->cpus) { if (j == cpu) continue; per_cpu(cpufreq_cpu_data, j) = NULL; } }
We are assigning NULL without freeing that memory.
Not really. All cpus in affected_cpus (data->cpus), share the same policy structure. We have already taken backup of cpufreq_cpu_data for the first cpu in "data" and are freeing it here:
kfree(data);
On 01/04/2013 10:49 AM, Viresh Kumar wrote:
On 3 January 2013 19:55, Srivatsa S. Bhat srivatsa.bhat@linux.vnet.ibm.com wrote:
I took a quick look at the problem you described above, and the cpufreq code.. If we cannot avoid calling cpufreq_add_dev() from cpufreq_remove_dev(), then I can't think of anything better than what your patch does.
Good :)
BTW, off-topic, while going through that path, I think I found a memory leak in __cpufreq_remove_dev():
if (unlikely(cpumask_weight(data->cpus) > 1)) { for_each_cpu(j, data->cpus) { if (j == cpu) continue; per_cpu(cpufreq_cpu_data, j) = NULL; } }
We are assigning NULL without freeing that memory.
Not really. All cpus in affected_cpus (data->cpus), share the same policy structure. We have already taken backup of cpufreq_cpu_data for the first cpu in "data" and are freeing it here:
kfree(data);
Ah, ok, got it. Thanks!
Regards, Srivatsa S. Bhat
On Sunday, December 16, 2012 11:20:08 AM Viresh Kumar wrote:
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors.
We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Well, this series makes sense to me, but I'd like to hear what the other people think.
Thanks, Rafael
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 1f93dbd..de99517 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -970,6 +970,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) pr_debug("initialization failed\n"); goto err_unlock_policy; }
- /*
* affected cpus must always be the one, which are online. We aren't
* managing offline cpus here.
*/
- cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
- policy->user_policy.min = policy->min; policy->user_policy.max = policy->max;
On 16 December 2012 18:34, Rafael J. Wysocki rjw@sisk.pl wrote:
On Sunday, December 16, 2012 11:20:08 AM Viresh Kumar wrote:
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors.
We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Well, this series makes sense to me, but I'd like to hear what the other people think.
That sounds great :)
Some more information for others about how i reached to these issues is present here:
On 16 December 2012 19:07, Viresh Kumar viresh.kumar@linaro.org wrote:
On 16 December 2012 18:34, Rafael J. Wysocki rjw@sisk.pl wrote:
Well, this series makes sense to me, but I'd like to hear what the other people think.
That sounds great :)
Some more information for others about how i reached to these issues is present here:
Hmm.. I don't know, if we are going to get any feedback from others :(
BTW, i consider them as fixes and so would make sense to get them in next rc. What do you think?
-- viresh
On Wednesday, January 02, 2013 11:59:57 AM Viresh Kumar wrote:
On 16 December 2012 19:07, Viresh Kumar viresh.kumar@linaro.org wrote:
On 16 December 2012 18:34, Rafael J. Wysocki rjw@sisk.pl wrote:
Well, this series makes sense to me, but I'd like to hear what the other people think.
That sounds great :)
Some more information for others about how i reached to these issues is present here:
Hmm.. I don't know, if we are going to get any feedback from others :(
Surely somebody cares except for us two?
BTW, i consider them as fixes and so would make sense to get them in next rc. What do you think?
Yes, if somebody tells me "yes, this fixes a problem for me". Otherwise, I don't quite see the reason.
Thanks, Rafael
On 3 January 2013 06:43, Rafael J. Wysocki rjw@sisk.pl wrote:
BTW, i consider them as fixes and so would make sense to get them in next rc. What do you think?
Yes, if somebody tells me "yes, this fixes a problem for me". Otherwise, I don't quite see the reason.
I don't know how much people test HOTPLUG, but there are clear bugs related to hotplug of cpus on a multiple cpu system :)
On Thursday, January 03, 2013 09:02:22 AM Viresh Kumar wrote:
On 3 January 2013 06:43, Rafael J. Wysocki rjw@sisk.pl wrote:
BTW, i consider them as fixes and so would make sense to get them in next rc. What do you think?
Yes, if somebody tells me "yes, this fixes a problem for me". Otherwise, I don't quite see the reason.
I don't know how much people test HOTPLUG, but there are clear bugs related to hotplug of cpus on a multiple cpu system :)
True, but have those bugs been introduced recently (ie. in v3.8-rc1 or later)?
Rafael
On 3 January 2013 17:32, Rafael J. Wysocki rjw@sisk.pl wrote:
True, but have those bugs been introduced recently (ie. in v3.8-rc1 or later)?
Don't know... I feel they were always there, its just that nobody tested it that way :)
On Friday, January 04, 2013 10:44:36 AM Viresh Kumar wrote:
On 3 January 2013 17:32, Rafael J. Wysocki rjw@sisk.pl wrote:
True, but have those bugs been introduced recently (ie. in v3.8-rc1 or later)?
Don't know... I feel they were always there, its just that nobody tested it that way :)
That exactly is my point. :-)
If they have always been there, I don't see much reason for hurrying with the fixes, so I'll queue them up for v3.9.
Thanks, Rafael
On Sunday, December 16, 2012 11:20:08 AM Viresh Kumar wrote:
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors.
We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Applied to the linux-next branch of the linux-pm.git tree as v3.9 material.
Thanks, Rafael
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 1f93dbd..de99517 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -970,6 +970,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) pr_debug("initialization failed\n"); goto err_unlock_policy; }
- /*
* affected cpus must always be the one, which are online. We aren't
* managing offline cpus here.
*/
- cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
- policy->user_policy.min = policy->min; policy->user_policy.max = policy->max;