Thanks for Tixy's reminder! The following commits are general thermal fixes which purpose on thermal-framework Any comments are appreciated!
Regards! Alex
[PATCH 01/11] thermal: cpu_cooling: fix 'descend' check in [PATCH 02/11] thermal: Fix binding problem when there is thermal zone [PATCH 03/11] thermal: fix cpu_cooling max_level behavior [PATCH 04/11] Thermal cpu cooling: return error if no valid cpu [PATCH 05/11] Thermal: thermal zone governor fix [PATCH 06/11] Thermal: Allow first update of cooling device state [PATCH 07/11] thermal: Bind cooling devices with the correct [PATCH 08/11] Thermal/cpu_cooling: Return directly for the cpu out of [PATCH 09/11] Thermal: update thermal zone device after setting [PATCH 10/11] thermal: Add braces around suspect code [PATCH 11/11] thermal: cpu_cooling: fix return value check in
From: Shawn Guo shawn.guo@linaro.org
The variable 'descend' is initialized as -1 in function get_property(), and will never get any chance to be updated by the following code.
if (freq != CPUFREQ_ENTRY_INVALID && descend != -1) descend = !!(freq > table[i].frequency);
This makes function get_property() return the wrong frequency for given cooling level if the frequency table is sorted in ascending. Fix it by correcting the 'descend' check in if-condition to 'descend == -1'.
Signed-off-by: Shawn Guo shawn.guo@linaro.org Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 24c7a381720843f17efb42de81f7e85aefd6f616) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/cpu_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index 78de794..6093862 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -167,7 +167,7 @@ static int get_property(unsigned int cpu, unsigned long input, continue;
/* get the frequency order */ - if (freq != CPUFREQ_ENTRY_INVALID && descend != -1) + if (freq != CPUFREQ_ENTRY_INVALID && descend == -1) descend = !!(freq > table[i].frequency);
freq = table[i].frequency;
From: Ni Wade wni@nvidia.com
The thermal zone params can be used to set governor to specific thermal governor for thermal zone device. But if the thermal zone params has only governor name without thermal bind params, then the thermal zone device will not be binding to cooling device. Because tz->ops->bind operator is not invoked in bind_tz() and bind_cdev() when there is thermal zone params.
Signed-off-by: Wei Ni wni@nvidia.com Signed-off-by: Jinyoung Park jinyoungp@nvidia.com Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit a9f2d19ba7be38590c84487359891d45a66b62f4) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/thermal_core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 83cc99b..d2b7be5 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -238,10 +238,11 @@ static void bind_cdev(struct thermal_cooling_device *cdev) if (!pos->tzp && !pos->ops->bind) continue;
- if (!pos->tzp && pos->ops->bind) { + if (pos->ops->bind) { ret = pos->ops->bind(pos, cdev); if (ret) print_bind_err_msg(pos, cdev, ret); + continue; }
tzp = pos->tzp; @@ -272,8 +273,8 @@ static void bind_tz(struct thermal_zone_device *tz)
mutex_lock(&thermal_list_lock);
- /* If there is no platform data, try to use ops->bind */ - if (!tzp && tz->ops->bind) { + /* If there is ops->bind, try to use ops->bind */ + if (tz->ops->bind) { list_for_each_entry(pos, &thermal_cdev_list, node) { ret = tz->ops->bind(tz, pos); if (ret)
From: Eduardo Valentin eduardo.valentin@ti.com
As per Documentation/thermal/sysfs-api.txt, max_level is an index, not a counter. Thus, in case a CPU has 3 valid frequencies, max_level is expected to be 2, for instance.
The current code makes max_level == number of valid frequencies, which is bogus. This patch fix the cpu_cooling device by ranging max_level properly.
Reported-by: Carlos Hernandez ceh@ti.com Signed-off-by: Eduardo Valentin eduardo.valentin@ti.com Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 1c9573a40c1d34494419f32560f28c763c504d79) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/cpu_cooling.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index 6093862..3331b1a 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -173,6 +173,8 @@ static int get_property(unsigned int cpu, unsigned long input, freq = table[i].frequency; max_level++; } + /* max_level is an index, not a counter */ + max_level--;
/* get max level */ if (property == GET_MAXL) { @@ -181,7 +183,7 @@ static int get_property(unsigned int cpu, unsigned long input, }
if (property == GET_FREQ) - level = descend ? input : (max_level - input - 1); + level = descend ? input : (max_level - input);
for (i = 0, j = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) { /* ignore invalid entry */ @@ -197,7 +199,7 @@ static int get_property(unsigned int cpu, unsigned long input,
if (property == GET_LEVEL && (unsigned int)input == freq) { /* get level by frequency */ - *output = descend ? j : (max_level - j - 1); + *output = descend ? j : (max_level - j); return 0; } if (property == GET_FREQ && level == j) {
From: Zhang Rui rui.zhang@intel.com
Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit a116776f7b6052599df0c67db29c30ea9d69d7ee) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/cpu_cooling.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index 3331b1a..e1f0830 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -173,6 +173,11 @@ static int get_property(unsigned int cpu, unsigned long input, freq = table[i].frequency; max_level++; } + + /* No valid cpu frequency entry */ + if (max_level == 0) + return -EINVAL; + /* max_level is an index, not a counter */ max_level--;
From: Zhang Rui rui.zhang@intel.com
This patch does a cleanup about the thermal zone govenor, setting and make the following rule. 1. For thermal zone devices that are registered w/o tz->tzp, they can use the default thermal governor only. 2. For thermal zone devices w/ governor name specified in tz->tzp->governor_name, we will use the default govenor if the governor specified is not available at the moment, and update tz->governor when the matched governor is registered.
This also fixes a problem that OF registered thermal zones are running with no governor.
Signed-off-by: Zhang Rui rui.zhang@intel.com Acked-by: Javi Merino javi.merino@arm.com (cherry picked from commit f2234bcd03ad031225d7dc37dd18852a2f2ff2bf) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/thermal_core.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index d2b7be5..561d61d 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -56,10 +56,15 @@ static LIST_HEAD(thermal_governor_list); static DEFINE_MUTEX(thermal_list_lock); static DEFINE_MUTEX(thermal_governor_lock);
+static struct thermal_governor *def_governor; + static struct thermal_governor *__find_governor(const char *name) { struct thermal_governor *pos;
+ if (!name || !name[0]) + return def_governor; + list_for_each_entry(pos, &thermal_governor_list, governor_list) if (!strnicmp(name, pos->name, THERMAL_NAME_LENGTH)) return pos; @@ -82,17 +87,23 @@ int thermal_register_governor(struct thermal_governor *governor) if (__find_governor(governor->name) == NULL) { err = 0; list_add(&governor->governor_list, &thermal_governor_list); + if (!def_governor && !strncmp(governor->name, + DEFAULT_THERMAL_GOVERNOR, THERMAL_NAME_LENGTH)) + def_governor = governor; }
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node) { + /* + * only thermal zones with specified tz->tzp->governor_name + * may run with tz->govenor unset + */ if (pos->governor) continue; - if (pos->tzp) - name = pos->tzp->governor_name; - else - name = DEFAULT_THERMAL_GOVERNOR; + + name = pos->tzp->governor_name; + if (!strnicmp(name, governor->name, THERMAL_NAME_LENGTH)) pos->governor = governor; } @@ -330,8 +341,8 @@ static void monitor_thermal_zone(struct thermal_zone_device *tz) static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip, enum thermal_trip_type trip_type) { - if (tz->governor) - tz->governor->throttle(tz, trip); + tz->governor ? tz->governor->throttle(tz, trip) : + def_governor->throttle(tz, trip); }
static void handle_critical_trips(struct thermal_zone_device *tz, @@ -1514,7 +1525,7 @@ struct thermal_zone_device *thermal_zone_device_register(const char *type, if (tz->tzp) tz->governor = __find_governor(tz->tzp->governor_name); else - tz->governor = __find_governor(DEFAULT_THERMAL_GOVERNOR); + tz->governor = def_governor;
mutex_unlock(&thermal_governor_lock);
From: Ni Wade wni@nvidia.com
In initialization, if the cooling device is initialized at max cooling state, and the thermal zone temperature is below the first trip point, then the cooling state can't be updated to the right state, untill the first trip point be triggered.
To fix this issue, allow first update of cooling device state during registration, initialized "updated" device field as "false" (instead of "true").
Signed-off-by: Wei Ni wni@nvidia.com Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 5ca0cce5622bf476e3e6bf627fe8e9381d6ae174) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/thermal_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 561d61d..07d80cc 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -1099,7 +1099,7 @@ __thermal_cooling_device_register(struct device_node *np, INIT_LIST_HEAD(&cdev->thermal_instances); cdev->np = np; cdev->ops = ops; - cdev->updated = true; + cdev->updated = false; cdev->device.class = &thermal_class; cdev->devdata = devdata; dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
From: Punit Agrawal punit.agrawal@arm.com
When binding cooling devices to thermal zones created from the device tree the minimum and maximum cooling states are in the wrong order leading to failure to bind.
Fix the order of cooling states in the call to thermal_zone_bind_cooling_device to fix this.
Cc:Zhang Rui rui.zhang@intel.com Signed-off-by: Punit Agrawal punit.agrawal@arm.com Reviewed-by: Stephen Boyd sboyd@codeaurora.org Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit dd354b84d47ec8ca53686bdb3cc1aecdeb75bef5) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/of-thermal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c index 04b1be7..97d312f 100644 --- a/drivers/thermal/of-thermal.c +++ b/drivers/thermal/of-thermal.c @@ -156,8 +156,8 @@ static int of_thermal_bind(struct thermal_zone_device *thermal,
ret = thermal_zone_bind_cooling_device(thermal, tbp->trip_id, cdev, - tbp->min, - tbp->max); + tbp->max, + tbp->min); if (ret) return ret; }
From: Lan Tianyu tianyu.lan@intel.com
cpufreq_thermal_notifier() is to change the cpu's cpufreq in the allowed_cpus mask when associated thermal-cpufreq cdev's cooling state is changed. It's a cpufreq policy notifier handler and it will be triggered even if those cpus out of allowed_cpus has changed freq policy.
cpufreq_thermal_notifier() checks the policy->cpu. If it belongs to allowed_cpus, change max_freq(default to 0) to the desire cpufreq value and pass 0 and max_freq to cpufreq_verify_within_limits() as cpufreq scope. But if not, do nothing and max_freq will remain 0. This will cause the cpufreq scope to become 0~0. This is not right. This patch is to return directly after finding cpu not belonging to allowed_cpus.
Signed-off-by: Lan Tianyu tianyu.lan@intel.com Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 044d5c26da262fa433dacbe1c6962459050d6b06) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/cpu_cooling.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index e1f0830..dc50315 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -329,6 +329,8 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
if (cpumask_test_cpu(policy->cpu, ¬ify_device->allowed_cpus)) max_freq = notify_device->cpufreq_val; + else + return 0;
/* Never exceed user_policy.max */ if (max_freq > policy->user_policy.max)
From: "lan,Tianyu" tianyu.lan@intel.com
This patch is to update thermal zone device after setting emul_temp in order to make governor work according to input temperature immediately.
Signed-off-by: Lan Tianyu tianyu.lan@intel.com Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 800744bf31df54b0cd4d1104ccfa426d3f578f0e) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/thermal_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 07d80cc..edc0cb8 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -777,6 +777,9 @@ emul_temp_store(struct device *dev, struct device_attribute *attr, ret = tz->ops->set_emul_temp(tz, temperature); }
+ if (!ret) + thermal_zone_device_update(tz); + return ret ? ret : count; } static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
From: Stephen Boyd sboyd@codeaurora.org
It looks like this code is missing braces, otherwise the if statement shouldn't have been indented. Fix it.
Signed-off-by: Stephen Boyd sboyd@codeaurora.org Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit ca9521b770c988bb6bb8eea1241f7a487dab6ff1) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/thermal/of-thermal.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c index 97d312f..4b2b999 100644 --- a/drivers/thermal/of-thermal.c +++ b/drivers/thermal/of-thermal.c @@ -712,11 +712,12 @@ thermal_of_build_thermal_zone(struct device_node *np) }
i = 0; - for_each_child_of_node(child, gchild) + for_each_child_of_node(child, gchild) { ret = thermal_of_populate_bind_params(gchild, &tz->tbps[i++], tz->trips, tz->ntrips); if (ret) goto free_tbps; + }
finish: of_node_put(child);
From: Wei Yongjun yongjun_wei@trendmicro.com.cn
In case of error, the function thermal_cooling_device_register() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR().
Signed-off-by: Wei Yongjun yongjun_wei@trendmicro.com.cn Signed-off-by: Zhang Rui rui.zhang@intel.com (cherry picked from commit 73b9bcd76d13716cc0e0ab053f8c1ae41f47636e) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: drivers/thermal/cpu_cooling.c --- drivers/thermal/cpu_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index dc50315..8c15474 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -482,7 +482,7 @@ __cpufreq_cooling_register(struct device_node *np, if (IS_ERR(cool_dev)) { release_idr(&cpufreq_idr, cpufreq_dev->id); kfree(cpufreq_dev); - return ERR_PTR(-EINVAL); + return cool_dev; } cpufreq_dev->cool_dev = cool_dev; cpufreq_dev->cpufreq_state = 0;
On Thu, 2014-09-25 at 18:08 +0800, Alex Shi wrote:
Thanks for Tixy's reminder! The following commits are general thermal fixes which purpose on thermal-framework Any comments are appreciated!
Regards! Alex
[PATCH 01/11] thermal: cpu_cooling: fix 'descend' check in [PATCH 02/11] thermal: Fix binding problem when there is thermal zone [PATCH 03/11] thermal: fix cpu_cooling max_level behavior [PATCH 04/11] Thermal cpu cooling: return error if no valid cpu [PATCH 05/11] Thermal: thermal zone governor fix [PATCH 06/11] Thermal: Allow first update of cooling device state [PATCH 07/11] thermal: Bind cooling devices with the correct [PATCH 08/11] Thermal/cpu_cooling: Return directly for the cpu out of [PATCH 09/11] Thermal: update thermal zone device after setting [PATCH 10/11] thermal: Add braces around suspect code [PATCH 11/11] thermal: cpu_cooling: fix return value check in
Did you deliberately not include commit 50e66c7ed8 (drivers: thermal: add check when unregistering cpu cooling)?
Of the others you have in this series, 5 are those I spotted, and 2 (05/11 and 07/11) are included in the patches I received from ARM as part of their Juno delivery, so assume they are good to have too :-)
That leaves patches 02, 06, 09 and 10 which all look like they are fixing real bugs or at least inefficient and undesirable behaviour.
So, these patches all look good to me, however, as I only looked at this area of code for the first time yesterday I'm not sure how much my opinion is worth.
On 09/25/2014 10:32 PM, Jon Medhurst (Tixy) wrote:
On Thu, 2014-09-25 at 18:08 +0800, Alex Shi wrote:
Thanks for Tixy's reminder! The following commits are general thermal fixes which purpose on thermal-framework Any comments are appreciated!
Regards! Alex
[PATCH 01/11] thermal: cpu_cooling: fix 'descend' check in [PATCH 02/11] thermal: Fix binding problem when there is thermal zone [PATCH 03/11] thermal: fix cpu_cooling max_level behavior [PATCH 04/11] Thermal cpu cooling: return error if no valid cpu [PATCH 05/11] Thermal: thermal zone governor fix [PATCH 06/11] Thermal: Allow first update of cooling device state [PATCH 07/11] thermal: Bind cooling devices with the correct [PATCH 08/11] Thermal/cpu_cooling: Return directly for the cpu out of [PATCH 09/11] Thermal: update thermal zone device after setting [PATCH 10/11] thermal: Add braces around suspect code [PATCH 11/11] thermal: cpu_cooling: fix return value check in
Did you deliberately not include commit 50e66c7ed8 (drivers: thermal: add check when unregistering cpu cooling)?
Uh, that is better to be inclueded in LSK, Why I overlooked this? :(
Of the others you have in this series, 5 are those I spotted, and 2 (05/11 and 07/11) are included in the patches I received from ARM as part of their Juno delivery, so assume they are good to have too :-)
That leaves patches 02, 06, 09 and 10 which all look like they are fixing real bugs or at least inefficient and undesirable behaviour.
So, these patches all look good to me, however, as I only looked at this area of code for the first time yesterday I'm not sure how much my opinion is worth.
Thanks a lot for review! :)
linaro-kernel@lists.linaro.org