Currently energy calculation in EAS has missed to consider RT pressure,
it's quite possible to select CPU for CFS tasks which has high RT
pressure and finally accumulate total utilization; as result the other
low RT pressure CPUs lose chance to run CFS tasks and reduce contention
between CFS and RT tasks, from performance view this is not optimal;
furthermore this also harms power data due pack RT task and CFS task on
single one CPU is more easily to trigger CPU frequency increasing.
We can measure the summed CPU utilization and calculate the CPU freqency
standard deviation to get to if the tasks can be well spreading within
the same cluster for middle workload case. So below is the comparison
result for video playback on Hikey960 for before and after applied this
patch set (Using schedutil CPUFreq governor):
Without Patch Set: With Patch Set:
CPU Min(Util) Mean(Util) Mean(Util) | Min(Util) Mean(Util) Mean(Util)
0 7 67 205 | 8 52 170
1 4 53 227 | 9 47 188
2 4 57 191 | 8 38 192
3 4 35 165 | 16 47 146
s.d. 1.5 13.3 25.9 | 3.9 5.83 20.9
4 0 35 160 | 10 34 129
5 0 24 129 | 0 30 115
6 0 18 123 | 0 18 95
7 0 12 84 | 0 21 73
s.d. 0 9.8 31.2 | 5 7.5 24.4
The standard diviation for CPU utilization mean value has been decreased
after applying this patch set (Little cluster: 13.3 vs 5.83, big cluster:
9.8 vs 7.5). This also confirm from the average CPU frequency:
Without Patch Set: With Patch Set:
Average Frequency | Average Frequency
LITTLT Cluster 737MHz | 646MHz
big Cluster 916MHz | 922MHz
Leo Yan (4):
sched/fair: Select maximum spare capacity for idle candidate CPUs
sched: Introduce cpu_util_sum()/__cpu_util_sum() functions
sched/fair: Consider RT pressure for find_best_target()
sched/fair: Consider RT/DL pressure for energy calculation
kernel/sched/fair.c | 22 +++++++++++++++++++---
kernel/sched/sched.h | 29 +++++++++++++++++++++++++++++
2 files changed, 48 insertions(+), 3 deletions(-)
--
1.9.1
find_best_target() tries to find the target CPU where the task should be
placed, based on how much would be the utilization of the CPU after the
task is placed on it. This is represented by 'new_util' in the routine.
Currently it adds task_util(p) to the wake_util of the CPU to find that
out, while it should really be adding the boosted task's util to
wake_util, as that's how much the cpu utilization would be.
This is how we used to do it before commit 3bfde3b4f848 ("ANDROID:
sched/fair: Change cpu iteration order in find_best_target()"), was
merged and the commit doesn't describe the rational behind this change.
This patch reverts to the earlier formula to calculate the new_util.
Fixes: 3bfde3b4f848 ("ANDROID: sched/fair: Change cpu iteration order in find_best_target()")
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Just wanted to get some review done over the list before posting it to
gerrit. Not sure if this commit is doing the right thing, but I couldn't
understand why it should be done this way.
This is for the Android 4.9 EAS dev kernel.
kernel/sched/fair.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 88abd5de69ce..1c33a2ddd39c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6860,14 +6860,13 @@ static inline int find_best_target(struct task_struct *p, int *backup_cpu,
* accounting. However, the blocked utilization may be zero.
*/
wake_util = cpu_util_wake(i, p);
- new_util = wake_util + task_util(p);
/*
* Ensure minimum capacity to grant the required boost.
* The target CPU can be already at a capacity level higher
* than the one required to boost the task.
*/
- new_util = max(min_util, new_util);
+ new_util = wake_util + min_util;
/*
* Include minimum capacity constraint:
--
2.15.0.194.g9af6a3dea062