eas-dev March 2018

eas-dev@lists.linaro.org

13 participants
11 discussions

by Jason Cren

Hi, My first mail here, please tell me if this is not the right list for my queries. I have recently got myself a thundersoft s835 development kit and set it up for power measurement. I was looking at creating the energy model for this from scratch and measuring the impact. I stumbled across something which led me to write this mail. The eas integration document mentions no support currently exists for SMT. However, my kernel configuration for some reason has SCHED_SMT enabled. The s835 is not hyper-threaded and so I am thinking it's a mistake. My questions are: 1) Is disabling SCHED_SMT enough or are there any other configurations that need to be changed as well ? 2) I'd like to understand the technical difficulties with enabling SMT support since I have a mips device which has hyperthreading. Is it that the cpu-cluster energy model would not be enough to handle proper task placement when it comes to virtual cores ? 3) Is there any on-going work on adding SMT support or is it considered as not being worth the effort ? Thanks for reading. -Jason

8 years, 2 months

[PATCH 00/15] Energy computation optimization on android-4.9-eas-dev

by Leo Yan

This patch set is to optimize the energy computation on Android kernel android-4.9-eas-dev branch [1]; Patches 0001-0012 are used to refactor the code and some minor optimization, otherwise the task energy computation is hard to landed into current code; Patch 0013 "sched/fair: Optimize energy computation with task oriented" is the core patch in whole patch set, which is mainly used to implement energy calculation for task. Patch 0014 is a sequential patch to use cached value so we can get more benefit for performance by exchanging more memory. Patch 0015 is a trival experiment patch to remove 'idle state estimation'. The testing uses rt-app to generate synthetic workload, the workload duty cycles are 1%/5%/10%/20%/30%/40%; the duration is measured interval for func select_energy_cpu_idx(), which now is used to calculation three candidates in single run. The result shows this patch set improve for energy computation duration: +----------------+-------+-------+-------+-------+-------+--------+ | workload | 1% | 5% | 10% | 20% | 30% | 40% | +----------------+-------+-------+-------+-------+-------+--------+ | w/o patch set | 17267 | 21227 | 17019 | 13914 | 15002 | 23412 | | w/t patch set | 10823 | 11924 | 10931 | 10785 | 11139 | 11223 | +----------------+-------+-------+-------+-------+-------+--------+ | Opt percentage | 37% | 43% | 36% | 22% | 26% | 52% | +----------------+-------+-------+-------+-------+-------+--------+ The detailed testing ipython notebook you could check [2][3]. [1] https://android.googlesource.com/kernel/common/+/android-4.9-eas-dev [2] https://github.com/Leo-Yan/lisa/blob/2018_03_17_android_4.9_eas_dev_nrg_com… [3] https://github.com/Leo-Yan/lisa/blob/2018_03_17_android_4.9_eas_dev_nrg_com… Leo Yan (15): sched/fair: Prepare energy env cpumask before energy calculation sched/fair: Re-define return values for select_energy_cpu_idx() sched/fair: Reduce indent in select_energy_cpu_brute() sched/fair: Fix one minor typo sched/fair: Use per cpu data to maintain energy environment sched/fair: Use cpumask to track candidates for energy calculation sched/fair: Lift CPU iteration out of calc_sg_energy() sched/fair: Introduce new function compute_task_energy() sched/fair: Decide 'eenv->sg_cap' ahead energy computation sched/fair: Use eenv::sg_cap to select capacity index sched/fair: Estimate capacity index ahead energy computation sched/fair: Refactor compute_energy() sched/fair: Optimize energy computation with task oriented sched/fair: Optimize energy calculation with cached energy data sched/fair: Remove idle state estimation kernel/sched/fair.c | 542 +++++++++++++++++++++++++--------------------------- 1 file changed, 256 insertions(+), 286 deletions(-) -- 1.9.1

8 years, 2 months

[Integration Branch] Update 16-Mar-2018

by Douglas Raillard

Hi, A new EAS integration branch is available on branch eas/next/integration here: http://linux-arm.org/git?p=linux-power.git For further information about main features, test coverage and work items for next integration please have a look at: https://developer.arm.com/open-source/energy-aware-scheduling/eas-mainline-… Best Regards, Douglas

8 years, 2 months

About group_idle_state estimation after energy_diff re-factor

by Ke Wang

hi Chris, Patrick, Quentin, On android-4.9-eas-dev branch, after the commit of energy_diff re-factor("sched/fair: re-factor energy_diff to use a single (extensible) energy_env"), the calculation of group_idle_state for previous cpu had changed. Before the re-factor, for eenv_before, there is chance to continue doing group_idle_state estimation(not "0goto end" directly); After the re-factor, for cpu_idx=EAS_CPU_PRV, there is no chance to do estimation("goto end" directly); Is it deliberate for this change? If yes, do you have any test result for this? -static int group_idle_state(struct energy_env *eenv, struct sched_group *sg) +static int group_idle_state(struct energy_env *eenv, int cpu_idx) { + struct sched_group *sg = eenv->sg; int i, state = INT_MAX; int src_in_grp, dst_in_grp; long grp_util = 0; @@ -5556,8 +5610,10 @@ static int group_idle_state(struct energy_env *eenv, struct sched_group *sg) /* Take non-cpuidle idling into account (active idle/arch_cpu_idle()) */ state++; - src_in_grp = cpumask_test_cpu(eenv->src_cpu, sched_group_cpus(sg)); - dst_in_grp = cpumask_test_cpu(eenv->dst_cpu, sched_group_cpus(sg)); + src_in_grp = cpumask_test_cpu(eenv->cpu[EAS_CPU_PRV].cpu_id, + sched_group_cpus(sg)); + dst_in_grp = cpumask_test_cpu(eenv->cpu[cpu_idx].cpu_id, + sched_group_cpus(sg)); if (src_in_grp == dst_in_grp) { /* both CPUs under consideration are in the same group or not in * either group, migration should leave idle state the same. @@ -5571,7 +5627,7 @@ static int group_idle_state(struct energy_env *eenv, struct sched_group *sg) */ for_each_cpu(i, sched_group_cpus(sg)) { grp_util += cpu_util_wake(i, eenv->p); - if (unlikely(i == eenv->trg_cpu)) + if (unlikely(i == eenv->cpu[cpu_idx].cpu_id)) grp_util += eenv->util_delta; } Thanks, Ke Wang

8 years, 2 months

[RFC] find_best_target refactor items

by Ionela Voinescu

Hi, In an attempt to clarify and simplify find_best_target I've identified four areas that I believe might benefit from simplifications/clarifications/fixing. Please consider the ideas below as proposals towards this purpose. Also, consider these listed from narrow scope to broader scope. 1. Do not consider boost for !prefer_idle tasks, but always try to minimize capacity_orig. This was more or less done already by not having boosted and not-prefer idle tasks in typical Android configurations, but was never enforced in find_best_target code. This was discussed on this list as well in the "sched/fair: Prefer low capacity idle-CPU for boosted non-prefer-idle tasks" thread. Leo, Viresh, I would encourage you to push your patch to Gerrit and continue the discussion there, as it is a valid point on its own, and less controversial than the broader scope patch at 2. 2. Do not consider boost for prefer_idle tasks, but always try to maximize capacity_orig. This is the discussion that the patch at https://android-review.googlesource.com/c/kernel/common/+/636583 is trying to trigger, although in scope it touches on the point above as well. 3. Make order of CPUs irrelevant for CPU selection in find_best_target. This is a patch I'll try to push to Gerrit tomorrow after I have more test results. This will not incorporate code for any of the points above and it will try to mimic the selection of a CPU that is done now in find_best target. But if the points above are proven on their own, this fact will simplify the code for this item and make the implementation of the item at 2. unnecessary. 4. Remove the prefer_idle case from find_best_target. https://android-review.googlesource.com/q/topic:%22strf-mainline%22+(status… This has the broadest scope but it is more difficult to validate. Although broader scope items would make some narrow scope items unnecessary, they have value on their own if the boarder scope items cannot be proven in a reasonable amount of time, as they will provide fixes and support earlier than it takes to make a full re-factor of code. - 1 will provide valuable fixes now. - 2 will simplify find_best_target decision logic to facilitate and simplify 3. - 3 will add support for tri-gear platforms for which the current order of CPUs will be incorrect in some cases. - 4 will bring us closer to the mainline behavior. I'm adding the present of pretty pictures from what I call "validation by storm": I've created some kernels with combinations of the above items and tested them on a Pixel 2 device, just to make sure there are no important regressions introduced by them. But I believe all of these need to be validated independently to make sure we don't miss important corner cases. Results at https://gist.github.com/ionela-voinescu/f89815591c7f50864188094bb8c53ec4. Given that it's difficult for all involved to keep up with discussions on both gerrit and eas-dev, I'd suggest to discuss design and direction ideas as part of this thread and push patches and individual test references to android-review directly. Let me know what you think. All comments are welcomed! Best regards, Ionela.

8 years, 2 months

[PATCH 0/4] Consider RT Pressure for Energy Saving

by Leo Yan

Currently energy calculation in EAS has missed to consider RT pressure, it's quite possible to select CPU for CFS tasks which has high RT pressure and finally accumulate total utilization; as result the other low RT pressure CPUs lose chance to run CFS tasks and reduce contention between CFS and RT tasks, from performance view this is not optimal; furthermore this also harms power data due pack RT task and CFS task on single one CPU is more easily to trigger CPU frequency increasing. We can measure the summed CPU utilization and calculate the CPU freqency standard deviation to get to if the tasks can be well spreading within the same cluster for middle workload case. So below is the comparison result for video playback on Hikey960 for before and after applied this patch set (Using schedutil CPUFreq governor): Without Patch Set: With Patch Set: CPU Min(Util) Mean(Util) Mean(Util) | Min(Util) Mean(Util) Mean(Util) 0 7 67 205 | 8 52 170 1 4 53 227 | 9 47 188 2 4 57 191 | 8 38 192 3 4 35 165 | 16 47 146 s.d. 1.5 13.3 25.9 | 3.9 5.83 20.9 4 0 35 160 | 10 34 129 5 0 24 129 | 0 30 115 6 0 18 123 | 0 18 95 7 0 12 84 | 0 21 73 s.d. 0 9.8 31.2 | 5 7.5 24.4 The standard diviation for CPU utilization mean value has been decreased after applying this patch set (Little cluster: 13.3 vs 5.83, big cluster: 9.8 vs 7.5). This also confirm from the average CPU frequency: Without Patch Set: With Patch Set: Average Frequency | Average Frequency LITTLT Cluster 737MHz | 646MHz big Cluster 916MHz | 922MHz Leo Yan (4): sched/fair: Select maximum spare capacity for idle candidate CPUs sched: Introduce cpu_util_sum()/__cpu_util_sum() functions sched/fair: Consider RT pressure for find_best_target() sched/fair: Consider RT/DL pressure for energy calculation kernel/sched/fair.c | 22 +++++++++++++++++++--- kernel/sched/sched.h | 29 +++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 3 deletions(-) -- 1.9.1

8 years, 2 months

eas-4.9-dev: regard of schedTune PE filter

by Leo Yan

Hi Patrick, I am reading the code for eas-4.9-dev branch [1], and found schedtune_accept_deltas() has been removed from fair.c. I think this is not purposed and might be wrongly removed by some optimization patch and should add back? Not sure if there have following patch to fix this. FYI. [1] https://android.googlesource.com/kernel/common/+/android-4.9-eas-dev/kernel… Thanks, Leo Yan

8 years, 2 months

[RESEND PATCH] sched/fair: consider RT/IRQ pressure in select_idle_sibling

by Rohit Jain

Currently fast path in the scheduler looks for an idle CPU to schedule threads on. Capacity is taken into account in the function 'select_task_rq_fair' when it calls 'wake_cap', however it ignores the instantaneous capacity and looks at the original capacity. Furthermore select_idle_sibling path of the code, ignores the RT/IRQ threads which are also running on the CPUs it is looking to schedule fair threads on. We don't necessarily have to force the code to go to slow path (by modifying wake_cap), instead we could do a better selection of the CPU in the current domain itself (i.e. in the fast path). This patch makes the fast path aware of capacity, resulting in overall performance improvements as shown in the test results. 1) KVM Test: ----------------------------------------------------------------------- In this test KVM is configured with a ubuntu VM (unchanged kernel, used Ubuntu server 16.04) which is running ping workload in a taskset along with hackbench in a separate taskset. The VM is a virtio VM (which means the host is taking and processing the interrupts). In this case, we want to avoid scheduling vcpus on CPUs which are very busy processing interrupts if there is a better choice available. This machine is a 2 socket 40 CPU 20 core Intel x86 machine. lscpu output: CPU(s): 40 On-line CPU(s) list: 0-39 Thread(s) per core: 2 Core(s) per socket: 10 Socket(s): 2 NUMA node(s): 2 NUMA node0 CPUs: 0-9,20-29 NUMA node1 CPUs: 10-19,30-39 The setup is realistic enough to mirror realistic use cases. KVM is bound to a full NUMA node with CPUs bound to whole cores, i.e. CPU 0 and 1 are bound to core 0, CPU 2 and 3 to core 1 and so on. virsh vcpupin output: VCPU: CPU Affinity ---------------------------------- 0: 0,20 1: 0,20 2: 1,21 3: 1,21 4: 2,22 5: 2,22 6: 3,23 7: 3,23 8: 4,24 9: 4,24 10: 5,25 11: 5,25 12: 6,26 13: 6,26 14: 7,27 15: 7,27 16: 8,28 17: 8,28 18: 9,29 19: 9,29 Here are the results seen with ping and hackbench running inside the KVM. Note: hackbench was run for 10000 loops (lower is better) (Improvement is show in brackets +ve is good, -ve is bad) +-------------------+-----------------+---------------------------+ | | Without patch | With patch | +---+-------+-------+-------+---------+-----------------+---------+ |FD | Groups| Tasks | Mean | Std Dev | Mean | Std Dev | +---+-------+-------+-------+---------+-----------------+---------+ |4 | 1 | 4 | 0.059 | 0.021 | 0.034 (+42.37%) | 0.008 | |4 | 2 | 8 | 0.087 | 0.031 | 0.075 (+13.79%) | 0.021 | |4 | 4 | 16 | 0.124 | 0.022 | 0.089 (+28.23%) | 0.013 | |4 | 8 | 32 | 0.149 | 0.031 | 0.126 (+15.43%) | 0.022 | +---+-------+-------+-------+---------+-----------------+---------+ |8 | 1 | 8 | 0.212 | 0.025 | 0.211 (+0.47%) | 0.023 | |8 | 2 | 16 | 0.246 | 0.055 | 0.225 (+8.54%) | 0.024 | |8 | 4 | 32 | 0.298 | 0.047 | 0.294 (+1.34%) | 0.022 | |8 | 8 | 64 | 0.407 | 0.03 | 0.378 (+7.13%) | 0.032 | +---+-------+-------+-------+---------+-----------------+---------+ |40 | 1 | 40 | 1.703 | 0.133 | 1.451 (+14.80%) | 0.072 | |40 | 2 | 80 | 2.912 | 0.204 | 2.431 (+16.52%) | 0.075 | +---+-------+-------+-------+---------+-----------------+---------+ 2) ping + hackbench test on x86 machine: ----------------------------------------------------------------------- Here hackbench is running in threaded mode along with, running ping on CPU 0 and 1 as: 'ping -l 10000 -q -s 10 -f hostX' This test is running on 2 socket, 20 core and 40 threads Intel x86 machine: runtime is in seconds (Lower is better) +-----------------------------+----------------+---------------------------+ | | Without patch | With patch | +----------+----+------+------+-------+--------+----------------+----------+ |Loops | FD |Groups|Tasks | Mean | Std Dev|Mean | Std Dev | +----------+----+------+------+-------+--------+----------------+----------+ |1,000,000 | 4 |1 |4 | 2.375 | 0.818 |1.785 (+24.84%) | 0.21 | |1,000,000 | 4 |2 |8 | 2.748 | 0.694 |2.102 (+23.51%) | 0.239 | |1,000,000 | 4 |4 |16 | 3.237 | 0.256 |2.922 (+9.73%) | 0.169 | |1,000,000 | 4 |8 |32 | 3.786 | 0.185 |3.637 (+3.94%) | 0.471 | +----------+----+------+------+-------+--------+----------------+----------+ |1,000,000 | 8 |1 |8 | 7.287 | 1.03 |5.364 (+26.39%) | 1.085 | |1,000,000 | 8 |2 |16 | 7.963 | 0.951 |6.474 (+18.70%) | 0.397 | |1,000,000 | 8 |4 |32 | 8.991 | 0.618 |8.32 (+7.46%) | 0.69 | |1,000,000 | 8 |8 |64 | 13.868| 1.195 |12.881 (+7.12%) | 0.722 | +----------+----+------+------+-------+--------+----------------+----------+ |10,000 | 40 |1 |40 | 0.828 | 0.032 |0.784 (+5.31%) | 0.010 | |10,000 | 40 |2 |80 | 1.087 | 0.246 |0.980 (+9.84%) | 0.037 | |10,000 | 40 |4 |160 | 1.611 | 0.055 |1.591 (+1.24%) | 0.039 | |10,000 | 40 |8 |320 | 2.827 | 0.031 |2.817 (+0.35%) | 0.025 | |10,000 | 40 |16 |640 | 5.107 | 0.085 |5.087 (+0.39%) | 0.045 | |10,000 | 40 |25 |1000 | 7.503 | 0.143 |7.468 (+0.46%) | 0.045 | +----------+----+------+------+-------+--------+----------------+----------+ Sanity tests: ----------------------------------------------------------------------- schbench results on 2 socket, 44 core and 88 threads Intel x86 machine, with 2 message threads (lower is better): +----------+-------------+----------+------------------+ |Threads | Latency | Without | With Patch | | | percentiles | Patch | | | | | (usec) | (usec) | +----------+-------------+----------+------------------+ |16 | 50.0000th | 60 | 62 (-3.33%) | |16 | 75.0000th | 72 | 68 (+5.56%) | |16 | 90.0000th | 81 | 80 (+2.46%) | |16 | 95.0000th | 88 | 83 (+5.68%) | |16 | *99.0000th | 100 | 92 (+8.00%) | |16 | 99.5000th | 105 | 97 (+7.62%) | |16 | 99.9000th | 110 | 100 (+9.09%) | +----------+-------------+----------+------------------+ |32 | 50.0000th | 62 | 62 (0%) | |32 | 75.0000th | 80 | 81 (0%) | |32 | 90.0000th | 93 | 94 (-1.07%) | |32 | 95.0000th | 103 | 105 (-1.94%) | |32 | *99.0000th | 121 | 121 (0%) | |32 | 99.5000th | 127 | 125 (+1.57%) | |32 | 99.9000th | 143 | 135 (+5.59%) | +----------+-------------+----------+------------------+ |44 | 50.0000th | 79 | 79 (0%) | |44 | 75.0000th | 104 | 104 (0%) | |44 | 90.0000th | 126 | 125 (+0.79%) | |44 | 95.0000th | 138 | 137 (+0.72%) | |44 | *99.0000th | 163 | 163 (0%) | |44 | 99.5000th | 174 | 171 (+1.72%) | |44 | 99.9000th | 10832 | 11248 (-3.84%) | +----------+-------------+----------+------------------+ I also ran uperf and sysbench MySQL workloads but I see no statistically significant change. Signed-off-by: Rohit Jain <rohit.k.jain(a)oracle.com> --- kernel/sched/fair.c | 38 ++++++++++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 26a71eb..ce5ccf8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5625,6 +5625,11 @@ static unsigned long capacity_orig_of(int cpu) return cpu_rq(cpu)->cpu_capacity_orig; } +static inline bool full_capacity(int cpu) +{ + return capacity_of(cpu) >= (capacity_orig_of(cpu)*3)/4; +} + static unsigned long cpu_avg_load_per_task(int cpu) { struct rq *rq = cpu_rq(cpu); @@ -6081,7 +6086,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int for_each_cpu(cpu, cpu_smt_mask(core)) { cpumask_clear_cpu(cpu, cpus); - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) || !full_capacity(cpu)) idle = false; } @@ -6102,7 +6107,8 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int */ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) { - int cpu; + int cpu, rcpu = -1; + unsigned long max_cap = 0; if (!static_branch_likely(&sched_smt_present)) return -1; @@ -6110,11 +6116,13 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t for_each_cpu(cpu, cpu_smt_mask(target)) { if (!cpumask_test_cpu(cpu, &p->cpus_allowed)) continue; - if (idle_cpu(cpu)) - return cpu; + if (idle_cpu(cpu) && (capacity_of(cpu) > max_cap)) { + max_cap = capacity_of(cpu); + rcpu = cpu; + } } - return -1; + return rcpu; } #else /* CONFIG_SCHED_SMT */ @@ -6143,6 +6151,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t u64 time, cost; s64 delta; int cpu, nr = INT_MAX; + int best_cpu = -1; + unsigned int best_cap = 0; this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc)); if (!this_sd) @@ -6173,8 +6183,15 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t return -1; if (!cpumask_test_cpu(cpu, &p->cpus_allowed)) continue; - if (idle_cpu(cpu)) - break; + if (idle_cpu(cpu)) { + if (full_capacity(cpu)) { + best_cpu = cpu; + break; + } else if (capacity_of(cpu) > best_cap) { + best_cap = capacity_of(cpu); + best_cpu = cpu; + } + } } time = local_clock() - time; @@ -6182,7 +6199,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t delta = (s64)(time - cost) / 8; this_sd->avg_scan_cost += delta; - return cpu; + return best_cpu; } /* @@ -6193,13 +6210,14 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) struct sched_domain *sd; int i; - if (idle_cpu(target)) + if (idle_cpu(target) && full_capacity(target)) return target; /* * If the previous cpu is cache affine and idle, don't be stupid. */ - if (prev != target && cpus_share_cache(prev, target) && idle_cpu(prev)) + if (prev != target && cpus_share_cache(prev, target) && idle_cpu(prev) && + full_capacity(prev)) return prev; sd = rcu_dereference(per_cpu(sd_llc, target)); -- 2.7.4

8 years, 2 months

[PATCH android-4.9] sched/fair: Prefer low capacity idle-CPU for boosted non-prefer-idle tasks

by Viresh Kumar

From: Leo Yan <leo.yan(a)linaro.org> The homescreen test-case shows unwanted disturbance on the big CPUs on the Hikey620 platform with Android 4.9. There are multiple reasons for that though. By default boost and prefer_idle are enabled for both top-app and foreground tasks and find_best_target() always (intentionally) prefers the big CPUs if prefer_idle is enabled. And some of the foreground tasks (like: DispSync, PowerManagerSer and PhotonicModulat) for the homescreen test-case get placed on the big CPUs eventually because of that. Even if prefer_idle is disabled for such foreground tasks, they don't end up on the little CPUs. The reason being that find_best_target() still prefers big CPUs if the task is boosted, though some of the comments in find_best_target() routine say the exact opposite of that. It eventually depends on the order in which CPUs are processed, which is from big to little for boosted tasks. This patch updates the find_best_target() routine to select low capacity idle-CPU if the task is boosted but doesn't have prefer_idle enabled. This would be the same for non-boosted tasks as well. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- kernel/sched/fair.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e45047bdd245..4534d8620989 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6998,8 +6998,11 @@ static inline int find_best_target(struct task_struct *p, int *backup_cpu, int idle_idx = idle_get_state_idx(cpu_rq(i)); /* Select idle CPU with lower cap_orig */ - if (capacity_orig > best_idle_min_cap_orig) + if (capacity_orig < best_idle_min_cap_orig) + goto found_best_idle_cpu; + else if (capacity_orig > best_idle_min_cap_orig) continue; + /* Favor CPUs that won't end up running at a * high OPP. */ @@ -7017,6 +7020,7 @@ static inline int find_best_target(struct task_struct *p, int *backup_cpu, best_idle_cstate <= idle_idx) continue; +found_best_idle_cpu: /* Keep track of best idle CPU */ best_idle_min_cap_orig = capacity_orig; target_idle_max_spare_cap = capacity_orig - -- 2.15.0.194.g9af6a3dea062

8 years, 2 months

[PATCH] sched/fair: Use task's boosted util while calculating new_util of CPU

by Viresh Kumar

find_best_target() tries to find the target CPU where the task should be placed, based on how much would be the utilization of the CPU after the task is placed on it. This is represented by 'new_util' in the routine. Currently it adds task_util(p) to the wake_util of the CPU to find that out, while it should really be adding the boosted task's util to wake_util, as that's how much the cpu utilization would be. This is how we used to do it before commit 3bfde3b4f848 ("ANDROID: sched/fair: Change cpu iteration order in find_best_target()"), was merged and the commit doesn't describe the rational behind this change. This patch reverts to the earlier formula to calculate the new_util. Fixes: 3bfde3b4f848 ("ANDROID: sched/fair: Change cpu iteration order in find_best_target()") Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- Just wanted to get some review done over the list before posting it to gerrit. Not sure if this commit is doing the right thing, but I couldn't understand why it should be done this way. This is for the Android 4.9 EAS dev kernel. kernel/sched/fair.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 88abd5de69ce..1c33a2ddd39c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6860,14 +6860,13 @@ static inline int find_best_target(struct task_struct *p, int *backup_cpu, * accounting. However, the blocked utilization may be zero. */ wake_util = cpu_util_wake(i, p); - new_util = wake_util + task_util(p); /* * Ensure minimum capacity to grant the required boost. * The target CPU can be already at a capacity level higher * than the one required to boost the task. */ - new_util = max(min_util, new_util); + new_util = wake_util + min_util; /* * Include minimum capacity constraint: -- 2.15.0.194.g9af6a3dea062

8 years, 2 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

eas-dev March 2018