Hi Yun,
On 29/07/2020 05:17, 向澐 wrote:
In find_energy_effecient_cpu(), if we set a task min util clamp to 1024, all cpus will be skipped because of !fit_capcity() condition. And the return value will always be prev_cpu (best_energy_cpu initial value).
For this case is it better to find max spare capcity CPU in the whole system? ex.
you're highlighting an existing issue in the current code in Linux mainline as well as in Android Common Kernel.
In case the task's UCLAMP_MIN value is larger than 0.8 * 1024 (~819) there won't be any CPU in the system which fits the capacity request. But this shouldn't be a showstopper right now since you can configure your task's UCLAMP_MIN always in the range of [0 .. 819]. By default task's UCLAMP_MIN value is 0.
IMHO, since no CPU in the system can handle such a UCLAMP_MIN value we should think about to disable the possibility to set those UCLAMP_MIN values in the first place.
But there is more to it then just this static 819 boundary. Since we use cpu_cap = capacity_of(cpu) (and not capacity_orig_of(cpu)), the cpu_cap might be < 1024 on a big CPU.
So even a task with smaller UCLAMP_MIN values than 819 might suffer from the fact that find_energy_efficient_cpu() returns prev_cpu in case there is capacity pressure on all the big CPUs (due to RT/DL/IRQ/thermal).
There are several ideas on the table to solve (or at least mitigate) this issue:
(1) Reduce tasks UCLAMP_MIN space to [0 .. 819] (mentioned above) or remap [0 ... 1023] to [0 .. 819] internally.
(2) System-wide spare capacity CPU (your proposal).
The issue w/ this is that you would return the system-wide max_spare_capacity CPU although CPUs sharing the Perf Domain with prev_cpu could serve as a best_energy_cpu. This can happen in case only prev_cpu experiences (temporary) capacity pressure.
(3) Route a task which cannot fit on prev_cpu due to its UCLAMP_MIN value through select_idle_sibling() -> select_idle_capacity(). I.e. add an appropriate check next to rd->overutilized and sd check at the beginning of find_energy_efficient_cpu() which goes to fail
Similar issue as in (2) here.
(4) Deal with the fact that we return prev cpu in this case (current behavior). Doesn't seem to bad in combination with (1).
I guess each solution can only work in combination with (1).
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index da3e5b54715b..7e431195753e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6536,6 +6536,8 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX; struct root_domain *rd = cpu_rq(smp_processor_id())->rd; unsigned long cpu_cap, util, base_energy = 0;
- unsigned long sys_max_spare_cap = 0;
- int sys_max_spare_cap_cpu = prev_cpu; int cpu, best_energy_cpu = prev_cpu; struct sched_domain *sd; struct perf_domain *pd;
@@ -6576,6 +6578,14 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) cpu_cap = capacity_of(cpu); spare_cap = cpu_cap - util;
- /*
- Find the CPU with the maximum spare capacity in
- the performance domain
- */
- if (spare_cap > sys_max_spare_cap) {
- sys_max_spare_cap = spare_cap;
- sys_max_spare_cap_cpu = cpu;
- } /*
- Skip CPUs that cannot satisfy the capacity request.
- IOW, placing the task there would make the CPU
@@ -6622,7 +6632,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
- least 6% of the energy used by prev_cpu.
*/ if (prev_delta == ULONG_MAX)
- return best_energy_cpu;
- return sys_max_spare_cap_cpu;