EASv1.2 unified the CPU selection for waken task with function find_best_target(); this function tries to select idle CPU or CPU with lowest utilization as candidate CPU, at the end this can reduce scheduling latency and boost performance for interactive scenarios.
On the other hand, this function is not comprehensive for power saving and it's fragile for big.LITTLE clusters system.
E.g, if has set "prefer_idle" flag, one small task was running on big core and the task is waken up after a sleeping, so the function find_best_target() iterates the scheduling group to find the best target for this task. The first iteration from previous CPU's scheduling group, so it firstly iterates the big core's scheduling if previous CPU is big core. As result it has much higher chance to select one idling big core and directly bails out, rather than to select idle CPUs from LITTLE cluster.
Another case is: if "prefer_idle" flag is cleared, the first time iterates the scheduling group for LITTLE cluster and second time iterates the group corresponding to big cluster; the function find_best_target() has much high chance to select big CPU for variable 'target_cpu' and 'best_idle_cpu'. This is not optimal selection from power saving perspective, this means we may miss the chance to select the LITTLE CPU with lower OPP.
So this patch is to add back function find_nrg_efficient_target(), which is to select CPU from energy efficiency perspective.
Signed-off-by: Leo Yan leo.yan@linaro.org --- kernel/sched/fair.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e9afae4..45b4080 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6310,6 +6310,73 @@ static inline int find_best_target(struct task_struct *p, bool boosted, bool pre return target_cpu; }
+static inline int find_nrg_efficient_target(struct task_struct *p, + struct sched_domain *sd) +{ + struct sched_group *sg, *sg_target; + int target_max_cap = INT_MAX; + int target_cpu = task_cpu(p); + unsigned long task_util_boosted, new_util; + int i; + + sg = sd->groups; + sg_target = sg; + + /* + * Find group with sufficient capacity. We only get here if no cpu is + * overutilized. We may end up overutilizing a cpu by adding the task, + * but that should not be any worse than select_idle_sibling(). + * load_balance() should sort it out later as we get above the tipping + * point. + */ + do { + /* Assuming all cpus are the same in group */ + int max_cap_cpu = group_first_cpu(sg); + + /* + * Assume smaller max capacity means more energy-efficient. + * Ideally we should query the energy model for the right + * answer but it easily ends up in an exhaustive search. + */ + if (capacity_of(max_cap_cpu) < target_max_cap && + task_fits_max(p, max_cap_cpu)) { + sg_target = sg; + target_max_cap = capacity_of(max_cap_cpu); + } + } while (sg = sg->next, sg != sd->groups); + + task_util_boosted = boosted_task_util(p); + /* Find cpu with sufficient capacity */ + for_each_cpu_and(i, tsk_cpus_allowed(p), sched_group_cpus(sg_target)) { + /* + * p's blocked utilization is still accounted for on prev_cpu + * so prev_cpu will receive a negative bias due to the double + * accounting. However, the blocked utilization may be zero. + */ + new_util = cpu_util(i) + task_util_boosted; + + /* + * Ensure minimum capacity to grant the required boost. + * The target CPU can be already at a capacity level higher + * than the one required to boost the task. + */ + if (new_util > capacity_orig_of(i)) + continue; + + if (new_util < capacity_curr_of(i)) { + target_cpu = i; + if (cpu_rq(i)->nr_running) + break; + } + + /* cpu has capacity at higher OPP, keep it as fallback */ + if (target_cpu == task_cpu(p)) + target_cpu = i; + } + + return target_cpu; +} + /* * Disable WAKE_AFFINE in the case where task @p doesn't fit in the * capacity of either the waking CPU @cpu or the previous CPU @prev_cpu. -- 1.9.1