When one the task is waken up, energy aware scheduling has one significant change is to set 'want_affine', the purpose is to decide CPU selection for energy aware path when system is under tipping point, or select idle sibling CPU after system is over tipping point. For idle sibling CPU selection, it only selects idle CPU in the first level schedule domain.
As result, if there have many big tasks are running the scheduler has no chance to migrate some of these tasks across higher level schedule domain. So the tasks is hard to migrate to CPUs in another cluster, so one cluster has packed many tasks but another cluster is idle; Finally this harms performance for multi-threading case.
This patch is to add more checking for 'want_affine'. If all CPUs in the highest capacity cluster have tasks are running on them, then need consider to fall back to traditional wakeup migration path, which will help select most idle CPU in the system. So this will give more chance to migrate task to idle CPU, finally decrease scheduling latency and improve performance.
Tested this patch with Geekbench on the ARM Juno R2 board for multi-thread case, the score can be improved from 2281 to 2357, so can improve performance ~3.3%.
Signed-off-by: Leo Yan leo.yan@linaro.org --- kernel/sched/fair.c | 41 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f2ab238..16eb48d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5328,6 +5328,34 @@ static bool cpu_overutilized(int cpu) return (capacity_of(cpu) * 1024) < (cpu_util(cpu) * capacity_margin); }
+static bool need_want_affine(struct task_struct *p, int cpu) +{ + int capacity = capacity_orig_of(cpu); + int max_capacity = cpu_rq(cpu)->rd->max_cpu_capacity.val; + unsigned long margin = schedtune_task_margin(p); + struct sched_domain *sd; + int affine = 0, i; + + if (margin) + return 1; + + if (capacity != max_capacity) + return 1; + + sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); + if (!sd) + return 1; + + for_each_cpu(i, sched_domain_span(sd)) { + if (idle_cpu(i)) { + affine = 1; + break; + } + } + + return affine; +} + #ifdef CONFIG_SCHED_TUNE
static long @@ -5891,7 +5919,7 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f if (sd_flag & SD_BALANCE_WAKE) want_affine = (!wake_wide(p) && task_fits_max(p, cpu) && cpumask_test_cpu(cpu, tsk_cpus_allowed(p))) || - energy_aware(); + (energy_aware() && need_want_affine(p, cpu));
rcu_read_lock(); for_each_domain(cpu, tmp) { @@ -8030,9 +8058,14 @@ static int need_active_balance(struct lb_env *env)
if ((capacity_orig_of(env->src_cpu) < capacity_orig_of(env->dst_cpu)) && env->src_rq->cfs.h_nr_running == 1 && - cpu_overutilized(env->src_cpu) && - !cpu_overutilized(env->dst_cpu)) - return 1; + cpu_overutilized(env->src_cpu)) { + + if (idle_cpu(env->dst_cpu)) + return 1; + + if (!idle_cpu(env->dst_cpu) && !cpu_overutilized(env->dst_cpu)) + return 1; + }
return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2); } -- 1.9.1