On 26 March 2015 at 15:19, Dietmar Eggemann dietmar.eggemann@arm.com wrote:
On 27/02/15 15:54, Vincent Guittot wrote:
When a CPU is used to handle a lot of IRQs or some RT tasks, the remaining capacity for CFS tasks can be significantly reduced. Once we detect such situation by comparing cpu_capacity_orig and cpu_capacity, we trig an idle load balance to check if it's worth moving its tasks on an idle CPU. It's worth trying to move the task before the CPU is fully utilized to minimize the preemption by irq or RT tasks.
Once the idle load_balance has selected the busiest CPU, it will look for an active load balance for only two cases :
- there is only 1 task on the busiest CPU.
- we haven't been able to move a task of the busiest rq.
A CPU with a reduced capacity is included in the 1st case, and it's worth to actively migrate its task if the idle CPU has got more available capacity for CFS tasks. This test has been added in need_active_balance.
As a sidenote, this will not generate more spurious ilb because we already trig an ilb if there is more than 1 busy cpu. If this cpu is the only one that has a task, we will trig the ilb once for migrating the task.
The nohz_kick_needed function has been cleaned up a bit while adding the new test
env.src_cpu and env.src_rq must be set unconditionnally because they are used in need_active_balance which is called even if busiest->nr_running equals 1
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org
kernel/sched/fair.c | 69 ++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 47 insertions(+), 22 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7420d21..e70c315 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6855,6 +6855,19 @@ static int need_active_balance(struct lb_env *env) return 1; }
/*
* The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
* It's worth migrating the task if the src_cpu's capacity is
reduced
* because of other sched_class or IRQs if more capacity stays
* available on dst_cpu.
*/
if ((env->idle != CPU_NOT_IDLE) &&
(env->src_rq->cfs.h_nr_running == 1)) {
if ((check_cpu_capacity(env->src_rq, sd)) &&
(capacity_of(env->src_cpu)*sd->imbalance_pct <
capacity_of(env->dst_cpu)*100))
return 1;
}
}return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
@@ -6954,6 +6967,9 @@ static int load_balance(int this_cpu, struct rq *this_rq,
schedstat_add(sd, lb_imbalance[idle], env.imbalance);
env.src_cpu = busiest->cpu;
Isn't this 'env.src_cpu = busiest->cpu;' or 'env.src_cpu = cpu_of(busiest);' already needed due to the existing ASYM_PACKING check in need_active_balance() 'if ( ... && env->src_cpu > env->dst_cpu)' for CPU_NEWLY_IDLE? Otherwise like you said, in these 'busiest->nr_running equals 1' instances, env->src_cpu is un-initialized.
yes, i sent a fix for that purpose some times ago : https://lkml.org/lkml/2013/2/12/158 but it has not gone further than the mailing list.
AFAICT, SD_ASYM_PACKING can't trig an active load balance on the cpu with the lowest id without this fix as src_cpu is initialized to 0 which implies that 'env->src_cpu > env->dst_cpu' is always false
Vincent
env.src_rq = busiest;
ld_moved = 0; if (busiest->nr_running > 1) { /*
@@ -6963,8 +6979,6 @@ static int load_balance(int this_cpu, struct rq *this_rq, * correctly treated as an imbalance. */ env.flags |= LBF_ALL_PINNED;
env.src_cpu = busiest->cpu;
env.src_rq = busiest; env.loop_max = min(sysctl_sched_nr_migrate,
busiest->nr_running);
more_balance:
[...]