Idle CPU will keep stale utilization value and this value will not be updated until this CPU is waken up. In some worse case, idle CPU may stay in idle states for very long time (even may in second level), so before the CPU enters idle state it has quite high utilization value this will let scheduler always think the CPU is "overutilized".
This is a defect for scheduler load metrics, and as result it introduces misunderstanding for scheduler to make decision for tipping point. E.g, scheduler calls function update_sg_lb_stats() to iterate all CPUs to make sure all CPUs are not overutilized and then clear flag to indicate system is under tipping point; if any idle CPU has stale utilization value which unfortunately introduce "overutilized", the function update_sg_lb_stats() will wrongly consider the system is over tipping point, even the idle CPU has been staying in idle states for long time.
So essentially we need figure out proper method to decay idle CPU utilization. One possible method is to wake up idle CPUs in scheduler tick and the idle CPU will exit from idle states and update its own utilization value and sleep again if there have no task on it, so this method is suboptimal and potentially harm energy if CPUs entry and exit idle states merely for decaying load metrics.
This patch is to using the load balance as a good occasion to decaying idle CPUs blocked load and eventually this will let system can get correct load metrics for idle CPUs.
Signed-off-by: Leo Yan leo.yan@linaro.org --- kernel/sched/fair.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2a263f7..3278b563 100755 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8204,9 +8204,14 @@ static inline void update_sg_lb_stats(struct lb_env *env, /* * No need to call idle_cpu() if nr_running is not 0 */ - if (!nr_running && idle_cpu(i)) + if (!nr_running && idle_cpu(i)) { sgs->idle_cpus++;
+ /* update idle CPU blocked load */ + if (cpu_util(i)) + update_blocked_averages(i); + } + if (cpu_overutilized(i)) { overutilized = true; if (!sgs->group_misfit_task && rq->misfit_task) -- 1.9.1