Hi Leo, Patrick,
Do we have c-state info at this time? We could perhaps only 'do something' (ignore overutilised or update load averages, whatever is best) for idle CPUs which have requested the deepest idle state - on the grounds that shallow idle states indicate we expect to be leaving idle very soon. We would also need a good definition of shallow idle states of course - on some platforms I guess the exit latency of CPU down might be reported to be low enough that we enter a lot, which makes it a bit spurious to even bother checking.
Leo, do you have a feel for if this is a very rare event or something we can repro with a test case easily?
Best Regards,
--Chris
________________________________ From: Patrick Bellasi patrick.bellasi@arm.com Sent: 11 October 2016 12:35 To: leo.yan@linaro.org Cc: eas-dev@lists.linaro.org; Dietmar Eggemann; Morten Rasmussen; Robin Randhawa; Juri Lelli; Chris Redpath; Vincent Guittot; Steve Muckle Subject: Re: [PATCH v1 7/7] sched/fair: consider CPU overutilized only when it is not idle
On 10-Oct 16:35, Leo Yan wrote:
Energy aware scheduling sets tipping point when any CPU in the system is overutilized. So there have several occasions to set root domain's overutilized flag to indicate system is over tipping point, like scheduler tick, load balance, enqueue task, on the other hand the scheduler only utilize load balance's function update_sg_lb_stats() to iterate all CPUs to make sure all CPUs are not overutilized and then clear this flag after system is under tipping point,
For idle CPU, it will keep stale utilization value and this value will not be updated until the CPU is waken up. In some worse case, the CPU may stay in idle states for very long time (even may in second level), so before the CPU enter idle state it has quite high utilization value this will let scheduler always think the CPU is "overutilized" and will not switch to state for under tipping point. As result, a very small task stays on big core for long time due system cannot go back to energy aware path.
What happen instead if a busy CPU has just entered idle but it's likely to exit quite soon?
This patch is to check CPU idle state in function update_sg_lb_stats(), so if CPU is in idle state then will simply consider the CPU is not overutilized. So avoid to set tipping point by idle CPUs.
Maybe it's possible, just for idle CPUs marked as overutilized, to trigger at this point an update_cfs_rq_load_avg and than verify if the utilization signal has decayed enough for the CPU to be considered not overutilized anymore?
Signed-off-by: Leo Yan leo.yan@linaro.org
kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 937eca2..43eae09 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7409,7 +7409,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, if (!nr_running && idle_cpu(i)) sgs->idle_cpus++;
if (cpu_overutilized(i)) {
if (cpu_overutilized(i) && !idle_cpu(i)) { *overutilized = true; if (!sgs->group_misfit_task && rq->misfit_task) sgs->group_misfit_task = capacity_of(i);
-- 1.9.1
-- #include <best/regards.h>
Patrick Bellasi
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.