On 12 October 2016 at 17:03, Dietmar Eggemann dietmar.eggemann@arm.com wrote:
On 26/09/16 13:19, Vincent Guittot wrote:
A task can be asynchronously detached from cfs_rq when migrating between CPUs. The load of the migrated task is then removed from source cfs_rq during its next update. We use this event to set propagation flag.
During the load balance, we take advanatge of the update of blocked load to we propagate any pending changes.
IMHO, it would be a good idea to mention that '2/7 sched: fix hierarchical order in rq->leaf_cfs_rq_list' is a hard requirement for this to work. The functionality relies on the order of cfs_rq's (top to root) in the rq->leaf_cfs_rq_list list.
yes. i will add a comment
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org
kernel/sched/fair.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8ba500f..bd3b6b9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3221,6 +3221,7 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) sub_positive(&sa->load_avg, r); sub_positive(&sa->load_sum, r * LOAD_AVG_MAX); removed_load = 1;
set_tg_cfs_propagate(cfs_rq); } if (atomic_long_read(&cfs_rq->removed_util_avg)) {
@@ -3228,6 +3229,7 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) sub_positive(&sa->util_avg, r); sub_positive(&sa->util_sum, r * LOAD_AVG_MAX); removed_util = 1;
set_tg_cfs_propagate(cfs_rq); } decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa,
@@ -6607,6 +6609,10 @@ static void update_blocked_averages(int cpu)
if (update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq, true)) update_tg_load_avg(cfs_rq, 0);
/* Propagate pending load changes to the parent */
if (cfs_rq->tg->se[cpu])
update_load_avg(cfs_rq->tg->se[cpu], 0);
In my test (1 task (run/period: 8ms/16ms) in tg_root->tg_x->tg_y->*tg_z* and oscillating between cpu1 and cpu2) the cfs_rq related signals are nicely going down to 0 after the task has left the cpu but it doesn't seem to be the case for the corresponding se (cfs_rq->tg->se[cpu])?
strange because such use case is part of the functional tests that I run and it was working fine according to last test that I did
It should actually work correctly because of the update_tg_cfs_util/load() calls in update_load_avg(cfs_rq->tg->se[cpu], 0)->propagate_entity_load_avg()
Furthermore, the update of the parent cfs_rq tg_x->cfs_rq[cpu] uses the delta between previous and new value for the child tg_y->se[cpu]. So it means that if tg_x->cfs_rq[cpu]->avg.load_avg goes down to 0, tg_y->se[cpu]->avg.load_avg has at least changed and most probably goes down to 0 too
Can't it be a misplaced trace point ?
[...]