On 12/09/16 08:47, Vincent Guittot wrote:
When a task moves from/to a cfs_rq, we set a flag which is then used to propagate the change at parent level (sched_entity and cfs_rq) during next update. If the cfs_rq is throttled, the flag will stay pending until the cfs_rw is unthrottled.
For propagating the utilization, we copy the utilization of child cfs_rq to
s/child/group ?
the sched_entity.
For propagating the load, we have to take into account the load of the whole task group in order to evaluate the load of the sched_entity. Similarly to what was done before the rewrite of PELT, we add a correction factor in case the task group's load is less than its share so it will contribute the same load of a task of equal weight.
What about cfs_rq->runnable_load_avg?
[...]
+/* Take into account change of load of a child task group */ +static inline void +update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se) +{
- struct cfs_rq *gcfs_rq = group_cfs_rq(se);
- long delta, load = gcfs_rq->avg.load_avg;
- /* If the load of group cfs_rq is null, the load of the
* sched_entity will also be null so we can skip the formula
*/
- if (load) {
long tg_load;
/* Get tg's load and ensure tg_load > 0 */
tg_load = atomic_long_read(&gcfs_rq->tg->load_avg) + 1;
/* Ensure tg_load >= load and updated with current load*/
tg_load -= gcfs_rq->tg_load_avg_contrib;
tg_load += load;
/* scale gcfs_rq's load into tg's shares*/
load *= scale_load_down(gcfs_rq->tg->shares);
load /= tg_load;
/*
* we need to compute a correction term in the case that the
* task group is consuming <1 cpu so that we would contribute
* the same load as a task of equal weight.
Wasn't 'consuming <1' related to 'NICE_0_LOAD' and not scale_load_down(gcfs_rq->tg->shares) before the rewrite of PELT (v4.2, __update_group_entity_contrib())?
*/
if (tg_load < scale_load_down(gcfs_rq->tg->shares)) {
load *= tg_load;
load /= scale_load_down(gcfs_rq->tg->shares);
}
- }
[...]