On 10/11/15 09:58, Leo Yan wrote:
When clusters share clocking domain, then they will bind to the same OPP; So if calculate energy data, should consider cross all related CPUs. But scheduler have no a top level sched group for these all CPUs, finally will not iterate all CPUs and get wrong calculation result.
This patch will change eenv->sg_cap as cpumask structure and set all related CPUs into it.
The 'put a struct cpumask sg_cap' on the stack idea should work but we so far favoured a different solution. Mainly also to cover one cluster systems for EAS we thought it would be better to introduce an extra sched domain 'SYS' (in your case on top of 'DIE') to:
(1) Hold cluster energy model (EM) data on a single cluster system (2) Offer this 'sg_shared_cap = sd->parent->groups' thing on a platform like Hikey (3) Let cpu and cluster level EM data survive if cpus get off-lined
Obviously, this sd level can't be part of the actual scheduling decisions so CFS code has to change slightly for this to work. I attached the appropriate patch. It's only tested lightly on TC2 and JUNO and might not apply on the current code line.
Would be nice if you can test it and give feedback if you consider it as something you're fine with. Thanks!
We should consider integrating one or the other solution into EAS RFC 6 to cover your topology for EAS as well.
Signed-off-by: Leo Yan leo.yan@linaro.org
kernel/sched/fair.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d9d0e11..63aef51 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4892,7 +4892,7 @@ static inline bool energy_aware(void)
struct energy_env { struct sched_group *sg_top;
struct sched_group *sg_cap;
struct cpumask sg_cap; int cap_idx; int usage_delta; int src_cpu;
@@ -4952,7 +4952,7 @@ unsigned long group_max_usage(struct energy_env *eenv) int i, delta; unsigned long max_usage = 0;
for_each_cpu(i, sched_group_cpus(eenv->sg_cap)) {
for_each_cpu(i, &eenv->sg_cap) { delta = calc_usage_delta(eenv, i); max_usage = max(max_usage, __get_cpu_usage(i, delta)); }
@@ -5054,8 +5054,19 @@ static unsigned int sched_group_energy(struct energy_env *eenv) * sched_group? */ sd = highest_flag_domain(cpu, SD_SHARE_CAP_STATES);
if (sd && sd->parent)
cpumask_clear(&eenv->sg_cap);
if (sd && sd->parent) { sg_shared_cap = sd->parent->groups;
cpumask_or(&eenv->sg_cap, &eenv->sg_cap,
sched_group_cpus(sg_shared_cap));
This path would cover platforms like TC2 and JUNO.
} else if (sd) {
sg_shared_cap = sd->groups;
do {
cpumask_or(&eenv->sg_cap, &eenv->sg_cap,
sched_group_cpus(sg_shared_cap));
} while (sg_shared_cap = sg_shared_cap->next, sg_shared_cap != sd->groups);
}
This path would cover Hikey.
for_each_domain(cpu, sd) { sg = sd->groups;
@@ -5069,11 +5080,6 @@ static unsigned int sched_group_energy(struct energy_env *eenv) int sg_busy_energy, sg_idle_energy; int cap_idx, idle_idx;
if (sg_shared_cap && sg_shared_cap->group_weight >= sg->group_weight)
eenv->sg_cap = sg_shared_cap;
else
eenv->sg_cap = sg;
cap_idx = find_new_capacity(eenv, sg->sge); if (sg->group_weight == 1) {
@@ -5170,6 +5176,7 @@ static int energy_diff(struct energy_env *eenv) energy_before += sched_group_energy(&eenv_before); energy_after += sched_group_energy(eenv); }
} while (sg = sg->next, sg != sd->groups); eenv->nrg.before = energy_before;
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.