On 19 December 2016 at 18:47, Dietmar Eggemann dietmar.eggemann@arm.com wrote:
On 12/19/2016 04:36 PM, Vincent Guittot wrote:
On 19 December 2016 at 16:02, Leo Yan leo.yan@linaro.org wrote:
On Mon, Dec 19, 2016 at 08:27:15AM +0100, Vincent Guittot wrote:
[...]
No in Thara's path, sd->overutilized and rd->overutilized have the exact same meaning, it is just that we rely on the parent to share the over utilization with the other cpu at the same level and the rd->overutilized is used as the parent of the last sd level but there is no difference in the usage
I think sd->overutilized and rd->overutilized have different visibility for CPUs. Please see below example:
CPU A SD level 1 - SG1 (CPUA), SG2 (CPUB) SD level 2 - SG5(CPUA, CPUB), SG6(CPU C, CPU D) RD
CPU B SD level 1 - SG2(CPUB), SG1 (CPUA) SD level 2 - SG5(CPU A, CPU B), SG6(CPU C, CPUD) RD
CPU C SD level 1 - SG3(CPU C), SG4 (CPUD) SD level 2 - SG6(CPUC, CPUD), SG5(CPUA, CPU B) RD
CPU D SD level 1 - SG4(CPU D), SG3(CPU C) SD level2 - SG6(CPUC, CPU D), SG5(CPU A, APU B) RD
If CPUA set its sd->overutilized flag into SG5, then later CPUC check sd->overutilized CPUC will only check the flags in SG6. So CPUA set sd->overutilized flag and this flag can be observed by CPUB, but CPUC cannot observe it.
yes and it's normal, we set flag into SG5 to say that load balance is need at sd_level1 between CPUA and CPUB. We use the SG at parent level because it is shared between all CPU involved in the child sd level. But the last sd level has not parent :-) so we use rd as the parent
But for rd->overutilized flag, it is visible to all CPUs. This is why I think function is_sd_overutilized() should change as below, CPUC iterates all "sd->overutilized" flags in the same schedule domain and
We use the SG at parent level to prevent this not scalable while loop
IMHO, wouldn't the newly introduced struct sched_domain_shared (commit 24fc7edb92ee "sched/core: Introduce 'struct sched_domain_shared'" be the perfect infrastructure for this kind of job? It is per-cpu data which is shared between all 'identical' sched domains.
It would allow us to not touch the root_domain for this business and thus this source of potential misunderstanding.
It's currently limited to solve another problem but was designed to be easily extended. It's not in EAS product code line but the latest EAS integration already has it.
What do you guys think?
Yes I agree that this shared struct should be used to share flags between CPUs but it has appeared quite recently and EAS was not available for with this feature when the patch has been done. The priority was to start review and discussion but the next version should use it instead