Hi Thara,
On Wed, Dec 07, 2016 at 05:22:37PM -0500, Thara Gopinath wrote:
The current implementation of overutilization, aborts energy aware scheduling if any cpu in the system is over-utilized. This patch introduces over utilization flag per sched group level instead of a single flag system wide. Load balancing is done at the sched domain where any of the sched group is over utilized. If energy aware scheduling is enabled and no sched group in a sched domain is overuttilized, load balancing is skipped for that sched domain and energy aware scheduling continues at that level.
The implementation is based on two points
- For every cpu in every sched domain the first group is the group that contains the cpu itself.
- sched groups are shared between cpus.
Thus if a sched group is overutilized the overutilized flag is set at the first sched group of the parent sched domain. This ensures a load balancing at the overutilzed sched domain level. For example consider a big little system with two little cpu's (CPU A and CPU B) and two big cpu's (CPU C and CPU D). In this system, the hierarchy will be as follows CPU A SD level 1 - SG1 (CPUA), SG2 (CPUB) SD level 2 - SG5(CPUA, CPUB), SG6(CPU C, CPU D) RD
CPU B SD level 1 - SG2(CPUB), SG1 (CPUA) SD level 2 - SG5(CPU A, CPU B), SG6(CPU C, CPUD) RD
CPU C SD level 1 - SG3(CPU C), SG4 (CPUD) SD level 2 - SG6(CPUC, CPUD), SG5(CPUA, CPU B) RD
CPU D SD level 1 - SG4(CPU D), SG3(CPU C) SD level2 - SG6(CPUC, CPU D), SG5(CPU A, APU B) RD
In the above system if CPUA is overutilized, the overutilized flag is set at SG5(parent sched domain first sched group). Similarly if CPUB is overutilized, the flag is set at SG5. During load balancing, at SD level 1, the overutilized flag is checked at the parent sched domain, first sched group level(SG5). If there is no parent sched domain, then the flag is set/checked at the root domain. This ensures that load balancing happens irrespective of which cpu is over utilized in a sched domain.
I did some verification for this patch on Juno, please note I verified this patch on EASv5.2 code but not latest EAS code base; there have four test cases:
- Case 1: one ramp up task from duty cycle 10% to 90%, every step increases 10% [1];
Please see analysis result in [5]: The line with Magenta color: LITTLE cluster sched domain flag The line with Yellow color: Big cluster sched domain flag The line with Red color: Root domain flag
- Case 2: 4 middle workload tasks (util_avg ~= 300 < LITTLE core's capacity 447 * 0.8 = 358); check if task can spread out in LITTLE cluster [2];
- Case 3: 2 big tasks (util_avg = 870, > 1024 * 0.8); check if tasks can be spread out in big cluster [3];
- Case 4: 6 big tasks (util_avg = 870, > 1024 * 0.8); check if tasks can be spread out within two clusters [4];
- Below are summary from the plots:
During ramp up task running, root domain's overutilized flag does not set [5], so "misfit" task cannot rely on "overutilized" flag to migrate task from LITTLE cluster to big cluster;
If there have big tasks and after these tasks are migrated onto big cluster, the LITTLE cluster "overutilized" flag cannot be cleared immediately; the flag keeps very long time until it have chance to clear it in load balance [5][7];
In big cluster if every CPU is "overutilized", the "overutilized" flag for big cluster is frequently seted and cleared, we should expect this value keeps "true"; In LITTLE cluster if every CPU is "overutilized", the "overutilized" flag can stay "true" during this period [8];
For LITTLE cluster "overutilized" flag, it only works after whole system is "overutilized". This is for 6 big tasks case, but for 4 middle tasks case, LITTLE cluster "overutilized" flag doesn't set; so if there have several tasks on LITTLE cluster, we cannot rely on LITTLE cluster "overutilized" flag to spread tasks within LITTLE cluster [6].
[1] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/test_ov... [2] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/test_ov... [3] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/test_ov... [4] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/test_ov... [5] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/1_ramp_... [6] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/4_middl... [7] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/2_big_t... [8] http://people.linaro.org/~leo.yan/per_sched_domain_overutilized_flag/6_big_t...
Thanks, Leo Yan