Hi Joonwoo,
On Fri, Oct 21, 2016 at 04:37:33PM -0700, Joonwoo Park wrote:
On 10/10/2016 01:35 AM, Leo Yan wrote:
In the case to migrate the task to higher capacity CPU, the scheduler need to distinguish CPU capacity is higher or lower. If use the function capacity_of(), this function will return back CPU capacity which is the value which reduce the occupied value by RT and DL class, so finally even the two CPUs have same capacity but this function will return back two different value so let them looks have different capacity.
This will introduce unnecessary active load balance for task migration within the same cluster. So change to use capacity_orig_of() instead, it returns back consistent value for CPU original capacity value.
This fixed issue which I had that meaningless active migrations happens among the little CPUs while running a simple CPU bound task. Thanks!
Also thanks a lot for your testing :)
Signed-off-by: Leo Yan leo.yan@linaro.org
kernel/sched/fair.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index dedb3e0..f2ab238 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8028,12 +8028,11 @@ static int need_active_balance(struct lb_env *env) return 1; }
- if ((capacity_of(env->src_cpu) < capacity_of(env->dst_cpu)) &&
env->src_rq->cfs.h_nr_running == 1 &&
cpu_overutilized(env->src_cpu) &&
!cpu_overutilized(env->dst_cpu)) {
return 1;
- }
- if ((capacity_orig_of(env->src_cpu) < capacity_orig_of(env->dst_cpu)) &&
Initially I thought we should have both of them like :
if ((capacity_of(env->src_cpu) < capacity_of(env->dst_cpu)) &&
((capacity_orig_of(env->src_cpu) <
capacity_orig_of(env->dst_cpu))) &&
env->src_rq->cfs.h_nr_running == 1 &&
cpu_overutilized(env->src_cpu) &&
!cpu_overutilized(env->dst_cpu))
But I think your version is good enough since this makes sure dst cpu has more spare capacity than src always after taking account of rt/dl task loads.
Yeah. cpu_overutilized() has taken account of rt/dl task loads yet.
return 1;
return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
}
BTW, I think we have a potential issue here when max capacity delta between little and big is large. For example, when little cap = 1024, big cap = 8192.
EAS will normalize CPU capacity to range [0..1024], so here I think you mean little cap = 128 but big cap = 1024 for single CPU, right?
If I'm not mistaken each little and big CPU will mark overutilized = true when spare capacity reaches down to ~204 and ~1638 on each CPU. (20% margin) At present, we don't upmigrate a task with load = 820 (1024 - 204) from little CPU to big even though the big CPU has enough spare capacity to take the task from little CPU since big CPU is marked as overutilized too. We might want to run this on big CPU?
I haven't seen such soc so this is just speculation though.
Are you suggesting some code like below:
static unsigned long cpu_spare_capacity(int cpu) { return max((capacity_of(cpu) - cpu_util(cpu)), 0); }
if ((capacity_orig_of(env->src_cpu) < capacity_orig_of(env->dst_cpu)) && env->src_rq->cfs.h_nr_running == 1 && cpu_overutilized(env->src_cpu) && cpu_util(env->src_cpu) < cpu_spare_capacity(env->dst_cpu)) return 1;
So cpu_util(env->src_cpu) = 128 * 80% = 102 and cpu_spare_capacity(env->dst_cpu) = 1024 * 20% = 204, that means even big core is overutilized, but it still have higher capacity than little core.
Dietmar, how about you think for this? This code originally is introduced by your patch "sched: Enable idle balance to pull single task towards cpu with higher capacity". So I'd like get suggetion from you.
Thanks, Leo Yan