The benefit we can guarantee now is, the decay removing can save much of kernel time in sched_tick. :)
On 11/19/2013 09:19 PM, Alex Shi wrote:
On 11/19/2013 08:45 PM, Morten Rasmussen wrote:
On Tue, Nov 19, 2013 at 11:46:19AM +0000, Alex Shi wrote:
We know the load_idx is a decay for cpu_load of rq. After added sched_avg, we has 2 kind decay for cpu_load. That is a kind of redundancy.
This patch remove the load_idx. There are 5 _idx in sched_domain, busy_idx and idle_idx are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero on all arch. and this patch remove the only place using busy_idx/idle_idx.
Have you looked into the decay rates of cpu_load[busy_idx] and cpu_load[idle_idx] and compared them to the load in sched_avg?
The sched_avg decay according to active task load history. The cpu_load decay on past cpu load: load = (2^idx - 1) / 2^idx * load + 1 / 2^idx * cur_load These 2 cpu load both decay on time. but sched_avg is a bit more precise.
I'm thinking that there must be a reason why they are not all using the same average cpu load.
I asked this for PeterZ. PeterZ had no clear answerer of cpu_load decay usage, and he also thought is these 2 decay have a bit redundance.
As to cpu_load decay, it only used in load_balance. to bias on source cpu or no bias.
Since arm system rarely has much tasks running in system. this removing
Could you elaborate on the effect of this change?
I don't think the assumption that ARM systems rarely have many tasks running is generally valid. Smartphones do occasionally use all available cpus and ARM systems are used in many other segments.
I only has a panda board in hands, and don't know which benchmark are good for the testing. The key of this patch effect is testing. Anyone like to give a hand on this?
AFAIK, there are no theory can prove which decay is better (but obviously, community prefer to sched_avg, since it was added when cpu_load already existed.)
Also, this patch will affect all architectures, so we need to ensure that none of them are affected negatively.
Yes, so however detailed we measured on ARM, there must are still much testing/comments on this in community. Anyway, I had tested this patch in Intel platforms, and in fengguang's 0day kernel testing. there are no clear regression, also no clear improvement.
should no harm. But it will be great, if someone can measure it with some performance benchmark.
I would suggest to do a comparison between the different cpu load averages first.
any detailed benchmarks?
Morten
Do we have some performance testing force in linaro or ARM?
From 6fd05051dbb5aaa28d3bfe11042cc9cbb147bf7c Mon Sep 17 00:00:00 2001 From: Alex Shi alex.shi@linaro.org Date: Tue, 19 Nov 2013 17:38:07 +0800 Subject: [PATCH] sched: remove load_idx effect
Shortcut to remove rq->cpu_load[load_idx] effect in scheduler. All other place rq->cpu_load used cpu_load[0].
Signed-off-by: Alex Shi alex.shi@linaro.org
kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e8b652e..ce683aa 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5633,7 +5633,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd if (child && child->flags & SD_PREFER_SIBLING) prefer_sibling = 1;
- load_idx = get_sd_load_idx(env->sd, env->idle);
load_idx = 0;
do { struct sg_lb_stats *sgs = &tmp_sgs;
-- 1.8.1.2
-- Thanks Alex