Wrong button make me removed others guys from the thread.
Sorry for this mistake.
On 13 September 2012 09:56, Mike Galbraith efault@gmx.de wrote:
On Thu, 2012-09-13 at 09:44 +0200, Vincent Guittot wrote:
On 13 September 2012 09:29, Mike Galbraith efault@gmx.de wrote:
On Thu, 2012-09-13 at 08:59 +0200, Vincent Guittot wrote:
On 13 September 2012 08:49, Mike Galbraith efault@gmx.de wrote:
On Thu, 2012-09-13 at 06:11 +0200, Vincent Guittot wrote:
On tickless system, one CPU runs load balance for all idle CPUs. The cpu_load of this CPU is updated before starting the load balance of each other idle CPUs. We should instead update the cpu_load of the balance_cpu.
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org
kernel/sched/fair.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1ca4fe4..9ae3a5b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4794,14 +4794,15 @@ static void nohz_idle_balance(int this_cpu, enum cpu_idle_type idle) if (need_resched()) break;
raw_spin_lock_irq(&this_rq->lock);
update_rq_clock(this_rq);
update_idle_cpu_load(this_rq);
raw_spin_unlock_irq(&this_rq->lock);
rq = cpu_rq(balance_cpu);
raw_spin_lock_irq(&rq->lock);
update_rq_clock(rq);
update_idle_cpu_load(rq);
raw_spin_unlock_irq(&rq->lock); rebalance_domains(balance_cpu, CPU_IDLE);
rq = cpu_rq(balance_cpu); if (time_after(this_rq->next_balance, rq->next_balance)) this_rq->next_balance = rq->next_balance; }
Ew, banging locks and updating clocks to what good end?
The goal is to update the cpu_load table of the CPU before starting the load balance. Other wise we will use outdated value in the load balance sequence
If there's load to distribute, seems it should all work out fine without doing that. What harm is being done that makes this worth while?
this_load and avg_load can be wrong and make an idle CPU set as balanced compared to the busy one
I think you need to present numbers showing benefit. Crawling all over a mostly idle (4096p?) box is decidedly bad thing to do.
Yep, let me prepare some figures
You should also notice that you are already crawling all over the idle processor in rebalance_domains
Vincent
-Mike
On Thu, 2012-09-13 at 10:37 +0200, Vincent Guittot wrote:
I think you need to present numbers showing benefit. Crawling all over a mostly idle (4096p?) box is decidedly bad thing to do.
Yeah, but we're already doing that anyway.. we know nohz idle balance doesn't scale. Venki and Suresh were working on that, but with Venki switching jobs this work got dropped.
I did talk to Suresh about it a week or so ago.. I think he was going to look at it again.
On Thu, 2012-09-13 at 10:45 +0200, Peter Zijlstra wrote:
On Thu, 2012-09-13 at 10:37 +0200, Vincent Guittot wrote:
I think you need to present numbers showing benefit. Crawling all over a mostly idle (4096p?) box is decidedly bad thing to do.
Yeah, but we're already doing that anyway.. we know nohz idle balance doesn't scale. Venki and Suresh were working on that, but with Venki switching jobs this work got dropped.
I did talk to Suresh about it a week or so ago.. I think he was going to look at it again.
As a reminder to Suresh.. please also consider calc_load_{idle,idx} in this work, its another nohz 'feature' that doesn't scale for pretty much the same reason.
That is, I'm fine with the initial patches not actually fixing that, but the structure put in place for the ILB should allow for it.