Le 24 janv. 2013 17:55, "Vincent Guittot" <vincent.guittot@linaro.org> a écrit :
>
> On 24 January 2013 17:44, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 2012/12/3 Vincent Guittot <vincent.guittot@linaro.org>:
> >> With the coupled cpuidle driver (but probably also with other drivers),
> >> a CPU loops in a temporary safe state while waiting for other CPUs of its
> >> cluster to be ready to enter the coupled C-state. If an IRQ or a softirq
> >> occurs, the CPU will stay in this internal loop if there is no need
> >> to resched. The SCHED softirq clears the NOHZ and increases
> >> nr_busy_cpus. If there is no need to resched, we will not call
> >> set_cpu_sd_state_idle because of this internal loop in a cpuidle state.
> >> We have to call set_cpu_sd_state_idle in tick_nohz_irq_exit which is used
> >> to handle such situation.
> >
> > I'm a bit confused with this.
> >
> > set_cpu_sd_state_busy() is only called from nohz_kick_needed(). And it
> > checks idle_cpu() before doing anything. So if no task is going to be
> > scheduled, idle_cpu() prevents from calling set_cpu_sd_state_busy().
> >
> > I'm probably missing something.
>
> Hi Frederic
>
> I can't find back the trace that i had saved with the issue but IIRC
> the sequence is:
> The CPU is kicked for ILB
> The wake_list of the CPU becomes not empty so cpu id not idle
> CPU wakes up, updates is timer framework and call nohz_kick_needed the
> execute the ILB sequence
> we don't go out of the cpuidle driver function because we don't need
> to resched so we don't clear the busy state
>

This sequence is not the right one

> I'm going to look for the saved trace to check the sequence above

I haven't been able to reproduce the bug that this patch was supposed to solved. The patch 2 and 3 seem enough to fix the nr_busy_cpus field. I will continue to try to reproduce it but it seems that it was a side effect of the 2 others fixes of the series

Vincent
>
> Vincent
>
> >
> > Thanks.
> >
> >>
> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >> ---
> >>  kernel/time/tick-sched.c |    2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> >> index 955d35b..b8d74ea 100644
> >> --- a/kernel/time/tick-sched.c
> >> +++ b/kernel/time/tick-sched.c
> >> @@ -570,6 +570,8 @@ void tick_nohz_irq_exit(void)
> >>         if (!ts->inidle)
> >>                 return;
> >>
> >> +       set_cpu_sd_state_idle();
> >> +
> >>         /* Cancel the timer because CPU already waken up from the C-states*/
> >>         menu_hrtimer_cancel();
> >>         __tick_nohz_idle_enter(ts);
> >> --
> >> 1.7.9.5
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/