On Thu, Jun 05, 2014 at 12:28:27PM +0530, Viresh Kumar wrote:
When a timer is enqueued or modified on a NO_HZ_FULL target (local or remote), the target is expected to re-evaluate its timer wheel to decide if tick must be restarted to handle timer expirations.
If it doesn't re-evaluate timer wheel and restart tick, it wouldn't be able to call timer's handler on its expiration. It would be delayed until the time tick is restarted again. Currently the max delay can be 1 second as returned by scheduler_tick_max_deferment(), but it can increase in future.
To handle this, currently we are calling wake_up_nohz_cpu() from add_timer_on() but what about timers enqueued/modified with other APIs?
For example, in __mod_timer() we get target cpu (where the timer should get enqueued) by calling get_nohz_timer_target() and it is free to return a NO_HZ_FULL cpu as well. So, we *should* re-evaluate timer wheel there as well, otherwise call to timer's handler might be delayed as explained earlier.
In order to fix this issue we can move wake_up_nohz_cpu(cpu) to internal_add_timer() so that it is well handled for any add-timer API.
LKML discussion about this: https://lkml.org/lkml/2014/6/4/169
This requires internal_add_timer() to get cpu number from per-cpu object 'base', as all the callers might not have cpu number to pass to internal_add_timer(). For example, in __mod_timer() we find timer's base from 'timer' pointer and not from per-cpu arithmetic.
Thus, this patch adds another field 'cpu' in 'struct tvec_base' which will store cpu number of the cpu it belongs to.
Next patch will then move wake_up_nohz_cpu() to internal_add_timer().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Except the changelog that should talk about NO_HZ in general, looks good.
Thanks!
kernel/timer.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/timer.c b/kernel/timer.c index 3bb01a3..9e5f4f2 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -82,6 +82,7 @@ struct tvec_base { unsigned long next_timer; unsigned long active_timers; unsigned long all_timers;
- int cpu; struct tvec_root tv1; struct tvec tv2; struct tvec tv3;
@@ -1568,6 +1569,7 @@ static int init_timers_cpu(int cpu) } spin_lock_init(&base->lock); tvec_base_done[cpu] = 1;
} else { base = per_cpu(tvec_bases, cpu); }base->cpu = cpu;
-- 2.0.0.rc2