On 5 June 2014 18:43, Frederic Weisbecker fweisbec@gmail.com wrote:
NO_HZ_IDLE is concerned as well. get_nohz_timer_target() can return an idle CPU despite its best effort in this regard.
It can return a idle cpu but that can only be current cpu, right?
The new target has all the time it wants to become idle until we lock its timer base.
Yeah..
And it matters because NO_HZ_FULL is a corner case while NO_HZ_IDLE adds 1000 pts of reviewer's interest for your patch ;)
Yeah. I tried rewording changelog, lets see if it is any better :)
When a timer is enqueued or modified on a NO_HZ_{IDLE|FULL} target (local or remote), the target must re-evaluate its timer wheel to decide if clock-event device must be programmed to restart tick on a IDLE cpu in NO_HZ_IDLE case OR on a tickless cpu in NO_HZ_FULL case.
If it doesn't re-evaluate timer wheel, it wouldn't be able to execute timer handlers on expiration. This would be delayed until the time CPU wakes up or tick is restarted again.
To handle this, currently we are calling wake_up_nohz_cpu() from add_timer_on() but what about timers enqueued/modified with other APIs?
For example, in __mod_timer() we get target cpu (where the timer should get enqueued) by calling get_nohz_timer_target() and it is free to return a NO_HZ_FULL cpu as well. Also, the new target has all the time it wants to become idle until we lock its timer base.
So, we *should* re-evaluate timer wheel there as well, otherwise call to timer's handler might be delayed.
In order to fix this issue we can move wake_up_nohz_cpu(cpu) to internal_add_timer() so that it is well handled for any add-timer API.
LKML discussion about this: https://lkml.org/lkml/2014/6/4/169
Previous patch has already made changes to store 'cpu-id' in 'struct tvec_base' and we can get cpu number from there to pass on to wake_up_nohz_cpu().
This patch also does a minor modification: s/timer->base/base while calling tbase_get_deferrable() as base == timer->base here.