This is the next planned step of "add ->set_dev_mode" patchset..
Its not being sent out (before earlier patchset is accepted by all) to receive *more* criticism (I already got enough :)), but to give an overall view of where we are heading.
You can choose to skip reviewing this and concentrate on the first patchset instead unless that is upstreamed :)
Oh man, I am too scared now :)
Okay, here we go:
A clockevent device is used to service timers/hrtimers requests and the next event (when it should fire) is decided by the timer/hrtimer expiring next. When no timers/hrtimers are pending to be serviced, the expiry time is set to a special value: KTIME_MAX. This means that no events are required for indefinite amount of time.
This would normally happen with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
When expiry == KTIME_MAX, either we cancel the tick-sched hrtimer (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES). But, the clockevent device is already reprogrammed from tick-handler for next tick.
So, the clockevent device will fire one more time. In NOHZ_MODE_HIGHRES, we will consider it as a spurious interrupt and just return from hrtimer_interrupt(). In NOHZ_MODE_LOWRES, we schedule the next tick again from tick_nohz_handler()? (This is what I could read from the code, not very sure though. Otherwise, it means that in NOHZ_MODE_LOWRES we are never tickless).
Ideally, as the clock event device is programmed in ONESHOT mode it should just fire one more time and that's it. But many implementations (like arm_arch_timer, etc) only have PERIODIC mode available and their drivers emulate ONESHOT over that. Which means that on these platforms we will get spurious interrupts at tick rate and that will hurt our tickless-ness badly.
At this time the clockevent device should be stopped, or its interrupts may be masked in order to get these issues fixed.
A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time) when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this. Also, NOHZ_MODE_LOWRES problem mentioned above wouldn't be fixed by this.
However, Thomas suggested to add an optional mode: ONESHOT_STOPPED (lkml.org/lkml/2014/5/9/508) to solve this problem.
First patch implements the required infrastructure to start/stop clockevent device. Third patch stops clockevent devices when no longer required and Second patch starts them again once required.
The review order can be 1,3,2 for better understanding. Patch 2 was required before 3 to keep 'git bisect' happy :)
Fourth patch is there to catch corner cases where we try to set next event while being in ONESHOT_STOPPED mode. We will do a WARN_ON_ONCE() then. The last patch modifies a sample driver (arm_arch_timer) to demonstrate/test this patchset.
Other drivers would be updated later.
Viresh Kumar (5): clockevents: Introduce CLOCK_EVT_MODE_ONESHOT_STOPPED mode tick-sched: switchback to ONESHOT mode if clockevent device is stopped tick-sched: stop clockevent device when no longer required clockevents: Catch event programming in ONESHOT_STOPPED mode clocksource: arm_arch_timer: Add support for CLOCK_EVT_MODE_ONESHOT_STOPPED
drivers/clocksource/arm_arch_timer.c | 1 + include/linux/clockchips.h | 1 + include/linux/tick.h | 2 ++ kernel/hrtimer.c | 53 +++++++++++++++++++++++++++++++++--- kernel/time/clockevents.c | 17 ++++++++++-- kernel/time/tick-oneshot.c | 20 ++++++++++++++ kernel/time/tick-sched.c | 4 +++ 7 files changed, 92 insertions(+), 6 deletions(-)