A clockevent device is used to service timers/hrtimers requests and the next event (when it should fire) is decided by the timer/hrtimer expiring next. When no timers/hrtimers are pending to be serviced, the expiry time is set to a special value: KTIME_MAX. This means that no events are required for indefinite amount of time.
This would normally happen with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
When expiry == KTIME_MAX, either we cancel the tick-sched hrtimer (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES). But, the clockevent device is already reprogrammed from tick-handler for next tick.
So, the clockevent device will fire one more time. In NOHZ_MODE_HIGHRES, we will consider it as a spurious interrupt and just return from hrtimer_interrupt(). In NOHZ_MODE_LOWRES, we schedule the next tick again from tick_nohz_handler()? (This is what I could read from the code, not very sure though. Otherwise, it means that in NOHZ_MODE_LOWRES we are never tickless).
Ideally, as the clock event device is programmed in ONESHOT mode it should just fire one more time and that's it. But many implementations (like arm_arch_timer, etc) only have PERIODIC mode available and their drivers emulate ONESHOT over that. Which means that on these platforms we will get spurious interrupts at tick rate and that will hurt our tickless-ness badly.
At this time the clockevent device should be stopped, or its interrupts may be masked in order to get these issues fixed.
A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time) when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this. Also, NOHZ_MODE_LOWRES problem mentioned above wouldn't be fixed by this.
However, Thomas suggested to add an optional mode: ONESHOT_STOPPED (lkml.org/lkml/2014/5/9/508) to solve this problem.
This patch adds support for this new mode in clockevents core. clockevents_set_mode() would do a WARN() if this mode is supported and we failed to switch to it. If it isn't supported for some platform (->set_dev_mode() returns -ENOSYS), we will not update dev->mode and return early.
Two new APIs tick_stop_event() and tick_restart_event() are also implemented.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- include/linux/clockchips.h | 1 + include/linux/tick.h | 2 ++ kernel/time/clockevents.c | 14 ++++++++++++-- kernel/time/tick-oneshot.c | 20 ++++++++++++++++++++ 4 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 08b203e..d8a108a 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -38,6 +38,7 @@ enum clock_event_mode { CLOCK_EVT_MODE_SHUTDOWN, CLOCK_EVT_MODE_PERIODIC, CLOCK_EVT_MODE_ONESHOT, + CLOCK_EVT_MODE_ONESHOT_STOPPED, CLOCK_EVT_MODE_RESUME, };
diff --git a/include/linux/tick.h b/include/linux/tick.h index b84773c..f9bc979 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -81,6 +81,8 @@ extern struct tick_device *tick_get_device(int cpu); # ifdef CONFIG_HIGH_RES_TIMERS extern int tick_init_highres(void); extern int tick_program_event(ktime_t expires, int force); +extern void tick_stop_event(void); +extern void tick_restart_event(void); extern void tick_setup_sched_timer(void); # endif
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index eaf5b0d..9348da1 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -107,8 +107,18 @@ void clockevents_set_mode(struct clock_event_device *dev, if (dev->mode != mode) { int ret = dev->set_dev_mode(mode, dev);
- /* Currently available modes shouldn't fail */ - WARN_ONCE(ret, "Requested mode: %d, error: %d\n", mode, ret); + if (ret) { + /* + * Supported Modes shouldn't fail and only + * ONESTOP_STOPPED is optional + */ + if (mode == CLOCK_EVT_MODE_ONESHOT_STOPPED && + ret == -ENOSYS) + return; + else + WARN_ONCE(1, "Requested mode: %d, error: %d\n", + mode, ret); + }
dev->mode = mode;
diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c index 8241090..a3404e6 100644 --- a/kernel/time/tick-oneshot.c +++ b/kernel/time/tick-oneshot.c @@ -22,6 +22,26 @@ #include "tick-internal.h"
/** + * tick_stop_event + */ +void tick_stop_event(void) +{ + struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); + + /* stop clock event device */ + clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT_STOPPED); +} + +/** + * tick_restart_event + */ +void tick_restart_event(void) +{ + struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); + clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT); +} + +/** * tick_program_event */ int tick_program_event(ktime_t expires, int force)