Viresh Kumar viresh.kumar@linaro.org writes:
A clockevent device is used to service timers/hrtimers requests and the next event (when it should fire) is decided by the timer/hrtimer expiring next.
Just drop this intro sentence. It adds more confusion than value, IMO.
When no timers/hrtimers are pending to be serviced, the expiry time is set to a
drop "to be serviced"
special value: KTIME_MAX. This means that no events are required for indefinite amount of time.
s/required for indefinite amount of time/pending/
This would normally happen with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
When expiry == KTIME_MAX, either we cancel the tick-sched hrtimer (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES). But, the clockevent device is already reprogrammed from tick-handler for next tick.
So, the clockevent device will fire one more time. In NOHZ_MODE_HIGHRES, we will consider it as a spurious interrupt and just return from hrtimer_interrupt(). In NOHZ_MODE_LOWRES, we schedule the next tick again from tick_nohz_handler()? (This is what I could read from the code, not very sure though. Otherwise, it means that in NOHZ_MODE_LOWRES we are never tickless).
Ideally, as the clock event device is programmed in ONESHOT mode it should just fire one more time and that's it.
Is this really the "ideal" case? Even if it fires one more time, it's still a spurious interrupt that causes an unncessary wakeup, right?
But many implementations (like arm_arch_timer, etc) only have PERIODIC mode available and their drivers emulate ONESHOT over that. Which means that on these platforms we will get spurious interrupts at tick rate and that will hurt our tickless-ness badly.
At this time the clockevent device should be stopped, or its interrupts may be
At which time? The copy/paste from the cover letter here adds confusion.
masked in order to get these issues fixed.
See comment in cover letter.
A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time) when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this. Also, NOHZ_MODE_LOWRES problem mentioned above wouldn't be fixed by this.
However, Thomas suggested to add an optional mode: ONESHOT_STOPPED (lkml.org/lkml/2014/5/9/508) to solve this problem.
This patch adds support for this new mode in clockevents core. clockevents_set_mode() would do a WARN() if this mode is supported and we failed to switch to it. If it isn't supported for some platform (->set_dev_mode() returns -ENOSYS), we will not update dev->mode and return early.
Two new APIs tick_stop_event() and tick_restart_event() are also implemented.
Uh, Why?
They are introduced here without explanation and without any users. This part should be a separate patch, introduced along with their users.
IMO, you should be following the sequence as discussed earlier
- introduce new ->set_dev_mode() - convert some clockevent devices to use it - introduce ONESHOT_STOPPED (and any dependencies) - convert a platform to use it.
The ordering of this is all mixed up in this series and rather confusing to review.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
include/linux/clockchips.h | 1 + include/linux/tick.h | 2 ++ kernel/time/clockevents.c | 14 ++++++++++++-- kernel/time/tick-oneshot.c | 20 ++++++++++++++++++++ 4 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 08b203e..d8a108a 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -38,6 +38,7 @@ enum clock_event_mode { CLOCK_EVT_MODE_SHUTDOWN, CLOCK_EVT_MODE_PERIODIC, CLOCK_EVT_MODE_ONESHOT,
- CLOCK_EVT_MODE_ONESHOT_STOPPED, CLOCK_EVT_MODE_RESUME,
}; diff --git a/include/linux/tick.h b/include/linux/tick.h index b84773c..f9bc979 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -81,6 +81,8 @@ extern struct tick_device *tick_get_device(int cpu); # ifdef CONFIG_HIGH_RES_TIMERS extern int tick_init_highres(void); extern int tick_program_event(ktime_t expires, int force); +extern void tick_stop_event(void); +extern void tick_restart_event(void); extern void tick_setup_sched_timer(void); # endif diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index eaf5b0d..9348da1 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -107,8 +107,18 @@ void clockevents_set_mode(struct clock_event_device *dev, if (dev->mode != mode) { int ret = dev->set_dev_mode(mode, dev);
/* Currently available modes shouldn't fail */
WARN_ONCE(ret, "Requested mode: %d, error: %d\n", mode, ret);
this code is not in mainline (or -next), so I'm not sure what this series applies to. Normally, it's good practice to state in the cover letter what a series applies to.
However, the above removed code has never been upstream AFAICT, so it looks to me like this series applies on top of some earlier version of your patchset, which means this series has never actually been applied to mainline and compile tested. Ugh.
Viresh, these are the kind of things that really get on the nerves of reviewers and maintainers. It's obvious on review that this patch has not been compile tested against upstream, and it's not stated in the cover letter what is should apply to, which makes me wonder how much more time I should spend on it.
Combined with the fact that I don't think you followed my previous advice about breaking up the series into the right sequence, I'm quickly arriving at burnout on this series.
Kevin