On Thu, 9 Apr 2015, Peter Zijlstra wrote:
On Thu, Apr 09, 2015 at 09:20:39AM +0200, Ingo Molnar wrote:
if at least one base is active (on my fairly standard system all cpus have at least one active hrtimer base all the time - and many cpus have two bases active), then we run hrtimer_get_softirq_time(), which dirties the cachelines of all 4 clock bases:
base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim; base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono; base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot; base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
so in practice we not only touch every cacheline in every timer interrupt, but we _dirty_ them, even the inactive ones.
Urgh we should really _really_ kill that entire softirq mess.
That's the !highres part. We cannot kill that one unless we remove all support for machines which do not provide hardware for highres support.
Now the softirq_time thing is an optimization which we added back in the days when hrtimer went into the tree and Roman complained about the base->get_time() invocation being overkill.
The reasoning behing this was that low resolution systems do not need accurate time for the expiry and the forwarding because everything happens tick aligned.
So for !HIGHRES we have:
static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer) { return timer->base->softirq_time; }
and for the HIGHRES case:
static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer) { return timer->base->get_time(); }
Here are the usage sites of this:
drivers/power/reset/ltc2952-poweroff.c: now = hrtimer_cb_get_time(timer); kernel/sched/core.c: now = hrtimer_cb_get_time(period_timer); kernel/sched/deadline.c: now = hrtimer_cb_get_time(&dl_se->dl_timer); kernel/sched/fair.c: now = hrtimer_cb_get_time(timer); kernel/sched/rt.c: now = hrtimer_cb_get_time(timer); kernel/time/posix-timers.c: ktime_t now = hrtimer_cb_get_time(timer); sound/drivers/dummy.c: dpcm->base_time = hrtimer_cb_get_time(&dpcm->timer); sound/drivers/dummy.c: delta = ktime_us_delta(hrtimer_cb_get_time(&dpcm->timer),
So the only ones where this optimization might matter is the clock monotonic one. The few users of posix interval timers which use something else than CLOCK_MONO should not matter much.
I'd be happy to kill all of this and consolidate everything on the HIGHRES implementation.
Thanks,
tglx