Hi Thomas et al,
I was analyzing few traces with 'trace-cmd' and realized that timings weren't matching properly. For example consider this trace:
1 cat-1867 [000] d.h1 149.750891: hrtimer_expire_entry: hrtimer=ee7b3740 function=tick_sched_timer now=149740008694 2 cat-1867 [000] d.h1 149.750900: softirq_raise: vec=1 [action=TIMER] 3 cat-1867 [000] d.h1 149.750905: softirq_raise: vec=9 [action=RCU] 4 cat-1867 [000] d.h2 149.750910: sched_stat_runtime: comm=cat pid=1867 runtime=1463250 [ns] vruntime=4777557704 [ns] 5 cat-1867 [000] d.h1 149.750914: softirq_raise: vec=7 [action=SCHED] 6 cat-1867 [000] d.h1 149.750919: hrtimer_expire_exit: hrtimer=ee7b3740 7 cat-1867 [000] d.h2 149.750922: hrtimer_start: hrtimer=ee7b3740 function=tick_sched_timer expires=149745000000 softexpires=149745000000
Line 1: now=149.740008694 and trace-time: 149.750891. diff 10ms (consistently) Line 7: trace-time: 149.750922 and expires: 149.745000000, diff 5ms
So, either there is a mismatch of trace-times and required to be fixed? (@Steven)
Or we are almost always adding the tick-sched hrtimer in past?
I wasn't sure and tried to go through tick-sched and hrtimer core code again and this is what I realized.
At certain points we are trying to make sure that 'jiffies' are updated for current-time with loops like:
Lowres mode:
while (tick_nohz_reprogram(ts, now)) { now = ktime_get(); tick_do_update_jiffies64(now); }
tick_nohz_reprogram() does check if expires is in past and returns -ETIME (right?). And so we are guaranteed to updated jiffies to current time and reprogram clkevt device in future.
Highres mode: - First of all this kind of loop is missing from tick-handler: tick_sched_timer() as we are relying on hrtimer core to restart hrtimer for us and so we aren't trying to updated jiffies to latest values. And so we might (maybe always as shown in my traces) enqueue hrtimer in past.
- At some places we do have a while loop:
static void tick_nohz_restart(struct tick_sched *ts, ktime_t now) { ...
while (1) { /* Forward the time to expire in the future */ hrtimer_forward(&ts->sched_timer, now, tick_period);
if (ts->nohz_mode == NOHZ_MODE_HIGHRES) { hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED); /* Check, if the timer was already in the past */ if (hrtimer_active(&ts->sched_timer)) break; } else { ... } ... } }
But I couldn't make out that this while loop will actually run for hres mode. Unlike tick_nohz_reprogram(), hrtimer_start_expires() never fails and calls enqueue_hrtimer() which does this: timer->state |= HRTIMER_STATE_ENQUEUED;
And so hrtimer_active() is guaranteed to return 'true' ? From the comment over hrtimer_active() here it looks we are trying to make sure that we aren't adding hrtimer in past, but code doesn't look like that.
And so there is a possibility that hrtimers are getting added in past.
I tried to go to the initial commit: 79bf2bb (tick-management: dyntick / highres functionality) and see if things were different there.
But they weren't.
Sorry for the long mail :)
-- viresh
On Thu, 3 Jul 2014 18:59:31 +0530 Viresh Kumar viresh.kumar@linaro.org wrote:
Hi Thomas et al,
I was analyzing few traces with 'trace-cmd' and realized that timings weren't matching properly. For example consider this trace:
1 cat-1867 [000] d.h1 149.750891: hrtimer_expire_entry: hrtimer=ee7b3740 function=tick_sched_timer now=149740008694 2 cat-1867 [000] d.h1 149.750900: softirq_raise: vec=1 [action=TIMER] 3 cat-1867 [000] d.h1 149.750905: softirq_raise: vec=9 [action=RCU] 4 cat-1867 [000] d.h2 149.750910: sched_stat_runtime: comm=cat pid=1867 runtime=1463250 [ns] vruntime=4777557704 [ns] 5 cat-1867 [000] d.h1 149.750914: softirq_raise: vec=7 [action=SCHED] 6 cat-1867 [000] d.h1 149.750919: hrtimer_expire_exit: hrtimer=ee7b3740 7 cat-1867 [000] d.h2 149.750922: hrtimer_start: hrtimer=ee7b3740 function=tick_sched_timer expires=149745000000 softexpires=149745000000
Line 1: now=149.740008694 and trace-time: 149.750891. diff 10ms (consistently) Line 7: trace-time: 149.750922 and expires: 149.745000000, diff 5ms
So, either there is a mismatch of trace-times and required to be fixed? (@Steven)
Unless you changed clocks, the default timing of the trace is done by trace_clock_local() which basically just calls sched_clock(). Not much to be fixed there.
I would be interested in seeing another hrtimer_expire_entry to see if that is off by anything else other than 10ms. As that event is the only one shown to show what hrtimer thinks "now" is.
Also, I'm not really seeing what is wrong with the above. The expire_entry means the timer is being triggered and it raised the softirq. The hrtimer_start is a new setting to go off at 149.745. Since you noticed that the trace time stamp is 10ms ahead, that would state that the hrtimer would go off around trace timestamp 149.755.
The trace timer stamp uses a different clock than hrtimers does. Do not confuse them as the same clock. The best you can do is reference that "now" value to get the difference between the two and use that as a reference for what the time stamp of the trace should be when the hrtimer goes off again.
-- Steve
On 3 July 2014 19:29, Steven Rostedt rostedt@goodmis.org wrote:
Unless you changed clocks, the default timing of the trace is done by
I haven't..
trace_clock_local() which basically just calls sched_clock(). Not much to be fixed there.
Okay.
I would be interested in seeing another hrtimer_expire_entry to see if that is off by anything else other than 10ms. As that event is the only one shown to show what hrtimer thinks "now" is.
I checked that for some other timer and it was still 10ms. So, its not a tick-sched issue.
Also, I'm not really seeing what is wrong with the above. The expire_entry means the timer is being triggered and it raised the softirq. The hrtimer_start is a new setting to go off at 149.745. Since you noticed that the trace time stamp is 10ms ahead, that would state that the hrtimer would go off around trace timestamp 149.755.
The only part that looked odd to me was this offset of 10ms.
The trace timer stamp uses a different clock than hrtimers does. Do not confuse them as the same clock.
Hmm, that's new. And the diff might be ~10ms
The best you can do is reference that "now" value to get the difference between the two and use that as a reference for what the time stamp of the trace should be when the hrtimer goes off again.
Yeah, it was on time as this 10ms offset was consistent ..
So, there is no problem in traces apart from the fact that people might try to match trace-time and hrtimer time, like I did.
But the other issue I reported in hrtimer/tick-sched code might still be valid.
Thanks for your reply.
-- viresh
linaro-kernel@lists.linaro.org