Hi Thomas,
As suggested by you (https://lkml.org/lkml/2014/4/14/797), this is the first lot of changes I have. These are all potential bug fixes (Sorry if I haven't read the most obvious code correctly at some place :) ).
Patch 2/5 isn't a bug fix but was required as a dependency for 3/5.
Some discussions already happened for 5/5 here: https://lkml.org/lkml/2014/4/9/243 https://lkml.org/lkml/2014/4/9/346
I have tried to mark stable release wherever possible.
Viresh Kumar (5): tick-common: fix wrong check in tick_check_replacement() tick-common: don't check tick_oneshot_mode_active() from tick_check_preferred() tick-common: do additional checks in tick_check_preferred() tick-sched: don't call update_wall_time() when delta is lesser than tick_period tick-sched: replace tick_nohz_active with tick_nohz_enabled in tick_nohz_switch_to_nohz()
kernel/time/tick-common.c | 29 +++++++++++++++++++---------- kernel/time/tick-sched.c | 34 ++++++++++++++++++---------------- 2 files changed, 37 insertions(+), 26 deletions(-)
tick_check_replacement() returns if a replacement of clock_event_device is possible or not. It does this as the first check:
if (tick_check_percpu(curdev, newdev, smp_processor_id())) return false;
This looks wrong as we are returning false when tick_check_percpu() returned true. Probably Thomas forgot '!' here in his commit: 03e13cf5e ?
Fix it by placing a '!' before tick_check_percpu().
Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- kernel/time/tick-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c index 0156612..0a0608e 100644 --- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -276,7 +276,7 @@ static bool tick_check_preferred(struct clock_event_device *curdev, bool tick_check_replacement(struct clock_event_device *curdev, struct clock_event_device *newdev) { - if (tick_check_percpu(curdev, newdev, smp_processor_id())) + if (!tick_check_percpu(curdev, newdev, smp_processor_id())) return false;
return tick_check_preferred(curdev, newdev);
B1;3202;0c
On Tue, 15 Apr 2014, Viresh Kumar wrote:
tick_check_replacement() returns if a replacement of clock_event_device is possible or not. It does this as the first check:
if (tick_check_percpu(curdev, newdev, smp_processor_id())) return false;
This looks wrong as we are returning false when tick_check_percpu() returned true. Probably Thomas forgot '!' here in his commit: 03e13cf5e ?
Come on. You can do better changelogs.
"This looks wrong" is definitely not a good description of the problem.
Either you know WHY it is wrong, then you say so. If not, then you can send an RFC.
I fixed the changelog up this time.
Thanks,
tglx
On 16 April 2014 00:12, Thomas Gleixner tglx@linutronix.de wrote:
B1;3202;0c
What does this mean ??
On Tue, 15 Apr 2014, Viresh Kumar wrote:
tick_check_replacement() returns if a replacement of clock_event_device is possible or not. It does this as the first check:
if (tick_check_percpu(curdev, newdev, smp_processor_id())) return false;
This looks wrong as we are returning false when tick_check_percpu() returned true. Probably Thomas forgot '!' here in his commit: 03e13cf5e ?
Come on. You can do better changelogs.
:(
"This looks wrong" is definitely not a good description of the problem.
Either you know WHY it is wrong, then you say so. If not, then you can send an RFC.
I fixed the changelog up this time.
Thanks, will take care of such stuff in future.
If 'curdev' passed to tick_check_preferred() is the current clock_event_device then these two checks look exactly same, because td->mode is set to TICKDEV_MODE_ONESHOT only when the event device has ONESHOT feature.
if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) return false;
if (tick_oneshot_mode_active()) return false;
Now left the case where 'curdev' is not the current clock_event_device. This can happen from the sequence started from clockevents_replace(). Here we are trying to find the best possible device that we should choose. And so even in this case we don't need the above check as we aren't really worried about the current device.
So, the second check can be removed.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- kernel/time/tick-common.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c index 0a0608e..69cab28 100644 --- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -256,8 +256,6 @@ static bool tick_check_preferred(struct clock_event_device *curdev, if (!(newdev->features & CLOCK_EVT_FEAT_ONESHOT)) { if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) return false; - if (tick_oneshot_mode_active()) - return false; }
/*
On Tue, 15 Apr 2014, Viresh Kumar wrote:
If 'curdev' passed to tick_check_preferred() is the current clock_event_device then these two checks look exactly same, because td->mode is set to TICKDEV_MODE_ONESHOT only when the event device has ONESHOT feature.
if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) return false;
if (tick_oneshot_mode_active()) return false;
Now left the case where 'curdev' is not the current clock_event_device. This can happen from the sequence started from clockevents_replace(). Here we are trying to find the best possible device that we should choose. And so even in this case we don't need the above check as we aren't really worried about the current device.
Wrong. If curdev is NULL, you might select a device w/o ONESHOT if the system is in oneshot mode. Go figure.
Thanks,
tglx
On 16 April 2014 00:00, Thomas Gleixner tglx@linutronix.de wrote:
On Tue, 15 Apr 2014, Viresh Kumar wrote:
If 'curdev' passed to tick_check_preferred() is the current clock_event_device then these two checks look exactly same, because td->mode is set to TICKDEV_MODE_ONESHOT only when the event device has ONESHOT feature.
if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) return false; if (tick_oneshot_mode_active()) return false;
Now left the case where 'curdev' is not the current clock_event_device. This can happen from the sequence started from clockevents_replace(). Here we are trying to find the best possible device that we should choose. And so even in this case we don't need the above check as we aren't really worried about the current device.
Wrong. If curdev is NULL, you might select a device w/o ONESHOT if the system is in oneshot mode. Go figure.
Okay, so the logs must have another case where curdev is NULL. But codewise we are already taking care of that here:
return !curdev || newdev->rating > curdev->rating || !cpumask_equal(curdev->cpumask, newdev->cpumask);
And so this patch wouldn't harm. And this is preserved in the next patch (3/5) as well, which adds checks for other cases as well.
We return false from tick_check_preferred() if newdev doesn't have ONESHOT feature but curdev has, but we don't return true when newdev has ONESHOT and curdev doesn't. Instead we go on, check ratings and other things in that case.
This patch tries to fix this by rewriting some portion of this code and adds sufficient comments to make logic clear.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- kernel/time/tick-common.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c index 69cab28..2f13889 100644 --- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -252,18 +252,29 @@ static bool tick_check_percpu(struct clock_event_device *curdev, static bool tick_check_preferred(struct clock_event_device *curdev, struct clock_event_device *newdev) { - /* Prefer oneshot capable device */ - if (!(newdev->features & CLOCK_EVT_FEAT_ONESHOT)) { - if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) - return false; - } + if (!curdev) + return true; + + /* + * Prefer oneshot capable device. + * return values based on if ONESHOT is available or not: + * + * curdev newdev operation + * 0 0 check priority + * 0 1 return true + * 1 0 return false + * 1 1 check priority + */ + + if ((newdev->features & CLOCK_EVT_FEAT_ONESHOT) ^ + (curdev->features & CLOCK_EVT_FEAT_ONESHOT)) + return newdev->features & CLOCK_EVT_FEAT_ONESHOT;
/* * Use the higher rated one, but prefer a CPU local device with a lower * rating than a non-CPU local device */ - return !curdev || - newdev->rating > curdev->rating || + return newdev->rating > curdev->rating || !cpumask_equal(curdev->cpumask, newdev->cpumask); }
In tick_do_update_jiffies64() we are processing ticks only if delta is greater than tick_period. This is what we are supposed to do here and it broke a bit with this patch:
commit 47a1b796306356f358e515149d86baf0cc6bf007 Author: John Stultz john.stultz@linaro.org Date: Thu Dec 12 13:10:55 2013 -0800
tick/timekeeping: Call update_wall_time outside the jiffies lock
With above patch, we might end up calling update_wall_time() even if delta is found to be smaller that tick_period. Fix this by reversing the check and returning early.
Cc: stable@vger.kernel.org # v3.14+ Cc: John Stultz john.stultz@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- kernel/time/tick-sched.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 9f8af69..dd35b55 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -65,26 +65,28 @@ static void tick_do_update_jiffies64(ktime_t now) write_seqlock(&jiffies_lock);
delta = ktime_sub(now, last_jiffies_update); - if (delta.tv64 >= tick_period.tv64) { - - delta = ktime_sub(delta, tick_period); - last_jiffies_update = ktime_add(last_jiffies_update, - tick_period); + if (delta.tv64 < tick_period.tv64) { + write_sequnlock(&jiffies_lock); + return; + }
- /* Slow path for long timeouts */ - if (unlikely(delta.tv64 >= tick_period.tv64)) { - s64 incr = ktime_to_ns(tick_period); + delta = ktime_sub(delta, tick_period); + last_jiffies_update = ktime_add(last_jiffies_update, tick_period);
- ticks = ktime_divns(delta, incr); + /* Slow path for long timeouts */ + if (unlikely(delta.tv64 >= tick_period.tv64)) { + s64 incr = ktime_to_ns(tick_period);
- last_jiffies_update = ktime_add_ns(last_jiffies_update, - incr * ticks); - } - do_timer(++ticks); + ticks = ktime_divns(delta, incr);
- /* Keep the tick_next_period variable up to date */ - tick_next_period = ktime_add(last_jiffies_update, tick_period); + last_jiffies_update = ktime_add_ns(last_jiffies_update, + incr * ticks); } + do_timer(++ticks); + + /* Keep the tick_next_period variable up to date */ + tick_next_period = ktime_add(last_jiffies_update, tick_period); + write_sequnlock(&jiffies_lock); update_wall_time(); }
On Tue, 15 Apr 2014, Viresh Kumar wrote:
In tick_do_update_jiffies64() we are processing ticks only if delta is greater than tick_period. This is what we are supposed to do here and it broke a bit with this patch:
commit 47a1b796306356f358e515149d86baf0cc6bf007 Author: John Stultz john.stultz@linaro.org Date: Thu Dec 12 13:10:55 2013 -0800
tick/timekeeping: Call update_wall_time outside the jiffies lock
Please look how I massaged the change log. There is no point in copying the whole gunk.
With above patch, we might end up calling update_wall_time() even if delta is found to be smaller that tick_period. Fix this by reversing the check and returning early.
Well.
Cc: stable@vger.kernel.org # v3.14+ Cc: John Stultz john.stultz@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
kernel/time/tick-sched.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
That's not how we do bug fixes if they can be done with 3 lines of change. See the commit.
Thanks,
tglx
On 16 April 2014 00:14, Thomas Gleixner tglx@linutronix.de wrote:
On Tue, 15 Apr 2014, Viresh Kumar wrote:
In tick_do_update_jiffies64() we are processing ticks only if delta is greater than tick_period. This is what we are supposed to do here and it broke a bit with this patch:
commit 47a1b796306356f358e515149d86baf0cc6bf007 Author: John Stultz john.stultz@linaro.org Date: Thu Dec 12 13:10:55 2013 -0800
tick/timekeeping: Call update_wall_time outside the jiffies lock
Please look how I massaged the change log. There is no point in copying the whole gunk.
I see.. Nice.
With above patch, we might end up calling update_wall_time() even if delta is found to be smaller that tick_period. Fix this by reversing the check and returning early.
Well.
Cc: stable@vger.kernel.org # v3.14+ Cc: John Stultz john.stultz@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
kernel/time/tick-sched.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
That's not how we do bug fixes if they can be done with 3 lines of change. See the commit.
I tried that initially but with these changes as well (which must be done now ??), which probably makes it more clear ?:
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 3cafe7d..0e70b1c 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -84,12 +84,12 @@ static void tick_do_update_jiffies64(ktime_t now)
/* Keep the tick_next_period variable up to date */ tick_next_period = ktime_add(last_jiffies_update, tick_period); + + write_sequnlock(&jiffies_lock); + update_wall_time(); } else { write_sequnlock(&jiffies_lock); - return; } - write_sequnlock(&jiffies_lock); - update_wall_time(); }
/*
Recently this patch came in:
commit d689fe222a858c767cb8594faf280048e532b53f Author: Thomas Gleixner tglx@linutronix.de Date: Wed Nov 13 21:01:57 2013 +0100
NOHZ: Check for nohz active instead of nohz enabled
and introduced a bug here. tick_nohz_active is initialized to zero and we are returning from this routine if it is zero. And hence we will never be able to cross this check. Actually this should not have been changed by above patch and should have stayed to tick_nohz_enabled.
Fix it.
Cc: stable@vger.kernel.org # 3.13+ Signed-off-by: Viresh Kumar viresh.kumar@linaro.org ---
Some discussions already happened here:
https://lkml.org/lkml/2014/4/9/243 https://lkml.org/lkml/2014/4/9/346
kernel/time/tick-sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index dd35b55..059e1eb 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -969,7 +969,7 @@ static void tick_nohz_switch_to_nohz(void) struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); ktime_t next;
- if (!tick_nohz_active) + if (!tick_nohz_enabled) return;
local_irq_disable();
linaro-kernel@lists.linaro.org