This is the next planned step of "add ->set_dev_mode" patchset..
Its not being sent out (before earlier patchset is accepted by all) to receive
*more* criticism (I already got enough :)), but to give an overall view of where
we are heading.
You can choose to skip reviewing this and concentrate on the first patchset
instead unless that is upstreamed :)
Oh man, I am too scared now :)
Okay, here we go:
A clockevent device is used to service timers/hrtimers requests and the next
event (when it should fire) is decided by the timer/hrtimer expiring next. When
no timers/hrtimers are pending to be serviced, the expiry time is set to a
special value: KTIME_MAX. This means that no events are required for indefinite
amount of time.
This would normally happen with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
When expiry == KTIME_MAX, either we cancel the tick-sched hrtimer
(NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES).
But, the clockevent device is already reprogrammed from tick-handler for next
tick.
So, the clockevent device will fire one more time. In NOHZ_MODE_HIGHRES, we will
consider it as a spurious interrupt and just return from hrtimer_interrupt(). In
NOHZ_MODE_LOWRES, we schedule the next tick again from tick_nohz_handler()?
(This is what I could read from the code, not very sure though. Otherwise, it
means that in NOHZ_MODE_LOWRES we are never tickless).
Ideally, as the clock event device is programmed in ONESHOT mode it should just
fire one more time and that's it. But many implementations (like arm_arch_timer,
etc) only have PERIODIC mode available and their drivers emulate ONESHOT over
that. Which means that on these platforms we will get spurious interrupts at
tick rate and that will hurt our tickless-ness badly.
At this time the clockevent device should be stopped, or its interrupts may be
masked in order to get these issues fixed.
A simple (yet hacky) solution to get this fixed could be: update
hrtimer_force_reprogram() to always reprogram clockevent device and update
clockevent drivers to STOP generating events (or delay it to max time) when
'expires' is set to KTIME_MAX. But the drawback here is that every clockevent
driver has to be hacked for this particular case and its very easy for new ones
to miss this. Also, NOHZ_MODE_LOWRES problem mentioned above wouldn't be fixed
by this.
However, Thomas suggested to add an optional mode: ONESHOT_STOPPED
(lkml.org/lkml/2014/5/9/508) to solve this problem.
First patch implements the required infrastructure to start/stop clockevent
device. Third patch stops clockevent devices when no longer required and Second
patch starts them again once required.
The review order can be 1,3,2 for better understanding. Patch 2 was required
before 3 to keep 'git bisect' happy :)
Fourth patch is there to catch corner cases where we try to set next event while
being in ONESHOT_STOPPED mode. We will do a WARN_ON_ONCE() then. The last patch
modifies a sample driver (arm_arch_timer) to demonstrate/test this patchset.
Other drivers would be updated later.
Viresh Kumar (5):
clockevents: Introduce CLOCK_EVT_MODE_ONESHOT_STOPPED mode
tick-sched: switchback to ONESHOT mode if clockevent device is stopped
tick-sched: stop clockevent device when no longer required
clockevents: Catch event programming in ONESHOT_STOPPED mode
clocksource: arm_arch_timer: Add support for
CLOCK_EVT_MODE_ONESHOT_STOPPED
drivers/clocksource/arm_arch_timer.c | 1 +
include/linux/clockchips.h | 1 +
include/linux/tick.h | 2 ++
kernel/hrtimer.c | 53 +++++++++++++++++++++++++++++++++---
kernel/time/clockevents.c | 17 ++++++++++--
kernel/time/tick-oneshot.c | 20 ++++++++++++++
kernel/time/tick-sched.c | 4 +++
7 files changed, 92 insertions(+), 6 deletions(-)
--
2.0.0.rc2
Tegra's driver got updated a bit (00917dd cpufreq: Tegra: implement intermediate
frequency callbacks) and implements new 'intermediate freq' infrastructure of
core. Above commit updated comments about when to call
clk_prepare_enable(pll_x_clk) and Doug wasn't satisfied with those comments and
said this:
> The "Though when target-freq is intermediate freq, we don't need to
> take this reference." makes me think that this function is actually
> called when target-freq is intermediate freq. I don't think it is,
> right?
For better clarity just make that comment more explicit about when we call
tegra_target_intermediate(). Wasn't sure if we actually need a commit for this,
but anyway lets other decide if its worth enough :)
Reported-by: Doug Anderson <dianders(a)chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
drivers/cpufreq/tegra-cpufreq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c
index a5fbc0a..48bc89b 100644
--- a/drivers/cpufreq/tegra-cpufreq.c
+++ b/drivers/cpufreq/tegra-cpufreq.c
@@ -73,7 +73,7 @@ static int tegra_target_intermediate(struct cpufreq_policy *policy,
* off when we move the cpu off of it as enabling it again while we
* switch to it from tegra_target() would take additional time. Though
* when target-freq is intermediate freq, we don't need to take this
- * reference.
+ * reference and so this routine isn't called at all.
*/
clk_prepare_enable(pll_x_clk);
--
2.0.0.rc2
3.13.11.3 -stable review patch. If anyone has any objections, please let me know.
------------------
From: Viresh Kumar <viresh.kumar(a)linaro.org>
commit 27630532ef5ead28b98cfe28d8f95222ef91c2b7 upstream.
Since commit d689fe222 (NOHZ: Check for nohz active instead of nohz
enabled) the tick_nohz_switch_to_nohz() function returns because it
checks for the tick_nohz_active flag. This can't be set, because the
function itself sets it.
Undo the change in tick_nohz_switch_to_nohz().
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Cc: linaro-kernel(a)lists.linaro.org
Cc: fweisbec(a)gmail.com
Cc: Arvind.Chauhan(a)arm.com
Cc: linaro-networking(a)linaro.org
Link: http://lkml.kernel.org/r/40939c05f2d65d781b92b20302b02243d0654224.139753798…
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Kamal Mostafa <kamal(a)canonical.com>
---
kernel/time/tick-sched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ea20f7d..29b063b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -970,7 +970,7 @@ static void tick_nohz_switch_to_nohz(void)
struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
ktime_t next;
- if (!tick_nohz_active)
+ if (!tick_nohz_enabled)
return;
local_irq_disable();
--
1.9.1
This is a note to let you know that I have just added a patch titled
tick-common: Fix wrong check in tick_check_replacement()
to the linux-3.13.y-queue branch of the 3.13.y.z extended stable tree
which can be found at:
http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/lin…
This patch is scheduled to be released in version 3.13.11.3.
If you, or anyone else, feels it should not be added to this tree, please
reply to this email.
For more information about the 3.13.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
Thanks.
-Kamal
------
>From a7a150dca33ba90ecf23eb27b055c1a3267cb263 Mon Sep 17 00:00:00 2001
From: Viresh Kumar <viresh.kumar(a)linaro.org>
Date: Tue, 15 Apr 2014 10:54:37 +0530
Subject: tick-common: Fix wrong check in tick_check_replacement()
commit 521c42990e9d561ed5ed9f501f07639d0512b3c9 upstream.
tick_check_replacement() returns if a replacement of clock_event_device is
possible or not. It does this as the first check:
if (tick_check_percpu(curdev, newdev, smp_processor_id()))
return false;
Thats wrong. tick_check_percpu() returns true when the device is
useable. Check for false instead.
[ tglx: Massaged changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Cc: linaro-kernel(a)lists.linaro.org
Cc: fweisbec(a)gmail.com
Cc: Arvind.Chauhan(a)arm.com
Cc: linaro-networking(a)linaro.org
Link: http://lkml.kernel.org/r/486a02efe0246635aaba786e24b42d316438bf3b.139753798…
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Kamal Mostafa <kamal(a)canonical.com>
---
kernel/time/tick-common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 162b03a..425bfae 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -275,7 +275,7 @@ static bool tick_check_preferred(struct clock_event_device *curdev,
bool tick_check_replacement(struct clock_event_device *curdev,
struct clock_event_device *newdev)
{
- if (tick_check_percpu(curdev, newdev, smp_processor_id()))
+ if (!tick_check_percpu(curdev, newdev, smp_processor_id()))
return false;
return tick_check_preferred(curdev, newdev);
--
1.9.1
This is a note to let you know that I have just added a patch titled
tick-sched: Check tick_nohz_enabled in tick_nohz_switch_to_nohz()
to the linux-3.13.y-queue branch of the 3.13.y.z extended stable tree
which can be found at:
http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/lin…
This patch is scheduled to be released in version 3.13.11.3.
If you, or anyone else, feels it should not be added to this tree, please
reply to this email.
For more information about the 3.13.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
Thanks.
-Kamal
------
>From 05760233a5718be8d39485f78d44e50d6a721290 Mon Sep 17 00:00:00 2001
From: Viresh Kumar <viresh.kumar(a)linaro.org>
Date: Tue, 15 Apr 2014 10:54:41 +0530
Subject: tick-sched: Check tick_nohz_enabled in tick_nohz_switch_to_nohz()
commit 27630532ef5ead28b98cfe28d8f95222ef91c2b7 upstream.
Since commit d689fe222 (NOHZ: Check for nohz active instead of nohz
enabled) the tick_nohz_switch_to_nohz() function returns because it
checks for the tick_nohz_active flag. This can't be set, because the
function itself sets it.
Undo the change in tick_nohz_switch_to_nohz().
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Cc: linaro-kernel(a)lists.linaro.org
Cc: fweisbec(a)gmail.com
Cc: Arvind.Chauhan(a)arm.com
Cc: linaro-networking(a)linaro.org
Link: http://lkml.kernel.org/r/40939c05f2d65d781b92b20302b02243d0654224.139753798…
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Kamal Mostafa <kamal(a)canonical.com>
---
kernel/time/tick-sched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ea20f7d..29b063b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -970,7 +970,7 @@ static void tick_nohz_switch_to_nohz(void)
struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
ktime_t next;
- if (!tick_nohz_active)
+ if (!tick_nohz_enabled)
return;
local_irq_disable();
--
1.9.1
Dear all,
There's a question in the arch/arm64/kernel/entry.S as following,
/*
* EL1 mode handlers.
*/
el1_sync:
kernel_entry 1
mrs x1, esr_el1 // read the syndrome register
lsr x24, x1, #ESR_EL1_EC_SHIFT // exception class
cmp x24, #ESR_EL1_EC_DABT_EL1 // data abort in EL1
b.eq el1_da
cmp x24, #ESR_EL1_EC_SYS64 // configurable trap
b.eq el1_undef
cmp x24, #ESR_EL1_EC_SP_ALIGN // stack alignment exception
b.eq el1_sp_pc
el1_sp_pc:
/*
* Stack or PC alignment exception handling
*/
mrs x0, far_el1
- mov x1, x25 ==> this is an extra operation
mov x2, sp
b do_sp_pc_abort //Jump to C Exception handler
/**The C Exception Handler/
asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
unsigned int esr,
struct pt_regs *regs)
{
...
}
We use x1 register to store the value of ESR, and check the value to identify which exception handler to jump,
And there's a weird part In stack alignment exception handler(el1_sp_pc),
Why do we need to move x25 to x1?
The ESR has been stored into x1, and should be directly pass to do_sp_pc_abort function
"MOV x1, x25" is an extra operation and do_sp_pc_abort would get the wrong value of esr...
I'm not sure whether I'm right or not, hope someone can take a look at it, thx
BRs
andy