On 6 June 2013 15:25, Borislav Petkov <bp(a)suse.de> wrote:
> The correct "fix" for this whole deal is coupling cpufreq with
> the scheduler, as it has been said so many times before. You need
> "something" which can tell you whether raising the freq. is worth it or
> not (i.e. the process is waiting on IO or is executing instructions).
Linaro has got a blueprint in this direction but doesn't have any
proof of concept or RFC patches to share. But that will happen soon.
On 6 June 2013 12:37, Lukasz Majewski <l.majewski(a)samsung.com> wrote:
> Subject: cpufreq: Define cpufreq_set_drv_attr_files() to add per CPU sysfs attributes
Its not per-cpu. We just add it for policy->cpu and other routines
actually create links.
> The cpufreq_set_drv_attr_files() function creates sysfs file entry for
> each available CPU. With it in place it is possible to add different
> set of attributes without code duplication.
Not for each available cpu but are linked to a policy->kobj and so
shows up on each policy->cpus.
> Signed-off-by: Lukasz Majewski <l.majewski(a)samsung.com>
> Signed-off-by: Myungjoo Ham <myungjoo.ham(a)samsung.com>
> ---
> drivers/cpufreq/cpufreq.c | 23 +++++++++++++++--------
> 1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 1b8a48e..ca74e27 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -730,12 +730,23 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
> return ret;
> }
>
> +static int cpufreq_set_drv_attr_files(struct cpufreq_policy *policy,
> + struct freq_attr **drv_attr)
> +{
> + while ((drv_attr) && (*drv_attr)) {
> + if (sysfs_create_file(&policy->kobj, &((*drv_attr)->attr)))
> + return 1;
You are changing the semantics here. We used to return error
value from sysfs_create_file() and you are returning 1.
> + drv_attr++;
If drv_attr was valid initially, then drv_attr++ can't make it NULL.
So, we don't need to check validity of drv_attr for every loop.
Currently, the RTC IRQ is never wakeup-enabled so is not capable of
bringing the system out of suspend.
On OMAP platforms, we have gotten by without this because the TWL RTC
is on an I2C-connected chip which is capable of waking up the OMAP via
the IO ring when the OMAP is in low-power states.
However, if the OMAP suspends without hitting the low-power states
(and the IO ring is not enabled), RTC wakeups will not work because
the IRQ is not wakeup enabled.
To fix, ensure the RTC IRQ is wakeup enabled whenever the RTC alarm is
set.
Cc: Alessandro Zummo <a.zummo(a)towertech.it>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Kevin Hilman <khilman(a)linaro.org>
---
drivers/rtc/rtc-twl.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/rtc/rtc-twl.c b/drivers/rtc/rtc-twl.c
index 8751a52..bbda0fd 100644
--- a/drivers/rtc/rtc-twl.c
+++ b/drivers/rtc/rtc-twl.c
@@ -213,12 +213,24 @@ static int mask_rtc_irq_bit(unsigned char bit)
static int twl_rtc_alarm_irq_enable(struct device *dev, unsigned enabled)
{
+ struct platform_device *pdev = to_platform_device(dev);
+ int irq = platform_get_irq(pdev, 0);
+ static bool twl_rtc_wake_enabled;
int ret;
- if (enabled)
+ if (enabled) {
ret = set_rtc_irq_bit(BIT_RTC_INTERRUPTS_REG_IT_ALARM_M);
- else
+ if (device_can_wakeup(dev) && !twl_rtc_wake_enabled) {
+ enable_irq_wake(irq);
+ twl_rtc_wake_enabled = true;
+ }
+ } else {
ret = mask_rtc_irq_bit(BIT_RTC_INTERRUPTS_REG_IT_ALARM_M);
+ if (twl_rtc_wake_enabled) {
+ disable_irq_wake(irq);
+ twl_rtc_wake_enabled = false;
+ }
+ }
return ret;
}
--
1.8.2
I have faced a sequence where the Idle Load Balance was sometime not
triggered for a while on my platform.
CPU 0 and CPU 1 are running tasks and CPU 2 is idle
CPU 1 kicks the Idle Load Balance
CPU 1 selects CPU 2 as the new Idle Load Balancer
CPU 1 sets NOHZ_BALANCE_KICK for CPU 2
CPU 1 sends a reschedule IPI to CPU 2
While CPU 2 wakes up, CPU 0 or CPU 1 migrates a waking task A on CPU 2
CPU 2 finally wakes up, runs task A and discards the Idle Load Balance
Task A quickly goes back to sleep (before a tick occurs on CPU 2)
CPU 2 goes back to idle with NOHZ_BALANCE_KICK set
Whenever CPU 2 will be selected for the ILB, reschedule IPI will be not
sent to CPU2, which is idle, because NOHZ_BALANCE_KICK is already set
and no Idle Load Balance will be performed.
We must wait for the sched softirq to be raised on CPU 2 thanks to
another part of the kernel to clear NOHZ_BALANCE_KICKand come back to
a normal situation.
The proposed solution clears NOHZ_BALANCE_KICK in schedule_ipi if
we can't raise the sched_softirq for the Idle Load Balance.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
---
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 58453b8..51fc715 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1420,7 +1420,8 @@ void scheduler_ipi(void)
if (unlikely(got_nohz_idle_kick() && !need_resched())) {
this_rq()->idle_balance = 1;
raise_softirq_irqoff(SCHED_SOFTIRQ);
- }
+ } else
+ clear_bit(NOHZ_BALANCE_KICK, nohz_flags(smp_processor_id()));
irq_exit();
}
--
1.7.9.5
We are saving first scheduling domain for a cpu in build_sched_domains() by
iterating over the nested sd->child list. We don't actually need to do it this
way.
*per_cpu_ptr(d.sd, i) is guaranteed to be NULL in the beginning as we have
called __visit_domain_allocation_hell() which does a memset to zero for struct
s_data.
So, save pointer to first SD while running the iteration loop over tl's.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
kernel/sched/core.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 58453b8..638f6cb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6533,16 +6533,13 @@ static int build_sched_domains(const struct cpumask *cpu_map,
sd = NULL;
for (tl = sched_domain_topology; tl->init; tl++) {
sd = build_sched_domain(tl, &d, cpu_map, attr, sd, i);
+ if (!*per_cpu_ptr(d.sd, i))
+ *per_cpu_ptr(d.sd, i) = sd;
if (tl->flags & SDTL_OVERLAP || sched_feat(FORCE_SD_OVERLAP))
sd->flags |= SD_OVERLAP;
if (cpumask_equal(cpu_map, sched_domain_span(sd)))
break;
}
-
- while (sd->child)
- sd = sd->child;
-
- *per_cpu_ptr(d.sd, i) = sd;
}
/* Build the groups for the domains */
--
1.7.12.rc2.18.g61b472e
Commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 brought the multiple driver
support. The code added a couple of new API to register the driver per cpu.
That led to some code complexity to handle the kernel config options when
the multiple driver support is enabled or not, which is not really necessary.
The code has to be compatible when the multiple driver support is not enabled,
and the multiple driver support has to be compatible with the old api.
This patch removes this API, which is not yet used by any driver but needed
for the HMP cpuidle drivers which will come soon, and replaces its usage
by a cpumask pointer in the cpuidle driver structure telling what cpus are
handled by the driver. That let the API cpuidle_[un]register_driver to be used
for the multipled driver support and also the cpuidle_[un]register functions,
added recently in the cpuidle framework.
The current code, a bit poor in comments, has been commented and simplified.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
---
[V2]:
- fixed bad refcount check
- inverted clockevent notify off order at unregister time
drivers/cpuidle/cpuidle.c | 4 +-
drivers/cpuidle/driver.c | 324 ++++++++++++++++++++++++++++-----------------
include/linux/cpuidle.h | 21 +--
3 files changed, 213 insertions(+), 136 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index c3a93fe..fdc432f 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -466,7 +466,7 @@ void cpuidle_unregister(struct cpuidle_driver *drv)
int cpu;
struct cpuidle_device *device;
- for_each_possible_cpu(cpu) {
+ for_each_cpu(cpu, drv->cpumask) {
device = &per_cpu(cpuidle_dev, cpu);
cpuidle_unregister_device(device);
}
@@ -498,7 +498,7 @@ int cpuidle_register(struct cpuidle_driver *drv,
return ret;
}
- for_each_possible_cpu(cpu) {
+ for_each_cpu(cpu, drv->cpumask) {
device = &per_cpu(cpuidle_dev, cpu);
device->cpu = cpu;
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 8dfaaae..0268346 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -18,206 +18,266 @@
DEFINE_SPINLOCK(cpuidle_driver_lock);
-static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu);
-static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu);
+#ifdef CONFIG_CPU_IDLE_MULTIPLE_DRIVERS
-static void cpuidle_setup_broadcast_timer(void *arg)
+static DEFINE_PER_CPU(struct cpuidle_driver *, cpuidle_drivers);
+
+/**
+ * __cpuidle_get_cpu_driver: returns the cpuidle driver tied with the specified
+ * cpu.
+ *
+ * @cpu: an integer specifying the cpu number
+ *
+ * Returns a pointer to struct cpuidle_driver, NULL if no driver has been
+ * registered for this driver
+ */
+static struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
{
- int cpu = smp_processor_id();
- clockevents_notify((long)(arg), &cpu);
+ return per_cpu(cpuidle_drivers, cpu);
}
-static void __cpuidle_driver_init(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_set_driver: assign to the per cpu variable the driver pointer for
+ * each cpu the driver is assigned to with the cpumask.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static inline int __cpuidle_set_driver(struct cpuidle_driver *drv)
{
- int i;
+ int cpu;
- drv->refcnt = 0;
+ for_each_cpu(cpu, drv->cpumask) {
- for (i = drv->state_count - 1; i >= 0 ; i--) {
+ if (__cpuidle_get_cpu_driver(cpu))
+ return -EBUSY;
- if (!(drv->states[i].flags & CPUIDLE_FLAG_TIMER_STOP))
- continue;
-
- drv->bctimer = 1;
- on_each_cpu_mask(get_cpu_mask(cpu), cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_ON, 1);
- break;
+ per_cpu(cpuidle_drivers, cpu) = drv;
}
+
+ return 0;
}
-static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_unset_driver: for each cpu the driver is handling, set the per cpu
+ * variable driver to NULL.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ */
+static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv)
{
- if (!drv || !drv->state_count)
- return -EINVAL;
-
- if (cpuidle_disabled())
- return -ENODEV;
-
- if (__cpuidle_get_cpu_driver(cpu))
- return -EBUSY;
+ int cpu;
- __cpuidle_driver_init(drv, cpu);
+ for_each_cpu(cpu, drv->cpumask) {
- __cpuidle_set_cpu_driver(drv, cpu);
+ if (drv != __cpuidle_get_cpu_driver(cpu))
+ continue;
- return 0;
+ per_cpu(cpuidle_drivers, cpu) = NULL;
+ }
}
-static void __cpuidle_unregister_driver(struct cpuidle_driver *drv, int cpu)
-{
- if (drv != __cpuidle_get_cpu_driver(cpu))
- return;
+#else
- if (!WARN_ON(drv->refcnt > 0))
- __cpuidle_set_cpu_driver(NULL, cpu);
+static struct cpuidle_driver *cpuidle_curr_driver;
- if (drv->bctimer) {
- drv->bctimer = 0;
- on_each_cpu_mask(get_cpu_mask(cpu), cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_OFF, 1);
- }
+/**
+ * __cpuidle_get_cpu_driver: returns the global cpuidle driver pointer.
+ *
+ * @cpu: an integer specifying the cpu number, this parameter is ignored
+ *
+ * Returns a pointer to a struct cpuidle_driver, NULL if no driver was
+ * previously registered
+ */
+static inline struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
+{
+ return cpuidle_curr_driver;
}
-#ifdef CONFIG_CPU_IDLE_MULTIPLE_DRIVERS
+/**
+ * __cpuidle_set_driver: assign the cpuidle driver pointer to the global cpuidle
+ * driver variable.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static inline int __cpuidle_set_driver(struct cpuidle_driver *drv)
+{
+ if (cpuidle_curr_driver)
+ return -EBUSY;
-static DEFINE_PER_CPU(struct cpuidle_driver *, cpuidle_drivers);
+ cpuidle_curr_driver = drv;
-static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- per_cpu(cpuidle_drivers, cpu) = drv;
+ return 0;
}
-static struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
+/**
+ * __cpuidle_unset_driver: reset the global cpuidle driver variable if the
+ * cpuidle driver pointer match it.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ */
+static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv)
{
- return per_cpu(cpuidle_drivers, cpu);
+ if (drv == cpuidle_curr_driver)
+ cpuidle_curr_driver = NULL;
}
-static void __cpuidle_unregister_all_cpu_driver(struct cpuidle_driver *drv)
+#endif
+
+/**
+ * cpuidle_setup_broadcast_timer: set the broadcast timer notification for the
+ * current cpu. This function is called per cpu context invoked by a smp cross
+ * call. It is not supposed to be called directly.
+ *
+ * @arg: a void pointer, actually used to match the smp cross call api but used
+ * as a long with two values:
+ * - CLOCK_EVT_NOTIFY_BROADCAST_ON
+ * - CLOCK_EVT_NOTIFY_BROADCAST_OFF
+ */
+static void cpuidle_setup_broadcast_timer(void *arg)
{
- int cpu;
- for_each_present_cpu(cpu)
- __cpuidle_unregister_driver(drv, cpu);
+ int cpu = smp_processor_id();
+ clockevents_notify((long)(arg), &cpu);
}
-static int __cpuidle_register_all_cpu_driver(struct cpuidle_driver *drv)
+/**
+ * __cpuidle_driver_init: initialize the driver internal data.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static int __cpuidle_driver_init(struct cpuidle_driver *drv)
{
- int ret = 0;
- int i, cpu;
+ int i;
- for_each_present_cpu(cpu) {
- ret = __cpuidle_register_driver(drv, cpu);
- if (ret)
- break;
- }
+ drv->refcnt = 0;
- if (ret)
- for_each_present_cpu(i) {
- if (i == cpu)
- break;
- __cpuidle_unregister_driver(drv, i);
- }
+ /*
+ * we default here to all cpu possible because if the kernel
+ * boots with some cpus offline and then we online one of them
+ * the cpu notifier won't know which driver to assign
+ */
+ if (!drv->cpumask)
+ drv->cpumask = (struct cpumask *)cpu_possible_mask;
+
+ /*
+ * we look for the timer stop flag in the different states,
+ * so know we have to setup the broadcast timer. The loop is
+ * in reverse order, because usually the deeper state has this
+ * flag set
+ */
+ for (i = drv->state_count - 1; i >= 0 ; i--) {
+ if (!(drv->states[i].flags & CPUIDLE_FLAG_TIMER_STOP))
+ continue;
- return ret;
+ drv->bctimer = 1;
+ break;
+ }
+
+ return 0;
}
-int cpuidle_register_cpu_driver(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_register_driver: do some sanity checks, initializes the driver,
+ * assign the driver to the global cpuidle driver variable(s) and setup the
+ * broadcast timer if the cpuidle driver has some states which shutdown the
+ * local timer.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static int __cpuidle_register_driver(struct cpuidle_driver *drv)
{
int ret;
- spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_driver(drv, cpu);
- spin_unlock(&cpuidle_driver_lock);
+ if (!drv || !drv->state_count)
+ return -EINVAL;
- return ret;
-}
+ if (cpuidle_disabled())
+ return -ENODEV;
-void cpuidle_unregister_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_driver(drv, cpu);
- spin_unlock(&cpuidle_driver_lock);
-}
+ ret = __cpuidle_driver_init(drv);
+ if (ret)
+ return ret;
-/**
- * cpuidle_register_driver - registers a driver
- * @drv: the driver
- */
-int cpuidle_register_driver(struct cpuidle_driver *drv)
-{
- int ret;
+ ret = __cpuidle_set_driver(drv);
+ if (ret)
+ return ret;
- spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_all_cpu_driver(drv);
- spin_unlock(&cpuidle_driver_lock);
+ if (drv->bctimer)
+ on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
+ (void *)CLOCK_EVT_NOTIFY_BROADCAST_ON, 1);
- return ret;
+ return 0;
}
-EXPORT_SYMBOL_GPL(cpuidle_register_driver);
/**
- * cpuidle_unregister_driver - unregisters a driver
- * @drv: the driver
+ * __cpuidle_unregister_driver: checks the driver is no longer in use, reset the
+ * global cpuidle driver variable(s) and disable the timer broadcast
+ * notification mechanism if it was in use.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
*/
-void cpuidle_unregister_driver(struct cpuidle_driver *drv)
+static void __cpuidle_unregister_driver(struct cpuidle_driver *drv)
{
- spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_all_cpu_driver(drv);
- spin_unlock(&cpuidle_driver_lock);
-}
-EXPORT_SYMBOL_GPL(cpuidle_unregister_driver);
-
-#else
-
-static struct cpuidle_driver *cpuidle_curr_driver;
+ if (WARN_ON(drv->refcnt > 0))
+ return;
-static inline void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- cpuidle_curr_driver = drv;
-}
+ if (drv->bctimer) {
+ drv->bctimer = 0;
+ on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
+ (void *)CLOCK_EVT_NOTIFY_BROADCAST_OFF, 1);
+ }
-static inline struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
-{
- return cpuidle_curr_driver;
+ __cpuidle_unset_driver(drv);
}
/**
- * cpuidle_register_driver - registers a driver
- * @drv: the driver
+ * cpuidle_register_driver: registers a driver by taking a lock to prevent
+ * multiple callers to [un]register a driver at the same time.
+ *
+ * @drv: a pointer to a valid struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
*/
int cpuidle_register_driver(struct cpuidle_driver *drv)
{
- int ret, cpu;
+ int ret;
- cpu = get_cpu();
spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_driver(drv, cpu);
+ ret = __cpuidle_register_driver(drv);
spin_unlock(&cpuidle_driver_lock);
- put_cpu();
return ret;
}
EXPORT_SYMBOL_GPL(cpuidle_register_driver);
/**
- * cpuidle_unregister_driver - unregisters a driver
- * @drv: the driver
+ * cpuidle_unregister_driver: unregisters a driver by taking a lock to prevent
+ * multiple callers to [un]register a driver at the same time. The specified
+ * driver must match the driver currently registered.
+ *
+ * @drv: a pointer to a valid struct cpuidle_driver
*/
void cpuidle_unregister_driver(struct cpuidle_driver *drv)
{
- int cpu;
-
- cpu = get_cpu();
spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_driver(drv, cpu);
+ __cpuidle_unregister_driver(drv);
spin_unlock(&cpuidle_driver_lock);
- put_cpu();
}
EXPORT_SYMBOL_GPL(cpuidle_unregister_driver);
-#endif
/**
- * cpuidle_get_driver - return the current driver
+ * cpuidle_get_driver: returns the driver tied with the current cpu.
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
*/
struct cpuidle_driver *cpuidle_get_driver(void)
{
@@ -233,7 +293,12 @@ struct cpuidle_driver *cpuidle_get_driver(void)
EXPORT_SYMBOL_GPL(cpuidle_get_driver);
/**
- * cpuidle_get_cpu_driver - return the driver tied with a cpu
+ * cpuidle_get_cpu_driver: returns the driver registered with a cpu.
+ *
+ * @dev: a valid pointer to a struct cpuidle_device
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
+ * for the specified cpu
*/
struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
{
@@ -244,6 +309,13 @@ struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
}
EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
+/**
+ * cpuidle_driver_ref: gets a refcount for the driver. Note this function takes
+ * a refcount for the driver assigned to the current cpu.
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
+ * for the current cpu
+ */
struct cpuidle_driver *cpuidle_driver_ref(void)
{
struct cpuidle_driver *drv;
@@ -257,6 +329,10 @@ struct cpuidle_driver *cpuidle_driver_ref(void)
return drv;
}
+/**
+ * cpuidle_driver_unref: puts down the refcount for the driver. Note this
+ * function decrement the refcount for the driver assigned to the current cpu.
+ */
void cpuidle_driver_unref(void)
{
struct cpuidle_driver *drv = cpuidle_get_driver();
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 8f04062..63d78b1 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -101,16 +101,20 @@ static inline int cpuidle_get_last_residency(struct cpuidle_device *dev)
****************************/
struct cpuidle_driver {
- const char *name;
- struct module *owner;
- int refcnt;
+ const char *name;
+ struct module *owner;
+ int refcnt;
/* used by the cpuidle framework to setup the broadcast timer */
- unsigned int bctimer:1;
+ unsigned int bctimer:1;
+
/* states array must be ordered in decreasing power consumption */
- struct cpuidle_state states[CPUIDLE_STATE_MAX];
- int state_count;
- int safe_state_index;
+ struct cpuidle_state states[CPUIDLE_STATE_MAX];
+ int state_count;
+ int safe_state_index;
+
+ /* the driver handles the cpus in cpumask */
+ struct cpumask *cpumask;
};
#ifdef CONFIG_CPU_IDLE
@@ -135,9 +139,6 @@ extern void cpuidle_disable_device(struct cpuidle_device *dev);
extern int cpuidle_play_dead(void);
extern struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev);
-extern int cpuidle_register_cpu_driver(struct cpuidle_driver *drv, int cpu);
-extern void cpuidle_unregister_cpu_driver(struct cpuidle_driver *drv, int cpu);
-
#else
static inline void disable_cpuidle(void) { }
static inline int cpuidle_idle_call(void) { return -ENODEV; }
--
1.7.9.5
The current code considers every wakeup as spurious, which is not
correct. Handle the same way as other arm platforms are doing.
Signed-off-by: Sanjay Singh Rawat <sanjay.rawat(a)linaro.org>
---
arch/arm/mach-zynq/hotplug.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm/mach-zynq/hotplug.c b/arch/arm/mach-zynq/hotplug.c
index c89672b..a1ab22c 100644
--- a/arch/arm/mach-zynq/hotplug.c
+++ b/arch/arm/mach-zynq/hotplug.c
@@ -67,6 +67,13 @@ static inline void zynq_platform_do_lowpower(unsigned int cpu, int *spurious)
dsb();
wfi();
+ if (pen_release == cpu_logical_map(cpu)) {
+ /*
+ * OK, proper wakeup, we're done
+ */
+ break;
+ }
+
/*
* Getting here, means that we have come out of WFI without
* having been woken up - this shouldn't happen
--
1.7.9.5
This patch series adds support for DP on Exynos5250 based Arndale Board
Is based on branch "for-next"
http://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung.git
Vikas Sajjan (3):
ARM: dts: Add DT node for DP controller for Arndale Board
ARM: dts: Add clock provider information for DP controller in
Exynos5250 SoC
ARM: dts: Add display timing node to exynos5250-arndale.dts
arch/arm/boot/dts/exynos5250-arndale.dts | 26 ++++++++++++++++++++++++++
arch/arm/boot/dts/exynos5250.dtsi | 2 ++
2 files changed, 28 insertions(+)
--
1.7.9.5
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.
I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.
I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Documentation/cgroups/cpusets.txt | 2 +-
Documentation/rt-mutex-design.txt | 2 +-
Documentation/scheduler/sched-domains.txt | 4 ++--
Documentation/spinlocks.txt | 2 +-
Documentation/virtual/uml/UserModeLinux-HOWTO.txt | 4 ++--
arch/avr32/kernel/process.c | 2 +-
arch/cris/include/arch-v10/arch/bitops.h | 2 +-
arch/ia64/kernel/head.S | 2 +-
arch/mips/kernel/mips-mt-fpaff.c | 4 ++--
arch/mips/kernel/scall32-o32.S | 5 +++--
arch/powerpc/include/asm/mmu_context.h | 2 +-
arch/tile/include/asm/processor.h | 2 +-
arch/tile/kernel/stack.c | 2 +-
arch/um/kernel/sysrq.c | 2 +-
include/linux/completion.h | 2 +-
include/linux/perf_event.h | 2 +-
include/linux/spinlock_up.h | 2 +-
include/uapi/asm-generic/unistd.h | 2 +-
kernel/cpuset.c | 4 ++--
kernel/time.c | 2 +-
kernel/workqueue_internal.h | 2 +-
21 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index 12e01d4..7740038 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -373,7 +373,7 @@ can become very uneven.
1.7 What is sched_load_balance ?
--------------------------------
-The kernel scheduler (kernel/sched.c) automatically load balances
+The kernel scheduler (kernel/sched/core.c) automatically load balances
tasks. If one CPU is underutilized, kernel code running on that
CPU will look for tasks on other more overloaded CPUs and move those
tasks to itself, within the constraints of such placement mechanisms
diff --git a/Documentation/rt-mutex-design.txt b/Documentation/rt-mutex-design.txt
index 33ed800..a5bcd7f 100644
--- a/Documentation/rt-mutex-design.txt
+++ b/Documentation/rt-mutex-design.txt
@@ -384,7 +384,7 @@ priority back.
__rt_mutex_adjust_prio examines the result of rt_mutex_getprio, and if the
result does not equal the task's current priority, then rt_mutex_setprio
is called to adjust the priority of the task to the new priority.
-Note that rt_mutex_setprio is defined in kernel/sched.c to implement the
+Note that rt_mutex_setprio is defined in kernel/sched/core.c to implement the
actual change in priority.
It is interesting to note that __rt_mutex_adjust_prio can either increase
diff --git a/Documentation/scheduler/sched-domains.txt b/Documentation/scheduler/sched-domains.txt
index 443f0c7..4af80b1 100644
--- a/Documentation/scheduler/sched-domains.txt
+++ b/Documentation/scheduler/sched-domains.txt
@@ -25,7 +25,7 @@ is treated as one entity. The load of a group is defined as the sum of the
load of each of its member CPUs, and only when the load of a group becomes
out of balance are tasks moved between groups.
-In kernel/sched.c, trigger_load_balance() is run periodically on each CPU
+In kernel/sched/core.c, trigger_load_balance() is run periodically on each CPU
through scheduler_tick(). It raises a softirq after the next regularly scheduled
rebalancing event for the current runqueue has arrived. The actual load
balancing workhorse, run_rebalance_domains()->rebalance_domains(), is then run
@@ -62,7 +62,7 @@ struct sched_domain fields, SD_FLAG_*, SD_*_INIT to get an idea of
the specifics and what to tune.
Architectures may retain the regular override the default SD_*_INIT flags
-while using the generic domain builder in kernel/sched.c if they wish to
+while using the generic domain builder in kernel/sched/core.c if they wish to
retain the traditional SMT->SMP->NUMA topology (or some subset of that). This
can be done by #define'ing ARCH_HASH_SCHED_TUNE.
diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt
index 9dbe885..97eaf57 100644
--- a/Documentation/spinlocks.txt
+++ b/Documentation/spinlocks.txt
@@ -137,7 +137,7 @@ don't block on each other (and thus there is no dead-lock wrt interrupts.
But when you do the write-lock, you have to use the irq-safe version.
For an example of being clever with rw-locks, see the "waitqueue_lock"
-handling in kernel/sched.c - nothing ever _changes_ a wait-queue from
+handling in kernel/sched/core.c - nothing ever _changes_ a wait-queue from
within an interrupt, they only read the queue in order to know whom to
wake up. So read-locks are safe (which is good: they are very common
indeed), while write-locks need to protect themselves against interrupts.
diff --git a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
index a5f8436..f4099ca 100644
--- a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
+++ b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
@@ -3127,7 +3127,7 @@
at process_kern.c:156
#3 0x1006a052 in switch_to (prev=0x50072000, next=0x507e8000, last=0x50072000)
at process_kern.c:161
- #4 0x10001d12 in schedule () at sched.c:777
+ #4 0x10001d12 in schedule () at core.c:777
#5 0x1006a744 in __down (sem=0x507d241c) at semaphore.c:71
#6 0x1006aa10 in __down_failed () at semaphore.c:157
#7 0x1006c5d8 in segv_handler (sc=0x5006e940) at trap_user.c:174
@@ -3191,7 +3191,7 @@
at process_kern.c:161
161 _switch_to(prev, next);
(gdb)
- #4 0x10001d12 in schedule () at sched.c:777
+ #4 0x10001d12 in schedule () at core.c:777
777 switch_to(prev, next, prev);
(gdb)
#5 0x1006a744 in __down (sem=0x507d241c) at semaphore.c:71
diff --git a/arch/avr32/kernel/process.c b/arch/avr32/kernel/process.c
index e7b6149..c273100 100644
--- a/arch/avr32/kernel/process.c
+++ b/arch/avr32/kernel/process.c
@@ -341,7 +341,7 @@ unsigned long get_wchan(struct task_struct *p)
* is actually quite ugly. It might be possible to
* determine the frame size automatically at build
* time by doing this:
- * - compile sched.c
+ * - compile sched/core.c
* - disassemble the resulting sched.o
* - look for 'sub sp,??' shortly after '<schedule>:'
*/
diff --git a/arch/cris/include/arch-v10/arch/bitops.h b/arch/cris/include/arch-v10/arch/bitops.h
index be85f6d..03d9cfd 100644
--- a/arch/cris/include/arch-v10/arch/bitops.h
+++ b/arch/cris/include/arch-v10/arch/bitops.h
@@ -17,7 +17,7 @@ static inline unsigned long cris_swapnwbrlz(unsigned long w)
in another register:
! __asm__ ("swapnwbr %2\n\tlz %2,%0"
! : "=r,r" (res), "=r,X" (dummy) : "1,0" (w));
- confuses gcc (sched.c, gcc from cris-dist-1.14). */
+ confuses gcc (core.c, gcc from cris-dist-1.14). */
unsigned long res;
__asm__ ("swapnwbr %0 \n\t"
diff --git a/arch/ia64/kernel/head.S b/arch/ia64/kernel/head.S
index 9be4e49..991ca33 100644
--- a/arch/ia64/kernel/head.S
+++ b/arch/ia64/kernel/head.S
@@ -1035,7 +1035,7 @@ END(ia64_delay_loop)
* Return a CPU-local timestamp in nano-seconds. This timestamp is
* NOT synchronized across CPUs its return value must never be
* compared against the values returned on another CPU. The usage in
- * kernel/sched.c ensures that.
+ * kernel/sched/core.c ensures that.
*
* The return-value of sched_clock() is NOT supposed to wrap-around.
* If it did, it would cause some scheduling hiccups (at the worst).
diff --git a/arch/mips/kernel/mips-mt-fpaff.c b/arch/mips/kernel/mips-mt-fpaff.c
index fd814e0..cb09862 100644
--- a/arch/mips/kernel/mips-mt-fpaff.c
+++ b/arch/mips/kernel/mips-mt-fpaff.c
@@ -27,12 +27,12 @@ unsigned long mt_fpemul_threshold;
* FPU affinity with the user's requested processor affinity.
* This code is 98% identical with the sys_sched_setaffinity()
* and sys_sched_getaffinity() system calls, and should be
- * updated when kernel/sched.c changes.
+ * updated when kernel/sched/core.c changes.
*/
/*
* find_process_by_pid - find a process with a matching PID value.
- * used in sys_sched_set/getaffinity() in kernel/sched.c, so
+ * used in sys_sched_set/getaffinity() in kernel/sched/core.c, so
* cloned here.
*/
static inline struct task_struct *find_process_by_pid(pid_t pid)
diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S
index 9b36424..e9127ec 100644
--- a/arch/mips/kernel/scall32-o32.S
+++ b/arch/mips/kernel/scall32-o32.S
@@ -476,8 +476,9 @@ einval: li v0, -ENOSYS
/*
* For FPU affinity scheduling on MIPS MT processors, we need to
* intercept sys_sched_xxxaffinity() calls until we get a proper hook
- * in kernel/sched.c. Considered only temporary we only support these
- * hooks for the 32-bit kernel - there is no MIPS64 MT processor atm.
+ * in kernel/sched/core.c. Considered only temporary we only support
+ * these hooks for the 32-bit kernel - there is no MIPS64 MT processor
+ * atm.
*/
sys mipsmt_sys_sched_setaffinity 3
sys mipsmt_sys_sched_getaffinity 3
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index a73668a..b467530 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -38,7 +38,7 @@ extern void drop_cop(unsigned long acop, struct mm_struct *mm);
/*
* switch_mm is the entry point called from the architecture independent
- * code in kernel/sched.c
+ * code in kernel/sched/core.c
*/
static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
index 2b70dfb..b3f1049 100644
--- a/arch/tile/include/asm/processor.h
+++ b/arch/tile/include/asm/processor.h
@@ -225,7 +225,7 @@ extern int do_work_pending(struct pt_regs *regs, u32 flags);
/*
* Return saved (kernel) PC of a blocked thread.
- * Only used in a printk() in kernel/sched.c, so don't work too hard.
+ * Only used in a printk() in kernel/sched/core.c, so don't work too hard.
*/
#define thread_saved_pc(t) ((t)->thread.pc)
diff --git a/arch/tile/kernel/stack.c b/arch/tile/kernel/stack.c
index ed258b8..af8dfc9 100644
--- a/arch/tile/kernel/stack.c
+++ b/arch/tile/kernel/stack.c
@@ -442,7 +442,7 @@ void _KBacktraceIterator_init_current(struct KBacktraceIterator *kbt, ulong pc,
regs_to_pt_regs(®s, pc, lr, sp, r52));
}
-/* This is called only from kernel/sched.c, with esp == NULL */
+/* This is called only from kernel/sched/core.c, with esp == NULL */
void show_stack(struct task_struct *task, unsigned long *esp)
{
struct KBacktraceIterator kbt;
diff --git a/arch/um/kernel/sysrq.c b/arch/um/kernel/sysrq.c
index 7d101a2..0dc4d1c 100644
--- a/arch/um/kernel/sysrq.c
+++ b/arch/um/kernel/sysrq.c
@@ -39,7 +39,7 @@ void show_trace(struct task_struct *task, unsigned long * stack)
static const int kstack_depth_to_print = 24;
/* This recently started being used in arch-independent code too, as in
- * kernel/sched.c.*/
+ * kernel/sched/core.c.*/
void show_stack(struct task_struct *task, unsigned long *esp)
{
unsigned long *stack;
diff --git a/include/linux/completion.h b/include/linux/completion.h
index 33f0280..3cd574d 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -5,7 +5,7 @@
* (C) Copyright 2001 Linus Torvalds
*
* Atomic wait-for-completion handler data structures.
- * See kernel/sched.c for details.
+ * See kernel/sched/core.c for details.
*/
#include <linux/wait.h>
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f463a46..5ec99e5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -803,7 +803,7 @@ static inline void perf_restore_debug_store(void) { }
#define perf_output_put(handle, x) perf_output_copy((handle), &(x), sizeof(x))
/*
- * This has to have a higher priority than migration_notifier in sched.c.
+ * This has to have a higher priority than migration_notifier in sched/core.c.
*/
#define perf_cpu_notifier(fn) \
do { \
diff --git a/include/linux/spinlock_up.h b/include/linux/spinlock_up.h
index e2369c1..8b3ac0d 100644
--- a/include/linux/spinlock_up.h
+++ b/include/linux/spinlock_up.h
@@ -67,7 +67,7 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
#else /* DEBUG_SPINLOCK */
#define arch_spin_is_locked(lock) ((void)(lock), 0)
-/* for sched.c and kernel_lock.c: */
+/* for sched/core.c and kernel_lock.c: */
# define arch_spin_lock(lock) do { barrier(); (void)(lock); } while (0)
# define arch_spin_lock_flags(lock, flags) do { barrier(); (void)(lock); } while (0)
# define arch_spin_unlock(lock) do { barrier(); (void)(lock); } while (0)
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 0cc74c4..a20a9b4 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -361,7 +361,7 @@ __SYSCALL(__NR_syslog, sys_syslog)
#define __NR_ptrace 117
__SYSCALL(__NR_ptrace, sys_ptrace)
-/* kernel/sched.c */
+/* kernel/sched/core.c */
#define __NR_sched_setparam 118
__SYSCALL(__NR_sched_setparam, sys_sched_setparam)
#define __NR_sched_setscheduler 119
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 64b3f79..902d13f 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -540,7 +540,7 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
* This function builds a partial partition of the systems CPUs
* A 'partial partition' is a set of non-overlapping subsets whose
* union is a subset of that set.
- * The output of this function needs to be passed to kernel/sched.c
+ * The output of this function needs to be passed to kernel/sched/core.c
* partition_sched_domains() routine, which will rebuild the scheduler's
* load balancing domains (sched domains) as specified by that partial
* partition.
@@ -569,7 +569,7 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
* is a subset of one of these domains, while there are as
* many such domains as possible, each as small as possible.
* doms - Conversion of 'csa' to an array of cpumasks, for passing to
- * the kernel/sched.c routine partition_sched_domains() in a
+ * the kernel/sched/core.c routine partition_sched_domains() in a
* convenient format, that can be easily compared to the prior
* value to determine what partition elements (sched domains)
* were changed (added or removed.)
diff --git a/kernel/time.c b/kernel/time.c
index d3617db..7c7964c 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -11,7 +11,7 @@
* Modification history kernel/time.c
*
* 1993-09-02 Philip Gladstone
- * Created file with time related functions from sched.c and adjtimex()
+ * Created file with time related functions from sched/core.c and adjtimex()
* 1993-10-08 Torsten Duwe
* adjtime interface update and CMOS clock write code
* 1995-08-13 Torsten Duwe
diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
index ad83c96..7e2204d 100644
--- a/kernel/workqueue_internal.h
+++ b/kernel/workqueue_internal.h
@@ -64,7 +64,7 @@ static inline struct worker *current_wq_worker(void)
/*
* Scheduler hooks for concurrency managed workqueue. Only to be used from
- * sched.c and workqueue.c.
+ * sched/core.c and workqueue.c.
*/
void wq_worker_waking_up(struct task_struct *task, int cpu);
struct task_struct *wq_worker_sleeping(struct task_struct *task, int cpu);
--
1.7.12.rc2.18.g61b472e
I have faced a sequence where the Idle Load Balance was sometime not
triggered for a while on my platform.
CPU 0 and CPU 1 are running tasks and CPU 2 is idle
CPU 1 kicks the Idle Load Balance
CPU 1 selects CPU 2 as the new Idle Load Balancer
CPU 2 sets NOHZ_BALANCE_KICK for CPU 2
CPU 2 sends a reschedule IPI to CPU 2
While CPU 3 wakes up, CPU 0 or CPU 1 migrates a waking up task A on CPU 2
CPU 2 finally wakes up, runs task A and discards the Idle Load Balance
task A quickly goes back to sleep (before a tick occurs on CPU 2)
CPU 2 goes back to idle with NOHZ_BALANCE_KICK set
Whenever CPU 2 will be selected as the ILB, no reschedule IPI will be sent
because NOHZ_BALANCE_KICK is already set and no Idle Load Balance will be
performed.
We must wait for the sched softirq to be raised on CPU 2 thanks to another
part the kernel to come back to clear NOHZ_BALANCE_KICK.
The proposed solution clears NOHZ_BALANCE_KICK in schedule_ipi if
we can't raise the sched_softirq for the Idle Load Balance.
Change since V1:
- move the clear of NOHZ_BALANCE_KICK in got_nohz_idle_kick if the ILB
can't run on this CPU (as suggested by Peter)
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
---
kernel/sched/core.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 58453b8..919bee6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -633,7 +633,19 @@ void wake_up_nohz_cpu(int cpu)
static inline bool got_nohz_idle_kick(void)
{
int cpu = smp_processor_id();
- return idle_cpu(cpu) && test_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu));
+
+ if (!test_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu)))
+ return false;
+
+ if (idle_cpu(cpu) && !need_resched())
+ return true;
+
+ /*
+ * We can't run Idle Load Balance on this CPU for this time so we
+ * cancel it and clear NOHZ_BALANCE_KICK
+ */
+ clear_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu));
+ return false;
}
#else /* CONFIG_NO_HZ_COMMON */
@@ -1393,8 +1405,9 @@ static void sched_ttwu_pending(void)
void scheduler_ipi(void)
{
- if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick()
- && !tick_nohz_full_cpu(smp_processor_id()))
+ if (llist_empty(&this_rq()->wake_list)
+ && !tick_nohz_full_cpu(smp_processor_id())
+ && !got_nohz_idle_kick())
return;
/*
@@ -1417,7 +1430,7 @@ void scheduler_ipi(void)
/*
* Check if someone kicked us for doing the nohz idle load balance.
*/
- if (unlikely(got_nohz_idle_kick() && !need_resched())) {
+ if (unlikely(got_nohz_idle_kick())) {
this_rq()->idle_balance = 1;
raise_softirq_irqoff(SCHED_SOFTIRQ);
}
--
1.7.9.5