Hi Guys,
This patchset add APIs in OPP layer to allow OPPs transitioning from
within OPP layer. Currently all OPP users need to replicate the same
code to switch between OPPs. While the same can be handled easily by
OPP-core.
The first 7 patches update the OPP core to introduce the new APIs and
the next Nine patches update cpufreq-dt for the same.
11 out of 17 are already Reviewed by Stephen, only few are left :)
I hope this is the last version of the series :)
Testing:
- Tested on exynos 5250-arndale (dual-cortex-A15)
- Tested for both Old-V1 bindings and New V2 bindings
- Tested with regulator names as: 'cpu-supply' and 'cpu0-supply'
- Tested with Unsupported supply ranges as well, to check the
opp-disable logic
V2->V3:
- Very minor updates.
- find_supply_name() doesn't return an error value now, and so its
callers don't check for it.
- And so we don't need to initialize name to NULL
Viresh Kumar (16):
PM / OPP: get/put regulators from OPP core
PM / OPP: Disable OPPs that aren't supported by the regulator
PM / OPP: Introduce dev_pm_opp_get_max_volt_latency()
PM / OPP: Introduce dev_pm_opp_get_max_transition_latency()
PM / OPP: Parse clock-latency and voltage-tolerance for v1 bindings
PM / OPP: Manage device clk
PM / OPP: Add dev_pm_opp_set_rate()
cpufreq: dt: Convert few pr_debug/err() calls to dev_dbg/err()
cpufreq: dt: Rename 'need_update' to 'opp_v1'
cpufreq: dt: OPP layers handles clock-latency for V1 bindings as well
cpufreq: dt: Pass regulator name to the OPP core
cpufreq: dt: Unsupported OPPs are already disabled
cpufreq: dt: Reuse dev_pm_opp_get_max_transition_latency()
cpufreq: dt: Use dev_pm_opp_set_rate() to switch frequency
cpufreq: dt: No need to fetch voltage-tolerance
cpufreq: dt: No need to allocate resources anymore
drivers/base/power/opp/core.c | 420 ++++++++++++++++++++++++++++++++++++++++++
drivers/base/power/opp/opp.h | 13 ++
drivers/cpufreq/cpufreq-dt.c | 300 +++++++++++-------------------
include/linux/pm_opp.h | 27 +++
4 files changed, 565 insertions(+), 195 deletions(-)
--
2.7.1.370.gb2aa7f8
Jason,
Could you please review my patch below?
See also arm64 maintainer's comment:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/313712.h…
Thanks,
-Takahiro AKASHI
I tried to verify kgdb in vanilla kernel on fast model, but it seems that
the single stepping with kgdb doesn't work correctly since its first
appearance at v3.15.
On v3.15, 'stepi' command after breaking the kernel at some breakpoint
steps forward to the next instruction, but the succeeding 'stepi' never
goes beyond that.
On v3.16, 'stepi' moves forward and stops at the next instruction just
after enable_dbg in el1_dbg, and never goes beyond that. This variance of
behavior seems to come in with the following patch in v3.16:
commit 2a2830703a23 ("arm64: debug: avoid accessing mdscr_el1 on fault
paths where possible")
This patch
(1) moves kgdb_disable_single_step() from 'c' command handling to single
step handler.
This makes sure that single stepping gets effective at every 's' command.
Please note that, under the current implementation, single step bit in
spsr, which is cleared by the first single stepping, will not be set
again for the consecutive 's' commands because single step bit in mdscr
is still kept on (that is, kernel_active_single_step() in
kgdb_arch_handle_exception() is true).
(2) re-implements kgdb_roundup_cpus() because the current implementation
enabled interrupts naively. See below.
(3) removes 'enable_dbg' in el1_dbg.
Single step bit in mdscr is turned on in do_handle_exception()->
kgdb_handle_expection() before returning to debugged context, and if
debug exception is enabled in el1_dbg, we will see unexpected single-
stepping in el1_dbg.
Since v3.18, the following patch does the same:
commit 1059c6bf8534 ("arm64: debug: don't re-enable debug exceptions
on return from el1_dbg)
(4) masks interrupts while single-stepping one instruction.
If an interrupt is caught during processing a single-stepping, debug
exception is unintentionally enabled by el1_irq's 'enable_dbg' before
returning to debugged context.
Thus, like in (2), we will see unexpected single-stepping in el1_irq.
Basically (1) and (2) are for v3.15, (3) and (4) for v3.1[67].
* issue fixed by (2):
Without (2), we would see another problem if a breakpoint is set at
interrupt-sensible places, like gic_handle_irq():
KGDB: re-enter error: breakpoint removed ffffffc000081258
------------[ cut here ]------------
WARNING: CPU: 0 PID: 650 at kernel/debug/debug_core.c:435
kgdb_handle_exception+0x1dc/0x1f4()
Modules linked in:
CPU: 0 PID: 650 Comm: sh Not tainted 3.17.0-rc2+ #177
Call trace:
[<ffffffc000087fac>] dump_backtrace+0x0/0x130
[<ffffffc0000880ec>] show_stack+0x10/0x1c
[<ffffffc0004d683c>] dump_stack+0x74/0xb8
[<ffffffc0000ab824>] warn_slowpath_common+0x8c/0xb4
[<ffffffc0000ab90c>] warn_slowpath_null+0x14/0x20
[<ffffffc000121bfc>] kgdb_handle_exception+0x1d8/0x1f4
[<ffffffc000092ffc>] kgdb_brk_fn+0x18/0x28
[<ffffffc0000821c8>] brk_handler+0x9c/0xe8
[<ffffffc0000811e8>] do_debug_exception+0x3c/0xac
Exception stack(0xffffffc07e027650 to 0xffffffc07e027770)
...
[<ffffffc000083cac>] el1_dbg+0x14/0x68
[<ffffffc00012178c>] kgdb_cpu_enter+0x464/0x5c0
[<ffffffc000121bb4>] kgdb_handle_exception+0x190/0x1f4
[<ffffffc000092ffc>] kgdb_brk_fn+0x18/0x28
[<ffffffc0000821c8>] brk_handler+0x9c/0xe8
[<ffffffc0000811e8>] do_debug_exception+0x3c/0xac
Exception stack(0xffffffc07e027ac0 to 0xffffffc07e027be0)
...
[<ffffffc000083cac>] el1_dbg+0x14/0x68
[<ffffffc00032e4b4>] __handle_sysrq+0x11c/0x190
[<ffffffc00032e93c>] write_sysrq_trigger+0x4c/0x60
[<ffffffc0001e7d58>] proc_reg_write+0x54/0x84
[<ffffffc000192fa4>] vfs_write+0x98/0x1c8
[<ffffffc0001939b0>] SyS_write+0x40/0xa0
Once some interrupt occurs, a breakpoint at gic_handle_irq() triggers kgdb.
Kgdb then calls kgdb_roundup_cpus() to sync with other cpus.
Current kgdb_roundup_cpus() unmasks interrupts temporarily to
use smp_call_function().
This eventually allows another interrupt to occur and likely results in
hitting a breakpoint at gic_handle_irq() again since debug exception is
always enabled in el1_irq.
We can avoid this issue by specifying "nokgdbroundup" in kernel parameter,
but this will also leave other cpus be in unknown state in terms of kgdb,
and may result in interfering with kgdb activity.
Signed-off-by: AKASHI Takahiro <takahiro.akashi(a)linaro.org>
---
arch/arm64/kernel/kgdb.c | 60 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 46 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index a0d10c5..81b5910 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -19,9 +19,13 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#include <linux/cpumask.h>
#include <linux/irq.h>
+#include <linux/irq_work.h>
#include <linux/kdebug.h>
#include <linux/kgdb.h>
+#include <linux/percpu.h>
+#include <asm/ptrace.h>
#include <asm/traps.h>
struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -95,6 +99,9 @@ struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
{ "fpcr", 4, -1 },
};
+static DEFINE_PER_CPU(unsigned int, kgdb_pstate);
+static DEFINE_PER_CPU(struct irq_work, kgdb_irq_work);
+
char *dbg_get_reg(int regno, void *mem, struct pt_regs *regs)
{
if (regno >= DBG_MAX_REG_NUM || regno < 0)
@@ -176,18 +183,14 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
* over and over again.
*/
kgdb_arch_update_addr(linux_regs, remcom_in_buffer);
- atomic_set(&kgdb_cpu_doing_single_step, -1);
- kgdb_single_step = 0;
-
- /*
- * Received continue command, disable single step
- */
- if (kernel_active_single_step())
- kernel_disable_single_step();
err = 0;
break;
case 's':
+ /* mask interrupts while single stepping */
+ __this_cpu_write(kgdb_pstate, linux_regs->pstate);
+ linux_regs->pstate |= PSR_I_BIT;
+
/*
* Update step address value with address passed
* with step packet.
@@ -198,8 +201,6 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
*/
kgdb_arch_update_addr(linux_regs, remcom_in_buffer);
atomic_set(&kgdb_cpu_doing_single_step, raw_smp_processor_id());
- kgdb_single_step = 1;
-
/*
* Enable single step handling
*/
@@ -229,6 +230,18 @@ static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int esr)
static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned int esr)
{
+ unsigned int pstate;
+
+ kernel_disable_single_step();
+ atomic_set(&kgdb_cpu_doing_single_step, -1);
+
+ /* restore interrupt mask status */
+ pstate = __this_cpu_read(kgdb_pstate);
+ if (pstate & PSR_I_BIT)
+ regs->pstate |= PSR_I_BIT;
+ else
+ regs->pstate &= ~PSR_I_BIT;
+
kgdb_handle_exception(1, SIGTRAP, 0, regs);
return 0;
}
@@ -249,16 +262,27 @@ static struct step_hook kgdb_step_hook = {
.fn = kgdb_step_brk_fn
};
-static void kgdb_call_nmi_hook(void *ignored)
+static void kgdb_roundup_hook(struct irq_work *work)
{
kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs());
}
void kgdb_roundup_cpus(unsigned long flags)
{
- local_irq_enable();
- smp_call_function(kgdb_call_nmi_hook, NULL, 0);
- local_irq_disable();
+ int cpu;
+ struct cpumask mask;
+ struct irq_work *work;
+
+ mask = *cpu_online_mask;
+ cpumask_clear_cpu(smp_processor_id(), &mask);
+ cpu = cpumask_first(&mask);
+ if (cpu >= nr_cpu_ids)
+ return;
+
+ for_each_cpu(cpu, &mask) {
+ work = per_cpu_ptr(&kgdb_irq_work, cpu);
+ irq_work_queue_on(work, cpu);
+ }
}
static int __kgdb_notify(struct die_args *args, unsigned long cmd)
@@ -299,6 +323,8 @@ static struct notifier_block kgdb_notifier = {
int kgdb_arch_init(void)
{
int ret = register_die_notifier(&kgdb_notifier);
+ int cpu;
+ struct irq_work *work;
if (ret != 0)
return ret;
@@ -306,6 +332,12 @@ int kgdb_arch_init(void)
register_break_hook(&kgdb_brkpt_hook);
register_break_hook(&kgdb_compiled_brkpt_hook);
register_step_hook(&kgdb_step_hook);
+
+ for_each_possible_cpu(cpu) {
+ work = per_cpu_ptr(&kgdb_irq_work, cpu);
+ init_irq_work(work, kgdb_roundup_hook);
+ }
+
return 0;
}
--
1.7.9.5
Hi Ard,
The KASLR is backported to LSK v4.4 on
git://git.linaro.org/kernel/linux-linaro-stable.git v4.4/topic/mm-kaslr
from v4.4.9 to 5dd612e parisc: Use generic extable search and sort ..
Could you like help to review the backporting commits, especially on the 'Conflicts' commits:
44b9620 arm64: kvm: deal with kernel symbols outside of linear mapping
2894f32 arm64: prevent potential circular header dependencies in asm/bug.h
d277954 Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules
358e3c8 arm64: Use PoU cache instr for I/D coherency
Thanks
Alex
This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly deliver the changes for the arm we also
rearrange some of the existing x86 NMI code to make it architecture
neutral.
The patches have been runtime tested on both a system capable of
supporting FIQ (Freescale i.MX6) and one that cannot (Qualcomm
Snapdragon 600). In addition older versions of this patchset
have been tested on STiH416 and vexpress-a9. The changes to the x86
logic were tested using qemu.
v21:
* Change the way SGIs are raised to try to increase robustness starting
secondary cores. This is a theoretic fix for a regression reported
by Mark Rutland on vexpress-tc2 but it also allows us to remove
igroup0_shadow entirely since it is no longer needed.
* Fix a couple of variable names and add comments to describe the
hardware behavior better (Mark Rutland).
* Improved MULTI_IRQ_HANDLER support by clearing FIQs using
handle_arch_irq (Marc Zygnier).
* Fix gic_cpu_if_down() to ensure group 1 interrupts are disabled
then the interface is brought down.
For changes in v20 and earlier see:
http://thread.gmane.org/gmane.linux.kernel/1928465
Daniel Thompson (6):
irqchip: gic: Optimize locking in gic_raise_softirq
irqchip: gic: Make gic_raise_softirq FIQ-safe
irqchip: gic: Introduce plumbing for IPI FIQ
printk: Simple implementation for NMI backtracing
x86/nmi: Use common printk functions
ARM: Add support for on-demand backtrace of other CPUs
arch/arm/Kconfig | 1 +
arch/arm/include/asm/hardirq.h | 2 +-
arch/arm/include/asm/irq.h | 5 +
arch/arm/include/asm/smp.h | 3 +
arch/arm/kernel/smp.c | 82 +++++++++++++++
arch/arm/kernel/traps.c | 13 ++-
arch/x86/Kconfig | 1 +
arch/x86/kernel/apic/hw_nmi.c | 104 ++-----------------
drivers/irqchip/irq-gic.c | 220 +++++++++++++++++++++++++++++++++++++---
include/linux/irqchip/arm-gic.h | 6 ++
include/linux/printk.h | 20 ++++
init/Kconfig | 3 +
kernel/printk/Makefile | 1 +
kernel/printk/nmi_backtrace.c | 147 +++++++++++++++++++++++++++
14 files changed, 495 insertions(+), 113 deletions(-)
create mode 100644 kernel/printk/nmi_backtrace.c
--
2.4.3
Hi Rafael,
This series is aimed to make traversing of cpufreq table more efficient
for platforms that already sort the frequency tables. The cpufreq core
now checks if the freq table is sorted and applies a different set of
helpers on such tables while traversing them.
All the patches are pushed here for testing in case anyone wants to try:
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/sorted-freq-table
V4->V4:
- s/cpufreq_table_find_index_unsorted/cpufreq_table_index_unsorted
- return error for duplicate frequencies
- freq-table-sorted is an enum now
- Remove freq_is_invalid() and clamp the frequencies at the top of few
helpers.
- Remove few checks which were impossible to hit (best == -1)
Viresh Kumar (2):
cpufreq: Handle sorted frequency tables more efficiently
cpufreq: Reuse new freq-table helpers
drivers/cpufreq/acpi-cpufreq.c | 14 +-
drivers/cpufreq/amd_freq_sensitivity.c | 4 +-
drivers/cpufreq/cpufreq_ondemand.c | 6 +-
drivers/cpufreq/freq_table.c | 73 ++++++++--
drivers/cpufreq/powernv-cpufreq.c | 3 +-
drivers/cpufreq/s5pv210-cpufreq.c | 3 +-
include/linux/cpufreq.h | 234 ++++++++++++++++++++++++++++++++-
7 files changed, 306 insertions(+), 31 deletions(-)
--
2.7.4
Hi Anders,
I resolved a conflicts when merge lsk 4.1 to lsk 4.1-rt. Could you like review it?
The merge branch pushed on lsk linux-linaro-lsk-v4.1-rt-test
Thanks
Alex
---
commit 72316324660f953a8daee9bf9fcf942f7851b516
Merge: e91b1cc 7c0ca54
Author: Alex Shi <alex.shi(a)linaro.org>
Date: Fri Jun 17 16:48:40 2016 +0800
Merge branch 'linux-linaro-lsk-v4.1' into linux-linaro-lsk-v4.1-rt
Conflicts:
backport cg-writeback in include/linux/cgroup.h
compatible with 28f83d2 futex: Handle unlock_pi race gracefull
in kernel/futex.c
diff --cc arch/arm64/Kconfig
index 1f05b35,2b5b0c5..d655358
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@@ -71,10 -72,9 +72,11 @@@ config ARM6
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE
+ select HAVE_PREEMPT_LAZY
select HAVE_SYSCALL_TRACEPOINTS
+ select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
+ select IRQ_FORCED_THREADING
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
select OF
diff --cc include/linux/cgroup.h
index e3f81b7,56e7af9..1e6f683
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@@ -16,24 -16,266 +16,268 @@@
#include <linux/fs.h>
#include <linux/seq_file.h>
#include <linux/kernfs.h>
+#include <linux/wait.h>
+#include <linux/work-simple.h>
+ #include <linux/jump_label.h>
#include <linux/cgroup-defs.h>
-
#ifdef CONFIG_CGROUPS
+ /*
+ * All weight knobs on the default hierarhcy should use the following min,
+ * default and max values. The default value is the logarithmic center of
+ * MIN and MAX and allows 100x to be expressed in both directions.
+ */
+ #define CGROUP_WEIGHT_MIN 1
+ #define CGROUP_WEIGHT_DFL 100
+ #define CGROUP_WEIGHT_MAX 10000
+
+ /* a css_task_iter should be treated as an opaque object */
+ struct css_task_iter {
+ struct cgroup_subsys *ss;
+
+ struct list_head *cset_pos;
+ struct list_head *cset_head;
+
+ struct list_head *task_pos;
+ struct list_head *tasks_head;
+ struct list_head *mg_tasks_head;
+
+ struct css_set *cur_cset;
+ struct task_struct *cur_task;
+ struct list_head iters_node; /* css_set->task_iters */
+ };
- extern int cgroup_init_early(void);
- extern int cgroup_init(void);
- extern void cgroup_fork(struct task_struct *p);
- extern void cgroup_post_fork(struct task_struct *p);
- extern void cgroup_exit(struct task_struct *p);
- extern int cgroupstats_build(struct cgroupstats *stats,
- struct dentry *dentry);
+ extern struct cgroup_root cgrp_dfl_root;
+ extern struct css_set init_css_set;
+
+ #define SUBSYS(_x) extern struct cgroup_subsys _x ## _cgrp_subsys;
+ #include <linux/cgroup_subsys.h>
+ #undef SUBSYS
- extern int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
- struct pid *pid, struct task_struct *tsk);
+ #define SUBSYS(_x) \
+ extern struct static_key_true _x ## _cgrp_subsys_enabled_key; \
+ extern struct static_key_true _x ## _cgrp_subsys_on_dfl_key;
+ #include <linux/cgroup_subsys.h>
+ #undef SUBSYS
+
+ /**
+ * cgroup_subsys_enabled - fast test on whether a subsys is enabled
+ * @ss: subsystem in question
+ */
+ #define cgroup_subsys_enabled(ss) \
+ static_branch_likely(&ss ## _enabled_key)
+
+ /**
+ * cgroup_subsys_on_dfl - fast test on whether a subsys is on default hierarchy
+ * @ss: subsystem in question
+ */
+ #define cgroup_subsys_on_dfl(ss) \
+ static_branch_likely(&ss ## _on_dfl_key)
+
+ bool css_has_online_children(struct cgroup_subsys_state *css);
+ struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss);
+ struct cgroup_subsys_state *cgroup_get_e_css(struct cgroup *cgroup,
+ struct cgroup_subsys *ss);
+ struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry,
+ struct cgroup_subsys *ss);
+
+ bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
+ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
+ int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
+
+ int cgroup_add_dfl_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
+ int cgroup_add_legacy_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
+ int cgroup_rm_cftypes(struct cftype *cfts);
+ void cgroup_file_notify(struct cgroup_file *cfile);
+
+ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen);
+ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry);
+ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *tsk);
+
+ void cgroup_fork(struct task_struct *p);
+ extern int cgroup_can_fork(struct task_struct *p,
+ void *ss_priv[CGROUP_CANFORK_COUNT]);
+ extern void cgroup_cancel_fork(struct task_struct *p,
+ void *ss_priv[CGROUP_CANFORK_COUNT]);
+ extern void cgroup_post_fork(struct task_struct *p,
+ void *old_ss_priv[CGROUP_CANFORK_COUNT]);
+ void cgroup_exit(struct task_struct *p);
+ void cgroup_free(struct task_struct *p);
+
+ int cgroup_init_early(void);
+ int cgroup_init(void);
+
+ /*
+ * Iteration helpers and macros.
+ */
+
+ struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
+ struct cgroup_subsys_state *parent);
+ struct cgroup_subsys_state *css_next_descendant_pre(struct cgroup_subsys_state *pos,
+ struct cgroup_subsys_state *css);
+ struct cgroup_subsys_state *css_rightmost_descendant(struct cgroup_subsys_state *pos);
+ struct cgroup_subsys_state *css_next_descendant_post(struct cgroup_subsys_state *pos,
+ struct cgroup_subsys_state *css);
+
+ struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset,
+ struct cgroup_subsys_state **dst_cssp);
+ struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset,
+ struct cgroup_subsys_state **dst_cssp);
+
+ void css_task_iter_start(struct cgroup_subsys_state *css,
+ struct css_task_iter *it);
+ struct task_struct *css_task_iter_next(struct css_task_iter *it);
+ void css_task_iter_end(struct css_task_iter *it);
+
+ /**
+ * css_for_each_child - iterate through children of a css
+ * @pos: the css * to use as the loop cursor
+ * @parent: css whose children to walk
+ *
+ * Walk @parent's children. Must be called under rcu_read_lock().
+ *
+ * If a subsystem synchronizes ->css_online() and the start of iteration, a
+ * css which finished ->css_online() is guaranteed to be visible in the
+ * future iterations and will stay visible until the last reference is put.
+ * A css which hasn't finished ->css_online() or already finished
+ * ->css_offline() may show up during traversal. It's each subsystem's
+ * responsibility to synchronize against on/offlining.
+ *
+ * It is allowed to temporarily drop RCU read lock during iteration. The
+ * caller is responsible for ensuring that @pos remains accessible until
+ * the start of the next iteration by, for example, bumping the css refcnt.
+ */
+ #define css_for_each_child(pos, parent) \
+ for ((pos) = css_next_child(NULL, (parent)); (pos); \
+ (pos) = css_next_child((pos), (parent)))
+
+ /**
+ * css_for_each_descendant_pre - pre-order walk of a css's descendants
+ * @pos: the css * to use as the loop cursor
+ * @root: css whose descendants to walk
+ *
+ * Walk @root's descendants. @root is included in the iteration and the
+ * first node to be visited. Must be called under rcu_read_lock().
+ *
+ * If a subsystem synchronizes ->css_online() and the start of iteration, a
+ * css which finished ->css_online() is guaranteed to be visible in the
+ * future iterations and will stay visible until the last reference is put.
+ * A css which hasn't finished ->css_online() or already finished
+ * ->css_offline() may show up during traversal. It's each subsystem's
+ * responsibility to synchronize against on/offlining.
+ *
+ * For example, the following guarantees that a descendant can't escape
+ * state updates of its ancestors.
+ *
+ * my_online(@css)
+ * {
+ * Lock @css's parent and @css;
+ * Inherit state from the parent;
+ * Unlock both.
+ * }
+ *
+ * my_update_state(@css)
+ * {
+ * css_for_each_descendant_pre(@pos, @css) {
+ * Lock @pos;
+ * if (@pos == @css)
+ * Update @css's state;
+ * else
+ * Verify @pos is alive and inherit state from its parent;
+ * Unlock @pos;
+ * }
+ * }
+ *
+ * As long as the inheriting step, including checking the parent state, is
+ * enclosed inside @pos locking, double-locking the parent isn't necessary
+ * while inheriting. The state update to the parent is guaranteed to be
+ * visible by walking order and, as long as inheriting operations to the
+ * same @pos are atomic to each other, multiple updates racing each other
+ * still result in the correct state. It's guaranateed that at least one
+ * inheritance happens for any css after the latest update to its parent.
+ *
+ * If checking parent's state requires locking the parent, each inheriting
+ * iteration should lock and unlock both @pos->parent and @pos.
+ *
+ * Alternatively, a subsystem may choose to use a single global lock to
+ * synchronize ->css_online() and ->css_offline() against tree-walking
+ * operations.
+ *
+ * It is allowed to temporarily drop RCU read lock during iteration. The
+ * caller is responsible for ensuring that @pos remains accessible until
+ * the start of the next iteration by, for example, bumping the css refcnt.
+ */
+ #define css_for_each_descendant_pre(pos, css) \
+ for ((pos) = css_next_descendant_pre(NULL, (css)); (pos); \
+ (pos) = css_next_descendant_pre((pos), (css)))
+
+ /**
+ * css_for_each_descendant_post - post-order walk of a css's descendants
+ * @pos: the css * to use as the loop cursor
+ * @css: css whose descendants to walk
+ *
+ * Similar to css_for_each_descendant_pre() but performs post-order
+ * traversal instead. @root is included in the iteration and the last
+ * node to be visited.
+ *
+ * If a subsystem synchronizes ->css_online() and the start of iteration, a
+ * css which finished ->css_online() is guaranteed to be visible in the
+ * future iterations and will stay visible until the last reference is put.
+ * A css which hasn't finished ->css_online() or already finished
+ * ->css_offline() may show up during traversal. It's each subsystem's
+ * responsibility to synchronize against on/offlining.
+ *
+ * Note that the walk visibility guarantee example described in pre-order
+ * walk doesn't apply the same to post-order walks.
+ */
+ #define css_for_each_descendant_post(pos, css) \
+ for ((pos) = css_next_descendant_post(NULL, (css)); (pos); \
+ (pos) = css_next_descendant_post((pos), (css)))
+
+ /**
+ * cgroup_taskset_for_each - iterate cgroup_taskset
+ * @task: the loop cursor
+ * @dst_css: the destination css
+ * @tset: taskset to iterate
+ *
+ * @tset may contain multiple tasks and they may belong to multiple
+ * processes.
+ *
+ * On the v2 hierarchy, there may be tasks from multiple processes and they
+ * may not share the source or destination csses.
+ *
+ * On traditional hierarchies, when there are multiple tasks in @tset, if a
+ * task of a process is in @tset, all tasks of the process are in @tset.
+ * Also, all are guaranteed to share the same source and destination csses.
+ *
+ * Iteration is not in any specific order.
+ */
+ #define cgroup_taskset_for_each(task, dst_css, tset) \
+ for ((task) = cgroup_taskset_first((tset), &(dst_css)); \
+ (task); \
+ (task) = cgroup_taskset_next((tset), &(dst_css)))
+
+ /**
+ * cgroup_taskset_for_each_leader - iterate group leaders in a cgroup_taskset
+ * @leader: the loop cursor
+ * @dst_css: the destination css
+ * @tset: takset to iterate
+ *
+ * Iterate threadgroup leaders of @tset. For single-task migrations, @tset
+ * may not contain any.
+ */
+ #define cgroup_taskset_for_each_leader(leader, dst_css, tset) \
+ for ((leader) = cgroup_taskset_first((tset), &(dst_css)); \
+ (leader); \
+ (leader) = cgroup_taskset_next((tset), &(dst_css))) \
+ if ((leader) != (leader)->group_leader) \
+ ; \
+ else
+
+ /*
+ * Inline functions.
+ */
/**
* css_get - obtain a reference on the specified css
diff --cc include/linux/sched.h
index 2fd062b,f40a1af..590e49b
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@@ -1769,20 -1698,14 +1761,20 @@@ struct task_struct
unsigned long trace;
/* bitmask and counter of trace recursion */
unsigned long trace_recursion;
+#ifdef CONFIG_WAKEUP_LATENCY_HIST
+ u64 preempt_timestamp_hist;
+#ifdef CONFIG_MISSED_TIMER_OFFSETS_HIST
+ long timer_offset;
+#endif
+#endif
#endif /* CONFIG_TRACING */
#ifdef CONFIG_MEMCG
- struct memcg_oom_info {
- struct mem_cgroup *memcg;
- gfp_t gfp_mask;
- int order;
- unsigned int may_oom:1;
- } memcg_oom;
+ struct mem_cgroup *memcg_in_oom;
+ gfp_t memcg_oom_gfp_mask;
+ int memcg_oom_order;
+
+ /* number of pages to reclaim on returning to userland */
+ unsigned int memcg_nr_pages_over_high;
#endif
#ifdef CONFIG_UPROBES
struct uprobe_task *utask;
diff --cc kernel/futex.c
index e417491,46b168e..8422f94
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@@ -2452,11 -2429,15 +2462,21 @@@ retry
*/
if (ret == -EFAULT)
goto pi_faulted;
+
/*
+ * A unconditional UNLOCK_PI op raced against a waiter
+ * setting the FUTEX_WAITERS bit. Try again.
+ */
+ if (ret == -EAGAIN) {
+ spin_unlock(&hb->lock);
+ put_futex_key(&key);
+ goto retry;
+ }
++
++ /*
+ * wake_futex_pi has detected invalid state. Tell user
+ * space.
+ */
goto out_unlock;
}
Hi Rafael et. al.,
We had some discussions [1] last week on the PM mailing about
upstreaming the cpufreq governor most widely used on Andoid Mobile
phones and tablets: Interactive governor. People (including Rafael)
mostly agreed that we better get it upstreamed and here is an attempt to
upstream most of it (idle notifiers aren't included in this series).
I picked the latest code spread over 70-80 patches from [2]. The
unmodified code, based over mainline is pushed [4] for reference.
I have updated the governor to align it with the current practices
followed with mainline governors, like using utilization hooks from the
scheduler and handling kobject (for governor's sysfs directory) in a
race free manner. And of course this included general cleanup of the
governor as well. This version is pushed here [3] for testing.
The Android version of interactive governor also uses idle EXIT
notifiers, but that code isn't part of this series and will be sent
separately later. For people interested in looking at that, those are 3
minor patches on top of this series and are pushed here [5].
I haven't changed the core logic of the governor intentionally, as that
rather requires more in-depth knowledge of the use case for which the
optimizations have been done. So, we should do any such thing later on
with new patches, so that people can find things easily.
I also haven't tried to change the userspace interface as that may have
broken the userspace that already exists and uses this governor.
This has been lightly tested on my Exynos board (dual ARM A15), where
the governor gets inserted/removed multiple times, the sysfs files are
all functional. The frequency gets changed with load, etc.
This series is based of pm/bleeding-edge branch + few patches from
Rafael [6] & [7] and few minor cleanups from me [8].
--
viresh
[1] http://marc.info/?l=linux-pm&m=146301864519072
[2] https://android.googlesource.com/kernel/common remotes/android/android-4.4
[3] git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/interactive
[4] git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/interactive-orig
[5] git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/intearctive-idle-notifier
[6] http://marc.info/?l=linux-kernel&m=146318036229244
[7] http://marc.info/?l=linux-kernel&m=146360498820804
[8] http://marc.info/?l=linux-pm&m=146357434108299
Viresh Kumar (2):
cpufreq: Move gov_attr_* macros to cpufreq.h
cpufreq: Add android's 'interactive' governor
Documentation/cpu-freq/governors.txt | 86 ++
drivers/cpufreq/Kconfig | 27 +
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/cpufreq_governor.h | 8 -
drivers/cpufreq/cpufreq_interactive.c | 1371 ++++++++++++++++++++++++++++
include/linux/cpufreq.h | 12 +
include/trace/events/cpufreq_interactive.h | 112 +++
kernel/sched/cpufreq_schedutil.c | 8 +-
8 files changed, 1613 insertions(+), 12 deletions(-)
create mode 100644 drivers/cpufreq/cpufreq_interactive.c
create mode 100644 include/trace/events/cpufreq_interactive.h
--
2.7.1.410.g6faf27b