linaro-kernel August 2014

linaro-kernel@lists.linaro.org

78 participants
244 discussions

[PATCH 00/34] CPUFreq: Move freq change notifications out of drivers

by Viresh Kumar

Another 452 lines gone :) Total stats upto now for all the 5 patchsets: Viresh Kumar (186): ... 69 files changed, 700 insertions(+), 2451 deletions(-) Net: 1751 lines gone.. -----x-------------x---------------- Most of the drivers do following in their ->target_index() routines: struct cpufreq_freqs freqs; freqs.old = old freq... freqs.new = new freq... cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE); /* Change rate here */ cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE); This is replicated over all cpufreq drivers today and there doesn't exists a good enough reason why this shouldn't be moved to cpufreq core instead. There are few special cases though, like exynos5440, which doesn't do everything on the call to ->target_index() routine and call some kind of bottom halves for doing this work, work/tasklet/etc.. They may continue doing notification from their own code and so this patch introduces another flag: CPUFREQ_NO_NOTIFICATION, which will be set by such drivers. This is Fifth part of my cleanup work for CPUFreq, first three are (And obviously its rebased over them): 1: cpufreq: Introduce cpufreq_table_validate_and_show() https://lkml.org/lkml/2013/8/8/263 2: cpufreq: define generic routines for cpufreq drivers https://lkml.org/lkml/2013/8/10/48 3. CPUFreq: Implement light weight ->target(): for 3.13 https://lkml.org/lkml/2013/8/13/349 4. CPUFreq: set policy->cur in cpufreq core instead of drivers https://lkml.org/lkml/2013/8/14/288 All these are pushed here: https://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/… -- viresh Cc: Andrew Lunn <andrew(a)lunn.ch> Cc: David S. Miller <davem(a)davemloft.net> Cc: Dmitry Eremin-Solenikov <dbaryshkov(a)gmail.com> Cc: Eric Miao <eric.y.miao(a)gmail.com> Cc: Hans-Christian Egtvedt <egtvedt(a)samfundet.no> Cc: Jesper Nilsson <jesper.nilsson(a)axis.com> Cc: John Crispin <blogic(a)openwrt.org> Cc: Kukjin Kim <kgene.kim(a)samsung.com> Cc: Linus Walleij <linus.walleij(a)linaro.org> Cc: linux-cris-kernel(a)axis.com Cc: Mikael Starvik <starvik(a)axis.com> Cc: Russell King <linux(a)arm.linux.org.uk> Cc: Santosh Shilimkar <santosh.shilimkar(a)ti.com> Cc: Sekhar Nori <nsekhar(a)ti.com> Cc: Shawn Guo <shawn.guo(a)linaro.org> Cc: sparclinux(a)vger.kernel.org Cc: spear-devel(a)list.st.com Cc: Stephen Warren <swarren(a)nvidia.com> Cc: Steven Miao <realmz6(a)gmail.com> Cc: Tony Luck <tony.luck(a)intel.com> Viresh Kumar (34): cpufreq: move freq change notifications to cpufreq core cpufreq: acpi: remove calls to cpufreq_notify_transition() cpufreq: arm_big_little: remove calls to cpufreq_notify_transition() cpufreq: at32ap: remove calls to cpufreq_notify_transition() cpufreq: blackfin: remove calls to cpufreq_notify_transition() cpufreq: cpu0: remove calls to cpufreq_notify_transition() cpufreq: cris: remove calls to cpufreq_notify_transition() cpufreq: davinci: remove calls to cpufreq_notify_transition() cpufreq: dbx500: remove calls to cpufreq_notify_transition() cpufreq: e_powersaver: remove calls to cpufreq_notify_transition() cpufreq: elanfreq: remove calls to cpufreq_notify_transition() cpufreq: exynos: remove calls to cpufreq_notify_transition() cpufreq: exynos5440: set CPUFREQ_NO_NOTIFICATION flag cpufreq: ia64-acpi: remove calls to cpufreq_notify_transition() cpufreq: imx6q: remove calls to cpufreq_notify_transition() cpufreq: kirkwood: remove calls to cpufreq_notify_transition() cpufreq: longhaul: set CPUFREQ_NO_NOTIFICATION flag cpufreq: loongson2: remove calls to cpufreq_notify_transition() cpufreq: maple: remove calls to cpufreq_notify_transition() cpufreq: omap: remove calls to cpufreq_notify_transition() cpufreq: p4-clockmod: remove calls to cpufreq_notify_transition() cpufreq: pasemi: remove calls to cpufreq_notify_transition() cpufreq: pmac: remove calls to cpufreq_notify_transition() cpufreq: powernow: remove calls to cpufreq_notify_transition() cpufreq: ppc: remove calls to cpufreq_notify_transition() cpufreq: pxa: remove calls to cpufreq_notify_transition() cpufreq: s3c: remove calls to cpufreq_notify_transition() cpufreq: s5pv210: remove calls to cpufreq_notify_transition() cpufreq: sa11x0: remove calls to cpufreq_notify_transition() cpufreq: sc520: remove calls to cpufreq_notify_transition() cpufreq: sparc: remove calls to cpufreq_notify_transition() cpufreq: SPEAr: remove calls to cpufreq_notify_transition() cpufreq: speedstep: remove calls to cpufreq_notify_transition() cpufreq: tegra: remove calls to cpufreq_notify_transition() drivers/cpufreq/acpi-cpufreq.c | 14 ++------- drivers/cpufreq/arm_big_little.c | 25 ++-------------- drivers/cpufreq/at32ap-cpufreq.c | 22 ++++++-------- drivers/cpufreq/blackfin-cpufreq.c | 22 +++++--------- drivers/cpufreq/cpufreq-cpu0.c | 33 ++++++++------------- drivers/cpufreq/cpufreq.c | 34 +++++++++++++++++++++ drivers/cpufreq/cris-artpec3-cpufreq.c | 8 ----- drivers/cpufreq/cris-etraxfs-cpufreq.c | 8 ----- drivers/cpufreq/davinci-cpufreq.c | 30 +++++++------------ drivers/cpufreq/dbx500-cpufreq.c | 22 +------------- drivers/cpufreq/e_powersaver.c | 23 ++------------- drivers/cpufreq/elanfreq.c | 13 -------- drivers/cpufreq/exynos-cpufreq.c | 28 +++++------------- drivers/cpufreq/exynos5440-cpufreq.c | 2 +- drivers/cpufreq/ia64-acpi-cpufreq.c | 19 ------------ drivers/cpufreq/imx6q-cpufreq.c | 39 ++++++++++-------------- drivers/cpufreq/kirkwood-cpufreq.c | 54 +++++++++++++--------------------- drivers/cpufreq/longhaul.c | 1 + drivers/cpufreq/loongson2_cpufreq.c | 16 ---------- drivers/cpufreq/maple-cpufreq.c | 18 +----------- drivers/cpufreq/omap-cpufreq.c | 39 +++++++++--------------- drivers/cpufreq/p4-clockmod.c | 10 ------- drivers/cpufreq/pasemi-cpufreq.c | 14 +-------- drivers/cpufreq/pmac32-cpufreq.c | 20 +++---------- drivers/cpufreq/pmac64-cpufreq.c | 18 +----------- drivers/cpufreq/powernow-k6.c | 8 ----- drivers/cpufreq/powernow-k7.c | 11 +------ drivers/cpufreq/powernow-k8.c | 15 +--------- drivers/cpufreq/ppc-corenet-cpufreq.c | 19 +----------- drivers/cpufreq/ppc_cbe_cpufreq.c | 19 +----------- drivers/cpufreq/pxa2xx-cpufreq.c | 27 ++++------------- drivers/cpufreq/pxa3xx-cpufreq.c | 12 -------- drivers/cpufreq/s3c2416-cpufreq.c | 21 +++---------- drivers/cpufreq/s3c64xx-cpufreq.c | 48 ++++++++++-------------------- drivers/cpufreq/s5pv210-cpufreq.c | 16 ++++------ drivers/cpufreq/sa1100-cpufreq.c | 17 ++++------- drivers/cpufreq/sa1110-cpufreq.c | 12 ++------ drivers/cpufreq/sc520_freq.c | 11 ------- drivers/cpufreq/sparc-us2e-cpufreq.c | 7 ----- drivers/cpufreq/sparc-us3-cpufreq.c | 7 ----- drivers/cpufreq/spear-cpufreq.c | 13 +------- drivers/cpufreq/speedstep-centrino.c | 20 +------------ drivers/cpufreq/speedstep-ich.c | 9 ------ drivers/cpufreq/speedstep-smi.c | 7 ----- drivers/cpufreq/tegra-cpufreq.c | 25 ++++------------ include/linux/cpufreq.h | 6 ++++ 46 files changed, 205 insertions(+), 657 deletions(-) -- 1.7.12.rc2.18.g61b472e

8 years

[PATCH v10 0/3] arm64: Add audit support

by AKASHI Takahiro

(This patchset was already acked by the maintainers, and re-targeting v3.17. See change history.) (I don't think that discussions below about ptrace() have impact on this patchset. http://lists.infradead.org/pipermail/linux-arm-kernel/2014-July/268923.html ) This patchset adds system call audit support on arm64. Both 32-bit (AUDIT_ARCH_ARM) and 64-bit tasks (AUDIT_ARCH_AARCH64) are supported. Since arm64 has the exact same set of system calls on LE and BE, we don't care about endianness (or more specifically __AUDIT_ARCH_64BIT bit in AUDIT_ARCH_*). This patch should work correctly with: * userspace audit tool (v2.3.6 or later) This code was tested on both 32-bit and 64-bit LE userland in the following two ways: 1) basic operations with auditctl/autrace # auditctl -a exit,always -S openat -F path=/etc/inittab # auditctl -a exit,always -F dir=/tmp -F perm=rw # auditctl -a task,always # autrace /bin/ls by comparing output from autrace with one from strace 2) audit-test-code (+ my workarounds for arm/arm64) by running "audit-tool", "filter" and "syscalls" test categories. Changes v9 -> v10: * rebased on 3.16-rc3 * included Catalin's patch[1/3] and added more syscall definitions for 3.16 Changes v8 -> v9: * rebased on 3.15-rc, especially due to the change of syscall_get_arch() interface [1,2/2] Changes v7 -> v8: * aligned with the change in "audit: generic compat system call audit support" v5 [1/2] * aligned with the change in "arm64: split syscall_trace() into separate functions for enter/exit" v5 [2/2] Changes v6 -> v7: * changed an include file in syscall.h from <linux/audit.h> to <uapi/linux/audit.h> [1/2] * aligned with the patch, "arm64: split syscall_trace() into separate functions for enter/exit" [2/2] Changes v5 -> v6: * removed and put "arm64: Add regs_return_value() in syscall.h" patch into a separate set * aligned with the change in "arm64: make a single hook to syscall_trace() for all syscall features" v3 [1/2] Changes v4 -> v5: * rebased to 3.14-rcX * added a guard against TIF_SYSCALL_AUDIT [3/3] * aligned with the change in "arm64: make a single hook to syscall_trace() for all syscall features" v2 [3/3] Changes v3 -> v4: * Modified to sync with the patch, "make a single hook to syscall_trace() for all syscall features" * aligned with "audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL" patch Changes v2 -> v3: * Remove asm/audit.h. See "generic compat syscall audit support" patch v4 * Remove endianness dependency, ie. AUDIT_ARCH_ARMEB/AARCH64EB. * Remove kernel/syscalls/Makefile which was used to create unistd32.h. See Catalin's "Add __NR_* definitions for compat syscalls" patch Changes v1 -> v2: * Modified to utilize "generic compat system call audit" [3/6, 4/6, 5/6] Please note that a required header, unistd_32.h, is automatically generated from unistd32.h. * Refer to regs->orig_x0 instead of regs->x0 as the first argument of system call in audit_syscall_entry() [6/6] * Include "Add regs_return_value() in syscall.h" patch [2/6], which was not intentionally included in v1 because it could be added by "kprobes support". AKASHI Takahiro (2): arm64: Add audit support arm64: audit: Add audit hook in syscall_trace_enter/exit() Catalin Marinas (1): arm64: Add __NR_* definitions for compat syscalls arch/arm64/Kconfig | 2 + arch/arm64/include/asm/syscall.h | 14 + arch/arm64/include/asm/unistd.h | 17 + arch/arm64/include/asm/unistd32.h | 1166 ++++++++++++++++++++++++------------- arch/arm64/kernel/entry.S | 1 - arch/arm64/kernel/kuser32.S | 2 +- arch/arm64/kernel/ptrace.c | 7 + arch/arm64/kernel/signal32.c | 2 +- arch/arm64/kernel/sys_compat.c | 2 +- include/uapi/linux/audit.h | 1 + 10 files changed, 810 insertions(+), 404 deletions(-) -- 1.7.9.5

10 years, 8 months

[RFC v2 00/10] kdb: Kiosk (reduced capabilities) mode

by Daniel Thompson

This patchset implements "kiosk" mode for KDB debugger and is a continuation of previous work by Anton Vorontsov (dating back to late 2012). When kiosk mode is engaged several kdb commands become disabled leaving only status reporting functions working normally. In particular arbitrary memory read/write is prevented and it is no longer possible to alter program flow. Note that the commands that remain enabled are sufficient to run the post-mortem macro commands, dumpcommon, dumpall and dumpcpu. One of the motivating use-cases for this work is to realize post-mortem on embedded devices (such as phones) without allowing the debug facility to be easily exploited to compromise user privacy. In principle this means the feature can be enabled on production devices. There are a few patches, some are just cleanups, some are churn-ish cleanups, but inevitable. And the rest implements the mode -- after all the preparations, everything is pretty straightforward. The first patch is actually a pure bug fix (arguably unrelated to kiosk mode) but collides with the kiosk code to honour the sysrq mask so I have included it here. Changes since v1 (circa 2012): * ef (Display exception frame) is essentially an overly complex peek and has therefore been marked unsafe * bt (Stack traceback) has been marked safe only with no arguments * sr (Magic SysRq key) honours the sysrq mask when called in kiosk mode * Fixed over-zealous blocking of macro commands * Symbol lookup is forbidden by kdbgetaddrarg (more robust, better error reporting to user) * Fix deadlock in sr (Magic SysRq key) * Better help text in kiosk mode * Default (kiosk on/off) can be changed From the config file. Anton Vorontsov (7): kdb: Remove currently unused kdbtab_t->cmd_flags kdb: Rename kdb_repeat_t to kdb_cmdflags_t, cmd_repeat to cmd_flags kdb: Rename kdb_register_repeat() to kdb_register_flags() kdb: Use KDB_REPEAT_* values as flags kdb: Remove KDB_REPEAT_NONE flag kdb: Mark safe commands as KDB_SAFE and KDB_SAFE_NO_ARGS kdb: Add kiosk mode Daniel Thompson (3): sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb kdb: Improve usability of help text when running in kiosk mode kdb: Allow access to sensitive commands to be restricted by default drivers/tty/sysrq.c | 11 ++- include/linux/kdb.h | 20 ++-- include/linux/sysrq.h | 1 + kernel/debug/kdb/kdb_bp.c | 22 ++--- kernel/debug/kdb/kdb_main.c | 207 +++++++++++++++++++++++------------------ kernel/debug/kdb/kdb_private.h | 3 +- kernel/trace/trace_kdb.c | 4 +- lib/Kconfig.kgdb | 21 +++++ 8 files changed, 172 insertions(+), 117 deletions(-) -- 1.9.0

10 years, 9 months

[RFC PATCH] kgdb: Timeout if secondary CPUs ignore the roundup

by Daniel Thompson

Currently if an active CPU fails to respond to a roundup request the CPU that requested the roundup will become stuck. This needlessly reduces the robustness of the debugger. This patch introduces a timeout allowing the system state to be examined even when the system contains unresponsive processors. It also modifies kdb's cpu command to make it censor attempts to switch to unresponsive processors and to report their state as (D)ead. Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org> Cc: Jason Wessel <jason.wessel(a)windriver.com> Cc: Mike Travis <travis(a)sgi.com> Cc: Randy Dunlap <rdunlap(a)infradead.org> Cc: Dimitri Sivanich <sivanich(a)sgi.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Borislav Petkov <bp(a)suse.de> Cc: kgdb-bugreport(a)lists.sourceforge.net --- kernel/debug/debug_core.c | 9 +++++++-- kernel/debug/kdb/kdb_main.c | 4 +++- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 1adf62b..acd7497 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -471,6 +471,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, int cpu; int trace_on = 0; int online_cpus = num_online_cpus(); + u64 time_left; kgdb_info[ks->cpu].enter_kgdb++; kgdb_info[ks->cpu].exception_state |= exception_state; @@ -595,9 +596,13 @@ return_normal: /* * Wait for the other CPUs to be notified and be waiting for us: */ - while (kgdb_do_roundup && (atomic_read(&masters_in_kgdb) + - atomic_read(&slaves_in_kgdb)) != online_cpus) + time_left = loops_per_jiffy * HZ; + while (kgdb_do_roundup && --time_left && + (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) != + online_cpus) cpu_relax(); + if (!time_left) + pr_crit("KGDB: Timed out waiting for secondary CPUs.\n"); /* * At this point the primary processor is completely diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c index 2f7c760..49f2425 100644 --- a/kernel/debug/kdb/kdb_main.c +++ b/kernel/debug/kdb/kdb_main.c @@ -2157,6 +2157,8 @@ static void kdb_cpu_status(void) for (start_cpu = -1, i = 0; i < NR_CPUS; i++) { if (!cpu_online(i)) { state = 'F'; /* cpu is offline */ + } else if (!kgdb_info[i].enter_kgdb) { + state = 'D'; /* cpu is online but unresponsive */ } else { state = ' '; /* cpu is responding to kdb */ if (kdb_task_state_char(KDB_TSK(i)) == 'I') @@ -2210,7 +2212,7 @@ static int kdb_cpu(int argc, const char **argv) /* * Validate cpunum */ - if ((cpunum > NR_CPUS) || !cpu_online(cpunum)) + if ((cpunum > NR_CPUS) || !kgdb_info[cpunum].enter_kgdb) return KDB_BADCPUNUM; dbg_switch_cpu = cpunum; -- 1.9.3

10 years, 9 months

[PATCH] kdb: Remove stack dump when entering kgdb due to NMI

by Daniel Thompson

Issuing a stack dump feels ergonomically wrong when entering due to NMI. Entering due to NMI is a normally reaction to a user request, either the NMI button on a server or a "magic knock" on a UART. Therefore the backtrace behaviour on entry due to NMI should be like SysRq-g (no stack dump) rather than like oops. Note also that the stack dump does not offer any information that cannot be trivial retrieved using the 'bt' command. Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org> Cc: Jason Wessel <jason.wessel(a)windriver.com> Cc: Mike Travis <travis(a)sgi.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: kgdb-bugreport(a)lists.sourceforge.net --- kernel/debug/kdb/kdb_main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c index 49f2425..6d19905 100644 --- a/kernel/debug/kdb/kdb_main.c +++ b/kernel/debug/kdb/kdb_main.c @@ -1207,7 +1207,6 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs, kdb_printf("due to NonMaskable Interrupt @ " kdb_machreg_fmt "\n", instruction_pointer(regs)); - kdb_dumpregs(regs); break; case KDB_REASON_SSTEP: case KDB_REASON_BREAK: -- 1.9.3

10 years, 9 months

[PATCH v4 00/12] sched: consolidation of cpu_capacity

by Vincent Guittot

Part of this patchset was previously part of the larger tasks packing patchset [1]. I have splitted the latter in 3 different patchsets (at least) to make the thing easier. -configuration of sched_domain topology [2] -update and consolidation of cpu_capacity (this patchset) -tasks packing algorithm SMT system is no more the only system that can have a CPUs with an original capacity that is different from the default value. We need to extend the use of cpu_capacity_orig to all kind of platform so the scheduler will have both the maximum capacity (cpu_capacity_orig/capacity_orig) and the current capacity (cpu_capacity/capacity) of CPUs and sched_groups. A new function arch_scale_cpu_capacity has been created and replace arch_scale_smt_capacity, which is SMT specifc in the computation of the capapcity of a CPU. During load balance, the scheduler evaluates the number of tasks that a group of CPUs can handle. The current method assumes that tasks have a fix load of SCHED_LOAD_SCALE and CPUs have a default capacity of SCHED_CAPACITY_SCALE. This assumption generates wrong decision by creating ghost cores and by removing real ones when the original capacity of CPUs is different from the default SCHED_CAPACITY_SCALE. We don't try anymore to evaluate the number of available cores based on the group_capacity but instead we detect when the group is fully utilized Now that we have the original capacity of CPUS and their activity/utilization, we can evaluate more accuratly the capacity and the level of utilization of a group of CPUs. This patchset mainly replaces the old capacity method by a new one and has kept the policy almost unchanged whereas we could certainly take advantage of this new statistic in several other places of the load balance. Tests results: I have put below results of 4 kind of tests: - hackbench -l 500 -s 4096 - perf bench sched pipe -l 400000 - scp of 100MB file on the platform - ebizzy with various number of threads on 4 kernels : - tip = tip/sched/core - step1 = tip + patches(1-8) - patchset = tip + whole patchset - patchset+irq = tip + this patchset + irq accounting each test has been run 6 times and the figure below show the stdev and the diff compared to the tip kernel Dual A7 tip | +step1 | +patchset | patchset+irq stdev | results stdev | results stdev | results stdev hackbench (lower is better) (+/-)0.64% | -0.19% (+/-)0.73% | 0.58% (+/-)1.29% | 0.20% (+/-)1.00% perf (lower is better) (+/-)0.28% | 1.22% (+/-)0.17% | 1.29% (+/-)0.06% | 2.85% (+/-)0.33% scp (+/-)4.81% | 2.61% (+/-)0.28% | 2.39% (+/-)0.22% | 82.18% (+/-)3.30% ebizzy -t 1 (+/-)2.31% | -1.32% (+/-)1.90% | -0.79% (+/-)2.88% | 3.10% (+/-)2.32% ebizzy -t 2 (+/-)0.70% | 8.29% (+/-)6.66% | 1.93% (+/-)5.47% | 2.72% (+/-)5.72% ebizzy -t 4 (+/-)3.54% | 5.57% (+/-)8.00% | 0.36% (+/-)9.00% | 2.53% (+/-)3.17% ebizzy -t 6 (+/-)2.36% | -0.43% (+/-)3.29% | -1.93% (+/-)3.47% | 0.57% (+/-)0.75% ebizzy -t 8 (+/-)1.65% | -0.45% (+/-)0.93% | -1.95% (+/-)1.52% | -1.18% (+/-)1.61% ebizzy -t 10 (+/-)2.55% | -0.98% (+/-)3.06% | -1.18% (+/-)6.17% | -2.33% (+/-)3.28% ebizzy -t 12 (+/-)6.22% | 0.17% (+/-)5.63% | 2.98% (+/-)7.11% | 1.19% (+/-)4.68% ebizzy -t 14 (+/-)5.38% | -0.14% (+/-)5.33% | 2.49% (+/-)4.93% | 1.43% (+/-)6.55% Quad A15 tip | +patchset1 | +patchset2 | patchset+irq stdev | results stdev | results stdev | results stdev hackbench (lower is better) (+/-)0.78% | 0.87% (+/-)1.72% | 0.91% (+/-)2.02% | 3.30% (+/-)2.02% perf (lower is better) (+/-)2.03% | -0.31% (+/-)0.76% | -2.38% (+/-)1.37% | 1.42% (+/-)3.14% scp (+/-)0.04% | 0.51% (+/-)1.37% | 1.79% (+/-)0.84% | 1.77% (+/-)0.38% ebizzy -t 1 (+/-)0.41% | 2.05% (+/-)0.38% | 2.08% (+/-)0.24% | 0.17% (+/-)0.62% ebizzy -t 2 (+/-)0.78% | 0.60% (+/-)0.63% | 0.43% (+/-)0.48% | 1.61% (+/-)0.38% ebizzy -t 4 (+/-)0.58% | -0.10% (+/-)0.97% | -0.65% (+/-)0.76% | -0.75% (+/-)0.86% ebizzy -t 6 (+/-)0.31% | 1.07% (+/-)1.12% | -0.16% (+/-)0.87% | -0.76% (+/-)0.22% ebizzy -t 8 (+/-)0.95% | -0.30% (+/-)0.85% | -0.79% (+/-)0.28% | -1.66% (+/-)0.21% ebizzy -t 10 (+/-)0.31% | 0.04% (+/-)0.97% | -1.44% (+/-)1.54% | -0.55% (+/-)0.62% ebizzy -t 12 (+/-)8.35% | -1.89% (+/-)7.64% | 0.75% (+/-)5.30% | -1.18% (+/-)8.16% ebizzy -t 14 (+/-)13.17% | 6.22% (+/-)4.71% | 5.25% (+/-)9.14% | 5.87% (+/-)5.77% I haven't been able to fully test the patchset for a SMT system to check that the regression that has been reported by Preethi has been solved but the various tests that i have done, don't show any regression so far. The correction of SD_PREFER_SIBLING mode and the use of the latter at SMT level should have fix the regression. The usage_avg_contrib is based on the current implementation of the load avg tracking. I also have a version of the usage_avg_contrib that is based on the new implementation [3] but haven't provide the patches and results as [3] is still under review. I can provide change above [3] to change how usage_avg_contrib is computed and adapt to new mecanism. TODO: manage conflict with the next version of [4] Change since V3: - add usage_avg_contrib statistic which sums the running time of tasks on a rq - use usage_avg_contrib instead of runnable_avg_sum for cpu_utilization - fix replacement power by capacity - update some comments Change since V2: - rebase on top of capacity renaming - fix wake_affine statistic update - rework nohz_kick_needed - optimize the active migration of a task from CPU with reduced capacity - rename group_activity by group_utilization and remove unused total_utilization - repair SD_PREFER_SIBLING and use it for SMT level - reorder patchset to gather patches with same topics Change since V1: - add 3 fixes - correct some commit messages - replace capacity computation by activity - take into account current cpu capacity [1] https://lkml.org/lkml/2013/10/18/121 [2] https://lkml.org/lkml/2014/3/19/377 [3] https://lkml.org/lkml/2014/7/18/110 [4] https://lkml.org/lkml/2014/7/25/589 Vincent Guittot (12): sched: fix imbalance flag reset sched: remove a wake_affine condition sched: fix avg_load computation sched: Allow all archs to set the capacity_orig ARM: topology: use new cpu_capacity interface sched: add per rq cpu_capacity_orig sched: test the cpu's capacity in wake affine sched: move cfs task on a CPU with higher capacity sched: add usage_load_avg sched: get CPU's utilization statistic sched: replace capacity_factor by utilization sched: add SD_PREFER_SIBLING for SMT level arch/arm/kernel/topology.c | 4 +- include/linux/sched.h | 4 +- kernel/sched/core.c | 3 +- kernel/sched/fair.c | 350 ++++++++++++++++++++++++++------------------- kernel/sched/sched.h | 3 +- 5 files changed, 207 insertions(+), 157 deletions(-) -- 1.9.1

10 years, 11 months

[PATCH v2 00/11] sched: consolidation of cpu_power

by Vincent Guittot

10 years, 11 months

[PATCH] arm64: topology: Fix handling of multi-level cluster MPIDR-based detection

by Mark Brown

From: Mark Brown <broonie(a)linaro.org> The only requirement the scheduler has on cluster IDs is that they must be unique. When enumerating the topology based on MPIDR information the kernel currently generates cluster IDs by using the first level of affinity above the core ID (either level one or two depending on if the core has multiple threads) however the ARMv8 architecture allows for up to three levels of affinity. This means that an ARMv8 system may contain cores which have MPIDRs identical other than affinity level three which with current code will cause us to report multiple cores with the same identification to the scheduler in violation of its uniqueness requirement. Ensure that we do not violate the scheduler requirements on systems that uses all the affinity levels by incorporating both affinity levels two and three into the cluser ID when the cores are not threaded. While no currently known hardware uses multi-level clusters it is better to program defensively, this will help ease bringup of systems that have them and will ensure that things like distribution install media do not need to be respun to replace kernels in order to deploy such systems. In the worst case the system will work but perform suboptimally until a kernel modified to handle the new topology better is installed, in the best case this will be an adequate description of such topologies for the scheduler to perform well. Signed-off-by: Mark Brown <broonie(a)linaro.org> --- arch/arm64/kernel/topology.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index b6ee26b..5752c1b 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -255,7 +255,8 @@ void store_cpu_topology(unsigned int cpuid) /* Multiprocessor system : Multi-threads per core */ cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); - cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); + cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) | + MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8; } else { /* Multiprocessor system : Single-thread per core */ cpuid_topo->thread_id = -1; -- 2.1.0.rc1

10 years, 11 months

[PATCH v6 0/6] arm64: add seccomp support

by AKASHI Takahiro

This patch series enable secure computing (system call filtering) on arm64, and contain related enhancements and bug fixes. This code was tested on ARMv8 fast model with 64-bit/32-bit userspace using * libseccomp v2.1.1 with modifications for arm64, especially its "live" tests: No.20, 21 and 24. * modified version of Kees' seccomp test for 'changing/skipping a syscall' and seccomp() system call * in-house tests for 'changing/skipping a system call' in tracing with ptrace(PTRACE_SYSCALL) (that is, without seccomp)' with and without audit tracing. Changes v5 -> v6: * rebased to v3.17-rc * changed the interface of changing/skipping a system call from re-writing x8 register [v5 1/3] to using dedicated PTRACE_SET_SYSCALL command [1/6, 2/6] Patch [1/6] contains a checkpatch error around a switch statement, but it won't be fixed as in compat_arch_ptrace(). * added a new system call, seccomp(), for compat task [4/6] * added SIGSYS siginfo for compat task [5/6] * changed to always execute audit exit tracing to avoid OOPs [2/6, 6/6] Changes v4 -> v5: * rebased to v3.16-rc * add patch [1/3] to allow ptrace to change a system call (please note that this patch should be applied even without seccomp.) Changes v3 -> v4: * removed the following patch and moved it to "arm64: prerequisites for audit and ftrace" patchset since it is required for audit and ftrace in case of !COMPAT, too. "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h" Changes v2 -> v3: * removed unnecessary 'type cast' operations [2/3] * check for a return value (-1) of secure_computing() explicitly [2/3] * aligned with the patch, "arm64: split syscall_trace() into separate functions for enter/exit" [2/3] * changed default of CONFIG_SECCOMP to n [2/3] Changes v1 -> v2: * added generic seccomp.h for arm64 to utilize it [1,2/3] * changed syscall_trace() to return more meaningful value (-EPERM) on seccomp failure case [2/3] * aligned with the change in "arm64: make a single hook to syscall_trace() for all syscall features" v2 [2/3] * removed is_compat_task() definition from compat.h [3/3] AKASHI Takahiro (6): arm64: ptrace: add PTRACE_SET_SYSCALL arm64: ptrace: allow tracer to skip a system call asm-generic: add generic seccomp.h for secure computing mode 1 arm64: add seccomp syscall for compat task arm64: add SIGSYS siginfo for compat task arm64: add seccomp support arch/arm64/Kconfig | 14 ++++++++++++ arch/arm64/include/asm/compat.h | 7 ++++++ arch/arm64/include/asm/ptrace.h | 9 ++++++++ arch/arm64/include/asm/seccomp.h | 25 ++++++++++++++++++++++ arch/arm64/include/asm/unistd.h | 5 ++++- arch/arm64/include/asm/unistd32.h | 3 +++ arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/entry.S | 6 ++++++ arch/arm64/kernel/ptrace.c | 39 +++++++++++++++++++++++++++++++++- arch/arm64/kernel/signal32.c | 8 +++++++ include/asm-generic/seccomp.h | 28 ++++++++++++++++++++++++ 11 files changed, 143 insertions(+), 2 deletions(-) create mode 100644 arch/arm64/include/asm/seccomp.h create mode 100644 include/asm-generic/seccomp.h -- 1.7.9.5

11 years

[PATCH v5 00/12] sched: consolidation of cpu_capacity

by Vincent Guittot

Part of this patchset was previously part of the larger tasks packing patchset [1]. I have splitted the latter in 3 different patchsets (at least) to make the thing easier. -configuration of sched_domain topology [2] -update and consolidation of cpu_capacity (this patchset) -tasks packing algorithm SMT system is no more the only system that can have a CPUs with an original capacity that is different from the default value. We need to extend the use of (cpu_)capacity_orig to all kind of platform so the scheduler will have both the maximum capacity (cpu_capacity_orig/capacity_orig) and the current capacity (cpu_capacity/capacity) of CPUs and sched_groups. A new function arch_scale_cpu_capacity has been created and replace arch_scale_smt_capacity, which is SMT specifc in the computation of the capapcity of a CPU. During load balance, the scheduler evaluates the number of tasks that a group of CPUs can handle. The current method assumes that tasks have a fix load of SCHED_LOAD_SCALE and CPUs have a default capacity of SCHED_CAPACITY_SCALE. This assumption generates wrong decision by creating ghost cores or by removing real ones when the original capacity of CPUs is different from the default SCHED_CAPACITY_SCALE. We don't try anymore to evaluate the number of available cores based on the group_capacity but instead we detect when the group is fully utilized Now that we have the original capacity of CPUS and their activity/utilization, we can evaluate more accuratly the capacity and the level of utilization of a group of CPUs. This patchset mainly replaces the old capacity method by a new one and has kept the policy almost unchanged whereas we could certainly take advantage of this new statistic in several other places of the load balance. Tests results (done on v4, no test has been done on v5 that is only a rebase): I have put below results of 4 kind of tests: - hackbench -l 500 -s 4096 - perf bench sched pipe -l 400000 - scp of 100MB file on the platform - ebizzy with various number of threads on 4 kernels : - tip = tip/sched/core - step1 = tip + patches(1-8) - patchset = tip + whole patchset - patchset+irq = tip + this patchset + irq accounting each test has been run 6 times and the figure below show the stdev and the diff compared to the tip kernel Dual A7 tip | +step1 | +patchset | patchset+irq stdev | results stdev | results stdev | results stdev hackbench (lower is better) (+/-)0.64% | -0.19% (+/-)0.73% | 0.58% (+/-)1.29% | 0.20% (+/-)1.00% perf (lower is better) (+/-)0.28% | 1.22% (+/-)0.17% | 1.29% (+/-)0.06% | 2.85% (+/-)0.33% scp (+/-)4.81% | 2.61% (+/-)0.28% | 2.39% (+/-)0.22% | 82.18% (+/-)3.30% ebizzy -t 1 (+/-)2.31% | -1.32% (+/-)1.90% | -0.79% (+/-)2.88% | 3.10% (+/-)2.32% ebizzy -t 2 (+/-)0.70% | 8.29% (+/-)6.66% | 1.93% (+/-)5.47% | 2.72% (+/-)5.72% ebizzy -t 4 (+/-)3.54% | 5.57% (+/-)8.00% | 0.36% (+/-)9.00% | 2.53% (+/-)3.17% ebizzy -t 6 (+/-)2.36% | -0.43% (+/-)3.29% | -1.93% (+/-)3.47% | 0.57% (+/-)0.75% ebizzy -t 8 (+/-)1.65% | -0.45% (+/-)0.93% | -1.95% (+/-)1.52% | -1.18% (+/-)1.61% ebizzy -t 10 (+/-)2.55% | -0.98% (+/-)3.06% | -1.18% (+/-)6.17% | -2.33% (+/-)3.28% ebizzy -t 12 (+/-)6.22% | 0.17% (+/-)5.63% | 2.98% (+/-)7.11% | 1.19% (+/-)4.68% ebizzy -t 14 (+/-)5.38% | -0.14% (+/-)5.33% | 2.49% (+/-)4.93% | 1.43% (+/-)6.55% Quad A15 tip | +patchset1 | +patchset2 | patchset+irq stdev | results stdev | results stdev | results stdev hackbench (lower is better) (+/-)0.78% | 0.87% (+/-)1.72% | 0.91% (+/-)2.02% | 3.30% (+/-)2.02% perf (lower is better) (+/-)2.03% | -0.31% (+/-)0.76% | -2.38% (+/-)1.37% | 1.42% (+/-)3.14% scp (+/-)0.04% | 0.51% (+/-)1.37% | 1.79% (+/-)0.84% | 1.77% (+/-)0.38% ebizzy -t 1 (+/-)0.41% | 2.05% (+/-)0.38% | 2.08% (+/-)0.24% | 0.17% (+/-)0.62% ebizzy -t 2 (+/-)0.78% | 0.60% (+/-)0.63% | 0.43% (+/-)0.48% | 1.61% (+/-)0.38% ebizzy -t 4 (+/-)0.58% | -0.10% (+/-)0.97% | -0.65% (+/-)0.76% | -0.75% (+/-)0.86% ebizzy -t 6 (+/-)0.31% | 1.07% (+/-)1.12% | -0.16% (+/-)0.87% | -0.76% (+/-)0.22% ebizzy -t 8 (+/-)0.95% | -0.30% (+/-)0.85% | -0.79% (+/-)0.28% | -1.66% (+/-)0.21% ebizzy -t 10 (+/-)0.31% | 0.04% (+/-)0.97% | -1.44% (+/-)1.54% | -0.55% (+/-)0.62% ebizzy -t 12 (+/-)8.35% | -1.89% (+/-)7.64% | 0.75% (+/-)5.30% | -1.18% (+/-)8.16% ebizzy -t 14 (+/-)13.17% | 6.22% (+/-)4.71% | 5.25% (+/-)9.14% | 5.87% (+/-)5.77% I haven't been able to fully test the patchset for a SMT system to check that the regression that has been reported by Preethi has been solved but the various tests that i have done, don't show any regression so far. The correction of SD_PREFER_SIBLING mode and the use of the latter at SMT level should have fix the regression. The usage_avg_contrib is based on the current implementation of the load avg tracking. I also have a version of the usage_avg_contrib that is based on the new implementation [3] but haven't provide the patches and results as [3] is still under review. I can provide change above [3] to change how usage_avg_contrib is computed and adapt to new mecanism. Change since V4 - rebase to manage conflicts with changes in selection of busiest group [4] Change since V3: - add usage_avg_contrib statistic which sums the running time of tasks on a rq - use usage_avg_contrib instead of runnable_avg_sum for cpu_utilization - fix replacement power by capacity - update some comments Change since V2: - rebase on top of capacity renaming - fix wake_affine statistic update - rework nohz_kick_needed - optimize the active migration of a task from CPU with reduced capacity - rename group_activity by group_utilization and remove unused total_utilization - repair SD_PREFER_SIBLING and use it for SMT level - reorder patchset to gather patches with same topics Change since V1: - add 3 fixes - correct some commit messages - replace capacity computation by activity - take into account current cpu capacity [1] https://lkml.org/lkml/2013/10/18/121 [2] https://lkml.org/lkml/2014/3/19/377 [3] https://lkml.org/lkml/2014/7/18/110 [4] https://lkml.org/lkml/2014/7/25/589 Vincent Guittot (12): sched: fix imbalance flag reset sched: remove a wake_affine condition sched: fix avg_load computation sched: Allow all archs to set the capacity_orig ARM: topology: use new cpu_capacity interface sched: add per rq cpu_capacity_orig sched: test the cpu's capacity in wake affine sched: move cfs task on a CPU with higher capacity sched: add usage_load_avg sched: get CPU's utilization statistic sched: replace capacity_factor by utilization sched: add SD_PREFER_SIBLING for SMT level arch/arm/kernel/topology.c | 4 +- include/linux/sched.h | 4 +- kernel/sched/core.c | 3 +- kernel/sched/fair.c | 356 ++++++++++++++++++++++++++------------------- kernel/sched/sched.h | 3 +- 5 files changed, 211 insertions(+), 159 deletions(-) -- 1.9.1

11 years, 1 month

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

linaro-kernel August 2014