This is my second version of patchset for ftrace support.
Actually v1 was submitted serveral weeks ago, but is still moderated.
(Just ignore them for now.)
There is another implementation from Cavium network, but both works
are independent, and my code has additional system call trace support.
I confirmed that I could compile the patches on v3.12-rc4 by Linaro's
coming 2013.10 gcc (4.8.2), and that the kernel worked on Fast Model
with the following tracers:
function tracer with dynamic ftrace
function graph tracer with dynamic ftrace
syscall tracepoint
irqsoff & preemptirqsoff (which use CALLER_ADDRx)
Also verified with in-kernel tests, FTRACE_SELFTEST, FTRACE_STARTUP_TEST
and EVENT_TRACE_TEST_SYSCALLS.
Patch[3/6] has warnings from checkpatch, but they follow other arch's style.
Please be careful that host's elf.h must have AArch64 definitions,
EM_AARCH64 and R_AARCH64_ABS64, to build the kernel. See [4/6].
Issues
* Can we optimize register usages in asm (by not saving x0, x1 and x2)? [1/6]
* Do we need "fault protection" code in ftrace_modify_code()? [1/6]
It exists in x86 and other architectures, but not in arm.
* We may be able to use aarch64_insn_patch_text_nosync() instead of
ftrace_modify_code().[2/6] But the former function does not use
probe_kernel_write(). Is this safe?
Changes from v1 to v2:
* splitted one patch into some pieces for easier review
(especially function tracer + dynamic ftrace + CALLER_ADDRx)
* put return_address() in a separate file
* renamed __mcount to _mcount (it was my mistake)
* changed stackframe handling to get parent's frame pointer
* removed ARCH_SUPPORTS_FTRACE_OPS
* switched to "hotpatch" interfaces from Huawai
* revised descriptions in comments
AKASHI Takahiro (6):
arm64: Add ftrace support
arm64: ftrace: Add dynamic ftrace support
arm64: ftrace: Add CALLER_ADDRx macros
ftrace: Add arm64 support to recordmcount
arm64: ftrace: Add system call tracepoint
arm64: Add 'notrace' attribute to unwind_frame() for ftrace
arch/arm64/Kconfig | 6 +
arch/arm64/include/asm/ftrace.h | 54 +++++++++
arch/arm64/include/asm/syscall.h | 1 +
arch/arm64/include/asm/thread_info.h | 1 +
arch/arm64/include/asm/unistd.h | 2 +
arch/arm64/kernel/Makefile | 9 +-
arch/arm64/kernel/arm64ksyms.c | 4 +
arch/arm64/kernel/entry-ftrace.S | 211 ++++++++++++++++++++++++++++++++++
arch/arm64/kernel/entry.S | 1 +
arch/arm64/kernel/ftrace.c | 186 ++++++++++++++++++++++++++++++
arch/arm64/kernel/ptrace.c | 10 ++
arch/arm64/kernel/return_address.c | 55 +++++++++
arch/arm64/kernel/stacktrace.c | 2 +-
scripts/recordmcount.c | 4 +
scripts/recordmcount.pl | 5 +
15 files changed, 549 insertions(+), 2 deletions(-)
create mode 100644 arch/arm64/include/asm/ftrace.h
create mode 100644 arch/arm64/kernel/entry-ftrace.S
create mode 100644 arch/arm64/kernel/ftrace.c
create mode 100644 arch/arm64/kernel/return_address.c
--
1.7.9.5
This patchset adds audit support on arm64.
The implementation is just like in other architectures,
and so I think little explanation is needed.
I verified this patch with some commands on both 64-bit rootfs
and 32-bit rootfs(, but only in little-endian):
# auditctl -a exit,always -S openat -F path=/etc/inittab
# auditctl -a exit,always -F dir=/tmp -F perm=rw
# auditctl -a task,always
# autrace /bin/ls
What else?
(Thanks to Clayton for his cross-compiling patch)
I'd like to discuss about the following issues:
(issues)
* AUDIT_ARCH_*
Why do we need to distiguish big-endian and little-endian? [2/4]
* AArch32
We need to add a check for identifying the endian in 32-bit tasks. [3/4]
* syscall no in AArch32
Currently all the definitions are added in unistd32.h with
"ifdef __AARCH32_AUDITSYSCALL" to use asm-generic/audit_*.h. [3/4]
"ifdef" is necessary to avoid a conflict with 64-bit definitions.
Do we need a more sophisticated way?
* TIF_AUDITSYSCALL
Most architectures, except x86, do not check TIF_AUDITSYSCALL. Why not? [4/4]
* Userspace audit package
There are some missing syscall definitions in lib/aarch64_table.h.
There is no support for AUDIT_ARCH_ARM (I mean LE. armeb is BE).
AKASHI Takahiro (4):
audit: Enable arm64 support
arm64: Add audit support
arm64: audit: Add AArch32 support
arm64: audit: Add audit hook in ptrace/syscall_trace
arch/arm64/Kconfig | 3 +
arch/arm64/include/asm/audit32.h | 12 ++
arch/arm64/include/asm/ptrace.h | 5 +
arch/arm64/include/asm/syscall.h | 18 ++
arch/arm64/include/asm/thread_info.h | 1 +
arch/arm64/include/asm/unistd32.h | 387 ++++++++++++++++++++++++++++++++++
arch/arm64/kernel/Makefile | 4 +
arch/arm64/kernel/audit.c | 77 +++++++
arch/arm64/kernel/audit32.c | 46 ++++
arch/arm64/kernel/entry.S | 3 +
arch/arm64/kernel/ptrace.c | 12 ++
include/uapi/linux/audit.h | 2 +
init/Kconfig | 2 +-
13 files changed, 571 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/audit32.h
create mode 100644 arch/arm64/kernel/audit.c
create mode 100644 arch/arm64/kernel/audit32.c
--
1.7.9.5
From: Vijaya Kumar K <Vijaya.Kumar(a)caviumnetworks.com>
Based on ARM64 KGDB support patches, KGDB support for
FPSIMD is added. Only debugging of FPSIMD kernel context
is supported.
This patch requires Ard's patches where in kernel support and
below patch for holding thread's fpsimd state.
http://permalink.gmane.org/gmane.linux.ports.arm.kernel/277228
So CONFIG_KERNEL_MODE_NEON should be enabled.
with this, FPSIMD registers can be viewed or set from gdb tool.
Unlike CPU registers, the FPSIMD registers are not saved on
exception entry. With the known restriction that FPSIMD should not be
touched in interrupt/exception context, in this patch the FPSIMD
registers are directly read/written on gdb tool request
Here, the FPSIMD registers are read and restored for every FPSIMD
register read and write by GDB tool. So this has impact on
gdb tool response which is neglible. Other architectures like
mips are also implemented similarly
v2 changes:
- Added API to know thread fpsimd state by checking
TIF_FOREIGN_FPSTATE flag. This is based on below patch
http://permalink.gmane.org/gmane.linux.ports.arm.kernel/277228
- Allow FPSIMD registers access only when FPSIMD is under use
by current thread
v1 changes:
- Initial patch
Tested on ARM64 simulator
Vijaya Kumar K (1):
ARM64: KGDB: Add FP/SIMD debug support
arch/arm64/include/asm/fpsimd.h | 1 +
arch/arm64/kernel/fpsimd.c | 5 ++
arch/arm64/kernel/kgdb.c | 105 +++++++++++++++++++++++++--------------
3 files changed, 73 insertions(+), 38 deletions(-)
--
1.7.9.5
Patchsets related to hibernation resume:
- enhancement to make the use of an existing resume file more general
- enhance name_to_dev_t to ignore trailing newlines coming from userspace.
Both patches are based on the 3.12-rc3 tag. This was tested on a
Pandaboard with partial hibernation support, and compiled for x86.
[PATCH 1/2] init/do_mounts.c: ignore final \n in name_to_dev_t
init/do_mounts.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
Changes name_to_dev_t to handle a trailing newline in the
input buffer, which will allow name_to_dev_t to be used
directly with user buffers without requiring a copy.
Also adds a const to the name parameter which reflects
how name_to_dev_t is treating the input buffer currently.
This also allows direct use of user buffers
(from resume_store for example).
[PATCH 2/2] PM / Hibernate: use name_to_dev_t to parse resume
kernel/power/hibernate.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
Use name_to_dev_t to parse the /sys/power/resume file making the
syntax more flexible. It supports the previous use syntax
and additionally can support other formats such as
/dev/devicenode and UUID= formats.
By changing /sys/debug/resume to accept the same syntax as
the resume=device parameter, we can parse the resume=device
in the initrd init script and use the resume device directly
from the kernel command line.
Changes in v3:
--------------
* Dropped documentation patch as it went in through trivial
* Added patch for name_to_dev_t to support directly parsing userspace
buffer
Changes in v2:
--------------
* Added check for null return of kstrndup in hibernate.c
Thanks,
Sebastian
[repost: adding kvmarm mailing list as per Christoffer's request]
Hi Guys,
Here is series that enables KVM support for V7 big endian kernels. Mostly
it deals with BE KVM host support. Marc Zyngier showed before with his patches
how BE guest could run on top LE host. With these patches BE guest runs on
top of BE host. If Marc's kvmtool is used with few additional changes I tested
that BE host could run LE guest. Also I verified that there were no
regressions in BE guest on top of LE host case.
Note that posted series covers only kernel side changes. The changes were
tested inside of bigger setup with additional changes in qemu and kvmtool.
I will post those changes separately in proper aliases but for completeness
sake Appendix A gives pointers to git repositories and branches with all
needed changes.
Please note first patch is not related to BE KVM per se. I've run
into an issue of conflicting 'push' identifier use while trying to include
assembler.h into KVM .S files. Details of an issue I observed covered in
Appendix B. The first patch is my take on solving it.
Victor Kamensky (5):
ARM: kvm: replace push and pop with stdmb and ldmia instrs to enable
assembler.h inclusion
ARM: fix KVM assembler files to work in BE case
ARM: kvm one_reg coproc set and get BE fixes
ARM: kvm vgic mmio should return data in BE format in BE case
ARM: kvm MMIO support BE host running LE code
arch/arm/include/asm/assembler.h | 7 +++
arch/arm/include/asm/kvm_asm.h | 4 +-
arch/arm/include/asm/kvm_emulate.h | 22 +++++++--
arch/arm/kvm/coproc.c | 94 ++++++++++++++++++++++++++++----------
arch/arm/kvm/init.S | 7 ++-
arch/arm/kvm/interrupts.S | 50 +++++++++++---------
arch/arm/kvm/interrupts_head.S | 61 +++++++++++++++----------
virt/kvm/arm/vgic.c | 4 +-
8 files changed, 168 insertions(+), 81 deletions(-)
--
1.8.1.4
Thanks,
Victor
Appendix A: Testing and Full Setup Description
----------------------------------------------
I) No mixed mode setup - i.e BE guest on BE host; and LE guest
on LE host tested to make sure no regressions.
KVM host and guest kernels:
TC2 on top of Linus 3.13-rc4 (this patch series):
git: git://git.linaro.org/people/victor.kamensky/linux-linaro-tracking-be.git
branch: armv7be-kvm-3.13-rc4
TC2 and Arndale on top of Linaro BE tree:
git: git://git.linaro.org/people/victor.kamensky/linux-linaro-tracking-be.git
branch: llct-be-20131216-kvm
- TC1 kernels used as guests
qemu:
git: git://git.linaro.org/people/victor.kamensky/qemu-be.git
branch: armv7be-v1
description: changes to run qemu on armeb target; and other
changes to work with be image on top of be host
kvmtool:
git: git://git.linaro.org/people/victor.kamensky/linux-linaro-tracking-be.git
branch: kvmtool-armv7be-v1
desciption: minimal changes to build kvmtool for armeb target; and
tiny change with virtio magic
II) Mixed mode setup all possible combinations within V7 (LE guest on BE host;
BE guest on LE host as Marc's setup tested to make sure no regressions) only
with kvmtool.
This work is based on Marc Zyngier's work that made BE guest to run on top
of LE host. For this setup special version of kvmtool should be used and
in addition I had to apply patch to guest kernel that would switch reading
virtio configs reads to be LE only, that is made on top of previous Rusty
Russell's changes. Effectively I just had to do very minor addition to make
LE guest to work on BE host, most of heavy lifting was done before by Marc.
KVM host kernels: as in previous setup
Guest TC1 kernels with LE virtio config patch:
git: git://git.linaro.org/people/victor.kamensky/linux-linaro-tracking-be.git
branch: virtio-leconfig-3.13-rc4
kvmtool:
git: git://git.linaro.org/people/victor.kamensky/linux-linaro-tracking-be.git
branch: kvmtool-mixed-v1
description: based on git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
branch kvm-arm64/kvmtool-be-on-le; adds missing include fix; above armeb target
build patches; and one fix related to BE mode
qemu:
git: git://git.linaro.org/people/victor.kamensky/qemu-be.git
branch: armv7be-leconfig-v1
description: change virtio-blk that so qemu could work with guest image
where virtio leconfig is made; note it does not work in mixed mode; to do
so qemu would need bunch of similar changes that Marc did in kvmtool
Appendix B: kvm asm file and asm/assembler.h file issue
-------------------------------------------------------
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index ddc1553..5d3b511 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -25,6 +25,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_arm.h>
#include <asm/vfpmacros.h>
+#include <asm/assembler.h>
#include "interrupts_head.S"
.text
produce the following compilation errors:
/run/media/kamensky/wd/linaro/linux-linaro-core-tracking/092913/linux-linaro-tracking-be/arch/arm/kvm/interrupts.S: Assembler messages:
/run/media/kamensky/wd/linaro/linux-linaro-core-tracking/092913/linux-linaro-tracking-be/arch/arm/kvm/interrupts.S:51: Error: ARM register expected -- `lsr {r2,r3}'
/run/media/kamensky/wd/linaro/linux-linaro-core-tracking/092913/linux-linaro-tracking-be/arch/arm/kvm/interrupts.S:100: Error: ARM register expected -- `lsr {r2}'
/run/media/kamensky/wd/linaro/linux-linaro-core-tracking/092913/linux-linaro-tracking-be/arch/arm/kvm/interrupts.S:100: Error: ARM register expected -- `lsr {r4-r12}'
Paul & Vincent & Morten,
The following rough idea get during this KS. I want to have internal
review before send to LKML. Would you like to give some comments?
==========================
1, Current scheduler load balance is bottom-up mode, each CPU need
initiate the balance by self. Like in a integrate computer system, it
has smt/core/cpu/numa, 4 level scheduler domains.
If there is just 2 tasks in whole system that both running on cpu0.
Current load balance need to pull task to another smt in smt domain,
then pull task to another core, then pull task to another cpu, finally
pull task to another numa. Totally it is need 4 times task moving to get
system balance.
Generally, the task moving complexity is
O(nm log n), n := nr_cpus, m := nr_tasks
PeterZ has a excellent summary and explanation for this in
kernel/sched/fair.c:4605
Another weakness of current LB is that every cpu need to get the other
cpus' load info repeatedly and try to figure out busiest sched
group/queue on every sched domain level. but may not conduct a task
moving, one of reasons is that cpu can only pull task, not pushing.
2, Consider huge cost of task moving: CS, tlb/cache refill, and the
useless remote cpu load info getting. If we can have better solution for
load balance, like reduce the balance times to.
O(m) m := nr_tasks
It will be a great win on performance. like above example, we can move
task from cpu0 direct to another numa. that only need 1 task moving,
save 3 CS and tlb/cache refill.
To get this point, a rough idea is changing the load balance behaviour
to top-down mode. Say let each of cpu report self load status on per-cpu
memory. And a baby-sitter in system to collect these cpus load info,
then decide how to move task centralize, finally send IPI to each hungry
cpu to let them pull load quota from appointed cpus.
Like in above example, the baby-sitter will fetch each cpus' load info,
then send a pull task IPI to let a cpu in another numa pull one task
from cpu0. So in the task pulling, we still just involved 2 cpus, can
reuse move_tasks functions.
BTW, the baby-sitter can care all kind of balance, regular balance, idle
balance, wake up balance.
3, One of concern of top-down mode is that baby-sitter need remote
access cpu load info on top domain level every times. But the fact is
current load balance also need to get remote cpu load info for top level
domain balance. and more worse, such remote accessing maybe useless.
-- since there is just one thread reading the info, no competitor
writer, Paul, do you think it is worthy concern?
BTW, to reduce unnecessary remote info fetching, we can use current
idle_cpus_mask in nohz, we just skip the idle cpu in this cpumask simply.
4, From power saving POV, top-down give the whole system cpu topology
info directly. So beside the CS reducing, it can reduce the idle cpu
interfere by a transition task. and let idle cpu sleep better.
--
Thanks
Alex
This patch set is based on part1 "Make ACPI core running on ARM64" patch
set.
After we can get the ACPI tables from UEFI, we can use these tables
to initialise the system now.
GIC (means GIC cpu interface) structure and GIC distributor structure in
MADT table contains the information of GIC cpu interface base address
and GIC distributor base address, which can be used to initialise GIC.
Further more, parked address in GIC structure can be used as cpu release
address for spin table SMP initialisation.
This patch set use these information to init SMP and GIC.
Please refer to chapter 5.2.12.14/15 of ACPI 5.0 spec for GIC and GIC
distributor structure information.
Amit Daniel Kachhap (1):
irqdomain: Add a new API irq_create_acpi_mapping()
Hanjun Guo (8):
ARM64 / ACPI: Implement core functions for parsing MADT table
ARM64 / ACPI: Prefill cpu possible/present maps and map logical cpu
id to APIC id
ARM64 / ACPI: Introduce map_gic_id() to get apic id from MADT or _MAT
method
ARM64 / ACPI: Use Parked Address in GIC structure for spin table SMP
initialisation
ACPI: Define ACPI_IRQ_MODEL_GIC needed for arm
Irqchip / gic: Set as default domain so we can access from ACPI
ACPI / ARM64: Update acpi_register_gsi to register with the core IRQ
subsystem
ACPI / GIC: Initialize GIC using the information in MADT
arch/arm64/include/asm/acpi.h | 16 +-
arch/arm64/kernel/irq.c | 5 +
arch/arm64/kernel/setup.c | 2 +
arch/arm64/kernel/smp.c | 2 +
arch/arm64/kernel/smp_spin_table.c | 16 +-
drivers/acpi/bus.c | 3 +
drivers/acpi/plat/arm-core.c | 397 +++++++++++++++++++++++++++++++++++-
drivers/acpi/processor_core.c | 26 +++
drivers/acpi/tables.c | 21 ++
drivers/irqchip/irq-gic.c | 7 +
include/linux/acpi.h | 9 +
kernel/irq/irqdomain.c | 27 +++
12 files changed, 521 insertions(+), 10 deletions(-)
--
1.7.9.5
Add AARCH64 specific support. This includes the following:
- AARCH64 perf registers definition and hooks,
- compat mode registers use, i.e. profiling a 32-bit binary on
a 64-bit system,
- unwinding using the dwarf information from the .debug_frame
section of the ELF binary; only in 64-bit mode,
- unwinding using the frame pointer information; in 64-bit and
compat modes.
ToDo:
- add support for unwinding using the dwarf information in compat
mode. This requires some changes to the libunwind code.
Tested on ARMv7 and ARMv8 platforms. The compat mode has been tested
on ARMv8 using statically built 32-bit binaries.
Jean Pihet (3):
ARM64: perf: add support for perf registers API
ARM64: perf: wire up perf_regs and unwind support
ARM64: perf: add support for frame pointer unwinding in compat mode
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/ptrace.h | 1 +
arch/arm64/include/uapi/asm/Kbuild | 1 +
arch/arm64/include/uapi/asm/perf_regs.h | 40 ++++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/perf_event.c | 75 +++++++++++++++++++++++---
arch/arm64/kernel/perf_regs.c | 44 ++++++++++++++++
tools/perf/arch/arm64/Makefile | 7 +++
tools/perf/arch/arm64/include/perf_regs.h | 88 +++++++++++++++++++++++++++++++
tools/perf/arch/arm64/util/dwarf-regs.c | 80 ++++++++++++++++++++++++++++
tools/perf/arch/arm64/util/unwind.c | 82 ++++++++++++++++++++++++++++
tools/perf/config/Makefile | 8 ++-
12 files changed, 420 insertions(+), 9 deletions(-)
create mode 100644 arch/arm64/include/uapi/asm/perf_regs.h
create mode 100644 arch/arm64/kernel/perf_regs.c
create mode 100644 tools/perf/arch/arm64/Makefile
create mode 100644 tools/perf/arch/arm64/include/perf_regs.h
create mode 100644 tools/perf/arch/arm64/util/dwarf-regs.c
create mode 100644 tools/perf/arch/arm64/util/unwind.c
--
1.7.11.7