Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Also includes the test suite for dwarf unwinding, by adding the arch specific test code and the perf_regs_load function.
This series depends on the following kernel patches series: - AARCH64 unwinding support [1]. Already mainlined. - ARM libdw integration [2], and on the changes from the branch for: - libdw AARCH64 unwinding support [3].
[1] http://www.spinics.net/lists/arm-kernel/msg304483.html [2] https://lkml.org/lkml/2014/5/6/366 [3] https://git.fedorahosted.org/cgit/elfutils.git/log/?h=mjw/aarch64-unwind
ToDo: investigate the libdw unwinding problem with compat binaries (i.e. ARMv7 binaries running on ARMv8). Since this functionality works ok with libunwind, the problem should be in libdw compat support [3].
Jean Pihet (3): perf tests: Introduce perf_regs_load function on ARM64 perf tests: Add dwarf unwind test on ARM64 perf tools: Add libdw DWARF post unwind support for ARM64
tools/perf/Makefile.perf | 2 +- tools/perf/arch/arm64/Makefile | 7 ++++ tools/perf/arch/arm64/include/perf_regs.h | 5 +++ tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++ tools/perf/arch/arm64/tests/regs_load.S | 39 ++++++++++++++++++++ tools/perf/arch/arm64/util/unwind-libdw.c | 53 +++++++++++++++++++++++++++ tools/perf/tests/builtin-test.c | 3 +- tools/perf/tests/tests.h | 3 +- 8 files changed, 168 insertions(+), 3 deletions(-) create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c create mode 100644 tools/perf/arch/arm64/tests/regs_load.S create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c
--- Rebased on the latest jolsa/perf/core
Introducing perf_regs_load function, which is going to be used for dwarf unwind test in following patches.
It takes single argument as a pointer to the regs dump buffer and populates it with current registers values, as expected by the perf built-in unwinding test.
Signed-off-by: Jean Pihet jean.pihet@linaro.org Cc: Steve Capper steve.capper@linaro.org Cc: Corey Ashford cjashfor@linux.vnet.ibm.com Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Ingo Molnar mingo@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Paul Mackerras paulus@samba.org Cc: Peter Zijlstra a.p.zijlstra@chello.nl Cc: Arnaldo Carvalho de Melo acme@infradead.org Cc: David Ahern dsahern@gmail.com Cc: Jiri Olsa jolsa@redhat.com --- tools/perf/arch/arm64/Makefile | 1 + tools/perf/arch/arm64/include/perf_regs.h | 2 ++ tools/perf/arch/arm64/tests/regs_load.S | 39 +++++++++++++++++++++++++++++++ 3 files changed, 42 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile index 67e9b3d..9b8f87e 100644 --- a/tools/perf/arch/arm64/Makefile +++ b/tools/perf/arch/arm64/Makefile @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o endif ifndef NO_LIBUNWIND LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o endif diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h index 2359546..1e052f1 100644 --- a/tools/perf/arch/arm64/include/perf_regs.h +++ b/tools/perf/arch/arm64/include/perf_regs.h @@ -9,6 +9,8 @@ #define PERF_REG_IP PERF_REG_ARM64_PC #define PERF_REG_SP PERF_REG_ARM64_SP
+void perf_regs_load(u64 *regs); + static inline const char *perf_reg_name(int id) { switch (id) { diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S new file mode 100644 index 0000000..92ab968 --- /dev/null +++ b/tools/perf/arch/arm64/tests/regs_load.S @@ -0,0 +1,39 @@ +#include <linux/linkage.h> + +/* + * Implementation of void perf_regs_load(u64 *regs); + * + * This functions fills in the 'regs' buffer from the actual registers values, + * in the way the perf built-in unwinding test expects them: + * - the PC at the time at the call to this function. Since this function + * is called using a bl instruction, the PC value is taken from LR, + * - the current SP (not touched by this function), + * - the current value of LR is merely retrieved and stored because the + * value before the call to this function is unknown at this time; it will + * be unwound from the dwarf information in unwind__get_entries. + */ + +.text +.type perf_regs_load,%function +ENTRY(perf_regs_load) + stp x0, x1, [x0], #16 // store x0..x29 + stp x2, x3, [x0], #16 + stp x4, x5, [x0], #16 + stp x6, x7, [x0], #16 + stp x8, x9, [x0], #16 + stp x10, x11, [x0], #16 + stp x12, x13, [x0], #16 + stp x14, x15, [x0], #16 + stp x16, x17, [x0], #16 + stp x18, x19, [x0], #16 + stp x20, x21, [x0], #16 + stp x22, x23, [x0], #16 + stp x24, x25, [x0], #16 + stp x26, x27, [x0], #16 + stp x28, x29, [x0], #16 + mov x1, sp + stp x30, x1, [x0], #16 // store lr and sp + str x30, [x0] // store pc as lr in order to skip the call + // to this function + ret +ENDPROC(perf_regs_load)
Adding dwarf unwind test, that setups live machine data over the perf test thread and does the remote unwind.
Need to use -fno-optimize-sibling-calls for test compilation, otherwise 'krava_*' function calls are optimized into jumps and ommited from the stack unwind.
Cc: Jiri Olsa jolsa@redhat.com Cc: Corey Ashford cjashfor@linux.vnet.ibm.com Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Ingo Molnar mingo@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Paul Mackerras paulus@samba.org Cc: Peter Zijlstra a.p.zijlstra@chello.nl Cc: Arnaldo Carvalho de Melo acme@infradead.org Cc: David Ahern dsahern@gmail.com Signed-off-by: Jean Pihet jean.pihet@linaro.org --- tools/perf/Makefile.perf | 2 +- tools/perf/arch/arm64/Makefile | 1 + tools/perf/arch/arm64/include/perf_regs.h | 3 ++ tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++ tools/perf/tests/builtin-test.c | 3 +- tools/perf/tests/tests.h | 3 +- 6 files changed, 68 insertions(+), 3 deletions(-) create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index dea2d633..6cde50f 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -411,7 +411,7 @@ LIB_OBJS += $(OUTPUT)tests/code-reading.o LIB_OBJS += $(OUTPUT)tests/sample-parsing.o LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o ifndef NO_DWARF_UNWIND -ifeq ($(ARCH),$(filter $(ARCH),x86 arm)) +ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64)) LIB_OBJS += $(OUTPUT)tests/dwarf-unwind.o endif endif diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile index 9b8f87e..221f21d 100644 --- a/tools/perf/arch/arm64/Makefile +++ b/tools/perf/arch/arm64/Makefile @@ -5,4 +5,5 @@ endif ifndef NO_LIBUNWIND LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o endif diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h index 1e052f1..e74df99 100644 --- a/tools/perf/arch/arm64/include/perf_regs.h +++ b/tools/perf/arch/arm64/include/perf_regs.h @@ -9,6 +9,9 @@ #define PERF_REG_IP PERF_REG_ARM64_PC #define PERF_REG_SP PERF_REG_ARM64_SP
+#define PERF_REGS_MAX PERF_REG_ARM64_MAX +#define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_64 + void perf_regs_load(u64 *regs);
static inline const char *perf_reg_name(int id) diff --git a/tools/perf/arch/arm64/tests/dwarf-unwind.c b/tools/perf/arch/arm64/tests/dwarf-unwind.c new file mode 100644 index 0000000..0aa64f3 --- /dev/null +++ b/tools/perf/arch/arm64/tests/dwarf-unwind.c @@ -0,0 +1,59 @@ +#include <string.h> +#include "perf_regs.h" +#include "thread.h" +#include "map.h" +#include "event.h" +#include "tests/tests.h" + +#define STACK_SIZE 8192 + +static int sample_ustack(struct perf_sample *sample, + struct thread *thread, u64 *regs) +{ + struct stack_dump *stack = &sample->user_stack; + struct map *map; + unsigned long sp; + u64 stack_size, *buf; + + buf = malloc(STACK_SIZE); + if (!buf) { + pr_debug("failed to allocate sample uregs data\n"); + return -1; + } + + sp = (unsigned long) regs[PERF_REG_ARM64_SP]; + + map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp); + if (!map) { + pr_debug("failed to get stack map\n"); + return -1; + } + + stack_size = map->end - sp; + stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size; + + memcpy(buf, (void *) sp, stack_size); + stack->data = (char *) buf; + stack->size = stack_size; + return 0; +} + +int test__arch_unwind_sample(struct perf_sample *sample, + struct thread *thread) +{ + struct regs_dump *regs = &sample->user_regs; + u64 *buf; + + buf = malloc(sizeof(u64) * PERF_REGS_MAX); + if (!buf) { + pr_debug("failed to allocate sample uregs data\n"); + return -1; + } + + perf_regs_load(buf); + regs->abi = PERF_SAMPLE_REGS_ABI; + regs->regs = buf; + regs->mask = PERF_REGS_MASK; + + return sample_ustack(sample, thread, buf); +} diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c index 5e0764b..7921aa0 100644 --- a/tools/perf/tests/builtin-test.c +++ b/tools/perf/tests/builtin-test.c @@ -115,7 +115,8 @@ static struct test { .desc = "Test parsing with no sample_id_all bit set", .func = test__parse_no_sample_id_all, }, -#if defined(__x86_64__) || defined(__i386__) || defined(__arm__) +#if defined(__x86_64__) || defined(__i386__) || \ + defined(__arm__) || defined(__aarch64__) #ifdef HAVE_DWARF_UNWIND_SUPPORT { .desc = "Test dwarf unwind", diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h index 8f91fb0..426680e 100644 --- a/tools/perf/tests/tests.h +++ b/tools/perf/tests/tests.h @@ -45,7 +45,8 @@ int test__hists_filter(void); int test__mmap_thread_lookup(void); int test__thread_mg_share(void);
-#if defined(__x86_64__) || defined(__i386__) || defined(__arm__) +#if defined(__x86_64__) || defined(__i386__) || \ + defined(__arm__) || defined(__aarch64__) #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample;
On Tue, May 06, 2014 at 05:55:32PM +0200, Jean Pihet wrote:
SNIP
+#include "tests/tests.h"
+#define STACK_SIZE 8192
+static int sample_ustack(struct perf_sample *sample,
struct thread *thread, u64 *regs)
+{
- struct stack_dump *stack = &sample->user_stack;
- struct map *map;
- unsigned long sp;
- u64 stack_size, *buf;
- buf = malloc(STACK_SIZE);
- if (!buf) {
pr_debug("failed to allocate sample uregs data\n");
return -1;
- }
- sp = (unsigned long) regs[PERF_REG_ARM64_SP];
- map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
- if (!map) {
pr_debug("failed to get stack map\n");
return -1;
- }
there's a memory leak of 'buf' already fixed fox x86:
perf tests x86: Fix memory leak in sample_ustack() commit 763d7f5f2718f085bab5a9e63308349728f3ad12 Author: Masanari Iida standby24x7@gmail.com Date: Sun Apr 20 00:16:41 2014 +0900
jirka
Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Note: the libdw code needs some support for dwarf unwinding on ARM64, this code is submitted seperately on the elfutils ML.
The new code is contained in unwin-libdw.c object, and implements unwind__get_entries unwind interface function.
Signed-off-by: Jean Pihet jean.pihet@linaro.org Cc: Jiri Olsa jolsa@redhat.com Cc: Corey Ashford cjashfor@linux.vnet.ibm.com Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Ingo Molnar mingo@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Paul Mackerras paulus@samba.org Cc: Peter Zijlstra a.p.zijlstra@chello.nl Cc: Arnaldo Carvalho de Melo acme@infradead.org Cc: David Ahern dsahern@gmail.com --- tools/perf/arch/arm64/Makefile | 5 +++ tools/perf/arch/arm64/util/unwind-libdw.c | 53 +++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c
diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile index 221f21d..09d6215 100644 --- a/tools/perf/arch/arm64/Makefile +++ b/tools/perf/arch/arm64/Makefile @@ -4,6 +4,11 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o endif ifndef NO_LIBUNWIND LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o +endif +ifndef NO_LIBDW_DWARF_UNWIND +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libdw.o +endif +ifndef NO_DWARF_UNWIND LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o endif diff --git a/tools/perf/arch/arm64/util/unwind-libdw.c b/tools/perf/arch/arm64/util/unwind-libdw.c new file mode 100644 index 0000000..8d24958 --- /dev/null +++ b/tools/perf/arch/arm64/util/unwind-libdw.c @@ -0,0 +1,53 @@ +#include <elfutils/libdwfl.h> +#include "../../util/unwind-libdw.h" +#include "../../util/perf_regs.h" + +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) +{ + struct unwind_info *ui = arg; + struct regs_dump *user_regs = &ui->sample->user_regs; + Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX]; + +#define REG(r) ({ \ + Dwarf_Word val = 0; \ + perf_reg_value(&val, user_regs, PERF_REG_ARM64_##r); \ + val; \ +}) + + dwarf_regs[0] = REG(X0); + dwarf_regs[1] = REG(X1); + dwarf_regs[2] = REG(X2); + dwarf_regs[3] = REG(X3); + dwarf_regs[4] = REG(X4); + dwarf_regs[5] = REG(X5); + dwarf_regs[6] = REG(X6); + dwarf_regs[7] = REG(X7); + dwarf_regs[8] = REG(X8); + dwarf_regs[9] = REG(X9); + dwarf_regs[10] = REG(X10); + dwarf_regs[11] = REG(X11); + dwarf_regs[12] = REG(X12); + dwarf_regs[13] = REG(X13); + dwarf_regs[14] = REG(X14); + dwarf_regs[15] = REG(X15); + dwarf_regs[16] = REG(X16); + dwarf_regs[17] = REG(X17); + dwarf_regs[18] = REG(X18); + dwarf_regs[19] = REG(X19); + dwarf_regs[20] = REG(X20); + dwarf_regs[21] = REG(X21); + dwarf_regs[22] = REG(X22); + dwarf_regs[23] = REG(X23); + dwarf_regs[24] = REG(X24); + dwarf_regs[25] = REG(X25); + dwarf_regs[26] = REG(X26); + dwarf_regs[27] = REG(X27); + dwarf_regs[28] = REG(X28); + dwarf_regs[29] = REG(X29); + dwarf_regs[30] = REG(LR); + dwarf_regs[31] = REG(SP); + dwarf_regs[32] = REG(PC); + + return dwfl_thread_state_registers(thread, 0, PERF_REG_ARM64_MAX, + dwarf_regs); +}
Hi Jean,
On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Note: the libdw code needs some support for dwarf unwinding on ARM64, this code is submitted seperately on the elfutils ML.
The new code is contained in unwin-libdw.c object, and implements unwind__get_entries unwind interface function.
Are you planning to implement support for 32-bit ARM too? If so, we'll need compat handling here again (your favourite!).
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) +{
- struct unwind_info *ui = arg;
- struct regs_dump *user_regs = &ui->sample->user_regs;
- Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
Shouldn't this be PERF_REG_ARM64_MAX - 1?
Will
Hi Will,
On 6 May 2014 19:00, Will Deacon will.deacon@arm.com wrote:
Hi Jean,
On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Note: the libdw code needs some support for dwarf unwinding on ARM64, this code is submitted seperately on the elfutils ML.
The new code is contained in unwin-libdw.c object, and implements unwind__get_entries unwind interface function.
Are you planning to implement support for 32-bit ARM too? If so, we'll need compat handling here again (your favourite!).
Yes! Another patch set (sent just before this one) targets ARM. There is a nice ToDo in the cover letter: handle compat mode correctly. In fact I sent a patch to libdw, so it supports it already but is somewhat broken for compat mode. This is on my prefered ToDo list ;-)
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) +{
struct unwind_info *ui = arg;
struct regs_dump *user_regs = &ui->sample->user_regs;
Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
Shouldn't this be PERF_REG_ARM64_MAX - 1?
Ah, well spotted! I will change although it shouldn't harm, right?
Will
Thx for reviewing, Jean
On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
Hi Will,
On 6 May 2014 19:00, Will Deacon will.deacon@arm.com wrote:
Hi Jean,
On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Note: the libdw code needs some support for dwarf unwinding on ARM64, this code is submitted seperately on the elfutils ML.
The new code is contained in unwin-libdw.c object, and implements unwind__get_entries unwind interface function.
Are you planning to implement support for 32-bit ARM too? If so, we'll need compat handling here again (your favourite!).
Yes! Another patch set (sent just before this one) targets ARM. There is a nice ToDo in the cover letter: handle compat mode correctly. In fact I sent a patch to libdw, so it supports it already but is somewhat broken for compat mode. This is on my prefered ToDo list ;-)
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) +{
struct unwind_info *ui = arg;
struct regs_dump *user_regs = &ui->sample->user_regs;
Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
Shouldn't this be PERF_REG_ARM64_MAX - 1?
Ah, well spotted! I will change although it shouldn't harm, right?
Actually, looking again, I think I'm wrong and your code was right first time! It looks like dwfl_thread_state_registers takes the limit too, so I don't think you need to change anything (except for adding compat support).
Sorry about that,
Will
Hi Will,
On Tue, May 6, 2014 at 7:52 PM, Will Deacon will.deacon@arm.com wrote:
On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
Hi Will,
On 6 May 2014 19:00, Will Deacon will.deacon@arm.com wrote:
Hi Jean,
On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
Adding libdw DWARF post unwind support, which is part of elfutils-devel/libdw-dev package from version 0.158.
Note: the libdw code needs some support for dwarf unwinding on ARM64, this code is submitted seperately on the elfutils ML.
The new code is contained in unwin-libdw.c object, and implements unwind__get_entries unwind interface function.
Are you planning to implement support for 32-bit ARM too? If so, we'll need compat handling here again (your favourite!).
Yes! Another patch set (sent just before this one) targets ARM. There is a nice ToDo in the cover letter: handle compat mode correctly. In fact I sent a patch to libdw, so it supports it already but is somewhat broken for compat mode. This is on my prefered ToDo list ;-)
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) +{
struct unwind_info *ui = arg;
struct regs_dump *user_regs = &ui->sample->user_regs;
Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
Shouldn't this be PERF_REG_ARM64_MAX - 1?
Ah, well spotted! I will change although it shouldn't harm, right?
Actually, looking again, I think I'm wrong and your code was right first time! It looks like dwfl_thread_state_registers takes the limit too, so I don't think you need to change anything (except for adding compat support).
Sorry about that,
My bad, I haven't checked carefully enough before replying.
Thx! Jean
Will
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
linaro-kernel@lists.linaro.org