Hi Mathieu and Chunyan,
I am now able to get the 4.9-rc1 kernel, decoder and perf all working on Juno r2. Thanks for the support given.
@Mathieu: there are typos in your LAS16-210 slides. In particular, "--perf-thread" in "./perf record -e cs_etm/@20070000.etr/ --perf-thread ./main" should be "--per-thread". There are a few occurrences of such.
Thanks again.
Regards, Yan Lin Aung
-----Original Message----- From: Mathieu Poirier [mailto:mathieu.poirier@linaro.org] Sent: Thursday, November 17, 2016 11:35 PM To: Yan Lin Aung (Dr) Cc: Yan Lin Aung; coresight@lists.linaro.org Subject: Re: CoreSight post from yan_lin_aung@yahoo.com requires approval
On 11 November 2016 at 09:30, Yan Lin Aung (Dr) yan_lin_aung@pmail.ntu.edu.sg wrote:
Hi Mathieu,
The platform is DragonBoard 410c. The latest kernel for this platform is 4.4.23 as mentioned at this link: https://builds.96boards.org/releases/dragonboard410c/linaro/debian/latest/.
Actually, I am not sure of how to compile the kernel branch "perf-opencsd-4.9-rc1" from the OpenCSD github for this platform.
Good day Yan,
I completely missed the above line when reading your email the first time - please accept my apologies on that.
I am attaching the 4.9-rc1 configuration file I use on my setup. That kernel and configuration file will make sure the kernel boots without needing to deal with power domains with DS-5.
The 4.8 kernel you've been using has 2 problems:
1) There is a bug in the CS lookup of orphan component (fixed by Sudeep as underlined by Chunyan) 2) Power domain management isn't functional, which requires powering up CS blocks using a DS-5.
Those two issues are addressed in the 4.9 cycle. Even then you need the code found on github as modifications to the perf user space tools aren't all upstream yet.
Thanks, Mathieu
Any suggestion/help from you on this?
Thanks.
Regards, Yan Lin Aung
-----Original Message----- From: Mathieu Poirier [mailto:mathieu.poirier@linaro.org] Sent: Friday, November 11, 2016 11:42 PM To: Yan Lin Aung yan_lin_aung@yahoo.com Cc: Yan Lin Aung (Dr) yan_lin_aung@pmail.ntu.edu.sg; coresight@lists.linaro.org Subject: Re: CoreSight post from yan_lin_aung@yahoo.com requires approval
On 11 November 2016 at 06:45, Yan Lin Aung yan_lin_aung@yahoo.com wrote:
Hi Mathieu,
I have made progress with getting things up.
I now switched to another platform with quad-core A53 processors because the Juno r2 environment is a bit difficult for me to work with.
The following describes the steps taken:
- I have Linux 4.4.23 running on quad-core A53. CoreSight is enabled.
I can see CoreSight components under "/sys/bus/coresight/devices"
linaro@linaro-alip:~/OpenCSD-perf-opencsd-4.9-rc1/tools/perf$ ls /sys/bus/coresight/devices/ 820000.tpiu 821000.funnel 824000.replicator 825000.etf 826000.etr 841000.funnel 85c000.etm 85d000.etm 85e000.etm 85f000.etm
I can enable/disable ETM and ETR (e.g. echo 1 > 85c000.etm/enable_source, cat 85c000.etm/enable_source).
Ok, but that is not required since identification of the sink to use is now done from the perf cmd line.
- I am able to build OpenCSD library. Then, share libraries
(libcstraced_c_api.so, libcstraced.so) are copied to "/usr/lib". Then, I tried testing the library by running "c_api_pkt_print_test". Below shows the sample outputs.
C-API packet print test Library Version 0.4.2
Idx:86; I_NOT_SYNC : I Stream not synchronised Idx:1650; I_ASYNC : Alignment Synchronisation. Idx:1662; I_TRACE_INFO : Trace Info. Idx:1666; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFFFC000096A00; Idx:1675; I_TRACE_ON : Trace On. Idx:1676; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000096A00; Ctxt: AArch64,EL1, NS; CID=0x00000000; VMID=0x0000; Idx:1692; I_ATOM_F1 : Atom format 1.; E Idx:1693; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFFFC000594AC0; Idx:1703; I_ATOM_F1 : Atom format 1.; E Idx:1704; I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0xFFFFFFC000592B58 ~[0x12B58] Idx:1707; I_ATOM_F3 : Atom format 3.; ENN Idx:1708; I_ATOM_F1 : Atom format 1.; E Idx:1709; I_ADDR_L_32IS0 : Address, Long, 32 bit, IS0.; Addr=0x005AC4C8; Idx:1715; I_ATOM_F2 : Atom format 2.; EE Idx:1716; I_ADDR_L_32IS0 : Address, Long, 32 bit, IS0.; Addr=0x000EA588; Idx:1721; I_ATOM_F3 : Atom format 3.; NNE Idx:1722; I_ADDR_L_32IS0 : Address, Long, 32 bit, IS0.; Addr=0x00592B60;
- Then, perf tool under the "OpenCSD-perf-opencsd-4.9-rc1" branch is
compiled on target. The following is the compilation log. It seems to me there is no issue with compilation ("CC util/cs-etm-decoder/cs-etm-decoder.o" is compiled properly). I also exported "CSTRACE_PATH=/home/linaro/OpenCSD-0.4.2/decoder" before compilation.
linaro@linaro-alip:~/OpenCSD-perf-opencsd-4.9-rc1$ make -C tools/perf/ make: Entering directory '/home/linaro/OpenCSD-perf-opencsd-4.9-rc1/tools/perf' BUILD: Doing 'make -j4' parallel build
Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ on ] ... libaudit: [ on ] ... libbfd: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ OFF ] ... bpf: [ on ]
Makefile.config:349: BPF prologue is not supported by architecture arm64, missing regs_query_register_offset() Makefile.config:400: No debug_frame support found in libunwind-aarch64 Makefile.config:459: No debug_frame support found in libunwind GEN common-cmds.h HOSTCC fixdep.o HOSTCC pmu-events/json.o HOSTLD fixdep-in.o LINK fixdep HOSTCC pmu-events/jsmn.o HOSTCC pmu-events/jevents.o CC fd/array.o CC fs/fs.o HOSTLD pmu-events/jevents-in.o CC event-parse.o LD fd/libapi-in.o CC cpu.o CC debug.o CC str_error_r.o CC fs/tracing_path.o CC exec-cmd.o PERF_VERSION = 4.9.0-rc1 Warning: tools/include/uapi/linux/bpf.h differs from kernel CC libbpf.o LD fs/libapi-in.o LD libapi-in.o AR libapi.a CC bpf.o CC help.o CC pager.o LD libbpf-in.o LINK libbpf.a LINK pmu-events/jevents CC event-plugin.o CC plugin_jbd2.o CC parse-options.o LD plugin_jbd2-in.o CC plugin_hrtimer.o LD plugin_hrtimer-in.o CC trace-seq.o CC plugin_kmem.o CC run-command.o LD plugin_kmem-in.o CC plugin_kvm.o CC parse-filter.o CC sigchain.o LD plugin_kvm-in.o CC plugin_mac80211.o LD plugin_mac80211-in.o CC plugin_sched_switch.o CC plugin_function.o CC parse-utils.o LD plugin_sched_switch-in.o CC plugin_xen.o LD plugin_xen-in.o CC plugin_scsi.o LD plugin_function-in.o CC plugin_cfg80211.o CC kbuffer-parse.o LD plugin_scsi-in.o LD plugin_cfg80211-in.o LINK plugin_jbd2.so LINK plugin_hrtimer.so LINK plugin_kmem.so LINK plugin_kvm.so LINK plugin_mac80211.so LD libtraceevent-in.o LINK libtraceevent.a LINK plugin_sched_switch.so GEN perf-archive LINK plugin_function.so GEN perf-with-kcore CC ui/gtk/browser.o LINK plugin_xen.so LINK plugin_scsi.so LINK plugin_cfg80211.so CC subcmd-config.o CC ui/gtk/hists.o LD libsubcmd-in.o CC util/alias.o AR libsubcmd.a Warning: tools/arch/x86/lib/memcpy_64.S differs from kernel Warning: tools/arch/x86/lib/memset_64.S differs from kernel Warning: tools/arch/arm/include/uapi/asm/kvm.h differs from kernel CC util/annotate.o Warning: tools/include/uapi/asm-generic/mman-common.h differs from kernel CC builtin-bench.o CC builtin-annotate.o CC ui/gtk/setup.o CC ui/gtk/util.o CC builtin-config.o CC builtin-diff.o CC util/block-range.o CC ui/gtk/helpline.o CC arch/common.o CC util/build-id.o CC arch/arm64/util/dwarf-regs.o CC arch/arm64/util/unwind-libunwind.o CC ui/gtk/progress.o CC builtin-evlist.o CC arch/arm64/util/../../arm/util/pmu.o CC util/config.o CC arch/arm64/util/../../arm/util/auxtrace.o CC builtin-help.o CC ui/gtk/annotate.o CC arch/arm64/util/../../arm/util/cs-etm.o CC builtin-sched.o CC util/ctype.o LD arch/arm64/util/libperf-in.o CC arch/arm64/tests/regs_load.o CC arch/arm64/tests/dwarf-unwind.o CC util/db-export.o LD ui/gtk/gtk-in.o LD arch/arm64/tests/libperf-in.o LD arch/arm64/libperf-in.o LD arch/libperf-in.o CC ui/setup.o CC util/env.o CC ui/helpline.o LD gtk-in.o GEN pmu-events/pmu-events.c CC pmu-events/pmu-events.o LD pmu-events/pmu-events-in.o CC ui/progress.o CC util/event.o CC ui/util.o GEN libtraceevent-dynamic-list CC ui/hist.o CC ui/stdio/hist.o CC builtin-buildid-list.o CC builtin-buildid-cache.o CC builtin-list.o CC ui/browser.o CC util/evlist.o CC builtin-record.o CC builtin-report.o CC ui/browsers/annotate.o CC ui/browsers/hists.o CC util/evsel.o CC builtin-stat.o CC builtin-timechart.o CC builtin-top.o CC util/evsel_fprintf.o CC builtin-script.o CC util/find_bit.o CC util/kallsyms.o CC util/levenshtein.o CC util/llvm-utils.o BISON util/parse-events-bison.c CC builtin-kmem.o CC ui/browsers/map.o CC util/perf_regs.o CC util/path.o CC ui/browsers/scripts.o CC util/rbtree.o CC ui/browsers/header.o CC util/libstring.o CC builtin-lock.o CC util/bitmap.o LD ui/browsers/libperf-in.o CC util/hweight.o CC ui/tui/setup.o CC util/quote.o CC util/strbuf.o CC ui/tui/util.o CC util/string.o CC builtin-kvm.o CC builtin-inject.o CC ui/tui/helpline.o CC util/strlist.o CC ui/tui/progress.o CC util/strfilter.o LD ui/tui/libperf-in.o LD ui/libperf-in.o GEN python/perf.so CC scripts/perl/Perf-Trace-Util/Context.o CC builtin-mem.o CC util/top.o LD scripts/perl/Perf-Trace-Util/libperf-in.o CC scripts/python/Perf-Trace-Util/Context.o CC builtin-data.o CC util/usage.o LD scripts/python/Perf-Trace-Util/libperf-in.o LD scripts/libperf-in.o CC builtin-version.o CC builtin-trace.o CC util/dso.o CC builtin-probe.o CC bench/sched-messaging.o CC bench/sched-pipe.o CC util/symbol.o CC bench/mem-functions.o CC bench/futex-hash.o CC bench/futex-wake.o CC bench/futex-wake-parallel.o CC bench/futex-requeue.o CC util/symbol_fprintf.o CC util/color.o CC bench/futex-lock-pi.o CC bench/numa.o CC util/header.o CC util/callchain.o CC util/values.o LD bench/perf-in.o CC tests/builtin-test.o CC tests/parse-events.o CC perf.o CC util/debug.o CC util/machine.o CC util/map.o CC util/pstack.o CC tests/dso-data.o CC util/session.o CC tests/attr.o CC util/syscalltbl.o CC tests/vmlinux-kallsyms.o CC util/ordered-events.o CC tests/openat-syscall.o CC tests/openat-syscall-all-cpus.o CC tests/openat-syscall-tp-fields.o CC tests/mmap-basic.o CC util/comm.o CC tests/perf-record.o CC util/thread.o CC tests/evsel-roundtrip-name.o CC tests/evsel-tp-sched.o CC tests/fdarray.o CC util/thread_map.o CC tests/pmu.o CC tests/hists_common.o CC util/trace-event-parse.o CC tests/hists_link.o CC util/parse-events-bison.o BISON util/pmu-bison.c CC util/trace-event-read.o CC util/trace-event-info.o CC tests/hists_filter.o CC util/trace-event-scripting.o CC util/trace-event.o CC tests/hists_output.o CC tests/hists_cumulate.o CC util/svghelper.o CC tests/python-use.o CC tests/bp_signal.o CC tests/bp_signal_overflow.o CC tests/task-exit.o CC tests/sw-clock.o CC tests/mmap-thread-lookup.o CC util/sort.o CC util/hist.o CC tests/thread-mg-share.o CC tests/switch-tracking.o CC util/util.o CC tests/keep-tracking.o CC util/xyarray.o CC tests/code-reading.o CC tests/sample-parsing.o CC tests/parse-no-sample-id-all.o CC tests/kmod-path.o CC tests/thread-map.o CC util/cpumap.o CC util/cgroup.o CC tests/llvm.o CC util/target.o CC tests/bpf.o CC tests/topology.o CC util/rblist.o CC util/intlist.o CC util/vdso.o CC util/counts.o CC tests/cpumap.o CC tests/stat.o CC tests/event_update.o CC tests/event-times.o CC util/stat.o CC util/stat-shadow.o CC tests/backward-ring-buffer.o CC tests/sdt.o CC tests/is_printable_array.o CC tests/bitmap.o CC tests/dwarf-unwind.o CC tests/llvm-src-base.o CC util/record.o CC tests/llvm-src-kbuild.o CC tests/llvm-src-prologue.o CC tests/llvm-src-relocation.o CC util/srcline.o CC util/data.o LD tests/perf-in.o CC util/tsc.o CC util/cloexec.o CC util/call-path.o CC util/thread-stack.o CC util/auxtrace.o CC util/intel-pt-decoder/intel-pt-pkt-decoder.o LD perf-in.o GEN util/intel-pt-decoder/inat-tables.c CC util/intel-pt-decoder/intel-pt-log.o CC util/intel-pt-decoder/intel-pt-decoder.o CC util/intel-pt-decoder/intel-pt-insn-decoder.o CC util/cs-etm-decoder/cs-etm-decoder.o LD util/cs-etm-decoder/libperf-in.o CC util/scripting-engines/trace-event-perl.o CC util/scripting-engines/trace-event-python.o CC util/intel-pt.o LD util/intel-pt-decoder/libperf-in.o CC util/intel-bts.o CC util/cs-etm.o LD util/scripting-engines/libperf-in.o CC util/parse-branch-options.o CC util/parse-regs-options.o CC util/term.o CC util/help-unknown-cmd.o CC util/mem-events.o CC util/vsprintf.o CC util/drv_configs.o CC util/bpf-loader.o CC util/symbol-elf.o CC util/probe-file.o CC util/probe-event.o CC util/probe-finder.o CC util/dwarf-aux.o CC util/dwarf-regs.o CC util/unwind-libunwind-local.o CC util/unwind-libunwind.o CC util/libunwind/arm64.o CC util/zlib.o CC util/lzma.o CC util/demangle-java.o CC util/demangle-rust.o CC util/jitdump.o CC util/genelf.o CC util/genelf_debug.o FLEX util/parse-events-flex.c FLEX util/pmu-flex.c CC util/parse-events.o CC util/pmu-bison.o CC util/parse-events-flex.o CC util/pmu.o CC util/pmu-flex.o LD util/libperf-in.o LD libperf-in.o AR libperf.a LINK perf LINK libperf-gtk.so make: Leaving directory '/home/linaro/OpenCSD-perf-opencsd-4.9-rc1/tools/perf'
- Then, I test the perf as follows:
linaro@linaro-alip:~/OpenCSD-perf-opencsd-4.9-rc1/tools/perf$ ./perf record -e cs_etm/@826000.etr/ --per-thread uname invalid or unsupported event: 'cs_etm/@826000.etr/' Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available
events
It seems that "perf" is not able to populate the cs_etm events. "perf list" output shows:
linaro@linaro-alip:~/OpenCSD-perf-opencsd-4.9-rc1/tools/perf$ ./perf list
List of pre-defined events (to be used in -e):
branch-misses [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event]
alignment-faults [Software event] bpf-output [Software event] context-switches OR cs [Software event] cpu-clock [Software event] cpu-migrations OR migrations [Software event] dummy [Software event] emulation-faults [Software event] major-faults [Software event] minor-faults [Software event] page-faults OR faults [Software event] task-clock [Software event]
L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-store-misses [Hardware cache event] L1-dcache-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event]
rNNN [Raw hardware event descriptor] cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] (see 'man perf-list' on how to encode it)
mem:<addr>[/len][:access] [Hardware breakpoint]
I think I am very close to getting things up properly. Also, I need to get this working for a research work here.
Any idea on why "perf" is not showing the events corresponding to "cs_etm"? Where possibly is the issue? Your kind help on this will be very much appreciated.
Did you get the kernel from github as per my previous email? Kernel 4.4.23 is very old and doesn't have all of the CoreSight features required to integrate with perf.
Also, keep in mind that since you are not working with Juno CPUIdle _needs_ to be disabled and you have to do make sure all power domains and clocks for the CoreSight IP blocks are managed properly. Out of curiosity, what platform is this?
Mathieu
Looking forward to hear from you and thanks.
Regards, Yan Lin Aung
On Thursday, November 10, 2016 12:23 AM, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On 9 November 2016 at 08:06, Yan Lin Aung yan_lin_aung@yahoo.com wrote:
Hi Mathieu,
Thanks for your reply. Sorry for a bit of delay on my response.
Just a bit of intro on myself. I am a research staff from Nanyang Technological University, Singapore.
Basically, I used the build scripts provided with Linaro deliverables for Juno and TC2 from ARM at this link: https://community.arm.com/docs/DOC-10803
I am able to get the system running either with prebuilt binaries or building from source. For Linaro release 16.09 with built from source option, the Linux 4.8.x runs on Juno r2. In the default configuration, coresight was not activated. I tried to update the config file to enable coresight drive and recompiled. However, the coresight devices are not populated somehow.
I would like to have a setup with which I will be able to do some experiments as demonstrated in your presentation at the very minimum. So, my specific question will be that how shall I proceed to get coresight, perf with coresight and OpenCSD working properly using the Linaro release 16.09.
You won't have the required pieces in 16.09. To replicate the examples shown in the presentation you will have to use the kernel found on github [1]. Since you have a Juno R2 I suggest to use branch perf-opencsd-4.9-rc1 - that way you won't have to deal with power domain management. Note that CoreSight is not part of the default V8 configuration as needs to be explicitly enabled.
Thanks, Mathieu
[1]. https://github.com/Linaro/OpenCSD
If you are not using Linaro release 16.09 and have other means of getting things up with coresight, perf and OpenCSD on Juno, please kindly share with me. I am quite keen to follow your steps and try it out at my side here.
Looking forward to hear from you and thanks.
Regards, Yan Lin Aung
On Monday, November 7, 2016 11:32 PM, Mathieu Poirier mathieu.poirier@linaro.org wrote:
---------- Forwarded message ---------- From: Yan Lin Aung yan_lin_aung@yahoo.com To: "coresight@lists.linaro.org" coresight@lists.linaro.org Cc: Date: Mon, 7 Nov 2016 03:45:45 +0000 (UTC) Subject: perf with CoreSight and OpenCSD on TC2 and Juno r2 Hi Linaro Coresight Team,
I came to know of "Hardware Assisted Tracing on ARM with CoreSight and OpenCSD" by Mathieu Poirier. In his presentation, he mentioned the reference platforms to evaluate perf with CoreSight and OpenCSD are Vexpress TC2 and Juno (Page 7 on his slide).
I just checked the "HOWTO.MD" at OpenCSD github site. However, there is very limited info on how to get started with Vexpress TC2 and Juno.
I have access to the TC2 and Juno r2 platforms. Please provide a rather detailed version of getting started guide to try out perf with CoreSight and OpenCSD on either TC2 or Juno r2.
Hello Yan Lin,
You are correct, the HOWTO.md on github concentrates on CoreSight and doesn't address platform specifics - something like this would be out of scope. I'm not exactly sure of what you are looking for in a "getting started guide"... Both Juno and TC2 are well supported upstream and can be booted with a mainline kernel. The choice of bootloader and user space are entirely up to users and don't affect the CoreSight suite nor its integration with the perf subsystem.
The fact that you have access to both platform leads me to believe you are part of a large organisation. As such there is definitely people around you with experience on how to set-up the platforms.
I can try to answer specific questions if you have any.
Thanks, Mathieu
Thanx.
Regards, Yan Lin Aung
CONFIDENTIALITY: This email is intended solely for the person(s) named and may be confidential and/or privileged. If you are not the intended recipient, please delete it, notify us and do not copy, use, or disclose its contents. Towards a sustainable earth: Print only when necessary. Thank you.