Hello,
I am trying to use Coresight drivers on Juno r0 board with 4.9 Linux
sources (Linaro release 17.01 with 'latest-armlt').
I have tried the configuration with ETM as the source and ETF as the
sink and it is working as expected.
But with ETR as the sink, when stopping the tracing, kernel panic occurs.
The bug can be reproduced with the following steps:
/ # echo 1 > /sys/bus/coresight/devices/20070000.etr/enable_sink
/ # echo 1 > /sys/bus/coresight/devices/22040000.etm/enable_source
/ # echo 0 > /sys/bus/coresight/devices/22040000.etm/enable_source
Sometime the system also hangs without printing panic message.
I am attaching the log file and .config file along with this mail. From
the logs, it looks like an arm SCP firmware problem.
Let me know if I am missing some steps/configuration or If this is a
know hardware/firmware problem with hopefully some workarounds existing.
Thanks and regards,
Don Kuzhiyelil
Good day to all,
A patch sent by Suzuki a few weeks ago [1] unearthed a problem with
how we deal with the "enable_sink" flag in the CS core. So far we
have been concentrating on system-wide trace scenarios [2] but per-CPU
[3] scenarios are also valid. In system-wide mode a single event is
generated by the perf user space and communicated to the kernel. In
per-CPU mode an event is generated for each CPU present in the system
or specified on the cmd line, and that is where our handling of the
"enable_sink" flag fails (get back to me if you want more details on
that).
My solution is to add the sink definition to the perf_event_attr
structure [4] that gets sent down to the kernel from user space. That
way there is no confusion about what sink belongs to what event. To
do that I will need to have a chat with the guys in the #perf IRC
channel, something I expect to be fairly tedious.
But before moving ahead we need to agree on the syntax we want to have
in the future. That way what I do now with the perf folks doesn't
have to be undone in a few months.
For the following I will be using figure 2-9 on page 2-33 in this document [5].
So far we have been using this syntax:
# perf record -e cs_etm/(a)20070000.etr/ --per-thread $COMMAND
This will instruct perf to select the ETR as a sink. Up to now not
specifying a sink is treated as an error condition since perf doesn't
know what sink to select.
The main goal of writing all this is that I am suggesting to revisit that.
What I am proposing is that _if_ a sink is omitted on the perf command
line, the perf infrastructure will pick the _first_ sink it finds when
doing a walk through of the CS topology. This is very advantageous
when thinking about the syntax required to support upcoming systems
where we have a one-to-one mapping between source and sink.
In such a system specifying sinks for each CPU on the perf command
line simply doesn't scale. Even on a small system I don't see users
specifying a sink for each CPU. Since the sink for each CPU will be
the first one found during the walk through, it is implicit that this
sink should be used and doesn't need to be specified explicitly.
It would also allow for the support of topologies like Juno-R1 [5]
where we have a couple of ETF in the middle. Those are perfectly
valid sinks but right now the current scheme doesn't allow us to use
them. If we pick the first sink we find along the way we can
automatically support something like this.
I have reflected quite extensively on this and I think it can work.
The only time it can fail is if at some point we we get more than one
sink associated with each tracer. But how likely is this?
What we decide now will not be undone easily, if at all. Please read
my email a couple of times and give it some consideration. Comment
and ideas are welcomed.
Best regards,
Mathieu
[1]. https://patchwork.kernel.org/patch/9657141/
[2]. perf record -e cs_etm/(a)20070000.etr/u --per-thread $COMMAND
[3]. perf record -e cs_etm/(a)20070000.etr/u --C 0,2-3 $COMMAND
[4]. http://lxr.free-electrons.com/source/include/uapi/linux/perf_event.h#L283
[5]. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515d.b/DDI0515D_b_juno…
On 17 April 2017 at 10:45, Bamvor Zhang Jian
<bamvor.zhangjian(a)linaro.org> wrote:
> Hi,
>
> On 17 April 2017 at 21:21, Mathieu Poirier <mathieu.poirier(a)linaro.org> wrote:
>> [snip]
>>
>> I have been travelling for the last two weeks and don't remember if I
>> have answered this already.
>>
>>>>
>>>> The wrapping around is normal and can't be avoided. Problems happens
>>>> when we get a wrap around and Perf decides to concatenate buffers in a
>>>> single notification to user space. When that happens there is no way
>>>> for the decoding library to know the boundaries of individual packets,
>>>> resulting in a lost of synchronisation and proper decoding of traces.
>>>>
>>>> This is very difficult to duplicate, hence taking a long time to deal
>>>> with. I am still not sure about the conditions needed for Perf to
>>>> concatenate buffers together.
>>> Thanks for your expaination. So, will it be exist in the any senario?
>>
>> Yes, this can happen in any scenario.
>>
>>> If so, I think we need this patch too.
>>> We encouter the wrong package recently and we have already use
>>> filter. How could we know if it is the same issue?
>>
>> Sorry, I don't understand what you mean here by "wrong package" and
>> the correlation is has on filters. Please explain further if you
>> still need input on this.
> 'Wrong packages' seem a little bit misleading.
"packet" is the word you are looking for.
> I would say there are some
> unexpected packages. Maybe I could send you the full log of packages
> tomorrow.
I won't be able to look at it.
>In summary, we found the following issues respectively:
> 1. We found that if we start etm in the start/stop range, sometimes the
> first address package(a jump or an instruction after isb) may be lost.
> If we do a 100 times loop after the coresight timeout of etm enable.
> It will not lost the package. I suspect there are some configuration issues
> in our etm. But I do not find a clue right now. Is there some suggestion
> from your side?
I strongly suggest you purchase a dragonboard. That way you can
reproduce the problems you are seeing on your platform on the
dragonboard. That way it is much easier for us to work on issues you
may encounter. This will be money well invested (they are very
cheap).
Beleive me, I wish we could use the HiKey but CS support on that
platform isn't very encouraging at this time (not for the lack of
trying).
> 2. We found that there are some full zero(64bit) packages. Is it usual?
It isn't no. Once again if you can reproduce this on a dragonboard
I'd be happy to look at it. Otherwise it is impossible for me to
help.
>
> Therea are probably some other questions. But it is too late for me(midnight
> in beijing time). I could not recall the details. If I found other issues,
> I will send to you later.
Very well.
Mathieu
>
> Thanks.
>
> Bamvor
>>
>>
>>> I saw the description
>>> of overflow. Could it monitor by CTI?
>>
>> CTIs would indeed prevent this from happening.
>>
>>>
>>> Regards
>>>
>>> Bamvor
>>>
>>>>
>>>>>
>>>>> Regards
>>>>>
>>>>> Bamvor
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Bamvor
>>>>>>>> After that I may be
>>>>>>>> able to look at it if nothing else gets in the way.
Hi,
I have a question about tracing on cortex A.I have a plateform based on 2 cortex A7,and I need to trace the executed application. How can I activate the trace under linux (without using external hardware) to access to coresight components (STM,ETM,PTM,CTI,ETF...) and configure them as well as the extraction of the datas trace ?Could you please describe the differents steps that allow me to trace my application and what is the librairie should be used for the trace?
Best regards Karim.
Hi,
I was wondering the reasons ARM decided to move towards PTM source
component instead of ETM component. What is the main reason that pushed
towards this change ?
Thanks for your reply.
Best regards,
MAW
Adding information on kernel versioning at the top of the file and
replacing all instances of numerical kernel versions with a
generic notation.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
HOWTO.md | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/HOWTO.md b/HOWTO.md
index 2a6be3db0164..71b228c8f5f7 100644
--- a/HOWTO.md
+++ b/HOWTO.md
@@ -7,16 +7,18 @@ This HOWTO explains how to use the perf cmd line tools and the openCSD
library to collect and extract program flow traces generated by the
CoreSight IP blocks on a Linux system. The examples have been generated using
an aarch64 Juno-r0 platform. All information is considered accurate and tested
-using library version v0.5 and the latest perf branch `perf-opencsd-4.9`
-on the [OpenCSD github repository][1].
+using library version v0.5 and the latest coresight/perf integration branch on
+on the [OpenCSD github repository][1]. That branch is labelled
+`perf-opencsd-($VERSION)`, where ($VERSION) carries the latest kernel version
+number.
On Target Trace Acquisition - Perf Record
-----------------------------------------
All the enhancement to the Perf tools that support the new `cs_etm` pmu have
not been upstreamed yet. To get the required functionality branch
-`perf-opencsd-4.9` needs to be downloaded to the target system where
-traces are to be collected. This branch is an upstream v4.9 kernel
+`perf-opencsd-($VERSION)` needs to be downloaded to the target system where
+traces are to be collected. This branch is a vanilla upstream kernel
supplemented with modifications to the CoreSight framework and drivers to be
usable by the Perf core. The remaining out of tree patches are being
upstreamed incrementally.
@@ -277,14 +279,14 @@ the host's (which has nothing to do with the target) architecture:
Off Target Perf Tools Compilation
---------------------------------
As stated above not all the pieces of the solution have been upstreamed. To
-get all the components branch `perf-opencsd-4.9` needs to be
+get all the components the latest `perf-opencsd-($VERSION)` needs to be
obtained:
- linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-4.9 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.9
+ linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-($VERSION) https://github.com/Linaro/OpenCSD.git perf-opencsd-($VERSION)
...
...
- linaro@t430:~/linaro/coresight$ ls perf-opencsd-4.9/
+ linaro@t430:~/linaro/coresight$ ls perf-opencsd-($VERSION)/
arch certs CREDITS Documentation firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts sound usr
block COPYING crypto drivers fs init Kbuild kernel MAINTAINERS mm README samples security tools virt
@@ -295,12 +297,12 @@ successful, but handling of CoreSight trace data won't be supported.
**See perf-test-scripts below for assistance in creating a build and test enviroment.**
- linaro@t430:~/linaro/coresight$ cd perf-opencsd-4.9
+ linaro@t430:~/linaro/coresight$ cd perf-opencsd-($VERSION)
linaro@t430:~/linaro/coresight/perf-opencsd-4.9$ export CSTRACE_PATH=~/linaro/coresight/my-opencsd/decoder
linaro@t430:~/linaro/coresight/perf-opencsd-4.9$ make -C tools/perf
...
...
- linaro@t430:~/linaro/coresight/perf-opencsd-4.9$ ls -l tools/perf/perf
+ linaro@t430:~/linaro/coresight/perf-opencsd-($VERSION)$ ls -l tools/perf/perf
-rwxrwxr-x 1 linaro linaro 6276360 Mar 3 10:05 tools/perf/perf
@@ -339,7 +341,7 @@ to be sure everything is clean.
linaro@t430:~/linaro/coresight/sept20$ rm -rf ~/.debug
linaro@t430:~/linaro/coresight/sept20$ cp -dpR .debug ~/
linaro@t430:~/linaro/coresight/sept20$ export LD_LIBRARY_PATH=~/linaro/coresight/my-opencsd/decoder/lib/linux64/dbg/
- linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-4.9/tools/perf/perf report --stdio
+ linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-($VERSION)/tools/perf/perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
@@ -383,7 +385,7 @@ to be sure everything is clean.
Additional data can be obtained, which contains a dump of the trace packets received using the command
- mjl@ubuntu-vbox:./perf-opencsd-4.9/coresight/tools/perf/perf report --stdio --dump
+ mjl@ubuntu-vbox:./perf-opencsd-($VERSION)/coresight/tools/perf/perf report --stdio --dump
resulting a large amount of data, trace looking like:-
@@ -432,10 +434,10 @@ Trace Decoding with Perf Script
Working with perf scripts needs more command line options but yields
interesting results.
- linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.9/tools/perf/
+ linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-($VERSION)/tools/perf/
linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
- linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-4.9/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
+ linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-($VERSION)/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
7f89f24d80: 910003e0 mov x0, sp
7f89f24d84: 94000d53 bl 7f89f282d0 <free@plt+0x3790>
@@ -467,18 +469,18 @@ Kernel Trace Decoding
When dealing with kernel space traces the vmlinux file has to be communicated
explicitely to perf using the "--vmlinux" command line option:
- linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-4.9/tools/perf/perf report --stdio --vmlinux=./vmlinux
+ linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-($VERSION)/tools/perf/perf report --stdio --vmlinux=./vmlinux
...
...
- linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-4.9/tools/perf/perf script --vmlinux=./vmlinux
+ linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-($VERSION)/tools/perf/perf script --vmlinux=./vmlinux
When using scripts things get a little more convoluted. Using the same example
an above but for traces but for kernel traces, the command line becomes:
- linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.9/tools/perf/
+ linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-($VERSION)/tools/perf/
linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
- linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-4.9/tools/perf/perf --exec-path=${EXEC_PATH} script \
+ linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-($VERSION)/tools/perf/perf --exec-path=${EXEC_PATH} script \
--vmlinux=./vmlinux \
--script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- \
-d ${XTOOLS_PATH}/aarch64-linux-gnu-objdump \
--
2.7.4
Hi,
Couple of questions here about perf/OpenCSD and dynamic code.
- what's the status of decode for arbitrarily loaded/unloaded shared objects?
I.e. decode where the code is in an ELF binary and we just need to know which
binaries are at what address and at what time. Can we match up mmap events
with the captured ETM trace to do the decode, even with modules being
unloaded and overwritten by other modules?
- what's the status of decode for JITted code? Here you'd need to capture the
actual contents of the code cache, to be able to decode it.
Are these ready to go or is there still some work to do?
Thanks,
Al
Good day,
First and foremost, for coresight related questions please CC the
coresight mailing list. There is a lot of knowledgeable people on it
that can help with your questions.
On 29 March 2017 at 04:09, Kaiyou Wang <Kaiyou.Wang(a)arm.com> wrote:
> Hi Mathieu,
>
>
>
> I have find the root cause about the issue, thanks
>
>
>
> Another questions,
>
> 1> I port the coresight dts config from JUNO-busybox to JUNO-android, but
> the cpu hung during the kernel boot stage, do you have any suggestion?
The first thing to do would be to disable coresight. If the board
still hangs there is not much I can do for you. Otherwise there is
either a problem with power domains or the clocks. You will likely
have to instrument the kernel to see where things hang.
>
> 2> I try to use the sysFS attribute of Coresight to trace data, but decode
> the data failed by the OpenCSD,
The interesting question here is how did you try to use the OpenCSD
library to decode traces acquired via sysFS? OpenCSD requires a lot
of trace configuration information that is not present in raw sysFS
traces. The integration between coresight and perf is an example of
how those trace configuration information need to be communicated to
the library.
>
> Is there any introduction of Coresight sysFS usage? And Can perf decode the
> raw trace data?
No - perf can't decode raw traces gathered from sysFS for reasons
mentionned above.
Regards,
Mathieu
>
>
>
> thanks
>
>
>
>
>
> From: Kaiyou Wang
> Sent: Wednesday, March 29, 2017 10:02 AM
> To: 'mathieu.poirier(a)linaro.org'
> Subject: Some question about OpenCSD coresight drive usage
>
>
>
> Hi Mathieu,
>
>
>
> Sorry to disturb, I have some question about the coresight driver usage,
> could you give me some suggestion?
>
>
>
> I try to perf recode the trace event on JUNO busybox platform:
>
>
>
> / # ./perf record -e cs_etm/(a)20010000.etf/ --per-thread uname
>
> map_groups__set_modules_path_dir: cannot open /lib/modules/4.9.0-dirty dir
>
> Problems setting modules path maps, continuing anyway...
>
> path: /sys/bus/coresight/devices/20010000.etf/enable_sink
>
> mmap size 528384B
>
> [ 2257.038441] coresight-funnel 20040000.main-funnel: FUNNEL inport 1
> enabled
>
> [ 2257.045268] coresight-funnel 230c0000.cluster1-funnel: FUNNEL inport 3
> enabled
>
> Linux
>
> [ 2257.055129] coresight-funnel 230c0000.cluster1-funnel: FUNNEL inport 3
> disabled
>
> [ 2257.062376] coresight-funnel 20040000.main-funnel: FUNNEL inport 1
> disabled
>
> [ 2257.069277] coresight-tmc 20010000.etf: TMC-ETB/ETF disabled
>
> [ perf record: Woken up 1 times to write data ]
>
> failed to write feature 8
>
> failed to write feature 9
>
> failed to write feature 14
>
> [ perf record: Captured and wrote 0.041 MB perf.data ]
>
>
>
> Then try to decode the perf.data on host-linux, but it faild, could you give
> me some suggestion?
>
>
>
>
>
> kaiwan01@sha-swtool:~/code/OpenCSD/trace-data$ cat disasm.sh
>
> #!/bin/bash
>
>
>
> export
> EXEC_PATH=/home/kaiwan01/code/OpenCSD/OpenCSD-perf/perf-opencsd-4.8/tools/perf/
>
> export SCRIPT_PATH=$EXEC_PATH/scripts/python/
>
> export
> XTOOL_PATH=/home/kaiwan01/tools/gcc-linaro-4.9-2015.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/bin/
>
>
>
> perf --exec-path=${EXEC_PATH} script
> --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d
> ${XTOOL_PATH}/aarch64-linux-gnu-objdump
>
> kaiwan01@sha-swtool:~/code/OpenCSD/trace-data$ ./disasm.sh
>
> Traceback (most recent call last):
>
> File
> "/home/kaiwan01/code/OpenCSD/OpenCSD-perf/perf-opencsd-4.8/tools/perf//scripts/python//cs-trace-disasm.py",
> line 113, in process_event
>
> disasm_output = check_output(disasm).split('\n')
>
> File "/usr/lib/python2.7/subprocess.py", line 567, in check_output
>
> process = Popen(stdout=PIPE, *popenargs, **kwargs)
>
> File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
>
> errread, errwrite)
>
> File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
>
> raise child_exception
>
> OSError: [Errno 2] No such file or directory
>
> Fatal Python error: problem in Python trace event handler
>
> ./disasm.sh: line 7: 24616 Aborted (core dumped) perf
> --exec-path=${EXEC_PATH} script
> --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d
> ${XTOOL_PATH}/aarch64-linux-gnu-objdump
>
> kaiwan01@sha-swtool:~/code/OpenCSD/trace-data$
>
>
>
>
>
> Thanks and Best Regards,
>
> Kaiyou
>
>
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>From gcc 6.2 and onward, the compiler complains about
‘cs_etm_global_header_fmts’ not being used (and rightly so).
One solution is to remove the declaration but it is a matter
of time before we need to modify the header. Another solution
is to simply print the information conveyed by the header.
There is a few advantages that comes with the latter:
1) We know how many CPUs are part of the session, without having to
count the number of magic numbers.
2) We get to see if the snapshot option was specified.
3) When we change the header we know exactly what kind of header
we are dealing with.
Reported-by: Kim Phillips <kim.phillips(a)arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
tools/perf/util/cs-etm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 0adb8e4aff2f..82702039ab2e 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1494,6 +1494,9 @@ static void cs_etm__print_auxtrace_info(u64 *val, size_t num)
{
unsigned i,j,cpu;
+ for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+ fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+
for (i = CS_HEADER_VERSION_0_MAX, cpu = 0; cpu < num; ++cpu) {
if (val[i] == __perf_cs_etmv3_magic) {
--
2.7.4