Hi,
Thanks for your time!
I am bob. I am interested in the CoreSight Project. And I get much from the web page http://www.linaro.org/blog/core-dump/coresight-perf-and-the-opencsd-library/.
Because I work on ARM64, there is a bug with perf working on ARM. Specific information from https://www.linaro.org/blog/core-dump/debugging-arm-kernels-using-nmifiq/.
For instance, when we run : dd if=/dev/urandom of=/dev/null, over 90% of the CPU time is spent unlocking interrupts and the cryptographic operations that should dominate the
use case are completely hidden.
[cid:image003.jpg@01D1E8DF.C2B9E9D0]
The author Daniel Thompson from Linaro comes up with a primary solution, however he suggests it will need further work.
Now, CoreSight can trace program flow only by hardware. If we combine coresight with perf, when we run “dd if=/dev/urandom of=/dev/null” and perf record, will the report be normal?
If it is normal, it will be amazing!!! And, I am eager for the related information.
I have followed the documentation to enable coresight and perf, but get stuck. I can not figure out whether it is normal.
I greatly appreciate for your help!!! Thanks again for your time!!!
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
HOWTO.md | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/HOWTO.md b/HOWTO.md
index ad19e9eb4aea..47e67f734964 100644
--- a/HOWTO.md
+++ b/HOWTO.md
@@ -7,8 +7,8 @@ This HOWTO explains how to use the perf cmd line tools and the openCSD
library to collect and extract program flow traces generated by the
CoreSight IP blocks on a Linux system. The examples have been generated using
an aarch64 Juno-r0 platform. All information is considered accurate and tested
-using library branches `opencsd-0v002` and `opencsd-0v003` (decode library only)
-and the latest perf branch `perf-opencsd-4.7-rc4` (decode library + perf tools)
+using library branches `opencsd-0v002` and `opencsd-0v003` (decode library only)
+and the latest perf branch `perf-opencsd-4.7` (decode library + perf tools)
on the [OpenCSD github repository][1].
@@ -17,8 +17,8 @@ On Target Trace Acquisition - Perf Record
All the enhancement to the Perf tools that support the new `cs_etm` pmu have
not been upstreamed yet. To get the required functionality branch
-`perf-opencsd-4.7-rc4` needs to be downloaded to the target system where
-traces are to be collected. This branch is an upstream v4.7-rc4 kernel
+`perf-opencsd-4.7` needs to be downloaded to the target system where
+traces are to be collected. This branch is an upstream v4.7 kernel
supplemented with modifications to the CoreSight framework and drivers to be
usable by the Perf core. The remaining out of tree patches are being
upstreamed incrementally.
@@ -163,14 +163,14 @@ the host's (which has nothing to do with the target) architecture:
Off Target Perf Tools Compilation
---------------------------------
As stated above not all the pieces of the solution have been upstreamed. To
-get all the components branch `perf-opencsd-4.7-rc4` needs to be
+get all the components branch `perf-opencsd-4.7` needs to be
obtained:
- linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-4.7-rc4 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.7-rc4
+ linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-4.7 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.7
...
...
- linaro@t430:~/linaro/coresight$ ls perf-opencsd-4.7-rc4/
+ linaro@t430:~/linaro/coresight$ ls perf-opencsd-4.7/
arch certs CREDITS Documentation firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts sound usr
block COPYING crypto drivers fs init Kbuild kernel MAINTAINERS mm README samples security tools virt
@@ -179,12 +179,12 @@ variable telling the build scripts where to find the library is needed. If
the `CSTRACE_PATH` variable is not defined the compilation will still be
successful, but handling of CoreSight trace data won't be supported.
- linaro@t430:~/linaro/coresight$ cd perf-opencsd-4.7-rc4
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ export CSTRACE_PATH=~/linaro/coresight/opencsd-0v003/decoder
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ make -C tools/perf
+ linaro@t430:~/linaro/coresight$ cd perf-opencsd-4.7
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ export CSTRACE_PATH=~/linaro/coresight/opencsd-0v003/decoder
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ make -C tools/perf
...
...
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ ls -l tools/perf/perf
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ ls -l tools/perf/perf
-rwxrwxr-x 1 linaro linaro 6276360 Mar 3 10:05 tools/perf/perf
@@ -224,7 +224,7 @@ to be sure everything is clean.
linaro@t430:~/linaro/coresight/feb24$ rm -rf ~/.debug
linaro@t430:~/linaro/coresight/feb24$ cp -dpR .debug ~/
linaro@t430:~/linaro/coresight/feb24$ export LD_LIBRARY_PATH=~/linaro/coresight/opencsd-0v003/decoder/lib/linux64/dbg/
- linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7-rc4/tools/perf/perf report --stdio
+ linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7/tools/perf/perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
@@ -268,7 +268,7 @@ to be sure everything is clean.
Additional data can be obtained, which contains a dump of the trace packets received using the command
- mjl@ubuntu-vbox:./perf-opencsd-4.7-rc4/coresight/tools/perf/perf report --stdio --dump
+ mjl@ubuntu-vbox:./perf-opencsd-4.7/coresight/tools/perf/perf report --stdio --dump
resulting a large amount of data, trace looking like:-
@@ -317,10 +317,10 @@ Trace Decoding with Perf Script
Working with perf scripts needs more command line options but yields
interesting results.
- linaro@t430:~/linaro/coresight/feb24$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.7-rc4/tools/perf/
+ linaro@t430:~/linaro/coresight/feb24$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.7/tools/perf/
linaro@t430:~/linaro/coresight/feb24$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
linaro@t430:~/linaro/coresight/feb24$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
- linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7-rc4/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
+ linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
7f89f24d80: 910003e0 mov x0, sp
7f89f24d84: 94000d53 bl 7f89f282d0 <free@plt+0x3790>
--
2.7.4
On 13 July 2016 at 10:35, Al Grant <Al.Grant(a)arm.com> wrote:
> Hi,
>
> When you see the libraries being mapped multiple times, are you just seeing the code and data segments? I see that too, I just ignore the data segments.
>
(Taking the liberty of CC'ing the list as this is probably a topic of interest)
Each time a library is mapped perf gets notified by the mm subsystem.
Part of the notification is a new vm_area_struct that contains the new
start address of the library (vm_area_struct::vm_start). Upon
receiving the notification the new address is communicated to the ETM
drivers which do the required filter configuration. That is all good
and working well.
On ARM64 (because I _assume_ X86 folks didn't see this) we get 3
notifications. For example notification A will have address
0x7f93a60000 while, subsequently, notification B and C address
0x7f93a70000. Note that the latter two are 64K higher than the first
one.
Once the last notification has been received the code in the main
program is executed. That code (in the main program) jumps to library
code mapped at the address it got from the first notification and not
the last one, making the filter configuration all wrong.
As such I have to understand what notification B and C are for. Based
on the vm_area_struct::vm_flags I'm guessing some sort of accounting
feature but not sure yet. If I ignore notification B and C, things
work amazingly well and one can really see the power offered by
coresight.
That's where I'm at now.
Get back to me if you (or anyone else) want more information.
Mathieu
> Al
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
>
An issue was noted while running tests in windows debug mode on the latest
development library.
As noted in the patch commit message, an initialisation issue for the
output packet structure was highlighted by the windows debug memory
initialisation (0xcdcdcdcd) which did not show up in linux tests (probably
due to default 0 init).
This led to the discovery of an additional issue with the setting of the
.isa field in the instruction range output packets. Prior to the patch this
was defaulting AArch32 in linux and not always being set correctly in the
ETMv4 and PTM decoders The default for PTM was probably OK in none-thumb
cases but the AArch64 juno captures were consistently reporting AArch32 in
the output packets which should have been AArch64.
The updated release has been tested against the opencsd-perf-4.7-rc4 build
of the perf report/script tools.
As expected the output from perf report is unchanged.
However, the perf script, which runs the architecture based disassembly is
also unchanged, suggesting that this code is not at present taking note of
the ISA supplied by the trace output.
Running against both the unpatched library, with packets marked as AArch32,
and the patched library, with packets marked as AArch64, resulted in the
disassembly correctly being output as AArch64.
I assume that the disassembly routines are obtaining the current core arch
from other information in the perf.data file. We should probably consider
if this is the best way to go in this case.
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Blackburn Design Centre. UK