On 19/10/2022 13:49, Jilin Yuan wrote:
> Delete the redundant word 'the'.
>
> Signed-off-by: Jilin Yuan <yuanjilin(a)cdjrlc.com>
Thanks, Queued.
> ---
> drivers/hwtracing/coresight/coresight-etm4x-core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 80fefaba58ee..849c563cfc65 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -1478,7 +1478,7 @@ static int etm4_set_event_filters(struct etmv4_drvdata *drvdata,
> /*
> * If filters::ssstatus == 1, trace acquisition was
> * started but the process was yanked away before the
> - * the stop address was hit. As such the start/stop
> + * stop address was hit. As such the start/stop
> * logic needs to be re-started so that tracing can
> * resume where it left.
> *
This is weird. This shouldn't be making its way into any man page at
all. It's just a file pointing people at other parts of the kernel tree
for docs (because when I looked for docs - i found nothing for any kind
of getting started guide - I put something here as this was the obvious
place for perf related docs for testing - but was told to put the docs
elsewhere in core kernel documentation, but I put a reference here for
those who follow after to be able to more easily find it).
There is some rule somewhere it seems that makes anything perf-*.txt
into a man page. I need to rename this file it seems.
On 10/11/22 08:27, Sven Schnelle wrote:
> Hi Carsten,
>
> carsten.haitzler(a)foss.arm.com writes:
>
>> From: Carsten Haitzler <carsten.haitzler(a)arm.com>
>>
>> Add/improve documentation helping people get started with CoreSight and
>> perf as well as describe the testing and how it works.
>>
>> Cc: linux-doc(a)vger.kernel.org
>> Signed-off-by: Carsten Haitzler <carsten.haitzler(a)arm.com>
>> ---
>> .../trace/coresight/coresight-perf.rst | 158 ++++++++++++++++++
>> .../perf/Documentation/perf-arm-coresight.txt | 5 +
>> 2 files changed, 163 insertions(+)
>> create mode 100644 Documentation/trace/coresight/coresight-perf.rst
>> create mode 100644 tools/perf/Documentation/perf-arm-coresight.txt
>>
>
> Investigating a rpm build failure i see the following error with next-20221011:
>
> % cd tools/perf/Documentation
> % make O=/home/svens/linux-build/ man V=1
> rm -f /home/svens/linux-build/perf-arm-coresight.xml+ /home/svens/linux-build/perf-arm-coresight.xml && \
> asciidoc -b docbook -d manpage \
> --unsafe -f asciidoc.conf -aperf_version= \
> -aperf_date=2022-10-06 \
> -o /home/svens/linux-build/perf-arm-coresight.xml+ perf-arm-coresight.txt && \
> mv /home/svens/linux-build/perf-arm-coresight.xml+ /home/svens/linux-build/perf-arm-coresight.xml
> asciidoc: FAILED: manpage document title is mandatory
> make: *** [Makefile:266: /home/svens/linux-build/perf-arm-coresight.xml] Error 1
>
> Can you please fix this?
>
> Thanks!
From: Carsten Haitzler <carsten.haitzler(a)arm.com>
Add/improve documentation helping people get started with CoreSight and
perf as well as describe the testing and how it works.
Cc: linux-doc(a)vger.kernel.org
Signed-off-by: Carsten Haitzler <carsten.haitzler(a)arm.com>
---
.../trace/coresight/coresight-perf.rst | 158 ++++++++++++++++++
tools/perf/Documentation/arm-coresight.txt | 5 +
2 files changed, 163 insertions(+)
create mode 100644 Documentation/trace/coresight/coresight-perf.rst
create mode 100644 tools/perf/Documentation/arm-coresight.txt
diff --git a/Documentation/trace/coresight/coresight-perf.rst b/Documentation/trace/coresight/coresight-perf.rst
new file mode 100644
index 000000000000..d087aae7d492
--- /dev/null
+++ b/Documentation/trace/coresight/coresight-perf.rst
@@ -0,0 +1,158 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+CoreSight - Perf
+================
+
+ :Author: Carsten Haitzler <carsten.haitzler(a)arm.com>
+ :Date: June 29th, 2022
+
+Perf is able to locally access CoreSight trace data and store it to the
+output perf data files. This data can then be later decoded to give the
+instructions that were traced for debugging or profiling purposes. You
+can log such data with a perf record command like::
+
+ perf record -e cs_etm//u testbinary
+
+This would run some test binary (testbinary) until it exits and record
+a perf.data trace file. That file would have AUX sections if CoreSight
+is working correctly. You can dump the content of this file as
+readable text with a command like::
+
+ perf report --stdio --dump -i perf.data
+
+You should find some sections of this file have AUX data blocks like::
+
+ 0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0 offset: 0 ref: 0x1b614fc1061b0ad1 idx: 0 tid: 531230 cpu: -1
+
+ . ... CoreSight ETM Trace data: size 73168 bytes
+ Idx:0; ID:10; I_ASYNC : Alignment Synchronisation.
+ Idx:12; ID:10; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
+ Idx:17; ID:10; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
+ Idx:26; ID:10; I_TRACE_ON : Trace On.
+ Idx:27; ID:10; I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
+ Idx:38; ID:10; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+ Idx:39; ID:10; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+ Idx:40; ID:10; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+ Idx:41; ID:10; I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
+ ...
+
+If you see these above, then your system is tracing CoreSight data
+correctly.
+
+To compile perf with CoreSight support in the tools/perf directory do::
+
+ make CORESIGHT=1
+
+This requires OpenCSD to build. You may install distribution packages
+for the support such as libopencsd and libopencsd-dev or download it
+and build yourself. Upstream OpenCSD is located at:
+
+ https://github.com/Linaro/OpenCSD
+
+For complete information on building perf with CoreSight support and
+more extensive usage look at:
+
+ https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md
+
+
+Kernel CoreSight Support
+------------------------
+
+You will also want CoreSight support enabled in your kernel config.
+Ensure it is enabled with::
+
+ CONFIG_CORESIGHT=y
+
+There are various other CoreSight options you probably also want
+enabled like::
+
+ CONFIG_CORESIGHT_LINKS_AND_SINKS=y
+ CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
+ CONFIG_CORESIGHT_CATU=y
+ CONFIG_CORESIGHT_SINK_TPIU=y
+ CONFIG_CORESIGHT_SINK_ETBV10=y
+ CONFIG_CORESIGHT_SOURCE_ETM4X=y
+ CONFIG_CORESIGHT_CTI=y
+ CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
+
+Please refer to the kernel configuration help for more information.
+
+Perf test - Verify kernel and userspace perf CoreSight work
+-----------------------------------------------------------
+
+When you run perf test, it will do a lot of self tests. Some of those
+tests will cover CoreSight (only if enabled and on ARM64). You
+generally would run perf test from the tools/perf directory in the
+kernel tree. Some tests will check some internal perf support like:
+
+ Check Arm CoreSight trace data recording and synthesized samples
+ Check Arm SPE trace data recording and synthesized samples
+
+Some others will actually use perf record and some test binaries that
+are in tests/shell/coresight and will collect traces to ensure a
+minimum level of functionality is met. The scripts that launch these
+tests are in the same directory. These will all look like:
+
+ CoreSight / ASM Pure Loop
+ CoreSight / Memcpy 16k 10 Threads
+ CoreSight / Thread Loop 10 Threads - Check TID
+ etc.
+
+These perf record tests will not run if the tool binaries do not exist
+in tests/shell/coresight/\*/ and will be skipped. If you do not have
+CoreSight support in hardware then either do not build perf with
+CoreSight support or remove these binaries in order to not have these
+tests fail and have them skip instead.
+
+These tests will log historical results in the current working
+directory (e.g. tools/perf) and will be named stats-\*.csv like:
+
+ stats-asm_pure_loop-out.csv
+ stats-memcpy_thread-16k_10.csv
+ ...
+
+These statistic files log some aspects of the AUX data sections in
+the perf data output counting some numbers of certain encodings (a
+good way to know that it's working in a very simple way). One problem
+with CoreSight is that given a large enough amount of data needing to
+be logged, some of it can be lost due to the processor not waking up
+in time to read out all the data from buffers etc.. You will notice
+that the amount of data collected can vary a lot per run of perf test.
+If you wish to see how this changes over time, simply run perf test
+multiple times and all these csv files will have more and more data
+appended to it that you can later examine, graph and otherwise use to
+figure out if things have become worse or better.
+
+This means sometimes these tests fail as they don't capture all the
+data needed. This is about tracking quality and amount of data
+produced over time and to see when changes to the Linux kernel improve
+quality of traces.
+
+Be aware that some of these tests take quite a while to run, specifically
+in processing the perf data file and dumping contents to then examine what
+is inside.
+
+You can change where these csv logs are stored by setting the
+PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
+test like::
+
+ export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
+ perf test
+
+They will also store resulting perf output data in the current
+directory for later inspection like::
+
+ perf-asm_pure_loop-out.data
+ perf-memcpy_thread-16k_10.data
+ ...
+
+You can alter where the perf data files are stored by setting the
+PERF_TEST_CORESIGHT_DATADIR environment variable such as::
+
+ PERF_TEST_CORESIGHT_DATADIR=/var/tmp
+ perf test
+
+You may wish to set these above environment variables if you wish to
+keep the output of tests outside of the current working directory for
+longer term storage and examination.
diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
new file mode 100644
index 000000000000..c117fc50a2a9
--- /dev/null
+++ b/tools/perf/Documentation/arm-coresight.txt
@@ -0,0 +1,5 @@
+Arm CoreSight Support
+=====================
+
+For full documentation, see Documentation/trace/coresight/coresight-perf.rst
+in the kernel tree.
--
2.32.0
This test commonly fails on Arm Juno because the instruction interval
is large enough to miss generating any samples for Perf in system-wide
mode.
Fix this by lowering the interval until a comfortable number of Perf
instructions are generated. The test is still quick to run because only
a small amount of trace is gathered.
Before:
sudo ./perf test coresight -vvv
...
Recording trace with system wide mode
Looking at perf.data file for dumping branch samples:
Looking at perf.data file for reporting branch samples:
Looking at perf.data file for instruction samples:
CoreSight system wide testing: FAIL
...
After:
sudo ./perf test coresight -vvv
...
Recording trace with system wide mode
Looking at perf.data file for dumping branch samples:
Looking at perf.data file for reporting branch samples:
Looking at perf.data file for instruction samples:
CoreSight system wide testing: PASS
...
Signed-off-by: James Clark <james.clark(a)arm.com>
---
tools/perf/tests/shell/test_arm_coresight.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/test_arm_coresight.sh b/tools/perf/tests/shell/test_arm_coresight.sh
index e4cb4f1806ff..daad786cf48d 100755
--- a/tools/perf/tests/shell/test_arm_coresight.sh
+++ b/tools/perf/tests/shell/test_arm_coresight.sh
@@ -70,7 +70,7 @@ perf_report_instruction_samples() {
# 68.12% touch libc-2.27.so [.] _dl_addr
# 5.80% touch libc-2.27.so [.] getenv
# 4.35% touch ld-2.27.so [.] _dl_fixup
- perf report --itrace=i1000i --stdio -i ${perfdata} 2>&1 | \
+ perf report --itrace=i20i --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +$1" > /dev/null 2>&1
}
--
2.28.0
On 13/10/2022 14:45, Christian Borntraeger wrote:
> Probably not related to this patch set, but
> tools/perf/Documentation/perf-arm-coresight.txt
>
> results in
> cd tools/perf/Documentation/
> make man
> ASCIIDOC perf-arm-coresight.xml
> asciidoc: ERROR: perf-arm-coresight.txt: line 2: malformed manpage title
> asciidoc: ERROR: perf-arm-coresight.txt: line 3: name section expected
> asciidoc: FAILED: perf-arm-coresight.txt: line 3: section title expected
>
> in linux-next.
>
I think this is the same as what has been reported here:
https://lore.kernel.org/linux-perf-users/20220909152803.2317006-1-carsten.h…
Carsten, are you able to take a look at this?
Thanks
James
The current method for allocating trace source ID values to sources is
to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
The STM is allocated ID 0x1.
This fixed algorithm is used in both the CoreSight driver code, and by
perf when writing the trace metadata in the AUXTRACE_INFO record.
The method needs replacing as currently:-
1. It is inefficient in using available IDs.
2. Does not scale to larger systems with many cores and the algorithm
has no limits so will generate invalid trace IDs for cpu number > 44.
Additionally requirements to allocate additional system IDs on some
systems have been seen.
This patch set introduces an API that allows the allocation of trace IDs
in a dynamic manner.
Architecturally reserved IDs are never allocated, and the system is
limited to allocating only valid IDs.
Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
the new API.
For the ETMx.x devices IDs are allocated on certain events
a) When using sysfs, an ID will be allocated on hardware enable, or a read of
sysfs TRCTRACEID register and freed when the sysfs reset is written.
b) When using perf, ID is allocated on during setup AUX event, and freed on
event free. IDs are communicated using the AUX_OUTPUT_HW_ID packet.
The ID allocator is notified when perf sessions start and stop
so CPU based IDs are kept constant throughout any perf session.
Note: This patchset breaks some backward compatibility for perf record and
perf report.
The version of the AUXTRACE_INFO has been updated to reflect the fact that
the trace source IDs are generated differently. This will
mean older versions of perf report cannot decode the newer file.
Applies to coresight/next [4d45bc82df66]
Tested on DB410c
Changes since v2:
1) Improved backward compatibility: (requested by James)
Using the new version of perf on an old kernel will generate a usable file
legacy metadata values are set by the new perf and will be used if mew
ID packets are not present in the file.
Using an older version of perf / simpleperf on an updated kernel may still
work. The trace ID allocator has been updated to use the legacy ID values
where possible, so generated file and used trace IDs will match up to the
point where the legacy algorithm is broken anyway.
2) Various changes to the ID allocator and ID packet format.
(suggested by Suzuki)
3) per CPU ID info in allocator now stored as atomic type to allow a passive read
without taking the allocator spinlock. perf flow now allocates and releases ID
values in setup_aux / free_event. Device enable and event enable use the passive
read to set the allocated values. This simplifies the locking mechanisms on the
perf run and fixes issues that arose with locking dependencies.
Changes since v1:
(after feedback & discussion with Mathieu & Suzuki).
1) API has changed. The global trace ID map is managed internally, so it
is no longer passed in to the API functions.
2) perf record does not use sysfs to find the trace IDs. These are now
output as AUX_OUTPUT_HW_ID events. The drivers, perf record, and perf report
have been updated accordingly to generate and handle these events.
Mike Leach (13):
coresight: trace-id: Add API to dynamically assign Trace ID values
coresight: Remove obsolete Trace ID unniqueness checks
coresight: stm: Update STM driver to use Trace ID API
coresight: etm4x: Update ETM4 driver to use Trace ID API
coresight: etm3x: Update ETM3 driver to use Trace ID API
coresight: etmX.X: stm: Remove trace_id() callback
coresight: perf: traceid: Add perf notifiers for Trace ID
perf: cs-etm: Move mapping of Trace ID and cpu into helper function
perf: cs-etm: Update record event to use new Trace ID protocol
kernel: events: Export perf_report_aux_output_id()
perf: cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packet
coresight: events: PERF_RECORD_AUX_OUTPUT_HW_ID used for Trace ID
coresight: trace-id: Add debug & test macros to Trace ID allocation
drivers/hwtracing/coresight/Makefile | 2 +-
drivers/hwtracing/coresight/coresight-core.c | 49 +--
.../hwtracing/coresight/coresight-etm-perf.c | 23 ++
drivers/hwtracing/coresight/coresight-etm.h | 3 +-
.../coresight/coresight-etm3x-core.c | 92 +++--
.../coresight/coresight-etm3x-sysfs.c | 27 +-
.../coresight/coresight-etm4x-core.c | 79 ++++-
.../coresight/coresight-etm4x-sysfs.c | 27 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 3 +
drivers/hwtracing/coresight/coresight-stm.c | 49 +--
.../hwtracing/coresight/coresight-trace-id.c | 266 ++++++++++++++
.../hwtracing/coresight/coresight-trace-id.h | 78 +++++
include/linux/coresight-pmu.h | 35 +-
include/linux/coresight.h | 3 -
kernel/events/core.c | 1 +
tools/include/linux/coresight-pmu.h | 48 ++-
tools/perf/arch/arm/util/cs-etm.c | 21 +-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +
tools/perf/util/cs-etm.c | 331 +++++++++++++++---
tools/perf/util/cs-etm.h | 14 +-
20 files changed, 933 insertions(+), 225 deletions(-)
create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
--
2.17.1
On 21/07/2022 14:03, Sudeep Holla wrote:
> With lockdeps enabled, we get the following warning:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u12:1/53 is trying to acquire lock:
> ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
> but task is already holding lock:
> ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> which lock already depends on the new lock.
> the existing dependency chain (in reverse order) is:
>
> -> #1 (ect_mutex){+.+.}-{4:4}:
> __mutex_lock_common+0xd8/0xe60
> mutex_lock_nested+0x44/0x50
> cti_add_assoc_to_csdev+0x4c/0x184
> coresight_register+0x2f0/0x314
> tmc_probe+0x33c/0x414
>
> -> #0 (coresight_mutex){+.+.}-{4:4}:
> __lock_acquire+0x1a20/0x32d0
> lock_acquire+0x160/0x308
> __mutex_lock_common+0xd8/0xe60
> mutex_lock_nested+0x44/0x50
> coresight_set_assoc_ectdev_mutex+0x3c/0x5c
> cti_update_conn_xrefs+0x6c/0xf8
> cti_probe+0x33c/0x394
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
> CPU0 CPU1
> ---- ----
> lock(ect_mutex);
> lock(coresight_mutex);
> lock(ect_mutex);
> lock(coresight_mutex);
> *** DEADLOCK ***
>
> 4 locks held by kworker/u12:1/53:
> #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
> #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
> #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
> #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> To fix the same, call cti_add_assoc_to_csdev without the holding
> coresight_mutex and confine the locking while setting the associated
> ect / cti device using coresight_set_assoc_ectdev_mutex().
>
> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
> Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
> Cc: Mike Leach <mike.leach(a)linaro.org>
> Cc: Leo Yan <leo.yan(a)linaro.org>
> Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
Thanks for the fix. I will push this out at rc1 as fixes.
Thanks
Suzuki
I'm still leaving out CONFIG_CORESIGHT_SOURCE_ETM4X because it depends
on the outcome of the investigation into CONFIG_PID_IN_CONTEXTIDR, but
I think we should enable these ones for now and start getting some of
the benefits sooner.
Changes since v1:
* Remove CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y which shouldn't be
enabled by default
-----
As suggested by Catalin here's the change to add Coresight to defconfig.
Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X
which builds a few files until [1] is merged because of the overhead
of CONFIG_PID_IN_CONTEXTIDR.
[1]: https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/T/
applies to arm64/for-next/core (e99db032d186)
James Clark (1):
arm64: defconfig: Add Coresight as module
arch/arm64/configs/defconfig | 8 ++++++++
1 file changed, 8 insertions(+)
--
2.28.0