The current method for allocating trace source ID values to sources is
to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
The STM is allocated ID 0x1.
This fixed algorithm is used in both the CoreSight driver code, and by
perf when writing the trace metadata in the AUXTRACE_INFO record.
The method needs replacing as currently:-
1. It is inefficient in using available IDs.
2. Does not scale to larger systems with many cores and the algorithm
has no limits so will generate invalid trace IDs for cpu number > 44.
Additionally requirements to allocate additional system IDs on some
systems have been seen.
This patch set introduces an API that allows the allocation of trace IDs
in a dynamic manner.
Architecturally reserved IDs are never allocated, and the system is
limited to allocating only valid IDs.
Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
the new API.
perf handling is changed so that the ID associated with the CPU is read
from sysfs. The ID allocator is notified when perf events start and stop
so CPU based IDs are kept constant throughout any perf session.
For the ETMx.x devices IDs are allocated on certain events
a) When using sysfs, an ID will be allocated on hardware enable, and freed
when the sysfs reset is written.
b) When using perf, ID is allocated on hardware enable, and freed on
hardware disable.
For both cases the ID is allocated when sysfs is read to get the current
trace ID. This ensures that consistent decode metadata can be extracted
from the system where this read occurs before device enable.
Note: This patchset breaks backward compatibility for perf record.
Because the method for generating the AUXTRACE_INFO meta data has
changed, using an older perf record will result in metadata that
does not match the trace IDs used in the recorded trace data.
This mismatch will cause subsequent decode to fail. Older versions of
perf will still be able to decode data generated by the updated system.
Applies to coresight/next [b54f53bc11a5]
Tested on DB410c
Mike Leach (10):
coresight: trace-id: Add API to dynamically assign trace ID values
coresight: trace-id: Set up source trace ID map for system
coresight: stm: Update STM driver to use Trace ID api
coresight: etm4x: Use trace ID API to dynamically allocate trace ID
coresight: etm3x: Use trace ID API to allocate IDs
coresight: perf: traceid: Add perf notifiers for trace ID
perf: cs-etm: Update event to read trace ID from sysfs
coresight: Remove legacy Trace ID allocation mechanism
coresight: etmX.X: stm: Remove unused legacy source trace ID ops
coresight: trace-id: Add debug & test macros to trace id allocation
drivers/hwtracing/coresight/Makefile | 2 +-
drivers/hwtracing/coresight/coresight-core.c | 64 ++---
.../hwtracing/coresight/coresight-etm-perf.c | 16 +-
drivers/hwtracing/coresight/coresight-etm.h | 3 +-
.../coresight/coresight-etm3x-core.c | 93 ++++---
.../coresight/coresight-etm3x-sysfs.c | 28 +-
.../coresight/coresight-etm4x-core.c | 63 ++++-
.../coresight/coresight-etm4x-sysfs.c | 32 ++-
drivers/hwtracing/coresight/coresight-etm4x.h | 3 +
drivers/hwtracing/coresight/coresight-priv.h | 1 +
drivers/hwtracing/coresight/coresight-stm.c | 49 +---
.../hwtracing/coresight/coresight-trace-id.c | 255 ++++++++++++++++++
.../hwtracing/coresight/coresight-trace-id.h | 69 +++++
include/linux/coresight-pmu.h | 12 -
include/linux/coresight.h | 3 -
tools/perf/arch/arm/util/cs-etm.c | 12 +-
16 files changed, 530 insertions(+), 175 deletions(-)
create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
--
2.17.1
Hi all,
My colleague Junhao He noticed this issue when tracing CPU48 on
Kunpeng920 platform, log as follows:
[root@localhost ~]# perf record -e cs_etm/@sink_smb1/ -C 48 -o perf.data
taskset -c 48 uname -a
[root@localhost ~]# perf report -D --stdio -i perf.data > perf_48.log
0x270 [0xc8]: failed to process type: 70 [Invalid argument]
Error:
failed to process sample
[root@localhost ~]# perf -v
perf version 5.17.rc4.gdeea22e4af29
[root@localhost ~]# ldd /usr/bin/perf | grep opencsd
libopencsd_c_api.so.1 => /root/lib/libopencsd_c_api.so.1
libopencsd.so.1 => /root/lib/libopencsd.so.1
As (CORESIGHT_ETM_PMU_SEED + (cpu * 2)) is used in
coresight_get_trace_id() to cacualate trace_id, if there are more than
48 CPUs on chip, we will have some ETM device which trace id is
invalid(trace_id = 0 or trace_id > 0x6F). In this situation, we cannot
parse trace data using perf tool.
Perhaps we should make trace_id in the range of 1 to 0x6F in
coresight_get_trace_id()? But there also might be parsing problem if
duplicate trace ID is used during collection.
Any response will be highly appreciated.
Thanks,
Qi
On Sun, May 15, 2022 at 02:18:53PM -0700, Ian Rogers wrote:
[...]
> This looks good to me and will run on python 2. The code would be more
> idiomatic in python3 using f-strings. I'd rather the code was
> idiomatic from the beginning, but others may disagree and prefer
> python 2 compatibility (python 2 is now end of life). f-strings are
> python 3.6+ and so have been supported for 5 years.
Using f-string is the right thing for me, will update.
Thanks for reviewing and suggestion!
Leo
The following changes since commit ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e:
Linux 5.18-rc2 (2022-04-10 14:21:36 -1000)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux.git tags/coresight-next-v5.19
for you to fetch changes up to 1adff542d67a2ed1120955cb219bfff8a9c53f59:
coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier (2022-05-09 16:03:24 +0100)
----------------------------------------------------------------
Coresight changes for v5.19
Good day Greg,
Please consider those for the the upcoming v5.19 merge window when you have time.
This pull request includes:
- Work to uniformise access to the ETMv4 registers, making it easier to
look for and change register accesses.
- A correction to a probing failure when looking for links between devices.
- The replacement of a call to mutex_lock() with a mutex_trylock() in the panic
notifier of the cpu-debug infrastructure to avoid a possible deadlock.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
----------------------------------------------------------------
Guilherme G. Piccoli (1):
coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier
James Clark (15):
coresight: etm4x: Cleanup TRCIDR0 register accesses
coresight: etm4x: Cleanup TRCIDR2 register accesses
coresight: etm4x: Cleanup TRCIDR3 register accesses
coresight: etm4x: Cleanup TRCIDR4 register accesses
coresight: etm4x: Cleanup TRCIDR5 register accesses
coresight: etm4x: Cleanup TRCCONFIGR register accesses
coresight: etm4x: Cleanup TRCEVENTCTL1R register accesses
coresight: etm4x: Cleanup TRCSTALLCTLR register accesses
coresight: etm4x: Cleanup TRCVICTLR register accesses
coresight: etm3x: Cleanup ETMTECR1 register accesses
coresight: etm4x: Cleanup TRCACATRn register accesses
coresight: etm4x: Cleanup TRCSSCCRn and TRCSSCSRn register accesses
coresight: etm4x: Cleanup TRCSSPCICRn register accesses
coresight: etm4x: Cleanup TRCBBCTLR register accesses
coresight: etm4x: Cleanup TRCRSCTLRn register accesses
Mao Jinlong (1):
coresight: core: Fix coresight device probe failure issue
drivers/hwtracing/coresight/coresight-core.c | 33 ++--
drivers/hwtracing/coresight/coresight-cpu-debug.c | 7 +-
drivers/hwtracing/coresight/coresight-etm3x-core.c | 2 +-
.../hwtracing/coresight/coresight-etm3x-sysfs.c | 2 +-
drivers/hwtracing/coresight/coresight-etm4x-core.c | 136 +++++-----------
.../hwtracing/coresight/coresight-etm4x-sysfs.c | 180 +++++++++++----------
drivers/hwtracing/coresight/coresight-etm4x.h | 120 +++++++++++---
7 files changed, 268 insertions(+), 212 deletions(-)
Hi,
This change has a soft dependency on [1], but assuming the name/location
of the new sysfs interface (ts_source) doesn't change, it should be safe
to apply.
The new 'ts_source' interface allows perf to detect if the timestamps in
the trace correspond to the value of CNTVCT_EL0, which we can convert to
a perf timestamp and store it in the instruction and branch samples.
Due to the way the trace is compressed and decoded by OpenCSD, we only
know the precise time of the first instruction in a range, but I think
for now this is better than not having timestamps at all...
Thanks,
German
[1] https://lore.kernel.org/all/20220503123537.1003035-1-german.gomez@arm.com/
German Gomez (4):
perf pmu: Add function to check if a pmu file exists
perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters
perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE
perf cs_etm: Set the time field in the synthetic samples
tools/perf/arch/arm/util/cs-etm.c | 89 +++++++++++++++++++--
tools/perf/util/cs-etm.c | 126 +++++++++++++++++++++++++-----
tools/perf/util/cs-etm.h | 13 ++-
tools/perf/util/pmu.c | 17 ++++
tools/perf/util/pmu.h | 2 +
5 files changed, 221 insertions(+), 26 deletions(-)
--
2.25.1