Hi all,
I am trying to enable Coresight framework on STM32MP1 (ARMv7). More
specifically on CPU-wide kernel trace collection with perf.
I recently came across a decoding error from OpenCSD, which prevent the proper
trace decoding :
> perf report --stdio
DCD_ETMV3_0018 : 0x0013 (OCSD_ERR_BAD_PACKET_SEQ) [Bad packet sequence];
TrcIdx=6969; CS ID=12; Bad Packet sequence.
0x27438 [0x8]: failed to process type: 68 [Invalid argument]
> perf report --dump
[...]
Idx:6961; ID:12; P_HDR : Atom P-header.;
Idx:6962; ID:12; BRANCH_ADDRESS : Branch address.;
Addr=0x6C67BD40 ~[0x6C67BD40]; Exception=Jazelle;
Idx:6967; ID:12; P_HDR : Atom P-header.; EE
Idx:6968; ID:12; P_HDR : Atom P-header.;
PKTP_ETMV3_0018 : 0x0013 (OCSD_ERR_BAD_PACKET_SEQ) [Bad packet
sequence]; TrcIdx=6969; CS ID=12; A-Sync ? : Unexpected byte in sequence
Idx:6969; ID:12; BAD_SEQUENCE : Invalid sequence for packet type.[A_SYNC]
Idx:6970; ID:12; P_HDR : Atom P-header.; EEEEEEEEEEEEEEE
What could explain this behavior ?
Thanks,
Raphaël
Hi Mike,
I was doing CPU hot plug test today and encoutner some CTI issues.
I'd like to know whether they are known issues so someone is already on it.
If no one is working on this, I can provide some patch later.
1. Deadlock
[ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
[ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
[ 988.357689] pc : smp_call_function_single+0x158/0x1b8
[ 988.362782] lr : smp_call_function_single+0x124/0x1b8
...
[ 988.451638] Call trace:
[ 988.454119] smp_call_function_single+0x158/0x1b8
[ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
[ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
[ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
[ 988.474957] enable_source_store+0x5c/0x9c [coresight]
[ 988.480140] dev_attr_store+0x14/0x28
[ 988.483839] sysfs_kf_write+0x38/0x4c
[ 988.487532] kernfs_fop_write+0x1c0/0x2b0
[ 988.491585] vfs_write+0xfc/0x300
[ 988.494931] ksys_write+0x78/0xe0
[ 988.498283] __arm64_sys_write+0x18/0x20
[ 988.502240] el0_svc_common+0x98/0x160
[ 988.506024] do_el0_svc+0x78/0x80
[ 988.509377] el0_sync_handler+0xd4/0x270
[ 988.513337] el0_sync+0x164/0x180
Root cause:
CPU6:
Grab drvdata->spinlock in cti_enable()
Call smp_call_function_single(drvdata->ctidev.cpu, cti_enable_hw_smp_call,
drvdata, 1);
and wait for CPU2 to write CTI HW.
CPU2:
In cti_cpu_pm_notify() with interrupt disabled and spin on drvdata->spinlock.
2. Warning
[ 121.436987] WARNING: CPU: 1 PID: 15 at drivers/hwtracing/coresight/coresight-core.c:227
coresight_disclaim_device+0x30/0x44 [coresight]
[ 121.438144] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 121.438156] pstate: 80c00085 (Nzcv daIf +PAN +UAO BTYPE=--)
[ 121.438167] pc : coresight_disclaim_device+0x30/0x44 [coresight]
[ 121.438203] lr : cti_dying_cpu+0x34/0x4c [coresight_cti]
Root cause:
coresight_disclaim() is called in dying unconditionally while coresight_claim()
is called only when it's enabled.
3. When checking the code, I think there's some issue on pm_runtime_get_sync()
as well. It's called in cti_starting_cpu but put() is not called in dying.
We could have unbalanced pm count here.
Test script:
adb wait-for-device root
adb wait-for-device
:loop
adb shell "echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink"
adb shell "echo 1 > /sys/bus/coresight/devices/etm2/enable_source"
adb shell "echo 0 > /sys/devices/system/cpu/cpu2/online"
adb shell "echo 1 > /sys/devices/system/cpu/cpu2/online"
adb shell "echo 0 > /sys/devices/system/cpu/cpu2/online"
adb shell "echo 1 > /sys/devices/system/cpu/cpu2/online"
adb shell "echo 0 > /sys/bus/coresight/devices/etm2/enable_source"
goto loop
Thanks,
Tingwei
Ftrace has ability to export trace packets to other destination.
Currently, only function trace can be exported. This series extends the
support to event trace and trace_maker. STM is one possible destination to
export ftrace. Use separate channel for each CPU to avoid mixing up packets
from different CPUs together.
Change from v2:
Change flag definition to BIT(). (Steven)
Add comment in stm_ftrace_write() to clarify it's safe to use
smp_processor_id() here since preempt is disabled. (Steven)
Change from v1:
All changes are suggested by Steven Rostedt.
User separate flag to control function trace, event trace and trace mark.
Allocate channels according to num_possible_cpu() dynamically.
Move ftrace_exports routines up so all ftrace can use them.
Tingwei Zhang (6):
stm class: ftrace: change dependency to TRACING
tracing: add flag to control different traces
tracing: add trace_export support for event trace
tracing: add trace_export support for trace_marker
stm class: ftrace: enable supported trace export flag
stm class: ftrace: use different channel accroding to CPU
drivers/hwtracing/stm/Kconfig | 2 +-
drivers/hwtracing/stm/ftrace.c | 7 +-
include/linux/trace.h | 7 +
kernel/trace/trace.c | 270 ++++++++++++++++++---------------
4 files changed, 159 insertions(+), 127 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Good morning,
I noticed the trace data is read from the buffer only when the target task is scheduled in/out, leading to high variability of the trace data size.
This can be an issue because trace data gets very different as in size (and thus as in its coverage) under different system load.
Is getting trace reads independent w.r.t. task scheduling an idea which has already been considered?
One possibility I’m considering is using an independent timer which triggers periodically trace reads from the sink buffer to achieve less variability.
A couple of buffers (struct etr_buf) could also be used: one for gathering the trace data coming from the sink and another whose content to be copied over to the perf aux buffer.
When the timer triggers, the buffers are switched such that it's always possible to copy the trace data to the perf aux buffer (in one buffer) while gathering the trace data coming from all the ETMs (in the other one).
I have been thinking about this solution with a N:1 source:sink topology in mind, but I'm not sure how this would fit in a N:N topology, where every sink has its own buffer.
What do you think about it? Are you aware of any limitations that should be taken into account?
If we think this could work, the next step would probably be to prototype something that works on my N:1 topology board
Thanks,
Andrea
Hello,
I am a PhD student from Virginia Commonwealth University. I wanted to use
the OpenCSD tool to decode ETM and PTM traces.
I have downloaded the OpenCSD tool and have decoded the test examples that
come with it. I have some questions about the tool:
1) OpenCSD Interface to read trace
a) As a first step towards learning about OpenCSD, I have installed it on
my PC. I have Cortex M4 and Cortex A9 boards that have ETM/PTM modules.
What are the interfaces supported by OpenCSD to read the trace from these
boards ?
b) Are there any specific boards, software and interfaces that OpenCSD has
been tested against ?
c) Can OpenCSD decoder interface with debuggers such as Segger JTrace or
Keil Ulink Pro and decode instruction traces read from them ?
d) Can OpenCSD be installed on a ARM Cortex processor and decode the traces
for an application running on the same processor or a co-processor?
2) Can we decode traces of exceptions, changes in processor instruction set
state, changes in processor security state, global system timestamps with
OpenCSD ?
3) Is there a way to get raw decoded traces which have only Atom
information(branch taken or not) and addresses from OpenCSD ?
Thank you for your time and help with these questions.
regards,
Smitha
Coresight driver assumes sink is common across all the ETMs,
and tries to build a path between ETM and the first enabled
sink found using bus based search. This breaks sysFS usage
on implementations that has multiple per core sinks in
enabled state.
For this,
- coresight_find_sink API is updated with an additional flag
so that it is able to return an enabled sink
- coresight_get_enabled_sink API is updated to do a
connection based search, when a source reference is given.
Change-Id: I6cc91ddb3ef8936a8f41a5f7c7c455b0ece9d85d
Signed-off-by: Linu Cherian <lcherian(a)marvell.com>
---
Applies on https://git.linaro.org/kernel/coresight.git/log/?h=next
Changes in V2:
- Fixed few typos in commit message
- Rephrased commit message
.../hwtracing/coresight/coresight-etm-perf.c | 2 +-
drivers/hwtracing/coresight/coresight-priv.h | 5 +-
drivers/hwtracing/coresight/coresight.c | 51 +++++++++++++++++--
3 files changed, 51 insertions(+), 7 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 1a3169e69bb1..25041d2654e3 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -223,7 +223,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
id = (u32)event->attr.config2;
sink = coresight_get_sink_by_id(id);
} else {
- sink = coresight_get_enabled_sink(true);
+ sink = coresight_get_enabled_sink(NULL, true);
}
mask = &event_data->mask;
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f2dc625ea585..010ed26db340 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -148,10 +148,13 @@ static inline void coresight_write_reg_pair(void __iomem *addr, u64 val,
void coresight_disable_path(struct list_head *path);
int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data);
struct coresight_device *coresight_get_sink(struct list_head *path);
-struct coresight_device *coresight_get_enabled_sink(bool reset);
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source, bool reset);
struct coresight_device *coresight_get_sink_by_id(u32 id);
struct coresight_device *
coresight_find_default_sink(struct coresight_device *csdev);
+struct coresight_device *
+coresight_find_enabled_sink(struct coresight_device *csdev);
struct list_head *coresight_build_path(struct coresight_device *csdev,
struct coresight_device *sink);
void coresight_release_path(struct list_head *path);
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index e9c90f2de34a..ae69169c58d3 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -566,6 +566,10 @@ static int coresight_enabled_sink(struct device *dev, const void *data)
/**
* coresight_get_enabled_sink - returns the first enabled sink found on the bus
+ * When a source reference is given, enabled sink is found using connection based
+ * search.
+ *
+ * @source: Coresight source device reference
* @deactivate: Whether the 'enable_sink' flag should be reset
*
* When operated from perf the deactivate parameter should be set to 'true'.
@@ -576,10 +580,21 @@ static int coresight_enabled_sink(struct device *dev, const void *data)
* parameter should be set to 'false', hence mandating users to explicitly
* clear the flag.
*/
-struct coresight_device *coresight_get_enabled_sink(bool deactivate)
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source, bool deactivate)
{
struct device *dev = NULL;
+ struct coresight_device *sink;
+
+ if (!source)
+ goto bus_search;
+ sink = coresight_find_enabled_sink(source);
+ if (sink && deactivate)
+ sink->activated = false;
+
+ return sink;
+bus_search:
dev = bus_find_device(&coresight_bustype, NULL, &deactivate,
coresight_enabled_sink);
@@ -828,6 +843,7 @@ coresight_select_best_sink(struct coresight_device *sink, int *depth,
*
* @csdev: source / current device to check.
* @depth: [in] search depth of calling dev, [out] depth of found sink.
+ * @enabled: flag to search only enabled sinks
*
* This will walk the connection path from a source (ETM) till a suitable
* sink is encountered and return that sink to the original caller.
@@ -839,7 +855,7 @@ coresight_select_best_sink(struct coresight_device *sink, int *depth,
* return best sink found, or NULL if not found at this node or child nodes.
*/
static struct coresight_device *
-coresight_find_sink(struct coresight_device *csdev, int *depth)
+coresight_find_sink(struct coresight_device *csdev, int *depth, bool enabled)
{
int i, curr_depth = *depth + 1, found_depth = 0;
struct coresight_device *found_sink = NULL;
@@ -862,7 +878,8 @@ coresight_find_sink(struct coresight_device *csdev, int *depth)
child_dev = csdev->pdata->conns[i].child_dev;
if (child_dev)
- sink = coresight_find_sink(child_dev, &child_depth);
+ sink = coresight_find_sink(child_dev, &child_depth,
+ enabled);
if (sink)
found_sink = coresight_select_best_sink(found_sink,
@@ -872,6 +889,10 @@ coresight_find_sink(struct coresight_device *csdev, int *depth)
}
return_def_sink:
+ /* Check if we need to return an enabled sink */
+ if (enabled && found_sink)
+ if (!found_sink->activated)
+ found_sink = NULL;
/* return found sink and depth */
if (found_sink)
*depth = found_depth;
@@ -901,10 +922,30 @@ coresight_find_default_sink(struct coresight_device *csdev)
/* look for a default sink if we have not found for this device */
if (!csdev->def_sink)
- csdev->def_sink = coresight_find_sink(csdev, &depth);
+ csdev->def_sink = coresight_find_sink(csdev, &depth, false);
return csdev->def_sink;
}
+/**
+ * coresight_find_enabled_sink: Find the suitable enabled sink
+ *
+ * @csdev: starting source to find a connected sink.
+ *
+ * Walks connections graph looking for a suitable sink to enable for the
+ * supplied source. Uses CoreSight device subtypes and distance from source
+ * to select the best sink.
+ *
+ * Used in cases where the CoreSight user (sysfs) has selected a sink.
+ */
+struct coresight_device *
+coresight_find_enabled_sink(struct coresight_device *csdev)
+{
+ int depth = 0;
+
+ /* look for the enabled sink */
+ return coresight_find_sink(csdev, &depth, true);
+}
+
static int coresight_remove_sink_ref(struct device *dev, void *data)
{
struct coresight_device *sink = data;
@@ -992,7 +1033,7 @@ int coresight_enable(struct coresight_device *csdev)
* Search for a valid sink for this session but don't reset the
* "enable_sink" flag in sysFS. Users get to do that explicitly.
*/
- sink = coresight_get_enabled_sink(false);
+ sink = coresight_get_enabled_sink(csdev, false);
if (!sink) {
ret = -EINVAL;
goto out;
--
2.25.1
CoreSight ETMv4.4 introduced system instructions for accessing
the ETM. This also implies that they may not be on the amba bus.
Right now all the CoreSight components are accessed via memory
map. Also, we have some common routines in coresight generic
code driver (e.g, CS_LOCK, claim/disclaim), which assume the
mmio. In order to preserve the generic algorithms at a single
place and to allow dynamic switch for ETMs, this series introduces
an abstraction layer for accessing a coresight device. It is
designed such that the mmio access are fast tracked (i.e, without
an indirect function call).
This will also help us to get rid of the driver+attribute specific
sysfs show/store routines and replace them with a single routine
to access a given register offset (which can be embedded in the
dev_ext_attribute). This is not currently implemented in the series,
but can be achieved.
Further we switch the generic routines to work with the abstraction.
With this in place, we refactor the etm4x code a bit to allow for
supporting the system instructions with very little new code. The
changes also switch to using the system instructions by default
even when we may have an MMIO.
The series has been mildly tested on a model. I would really
appreciate any testing on real hardware.
Applies on coresight/next tree. The tree is also available here :
git://linux-arm.org/linux-skp.git etm-4.4/rfc
Suzuki K Poulose (14):
coresight: etm4x: Skip save/restore before device registration
coresight: Introduce device access abstraction
coresight: tpiu: Use coresight device access abstraction
coresight: etm4x: Free up argument of etm4_init_arch_data
coresight: Convert coresight_timeout to use access abstraction
coresight: Convert claim and lock operations to use access wrappers
coresight: etm4x: Always read the registers on the host CPU
coresight: etm4x: Convert all register accesses
coresight: etm4x: Add sysreg access helpers
coresight: etm4x: Define DEVARCH register fields
coresight: etm4x: Detect system register access support
coresight: etm4x: Refactor probing routine
coresight: etm4x: Add support for sysreg only devices
dts: bindings: coresight: ETMv4.4 system register access only units
.../devicetree/bindings/arm/coresight.txt | 6 +-
drivers/hwtracing/coresight/coresight-catu.c | 17 +-
.../hwtracing/coresight/coresight-cpu-debug.c | 26 +-
.../hwtracing/coresight/coresight-cti-sysfs.c | 4 +-
drivers/hwtracing/coresight/coresight-cti.c | 31 +-
drivers/hwtracing/coresight/coresight-etb10.c | 26 +-
.../coresight/coresight-etm3x-sysfs.c | 8 +-
drivers/hwtracing/coresight/coresight-etm3x.c | 45 +-
.../coresight/coresight-etm4x-sysfs.c | 32 +-
drivers/hwtracing/coresight/coresight-etm4x.c | 580 +++++++++++-------
drivers/hwtracing/coresight/coresight-etm4x.h | 403 +++++++++++-
.../hwtracing/coresight/coresight-funnel.c | 19 +-
drivers/hwtracing/coresight/coresight-priv.h | 9 +-
.../coresight/coresight-replicator.c | 28 +-
drivers/hwtracing/coresight/coresight-stm.c | 49 +-
.../hwtracing/coresight/coresight-tmc-etf.c | 36 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 19 +-
drivers/hwtracing/coresight/coresight-tmc.c | 10 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 32 +-
drivers/hwtracing/coresight/coresight.c | 130 +++-
include/linux/coresight.h | 189 +++++-
21 files changed, 1273 insertions(+), 426 deletions(-)
--
2.24.1