This is an RFC patch to explore the solution to a problem we have
in the CoreSight ETM/ETE PMU.
CoreSight ETM trace allows instruction level tracing of Arm CPUs.
The ETM generates the CPU excecution trace and pumps it into CoreSight
AMBA Trace Bus and is collected by a different CoreSight component
(traditionally CoreSight TMC-ETR /ETB/ETF), called "sink".
Important to note that there is no guarantee that every CPU has
a dedicated sink. Thus multiple ETMs could pump the trace data
into the same "sink" and thus they apply additional formatting
of the trace data for the user to decode it properly and attribute
the trace data to the corresponding ETM.
However, with the introduction of Arm Trace buffer Extensions (TRBE),
we now have a dedicated per-CPU architected sink for collecting the
trace. Since the TRBE is always per-CPU, it doesn't apply any formatting
of the trace. The support for this driver is under review [1].
Now a system could have a per-cpu TRBE and one or more shared
TMC-ETRs on the system. A user could choose a "specific" sink
for a perf session (e.g, a TMC-ETR) or the driver could automatically
select the nearest sink for a given ETM. It is possible that
some ETMs could end up using TMC-ETR (e.g, if the TRBE is not
usable on the CPU) while the others using TRBE in a single
perf session. Thus we now have "formatted" trace collected
from TMC-ETR and "unformatted" trace collected from TRBE.
However, we don't get into a situation where a single event
could end up using TMC-ETR & TRBE. i.e, any AUX buffer is
guaranteed to be either RAW or FORMATTED, but not a mix
of both.
As for perf decoding, we need to know the type of the data
in the individual AUX buffers, so that it can set up the
"OpenCSD" (library for decoding CoreSight trace) decoder
instance appropriately. Thus the perf.data file must conatin
the hints for the tool to decode the data correctly.
Since this is a runtime variable, and perf tool doesn't have
a control on what sink gets used (in case of automatic sink
selection), we need this information made available from
the PMU driver for each AUX record.
This patch is an attempt to solve the problem by, adding an
AUX flag for each AUX record to indicate the type of the
trace in them. It can be defined as a PMU specific flag,
which each PMU could interpret in its on way (e.g,
PERF_AUX_FLAG_PMU_FLAG_1 or could be a dedicated
flag for the CoreSight in a "generic" form
PERF_AUX_FLAG_ALT_FMT (Thanks Mike Leach for the name).
We are looking for suggestions on how best to solve this
problem and happy to explore other options if there is
a preferred way of solving this.
[1] https://lkml.kernel.org/r/1610511498-4058-1-git-send-email-anshuman.khandua…
Suzuki K Poulose (1):
perf: Handle multiple formatted AUX records
drivers/hwtracing/coresight/coresight-etm-perf.c | 2 ++
include/linux/coresight.h | 1 +
include/uapi/linux/perf_event.h | 1 +
3 files changed, 4 insertions(+)
--
2.24.1
On production systems with ETMs enabled, it is preferred to
exclude kernel mode(NS EL1) tracing for security concerns and
support only userspace(NS EL0) tracing. So provide an option
via kconfig to exclude kernel mode tracing if it is required.
This config is disabled by default and would not affect the
current configuration which has both kernel and userspace
tracing enabled by default.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan(a)codeaurora.org>
---
drivers/hwtracing/coresight/Kconfig | 9 +++++++++
drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index c1198245461d..52435de8824c 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -110,6 +110,15 @@ config CORESIGHT_SOURCE_ETM4X
To compile this driver as a module, choose M here: the
module will be called coresight-etm4x.
+config CORESIGHT_ETM4X_EXCL_KERN
+ bool "Coresight ETM 4.x exclude kernel mode tracing"
+ depends on CORESIGHT_SOURCE_ETM4X
+ help
+ This will exclude kernel mode(NS EL1) tracing if enabled. This option
+ will be useful to provide more flexible options on production systems
+ where only userspace(NS EL0) tracing might be preferred for security
+ reasons.
+
config CORESIGHT_STM
tristate "CoreSight System Trace Macrocell driver"
depends on (ARM && !(CPU_32v3 || CPU_32v4 || CPU_32v4T)) || ARM64
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index abd706b216ac..7e5669e5cd1f 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -832,6 +832,9 @@ static u64 etm4_get_ns_access_type(struct etmv4_config *config)
{
u64 access_type = 0;
+ if (IS_ENABLED(CONFIG_CORESIGHT_ETM4X_EXCL_KERN))
+ config->mode |= ETM_MODE_EXCL_KERN;
+
/*
* EXLEVEL_NS, bits[15:12]
* The Exception levels are:
@@ -849,7 +852,8 @@ static u64 etm4_get_ns_access_type(struct etmv4_config *config)
access_type = ETM_EXLEVEL_NS_HYP;
}
- if (config->mode & ETM_MODE_EXCL_USER)
+ if (config->mode & ETM_MODE_EXCL_USER &&
+ !IS_ENABLED(CONFIG_CORESIGHT_ETM4X_EXCL_KERN))
access_type |= ETM_EXLEVEL_NS_APP;
return access_type;
base-commit: 3477326277451000bc667dfcc4fd0774c039184c
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
The current fixed metadata version format (version 0), means that adding
metadata parameter items renders files from a previous version of perf
unreadable. Per CPU parameters appear in a fixed order, but there is no
field to indicate the number of ETM parameters per CPU.
This patch updates the per CPU parameter blocks to include a NR_PARAMs
value which indicates the number of parameters in the block.
The header version is incremented to 1. Fixed ordering is retained,
new ETM parameters are added to the end of the list.
The reader code is updated to be able to read current version 0 files,
For version 1, the reader will read the number of parameters in the
per CPU block. This allows the reader to process older or newer files
that may have different numbers of parameters than in use at the
time perf was built.
Changes since v1 (from Review by Leo):
1. Split printing routine into sub functions per version
2. Renamed NR_PARAMs to NR_TRC_PARAMs to emphasise use as count of trace
related parameters, not total block parameter.
3. Misc other fixes.
Signed-off-by: Mike Leach <mike.leach(a)linaro.org>
---
tools/perf/arch/arm/util/cs-etm.c | 7 +-
tools/perf/util/cs-etm.c | 212 ++++++++++++++++++++++++------
tools/perf/util/cs-etm.h | 30 ++++-
3 files changed, 200 insertions(+), 49 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index bd446aba64f7..b0470f2a955a 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -572,7 +572,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
struct auxtrace_record *itr,
struct perf_record_auxtrace_info *info)
{
- u32 increment;
+ u32 increment, nr_trc_params;
u64 magic;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
@@ -607,6 +607,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETMV4_PRIV_MAX;
+ nr_trc_params = CS_ETMV4_PRIV_MAX - CS_ETMV4_TRCCONFIGR;
} else {
magic = __perf_cs_etmv3_magic;
/* Get configuration register */
@@ -624,11 +625,13 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETM_PRIV_MAX;
+ nr_trc_params = CS_ETM_PRIV_MAX - CS_ETM_ETMCR;
}
/* Build generic header portion */
info->priv[*offset + CS_ETM_MAGIC] = magic;
info->priv[*offset + CS_ETM_CPU] = cpu;
+ info->priv[*offset + CS_ETM_NR_TRC_PARAMS] = nr_trc_params;
/* Where the next CPU entry should start from */
*offset += increment;
}
@@ -674,7 +677,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
/* First fill out the session header */
info->type = PERF_AUXTRACE_CS_ETM;
- info->priv[CS_HEADER_VERSION_0] = 0;
+ info->priv[CS_HEADER_VERSION] = CS_HEADER_CURRENT_VERSION;
info->priv[CS_PMU_TYPE_CPUS] = type << 32;
info->priv[CS_PMU_TYPE_CPUS] |= nr_cpu;
info->priv[CS_ETM_SNAPSHOT] = ptr->snapshot_mode;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a2a369e2fbb6..23115f9633dc 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2435,7 +2435,7 @@ static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
}
static const char * const cs_etm_global_header_fmts[] = {
- [CS_HEADER_VERSION_0] = " Header version %llx\n",
+ [CS_HEADER_VERSION] = " Header version %llx\n",
[CS_PMU_TYPE_CPUS] = " PMU type/num cpus %llx\n",
[CS_ETM_SNAPSHOT] = " Snapshot %llx\n",
};
@@ -2443,6 +2443,7 @@ static const char * const cs_etm_global_header_fmts[] = {
static const char * const cs_etm_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETM_ETMCR] = " ETMCR %llx\n",
[CS_ETM_ETMTRACEIDR] = " ETMTRACEIDR %llx\n",
[CS_ETM_ETMCCER] = " ETMCCER %llx\n",
@@ -2452,6 +2453,7 @@ static const char * const cs_etm_priv_fmts[] = {
static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETMV4_TRCCONFIGR] = " TRCCONFIGR %llx\n",
[CS_ETMV4_TRCTRACEIDR] = " TRCTRACEIDR %llx\n",
[CS_ETMV4_TRCIDR0] = " TRCIDR0 %llx\n",
@@ -2461,24 +2463,152 @@ static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETMV4_TRCAUTHSTATUS] = " TRCAUTHSTATUS %llx\n",
};
-static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+static const char * const param_unk_fmt =
+ " Unknown parameter [%d] %llx\n";
+static const char * const magic_unk_fmt =
+ " Magic number Unknown %llx\n";
+
+static void cs_etm__print_cpu_metadata_v0(__u64 *val, int *offset)
+{
+ int i = *offset, j, nr_params = 0;
+ __u64 magic;
+
+ /* common header block */
+ magic = val[i];
+ fprintf(stdout, cs_etm_priv_fmts[0], val[i++]);
+ fprintf(stdout, cs_etm_priv_fmts[1], val[i++]);
+
+ if (magic == __perf_cs_etmv3_magic) {
+ nr_params = CS_ETM_NR_TRC_PARAMS_V0;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = 3; j < nr_params + 3; j++, i++)
+ fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
+ } else if (magic == __perf_cs_etmv4_magic) {
+ nr_params = CS_ETMV4_NR_TRC_PARAMS_V0;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = 3; j < nr_params + 3; j++, i++)
+ fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
+ } else {
+ /* failure - note bad magic value */
+ fprintf(stdout, magic_unk_fmt, magic);
+ }
+ *offset = i;
+}
+
+static void cs_etm__print_cpu_metadata_v1(__u64 *val, int *offset)
{
- int i, j, cpu = 0;
+ int i = *offset, j, total_params = 0;
+ __u64 magic;
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
- fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+ magic = val[i + CS_ETM_MAGIC];
+ /* total params to print is NR_PARAMS + common block size for v1 */
+ total_params = val[i + CS_ETM_NR_TRC_PARAMS] + CS_ETM_COMMON_BLK_MAX_V1;
- for (i = CS_HEADER_VERSION_0_MAX; cpu < num; cpu++) {
- if (val[i] == __perf_cs_etmv3_magic)
- for (j = 0; j < CS_ETM_PRIV_MAX; j++, i++)
+ if (magic == __perf_cs_etmv3_magic) {
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETM_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
- else if (val[i] == __perf_cs_etmv4_magic)
- for (j = 0; j < CS_ETMV4_PRIV_MAX; j++, i++)
+ }
+ } else if (magic == __perf_cs_etmv4_magic) {
+ /* after common block */
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETMV4_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
- else
- /* failure.. return */
- return;
+ }
+ } else {
+ /* failure - note bad magic value */
+ fprintf(stdout, magic_unk_fmt, magic);
}
+ *offset = i;
+}
+
+static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+{
+ int i, cpu = 0, version;
+
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
+ fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+ version = val[0];
+ if (version > CS_HEADER_CURRENT_VERSION) {
+ /* failure.. return */
+ fprintf(stdout, " Unknown Header Version = %x, ", version);
+ fprintf(stdout, "Version supported <= %x\n", CS_HEADER_CURRENT_VERSION);
+ return;
+ }
+
+ for (i = CS_HEADER_VERSION_MAX; cpu < num; cpu++) {
+ if (version == 0)
+ cs_etm__print_cpu_metadata_v0(val, &i);
+ else if (version == 1)
+ cs_etm__print_cpu_metadata_v1(val, &i);
+ }
+}
+
+/*
+ * Read a single cpu parameter block from the auxtrace_info priv block.
+ *
+ * For version 1 there is a per cpu nr_params entry. If we are handling
+ * version 1 file, then there may be less, the same, or more params
+ * indicated by this value than the compile time number we understand.
+ *
+ * For a version 0 info block, there are a fixed number, and we need to
+ * fill out the nr_param value in the metadata we create.
+ */
+static u64 *cs_etm__create_meta_blk(u64 *buff_in, int *buff_in_offset,
+ int out_blk_size, int nr_params_v0)
+{
+ u64 *metadata = NULL;
+ int hdr_version;
+ int nr_in_params, nr_out_params, nr_cmn_params;
+ int i, k;
+
+ metadata = zalloc(sizeof(*metadata) * out_blk_size);
+ if (!metadata)
+ return NULL;
+
+ /* read block current index & version */
+ i = *buff_in_offset;
+ hdr_version = buff_in[CS_HEADER_VERSION];
+
+ if (!hdr_version) {
+ /* read version 0 info block into a version 1 metadata block */
+ nr_in_params = nr_params_v0;
+ metadata[CS_ETM_MAGIC] = buff_in[i + CS_ETM_MAGIC];
+ metadata[CS_ETM_CPU] = buff_in[i + CS_ETM_CPU];
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_in_params;
+ /* remaining block params at offset +1 from source */
+ for (k = 2; k < nr_in_params; k++)
+ metadata[k+1] = buff_in[i + k];
+ /* version 0 has 2 common params */
+ nr_cmn_params = 2;
+ } else {
+ /* read version 1 info block - input and output nr_params may differ */
+ /* version 1 has 3 common params */
+ nr_cmn_params = 3;
+ nr_in_params = buff_in[i + CS_ETM_NR_TRC_PARAMS];
+
+ /* if input has more params than output - skip excess */
+ nr_out_params = nr_in_params + nr_cmn_params;
+ if (nr_out_params > out_blk_size)
+ nr_out_params = out_blk_size;
+
+ for (k = 0; k < nr_out_params; k++)
+ metadata[k] = buff_in[i + k];
+
+ /* record the actual nr params we copied */
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_out_params - nr_cmn_params;
+ }
+
+ /* adjust in offset by number of in params used */
+ i += nr_in_params + nr_cmn_params;
+ *buff_in_offset = i;
+ return metadata;
}
int cs_etm__process_auxtrace_info(union perf_event *event,
@@ -2492,11 +2622,12 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
int info_header_size;
int total_size = auxtrace_info->header.size;
int priv_size = 0;
- int num_cpu;
- int err = 0, idx = -1;
- int i, j, k;
+ int num_cpu, trcidr_idx;
+ int err = 0;
+ int i, j;
u64 *ptr, *hdr = NULL;
u64 **metadata = NULL;
+ u64 hdr_version;
/*
* sizeof(auxtrace_info_event::type) +
@@ -2512,16 +2643,17 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
/* First the global part */
ptr = (u64 *) auxtrace_info->priv;
- /* Look for version '0' of the header */
- if (ptr[0] != 0)
+ /* Look for version of the header */
+ hdr_version = ptr[0];
+ if (hdr_version > CS_HEADER_CURRENT_VERSION)
return -EINVAL;
- hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_0_MAX);
+ hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_MAX);
if (!hdr)
return -ENOMEM;
/* Extract header information - see cs-etm.h for format */
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
hdr[i] = ptr[i];
num_cpu = hdr[CS_PMU_TYPE_CPUS] & 0xffffffff;
pmu_type = (unsigned int) ((hdr[CS_PMU_TYPE_CPUS] >> 32) &
@@ -2552,35 +2684,31 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
*/
for (j = 0; j < num_cpu; j++) {
if (ptr[i] == __perf_cs_etmv3_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETM_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETM_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETM_PRIV_MAX,
+ CS_ETM_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETM_ETMTRACEIDR];
- i += CS_ETM_PRIV_MAX;
+ trcidr_idx = CS_ETM_ETMTRACEIDR;
+
} else if (ptr[i] == __perf_cs_etmv4_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETMV4_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETMV4_PRIV_MAX,
+ CS_ETMV4_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
- i += CS_ETMV4_PRIV_MAX;
+ trcidr_idx = CS_ETMV4_TRCTRACEIDR;
+ }
+
+ if (!metadata[j]) {
+ err = -ENOMEM;
+ goto err_free_metadata;
}
/* Get an RB node for this CPU */
- inode = intlist__findnew(traceid_list, idx);
+ inode = intlist__findnew(traceid_list, metadata[j][trcidr_idx]);
/* Something went wrong, no need to continue */
if (!inode) {
@@ -2601,7 +2729,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
}
/*
- * Each of CS_HEADER_VERSION_0_MAX, CS_ETM_PRIV_MAX and
+ * Each of CS_HEADER_VERSION_MAX, CS_ETM_PRIV_MAX and
* CS_ETMV4_PRIV_MAX mark how many double words are in the
* global metadata, and each cpu's metadata respectively.
* The following tests if the correct number of double words was
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 4ad925d6d799..e153d02df0de 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -17,23 +17,37 @@ struct perf_session;
*/
enum {
/* Starting with 0x0 */
- CS_HEADER_VERSION_0,
+ CS_HEADER_VERSION,
/* PMU->type (32 bit), total # of CPUs (32 bit) */
CS_PMU_TYPE_CPUS,
CS_ETM_SNAPSHOT,
- CS_HEADER_VERSION_0_MAX,
+ CS_HEADER_VERSION_MAX,
};
+/*
+ * Update the version for new format.
+ *
+ * New version 1 format adds a param count to the per cpu metadata.
+ * This allows easy adding of new metadata parameters.
+ * Requires that new params always added after current ones.
+ * Also allows client reader to handle file versions that are different by
+ * checking the number of params in the file vs the number expected.
+ */
+#define CS_HEADER_CURRENT_VERSION 1
+
/* Beginning of header common to both ETMv3 and V4 */
enum {
CS_ETM_MAGIC,
CS_ETM_CPU,
+ /* Number of trace config params in following ETM specific block */
+ CS_ETM_NR_TRC_PARAMS,
+ CS_ETM_COMMON_BLK_MAX_V1,
};
/* ETMv3/PTM metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETM_ETMCR = CS_ETM_CPU + 1,
+ CS_ETM_ETMCR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETM_ETMTRACEIDR,
/* RO, taken from sysFS */
CS_ETM_ETMCCER,
@@ -41,10 +55,13 @@ enum {
CS_ETM_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETM_NR_TRC_PARAMS_V0 (CS_ETM_ETMIDR - CS_ETM_ETMCR + 1)
+
/* ETMv4 metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETMV4_TRCCONFIGR = CS_ETM_CPU + 1,
+ CS_ETMV4_TRCCONFIGR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETMV4_TRCTRACEIDR,
/* RO, taken from sysFS */
CS_ETMV4_TRCIDR0,
@@ -55,6 +72,9 @@ enum {
CS_ETMV4_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETMV4_NR_TRC_PARAMS_V0 (CS_ETMV4_TRCAUTHSTATUS - CS_ETMV4_TRCCONFIGR + 1)
+
/*
* ETMv3 exception encoding number:
* See Embedded Trace Macrocell spcification (ARM IHI 0014Q)
@@ -162,7 +182,7 @@ struct cs_etm_packet_queue {
#define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
-#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_0_MAX * sizeof(u64))
+#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_MAX * sizeof(u64))
#define __perf_cs_etmv3_magic 0x3030303030303030ULL
#define __perf_cs_etmv4_magic 0x4040404040404040ULL
--
2.17.1
Hello coresight-list!
I have a question about the future of OpenCSD library API. How stable is
it going to be? I see there's currently v1.0, so I understand it as that
there's already some API version which is considered to be "mature". Could
anyone please help estimating whether major API changes are planned/to be
expected in upcoming years?
Thank you!
Michael
hi Mike
I am compiling , running and debugging a program generated from the same
source code on ARMv7 and ARMv8 processors. when a breakpoint is hit, I
am using the traces to reconstruct the execution flow.
I noticed that on stm32 the decoder emits the address of the breakpoint
in an OCSD_GEN_TRC_ELEM_INSTR_RANGE twice (before and after the
exception), whereas it emits it only once (after the exception).
I need to have different handling according to the situation.
what can I use to distinguish both cases and handle the traces properly:
- the ETM version (v3.x vs v4.x)
- the CPU architecture (Cortex A v7 vs cortex A v8)
- the ISA of executed program (A64 vs (A32 or T32 ))
- the exception number
below are the logs
on stm32MP1 (ARMv7, cortex A7)
[btrace] ETM trace_element: index= 12, channel= 0x10,
OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x103f4:[0x103f6] num_i(1)
last_sz(2) (ISA=T32) E --- )
[btrace] ETM trace_element: index= 13, channel= 0x10,
OCSD_GEN_TRC_ELEM_EXCEPTION(excep num (0x09) )
[btrace] ETM trace_element: index= 54, channel= 0x10,
OCSD_GEN_TRC_ELEM_TRACE_ON( [begin or filter])
[btrace] ETM trace_element: index= 54, channel= 0x10,
OCSD_GEN_TRC_ELEM_PE_CONTEXT((ISA=T32) N; 32-bit; )
[btrace] ETM trace_element: index= 60, channel= 0x10,
OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x103f4:[0x103f6] num_i(1)
last_sz(2) (ISA=T32) E --- )
on qcom APQ8016e(ARMv8, Cortex A53)
[btrace] ETM trace_element: index= 38, channel= 0x12,
OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x4005b4:[0x4005b8] num_i(1)
last_sz(4) (ISA=A64) E --- )
[btrace] ETM trace_element: index= 38, channel= 0x12,
OCSD_GEN_TRC_ELEM_EXCEPTION(pref ret addr:0x4005b8; excep num (0x06) )
[btrace] ETM trace_element: index= 112, channel= 0x16,
OCSD_GEN_TRC_ELEM_NO_SYNC( [init-decoder])
[btrace] ETM trace_element: index= 138, channel= 0x16,
OCSD_GEN_TRC_ELEM_TRACE_ON( [begin or filter])
[btrace] ETM trace_element: index= 139, channel= 0x16,
OCSD_GEN_TRC_ELEM_PE_CONTEXT((ISA=A64) EL0N; 64-bit; )
[btrace] ETM trace_element: index= 150, channel= 0x16,
OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x4005b8:[0x4005bc] num_i(1)
last_sz(4) (ISA=A64) E BR )
Kind Regards
Zied Guermazi
--
*Zied Guermazi*
founder
Trande UG
Leuschnerstraße 2
69469 Weinheim/Germany
Mobile: +491722645127
mailto:zied.guermazi@trande.de
*Trande UG*
Leuschnerstraße 2, D-69469 Weinheim; Telefon: +491722645127
Sitz der Gesellschaft: Weinheim- Registergericht: AG Mannheim HRB 736209
- Geschäftsführung: Zied Guermazi
*Confidentiality Note*
This message is intended only for the use of the named recipient(s) and
may contain confidential and/or privileged information. If you are not
the intended recipient, please contact the sender and delete the
message. Any unauthorized use of the information contained in this
message is prohibited.
This patch series is a following up for the previous version which was
delivered by Suzuki [1]. Below gives the background info for why we
need this patch series, directly quotes the description in the cover
letter of the previous version:
"With the Virtualization Host Extensions, the kernel can run at EL2.
In this case the pid is written to CONTEXTIDR_EL2 instead of the
CONTEXTIDR_EL1. Thus the normal coresight tracing will be unable
to detect the PID of the thread generating the trace by looking
at the CONTEXTIDR_EL1. Thus, depending on the kernel EL, we must
switch to tracing the correct CONTEXTIDR register.
With VHE, we must set the TRCCONFIGR.VMID and TRCCONFIGR.VMID_OPT
to include the CONTEXTIDR_EL2 as the VMID in the trace. This
requires the perf tool to detect the changes in the TRCCONFIGR and
use the VMID / CID field for the PID. The challenge here is for
the perf tool to detect the kernel behavior.
Instead of the previously proposed invasive approaches, this set
implements a less intrusive mechanism, by playing with the
perf_event.attribute.config bits."
Same as the previous series, this series keeps the same implementation
for two introduced format bits:
- contextid_in_vmid -> Is only supported when the VMID tracing
and CONTEXTIDR_EL2 both are supported. When requested the perf
etm4x backend sets (TRCCONFIGR.VMID | TRCCONFIGR.VMID_OPT).
As per ETMv4.4 TRM, when the core supports VHE, the CONTEXTIDR_EL2
tracing is mandatory. (See the field TRCID2.VMIDOPT)
- pid -> Is an alias for the correct config to enable PID tracing
on any kernel.
i.e, in EL1 kernel -> pid == contextid
EL2 kernel -> pid == contextid_in_vmid
With this, the perf tool is also updated to request the "pid"
tracing whenever available, falling back to "contextid" if it
is unavailable.
Comparing against the old version, this patch series uses the metadata
to save PID format; after add new item into metadata, it introduces
backward compatibility issue. To allow backward compatibility, this
series calculates per CPU metadata array size and avoid to use the
defined macro, so can always know the correct array size based on the
info stored in perf data file. Finally, the PID format stored in
metadata is passed to decoder and guide the decoder to set PID from
CONTEXTIDR_EL1 or VMID.
This patch series has been tested on Arm Juno-r2 board, with testing
two perf data files: one data file is recorded by the latest perf tool
after applied this patch series, and another data file is recorded by
old perf tool without this patch series, so this can prove the tool is
backward compatible.
Changes from RFC:
* Added comments to clarify cases requested (Leo);
* Explain the change to generic flags for cs_etm_set_option() in the
commit description;
* Stored PID format in metadata and passed it to decoder (Leo);
* Enhanced cs-etm for backward compatibility (Denis Nikitin).
[1] https://archive.armlinux.org.uk/lurker/message/20201110.183310.24406f33.en.…
Leo Yan (4):
perf cs-etm: Calculate per CPU metadata array size
perf cs-etm: Add PID format into metadata
perf cs-etm: Fixup PID_FMT when it is zero
perf cs-etm: Add helper cs_etm__get_pid_fmt()
Suzuki K Poulose (3):
coresight: etm-perf: Add support for PID tracing for kernel at EL2
perf cs_etm: Use pid tracing explicitly instead of contextid
perf cs-etm: Detect pid in VMID for kernel running at EL2
.../hwtracing/coresight/coresight-etm-perf.c | 14 +++
.../coresight/coresight-etm4x-core.c | 9 ++
include/linux/coresight-pmu.h | 11 ++-
tools/include/linux/coresight-pmu.h | 11 ++-
tools/perf/arch/arm/util/cs-etm.c | 89 +++++++++++++++----
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 32 ++++++-
tools/perf/util/cs-etm.c | 61 ++++++++++++-
tools/perf/util/cs-etm.h | 3 +
8 files changed, 198 insertions(+), 32 deletions(-)
--
2.25.1