Do some cleanups then add a new format attribute to set the timestamp interval for ETMv4 in Perf mode. The current interval is too high for most use cases, and particularly on the FVP the number of timestamps generated is excessive.
Although it would be good to make only SYNC timestamps the default and have counter timestamps opt-in, this would be a breaking change. We can always do that later, or disable counter timestamps from Perf.
This is added as an event format attribute, rather than a Coresight config because it's something that the driver is already configuring automatically in Perf mode with any unused counter, so it's not possible to modify this with a config.
Applies to coresight/next
Signed-off-by: James Clark james.clark@linaro.org --- Changes in v3: - Move the format attr definitions to coresight-etm-perf.h we can compile on arm32 without #ifdefs - (Leo) - Convert the new #ifdefs to a single one in an is_visible() function so that the code is cleaner - (Leo) - Drop the change to remove the holes in struct etmv4_config as they were grouped by function - (Mike) - Link to v2: https://lore.kernel.org/r/20250814-james-cs-syncfreq-v2-0-c76fcb87696d@linar...
Changes in v2: - Only show the attribute for ETMv4 to improve usability and fix the arm32 build error. Wrapping everything in IS_ENABLED(CONFIG_CORESIGHT_SOURCE_ETM4X) isn't ideal, but the -perf.c file is shared between ETMv3 and ETMv4, and there is already precedent for doing it this way. - Link to v1: https://lore.kernel.org/r/20250811-james-cs-syncfreq-v1-0-b001cd6e3404@linar...
--- James Clark (5): coresight: Change syncfreq to be a u8 coresight: Repack struct etmv4_drvdata coresight: Refactor etm4_config_timestamp_event() coresight: Add format attribute for setting the timestamp interval coresight: docs: Document etm4x ts_interval
Documentation/trace/coresight/coresight.rst | 14 +++ drivers/hwtracing/coresight/coresight-etm-perf.c | 16 ++- drivers/hwtracing/coresight/coresight-etm-perf.h | 7 ++ drivers/hwtracing/coresight/coresight-etm4x-core.c | 110 +++++++++++++-------- drivers/hwtracing/coresight/coresight-etm4x.h | 61 +++++++----- 5 files changed, 143 insertions(+), 65 deletions(-) --- base-commit: 01f96b812526a2c8dcd5c0e510dda37e09ec8bcd change-id: 20250724-james-cs-syncfreq-7c2257a38ed3
Best regards,
TRCSYNCPR.PERIOD is the only functional part of TRCSYNCPR and it only has 5 valid bits so it can be stored in a u8.
Signed-off-by: James Clark james.clark@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index 13ec9ecef46f..eda3a6d2e8e2 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -825,7 +825,6 @@ struct etmv4_config { u32 eventctrl1; u32 stall_ctrl; u32 ts_ctrl; - u32 syncfreq; u32 ccctlr; u32 bb_ctrl; u32 vinst_ctrl; @@ -833,6 +832,7 @@ struct etmv4_config { u32 vissctlr; u32 vipcssctlr; u8 seq_idx; + u8 syncfreq; u32 seq_ctrl[ETM_MAX_SEQ_STATES]; u32 seq_rst; u32 seq_state;
Fix holes and convert the long list of bools to single bits to save some space because there's one of these for each ETM.
Reviewed-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Signed-off-by: James Clark james.clark@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x.h | 39 ++++++++++++++------------- 1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index eda3a6d2e8e2..813e7208ad17 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -1019,29 +1019,30 @@ struct etmv4_drvdata { u8 ns_ex_level; u8 q_support; u8 os_lock_model; - bool sticky_enable; - bool boot_enable; - bool os_unlock; - bool instrp0; - bool q_filt; - bool trcbb; - bool trccond; - bool retstack; - bool trccci; - bool trc_error; - bool syncpr; - bool stallctl; - bool sysstall; - bool nooverflow; - bool atbtrig; - bool lpoverride; + bool sticky_enable : 1; + bool boot_enable : 1; + bool os_unlock : 1; + bool instrp0 : 1; + bool q_filt : 1; + bool trcbb : 1; + bool trccond : 1; + bool retstack : 1; + bool trccci : 1; + bool trc_error : 1; + bool syncpr : 1; + bool stallctl : 1; + bool sysstall : 1; + bool nooverflow : 1; + bool atbtrig : 1; + bool lpoverride : 1; + bool state_needs_restore : 1; + bool skip_power_up : 1; + bool paused : 1; u64 trfcr; struct etmv4_config config; u64 save_trfcr; struct etmv4_save_state *save_state; - bool state_needs_restore; - bool skip_power_up; - bool paused; + DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX); };
Remove some of the magic numbers and try to clarify some of the documentation so it's clearer how this sets up the timestamp interval.
Return errors directly instead of jumping to out and returning ret, nothing needs to be cleaned up at the end and it only obscures the flow and return value.
Reviewed-by: Leo Yan leo.yan@arm.com Signed-off-by: James Clark james.clark@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x-core.c | 95 ++++++++++++++-------- drivers/hwtracing/coresight/coresight-etm4x.h | 20 +++-- 2 files changed, 77 insertions(+), 38 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 020f070bf17d..920d092ef862 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -609,18 +609,33 @@ static void etm4_enable_hw_smp_call(void *info) * TRCRSCTLR1 (always true) used to get the counter to decrement. From * there a resource selector is configured with the counter and the * timestamp control register to use the resource selector to trigger the - * event that will insert a timestamp packet in the stream. + * event that will insert a timestamp packet in the stream: + * + * +--------------+ + * | Resource 1 | fixed "always-true" resource + * +--------------+ + * | + * +------v-------+ + * | Counter x | (reload to 1 on underflow) + * +--------------+ + * | + * +------v--------------+ + * | Resource Selector y | (trigger on counter x == 0) + * +---------------------+ + * | + * +------v---------------+ + * | Timestamp Generator | (timestamp on resource y) + * +----------------------+ */ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) { - int ctridx, ret = -EINVAL; - int counter, rselector; - u32 val = 0; + int ctridx; + int rselector; struct etmv4_config *config = &drvdata->config;
/* No point in trying if we don't have at least one counter */ if (!drvdata->nr_cntr) - goto out; + return -EINVAL;
/* Find a counter that hasn't been initialised */ for (ctridx = 0; ctridx < drvdata->nr_cntr; ctridx++) @@ -630,15 +645,17 @@ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) /* All the counters have been configured already, bail out */ if (ctridx == drvdata->nr_cntr) { pr_debug("%s: no available counter found\n", __func__); - ret = -ENOSPC; - goto out; + return -ENOSPC; }
/* - * Searching for an available resource selector to use, starting at - * '2' since every implementation has at least 2 resource selector. - * ETMIDR4 gives the number of resource selector _pairs_, - * hence multiply by 2. + * Searching for an available resource selector to use, starting at '2' + * since resource 0 is the fixed 'always returns false' resource and 1 + * is the fixed 'always returns true' resource. See IHI0064H_b '7.3.64 + * TRCRSCTLRn, Resource Selection Control Registers, n=2-31'. + * + * ETMIDR4 gives the number of resource selector _pairs_, hence multiply + * by 2. */ for (rselector = 2; rselector < drvdata->nr_resource * 2; rselector++) if (!config->res_ctrl[rselector]) @@ -647,13 +664,9 @@ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) if (rselector == drvdata->nr_resource * 2) { pr_debug("%s: no available resource selector found\n", __func__); - ret = -ENOSPC; - goto out; + return -ENOSPC; }
- /* Remember what counter we used */ - counter = 1 << ctridx; - /* * Initialise original and reload counter value to the smallest * possible value in order to get as much precision as we can. @@ -661,26 +674,42 @@ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) config->cntr_val[ctridx] = 1; config->cntrldvr[ctridx] = 1;
- /* Set the trace counter control register */ - val = 0x1 << 16 | /* Bit 16, reload counter automatically */ - 0x0 << 7 | /* Select single resource selector */ - 0x1; /* Resource selector 1, i.e always true */ - - config->cntr_ctrl[ctridx] = val; - - val = 0x2 << 16 | /* Group 0b0010 - Counter and sequencers */ - counter << 0; /* Counter to use */ - - config->res_ctrl[rselector] = val; + /* + * Trace Counter Control Register TRCCNTCTLRn + * + * CNTCHAIN = 0, don't reload on the previous counter + * RLDSELF = true, reload counter automatically on underflow + * RLDTYPE = 0, one reload input resource + * RLDSEL = 0, reload on resource 0 (fixed always false resource, never + * reload) + * CNTTYPE = 0, one count input resource + * CNTSEL = 1, count on resource 1 (fixed always true resource, always + * decrement) + */ + config->cntr_ctrl[ctridx] = TRCCNTCTLRn_RLDSELF | + FIELD_PREP(TRCCNTCTLRn_CNTSEL_MASK, 1);
- val = 0x0 << 7 | /* Select single resource selector */ - rselector; /* Resource selector */ + /* + * Resource Selection Control Register TRCRSCTLRn + * + * PAIRINV = 0, INV = 0, don't invert + * GROUP = 2, SELECT = ctridx, trigger when counter 'ctridx' reaches 0 + * + * Multiple counters can be selected, and each bit signifies a counter, + * so set bit 'ctridx' to select our counter. + */ + config->res_ctrl[rselector] = FIELD_PREP(TRCRSCTLRn_GROUP_MASK, 2) | + FIELD_PREP(TRCRSCTLRn_SELECT_MASK, 1 << ctridx);
- config->ts_ctrl = val; + /* + * Global Timestamp Control Register TRCTSCTLR + * + * TYPE = 0, one input resource + * SEL = rselector, generate timestamp on resource 'rselector' + */ + config->ts_ctrl = FIELD_PREP(TRCTSCTLR_SEL_MASK, rselector);
- ret = 0; -out: - return ret; + return 0; }
static int etm4_parse_event_config(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index 813e7208ad17..a06338851ef5 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -225,6 +225,16 @@ #define TRCRSCTLRn_GROUP_MASK GENMASK(19, 16) #define TRCRSCTLRn_SELECT_MASK GENMASK(15, 0)
+#define TRCCNTCTLRn_CNTCHAIN BIT(17) +#define TRCCNTCTLRn_RLDSELF BIT(16) +#define TRCCNTCTLRn_RLDTYPE BIT(15) +#define TRCCNTCTLRn_RLDSEL_MASK GENMASK(12, 8) +#define TRCCNTCTLRn_CNTTYPE_MASK BIT(7) +#define TRCCNTCTLRn_CNTSEL_MASK GENMASK(4, 0) + +#define TRCTSCTLR_TYPE BIT(7) +#define TRCTSCTLR_SEL_MASK GENMASK(4, 0) + /* * System instructions to access ETM registers. * See ETMv4.4 spec ARM IHI0064F section 4.3.6 System instructions @@ -824,7 +834,7 @@ struct etmv4_config { u32 eventctrl0; u32 eventctrl1; u32 stall_ctrl; - u32 ts_ctrl; + u32 ts_ctrl; /* TRCTSCTLR */ u32 ccctlr; u32 bb_ctrl; u32 vinst_ctrl; @@ -837,11 +847,11 @@ struct etmv4_config { u32 seq_rst; u32 seq_state; u8 cntr_idx; - u32 cntrldvr[ETMv4_MAX_CNTR]; - u32 cntr_ctrl[ETMv4_MAX_CNTR]; - u32 cntr_val[ETMv4_MAX_CNTR]; + u32 cntrldvr[ETMv4_MAX_CNTR]; /* TRCCNTRLDVRn */ + u32 cntr_ctrl[ETMv4_MAX_CNTR]; /* TRCCNTCTLRn */ + u32 cntr_val[ETMv4_MAX_CNTR]; /* TRCCNTVRn */ u8 res_idx; - u32 res_ctrl[ETM_MAX_RES_SEL]; + u32 res_ctrl[ETM_MAX_RES_SEL]; /* TRCRSCTLRn */ u8 ss_idx; u32 ss_ctrl[ETM_MAX_SS_CMP]; u32 ss_status[ETM_MAX_SS_CMP];
Timestamps are currently emitted at the maximum rate possible, which is much too frequent for most use cases. Add an attribute to be able to set the interval. Granular control is not required, so save space in the config by interpreting it as 2 ^ ts_interval. And then 4 bits (0 - 15) is enough to set the interval to be larger than the existing SYNC timestamp interval.
No sysfs file is needed for this attribute because counter generated timestamps are only configured for Perf mode.
Only show this attribute for ETM4x because timestamps aren't configured in the same way for ETM3x. The attribute is only ever read in coresight-etm4x-core.c.
Reviewed-by: Leo Yan leo.yan@arm.com Tested-by: Leo Yan leo.yan@arm.com Signed-off-by: James Clark james.clark@linaro.org --- drivers/hwtracing/coresight/coresight-etm-perf.c | 16 +++++++++++++++- drivers/hwtracing/coresight/coresight-etm-perf.h | 7 +++++++ drivers/hwtracing/coresight/coresight-etm4x-core.c | 21 ++++++++++++--------- 3 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index f677c08233ba..0c1b990fc56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -13,6 +13,7 @@ #include <linux/mm.h> #include <linux/init.h> #include <linux/perf_event.h> +#include <linux/perf/arm_pmu.h> #include <linux/percpu-defs.h> #include <linux/slab.h> #include <linux/stringhash.h> @@ -69,7 +70,8 @@ PMU_FORMAT_ATTR(sinkid, "config2:0-31"); /* config ID - set if a system configuration is selected */ PMU_FORMAT_ATTR(configid, "config2:32-63"); PMU_FORMAT_ATTR(cc_threshold, "config3:0-11"); - +/* Interval = (2 ^ ts_level) */ +GEN_PMU_FORMAT_ATTR(ts_level);
/* * contextid always traces the "PID". The PID is in CONTEXTIDR_EL1 @@ -103,11 +105,23 @@ static struct attribute *etm_config_formats_attr[] = { &format_attr_configid.attr, &format_attr_branch_broadcast.attr, &format_attr_cc_threshold.attr, + &format_attr_ts_level.attr, NULL, };
+static umode_t etm_format_attr_is_visible(struct kobject *kobj, + struct attribute *attr, int unused) +{ + if (attr == &format_attr_ts_level.attr && + !IS_ENABLED(CONFIG_CORESIGHT_SOURCE_ETM4X)) + return 0; + + return attr->mode; +} + static const struct attribute_group etm_pmu_format_group = { .name = "format", + .is_visible = etm_format_attr_is_visible, .attrs = etm_config_formats_attr, };
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 5febbcdb8696..d2664ffb33e5 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -7,6 +7,7 @@ #ifndef _CORESIGHT_ETM_PERF_H #define _CORESIGHT_ETM_PERF_H
+#include <linux/bits.h> #include <linux/percpu-defs.h> #include "coresight-priv.h"
@@ -20,6 +21,12 @@ struct cscfg_config_desc; */ #define ETM_ADDR_CMP_MAX 8
+#define ATTR_CFG_FLD_ts_level_CFG config3 +#define ATTR_CFG_FLD_ts_level_LO 12 +#define ATTR_CFG_FLD_ts_level_HI 15 +#define ATTR_CFG_FLD_ts_level_MASK GENMASK(ATTR_CFG_FLD_ts_level_HI, \ + ATTR_CFG_FLD_ts_level_LO) + /** * struct etm_filter - single instruction range or start/stop configuration. * @start_addr: The address to start tracing on. diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 920d092ef862..034844f52bb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -28,6 +28,7 @@ #include <linux/amba/bus.h> #include <linux/seq_file.h> #include <linux/uaccess.h> +#include <linux/perf/arm_pmu.h> #include <linux/perf_event.h> #include <linux/platform_device.h> #include <linux/pm_runtime.h> @@ -616,7 +617,7 @@ static void etm4_enable_hw_smp_call(void *info) * +--------------+ * | * +------v-------+ - * | Counter x | (reload to 1 on underflow) + * | Counter x | (reload to 2 ^ ts_level on underflow) * +--------------+ * | * +------v--------------+ @@ -627,11 +628,17 @@ static void etm4_enable_hw_smp_call(void *info) * | Timestamp Generator | (timestamp on resource y) * +----------------------+ */ -static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) +static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata, + struct perf_event_attr *attr) { int ctridx; int rselector; struct etmv4_config *config = &drvdata->config; + u8 ts_level = ATTR_CFG_GET_FLD(attr, ts_level); + + /* Disable when ts_level == MAX */ + if (ts_level == FIELD_GET(ATTR_CFG_FLD_ts_level_MASK, UINT_MAX)) + return 0;
/* No point in trying if we don't have at least one counter */ if (!drvdata->nr_cntr) @@ -667,12 +674,8 @@ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata) return -ENOSPC; }
- /* - * Initialise original and reload counter value to the smallest - * possible value in order to get as much precision as we can. - */ - config->cntr_val[ctridx] = 1; - config->cntrldvr[ctridx] = 1; + /* Initialise original and reload counter value. */ + config->cntr_val[ctridx] = config->cntrldvr[ctridx] = 1 << ts_level;
/* * Trace Counter Control Register TRCCNTCTLRn @@ -762,7 +765,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev, * order to correlate instructions executed on different CPUs * (CPU-wide trace scenarios). */ - ret = etm4_config_timestamp_event(drvdata); + ret = etm4_config_timestamp_event(drvdata, attr);
/* * No need to go further if timestamp intervals can't
Document how the new field is used, maximum value and the interaction with SYNC timestamps.
Signed-off-by: James Clark james.clark@linaro.org --- Documentation/trace/coresight/coresight.rst | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst index 806699871b80..0cd83119b83f 100644 --- a/Documentation/trace/coresight/coresight.rst +++ b/Documentation/trace/coresight/coresight.rst @@ -619,6 +619,20 @@ They are also listed in the folder /sys/bus/event_source/devices/cs_etm/format/ - Cycle count threshold value. If nothing is provided here or the provided value is 0, then the default value i.e 0x100 will be used. If provided value is less than minimum cycles threshold value, as indicated via TRCIDR3.CCITMIN, then the minimum value will be used instead. + * - ts_level + - Controls frequency of timestamps. The reload value of the + timestamp counter is 2 raised to the power of this value. If the value is + 0 then the reload value is 1, if the value is 10 then the reload value is + 1024. Maximum allowed value is 15, and setting the maximum disables + generation of timestamps via the counter, freeing the counter resources. + Timestamps will be generated after 2 ^ ts_level cycles. + + Separately to this value, timestamps will also be emitted when a SYNC + packet is generated, although this is only for every 4096 bytes of trace. + Therefore it's not possible to generate timestamps less frequently than + that and ts_level timestamps are always in addition to SYNC timestamps. + Timestamps must be enabled for this to have effect. +
How to use the STM module -------------------------
On Thu, Oct 02, 2025 at 11:09:28AM +0100, James Clark wrote:
Do some cleanups then add a new format attribute to set the timestamp interval for ETMv4 in Perf mode. The current interval is too high for most use cases, and particularly on the FVP the number of timestamps generated is excessive.
For the series:
Reviewed-by: Leo Yan leo.yan@arm.com