This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patches 05 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board and TC platform. The previous one uses ETR and the later uses TRBE as sink.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately.
Changes from v2: - Rebased on CoreSight next branch. - Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki). - Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki). - Minor improvement for error handling in the AUX resume flow.
Changes from v1: - Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (6): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume
.../trace/coresight/coresight-perf.rst | 31 ++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++ .../hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++- .../coresight/coresight-etm4x-core.c | 143 +++++++++++++----- drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + include/linux/coresight.h | 4 + 7 files changed, 246 insertions(+), 42 deletions(-)
The trace unit is controlled in the ETM hardware enabling and disabling. The sequential changes for support AUX pause and resume will reuse the same operations.
Extract the operations in the etm4_{enable|disable}_trace_unit() functions. A minor improvement in etm4_enable_trace_unit() is for returning the timeout error to callers.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../coresight/coresight-etm4x-core.c | 103 +++++++++++------- 1 file changed, 62 insertions(+), 41 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e5972f16abff..53cb0569dbbf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -431,6 +431,44 @@ static int etm4x_wait_status(struct csdev_access *csa, int pos, int val) return coresight_timeout(csa, TRCSTATR, pos, val); }
+static int etm4_enable_trace_unit(struct etmv4_drvdata *drvdata) +{ + struct coresight_device *csdev = drvdata->csdev; + struct device *etm_dev = &csdev->dev; + struct csdev_access *csa = &csdev->access; + + /* + * ETE mandates that the TRCRSR is written to before + * enabling it. + */ + if (etm4x_is_ete(drvdata)) + etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); + + etm4x_allow_trace(drvdata); + /* Enable the trace unit */ + etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); + + /* Synchronize the register updates for sysreg access */ + if (!csa->io_mem) + isb(); + + /* wait for TRCSTATR.IDLE to go back down to '0' */ + if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) { + dev_err(etm_dev, + "timeout while waiting for Idle Trace Status\n"); + return -ETIME; + } + + /* + * As recommended by section 4.3.7 ("Synchronization when using the + * memory-mapped interface") of ARM IHI 0064D + */ + dsb(sy); + isb(); + + return 0; +} + static int etm4_enable_hw(struct etmv4_drvdata *drvdata) { int i, rc; @@ -539,33 +577,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- /* - * ETE mandates that the TRCRSR is written to before - * enabling it. - */ - if (etm4x_is_ete(drvdata)) - etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); - - etm4x_allow_trace(drvdata); - /* Enable the trace unit */ - etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); - - /* Synchronize the register updates for sysreg access */ - if (!csa->io_mem) - isb(); - - /* wait for TRCSTATR.IDLE to go back down to '0' */ - if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) - dev_err(etm_dev, - "timeout while waiting for Idle Trace Status\n"); - - /* - * As recommended by section 4.3.7 ("Synchronization when using the - * memory-mapped interface") of ARM IHI 0064D - */ - dsb(sy); - isb(); - + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -884,25 +896,12 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, return ret; }
-static void etm4_disable_hw(void *info) +static void etm4_disable_trace_unit(struct etmv4_drvdata *drvdata) { u32 control; - struct etmv4_drvdata *drvdata = info; - struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; struct device *etm_dev = &csdev->dev; struct csdev_access *csa = &csdev->access; - int i; - - etm4_cs_unlock(drvdata, csa); - etm4_disable_arch_specific(drvdata); - - if (!drvdata->skip_power_up) { - /* power can be removed from the trace unit now */ - control = etm4x_relaxed_read32(csa, TRCPDCR); - control &= ~TRCPDCR_PU; - etm4x_relaxed_write32(csa, control, TRCPDCR); - }
control = etm4x_relaxed_read32(csa, TRCPRGCTLR);
@@ -943,6 +942,28 @@ static void etm4_disable_hw(void *info) * of ARM IHI 0064H.b. */ isb(); +} + +static void etm4_disable_hw(void *info) +{ + u32 control; + struct etmv4_drvdata *drvdata = info; + struct etmv4_config *config = &drvdata->config; + struct coresight_device *csdev = drvdata->csdev; + struct csdev_access *csa = &csdev->access; + int i; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_arch_specific(drvdata); + + if (!drvdata->skip_power_up) { + /* power can be removed from the trace unit now */ + control = etm4x_relaxed_read32(csa, TRCPDCR); + control &= ~TRCPDCR_PU; + etm4x_relaxed_write32(csa, control, TRCPDCR); + } + + etm4_disable_trace_unit(drvdata);
/* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) {
On Tue, 11 Mar 2025 at 17:05, Leo Yan leo.yan@arm.com wrote:
The trace unit is controlled in the ETM hardware enabling and disabling. The sequential changes for support AUX pause and resume will reuse the same operations.
Extract the operations in the etm4_{enable|disable}_trace_unit() functions. A minor improvement in etm4_enable_trace_unit() is for returning the timeout error to callers.
Signed-off-by: Leo Yan leo.yan@arm.com
.../coresight/coresight-etm4x-core.c | 103 +++++++++++------- 1 file changed, 62 insertions(+), 41 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e5972f16abff..53cb0569dbbf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -431,6 +431,44 @@ static int etm4x_wait_status(struct csdev_access *csa, int pos, int val) return coresight_timeout(csa, TRCSTATR, pos, val); }
+static int etm4_enable_trace_unit(struct etmv4_drvdata *drvdata) +{
struct coresight_device *csdev = drvdata->csdev;
struct device *etm_dev = &csdev->dev;
struct csdev_access *csa = &csdev->access;
/*
* ETE mandates that the TRCRSR is written to before
* enabling it.
*/
if (etm4x_is_ete(drvdata))
etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
etm4x_allow_trace(drvdata);
/* Enable the trace unit */
etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
/* Synchronize the register updates for sysreg access */
if (!csa->io_mem)
isb();
/* wait for TRCSTATR.IDLE to go back down to '0' */
if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) {
dev_err(etm_dev,
"timeout while waiting for Idle Trace Status\n");
return -ETIME;
}
/*
* As recommended by section 4.3.7 ("Synchronization when using the
* memory-mapped interface") of ARM IHI 0064D
*/
dsb(sy);
isb();
return 0;
+}
static int etm4_enable_hw(struct etmv4_drvdata *drvdata) { int i, rc; @@ -539,33 +577,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
/*
* ETE mandates that the TRCRSR is written to before
* enabling it.
*/
if (etm4x_is_ete(drvdata))
etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
etm4x_allow_trace(drvdata);
/* Enable the trace unit */
etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
/* Synchronize the register updates for sysreg access */
if (!csa->io_mem)
isb();
/* wait for TRCSTATR.IDLE to go back down to '0' */
if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0))
dev_err(etm_dev,
"timeout while waiting for Idle Trace Status\n");
/*
* As recommended by section 4.3.7 ("Synchronization when using the
* memory-mapped interface") of ARM IHI 0064D
*/
dsb(sy);
isb();
rc = etm4_enable_trace_unit(drvdata);
done: etm4_cs_lock(drvdata, csa);
@@ -884,25 +896,12 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, return ret; }
-static void etm4_disable_hw(void *info) +static void etm4_disable_trace_unit(struct etmv4_drvdata *drvdata) { u32 control;
struct etmv4_drvdata *drvdata = info;
struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; struct device *etm_dev = &csdev->dev; struct csdev_access *csa = &csdev->access;
int i;
etm4_cs_unlock(drvdata, csa);
etm4_disable_arch_specific(drvdata);
if (!drvdata->skip_power_up) {
/* power can be removed from the trace unit now */
control = etm4x_relaxed_read32(csa, TRCPDCR);
control &= ~TRCPDCR_PU;
etm4x_relaxed_write32(csa, control, TRCPDCR);
} control = etm4x_relaxed_read32(csa, TRCPRGCTLR);
@@ -943,6 +942,28 @@ static void etm4_disable_hw(void *info) * of ARM IHI 0064H.b. */ isb(); +}
+static void etm4_disable_hw(void *info) +{
u32 control;
struct etmv4_drvdata *drvdata = info;
struct etmv4_config *config = &drvdata->config;
struct coresight_device *csdev = drvdata->csdev;
struct csdev_access *csa = &csdev->access;
int i;
etm4_cs_unlock(drvdata, csa);
etm4_disable_arch_specific(drvdata);
if (!drvdata->skip_power_up) {
/* power can be removed from the trace unit now */
control = etm4x_relaxed_read32(csa, TRCPDCR);
control &= ~TRCPDCR_PU;
etm4x_relaxed_write32(csa, control, TRCPDCR);
}
etm4_disable_trace_unit(drvdata); /* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) {
-- 2.34.1
Reviewed-by: Mike Leach <mike;leach@linaro.org>
Introduce APIs for pausing and resuming trace source and export as GPL symbols.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-core.c | 22 ++++++++++++++++++++ drivers/hwtracing/coresight/coresight-priv.h | 2 ++ include/linux/coresight.h | 4 ++++ 3 files changed, 28 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index fb43ef6a3b1f..d4c3000608f2 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -367,6 +367,28 @@ void coresight_disable_source(struct coresight_device *csdev, void *data) } EXPORT_SYMBOL_GPL(coresight_disable_source);
+void coresight_pause_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return; + + if (source_ops(csdev)->pause_perf) + source_ops(csdev)->pause_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_pause_source); + +int coresight_resume_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return -EOPNOTSUPP; + + if (!source_ops(csdev)->resume_perf) + return -EOPNOTSUPP; + + return source_ops(csdev)->resume_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_resume_source); + /* * coresight_disable_path_from : Disable components in the given path beyond * @nd in the list. If @nd is NULL, all the components, except the SOURCE are diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 82644aff8d2b..2d9baa9d8228 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -249,5 +249,7 @@ void coresight_add_helper(struct coresight_device *csdev, void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); struct coresight_device *coresight_get_percpu_sink(int cpu); void coresight_disable_source(struct coresight_device *csdev, void *data); +void coresight_pause_source(struct coresight_device *csdev); +int coresight_resume_source(struct coresight_device *csdev);
#endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d79a242b271d..c95c72e07e02 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -398,6 +398,8 @@ struct coresight_ops_link { * is associated to. * @enable: enables tracing for a source. * @disable: disables tracing for a source. + * @resume_perf: resumes tracing for a source in perf session. + * @pause_perf: pauses tracing for a source in perf session. */ struct coresight_ops_source { int (*cpu_id)(struct coresight_device *csdev); @@ -405,6 +407,8 @@ struct coresight_ops_source { enum cs_mode mode, struct coresight_path *path); void (*disable)(struct coresight_device *csdev, struct perf_event *event); + int (*resume_perf)(struct coresight_device *csdev); + void (*pause_perf)(struct coresight_device *csdev); };
/**
On Tue, 11 Mar 2025 at 17:05, Leo Yan leo.yan@arm.com wrote:
Introduce APIs for pausing and resuming trace source and export as GPL symbols.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-core.c | 22 ++++++++++++++++++++ drivers/hwtracing/coresight/coresight-priv.h | 2 ++ include/linux/coresight.h | 4 ++++ 3 files changed, 28 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index fb43ef6a3b1f..d4c3000608f2 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -367,6 +367,28 @@ void coresight_disable_source(struct coresight_device *csdev, void *data) } EXPORT_SYMBOL_GPL(coresight_disable_source);
+void coresight_pause_source(struct coresight_device *csdev) +{
if (!coresight_is_percpu_source(csdev))
return;
if (source_ops(csdev)->pause_perf)
source_ops(csdev)->pause_perf(csdev);
+} +EXPORT_SYMBOL_GPL(coresight_pause_source);
+int coresight_resume_source(struct coresight_device *csdev) +{
if (!coresight_is_percpu_source(csdev))
return -EOPNOTSUPP;
if (!source_ops(csdev)->resume_perf)
return -EOPNOTSUPP;
return source_ops(csdev)->resume_perf(csdev);
+} +EXPORT_SYMBOL_GPL(coresight_resume_source);
/*
- coresight_disable_path_from : Disable components in the given path beyond
- @nd in the list. If @nd is NULL, all the components, except the SOURCE are
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 82644aff8d2b..2d9baa9d8228 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -249,5 +249,7 @@ void coresight_add_helper(struct coresight_device *csdev, void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); struct coresight_device *coresight_get_percpu_sink(int cpu); void coresight_disable_source(struct coresight_device *csdev, void *data); +void coresight_pause_source(struct coresight_device *csdev); +int coresight_resume_source(struct coresight_device *csdev);
#endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d79a242b271d..c95c72e07e02 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -398,6 +398,8 @@ struct coresight_ops_link {
is associated to.
- @enable: enables tracing for a source.
- @disable: disables tracing for a source.
- @resume_perf: resumes tracing for a source in perf session.
*/
- @pause_perf: pauses tracing for a source in perf session.
struct coresight_ops_source { int (*cpu_id)(struct coresight_device *csdev); @@ -405,6 +407,8 @@ struct coresight_ops_source { enum cs_mode mode, struct coresight_path *path); void (*disable)(struct coresight_device *csdev, struct perf_event *event);
int (*resume_perf)(struct coresight_device *csdev);
void (*pause_perf)(struct coresight_device *csdev);
};
/**
2.34.1
Reviewed-by: Mike Leach mike.leach@linaro.org
Add callbacks for pausing and resuming the tracer.
A "paused" flag in the driver data indicates whether the tracer is paused. If the flag is set, the driver will skip starting the hardware trace. The flag is always set to false for the sysfs mode, meaning the tracer will never be paused in the case.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../coresight/coresight-etm4x-core.c | 42 ++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x.h | 2 + 2 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 53cb0569dbbf..5b69446db947 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -577,7 +577,8 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- rc = etm4_enable_trace_unit(drvdata); + if (!drvdata->paused) + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -820,6 +821,9 @@ static int etm4_enable_perf(struct coresight_device *csdev,
drvdata->trcid = path->trace_id;
+ /* Populate pause state */ + drvdata->paused = !!READ_ONCE(event->hw.aux_paused); + /* And enable it */ ret = etm4_enable_hw(drvdata);
@@ -846,6 +850,9 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
drvdata->trcid = path->trace_id;
+ /* Tracer will never be paused in sysfs mode */ + drvdata->paused = false; + /* * Executing etm4_enable_hw on the cpu whose ETM is being enabled * ensures that register writes occur when cpu is powered. @@ -1080,10 +1087,43 @@ static void etm4_disable(struct coresight_device *csdev, coresight_set_mode(csdev, CS_MODE_DISABLED); }
+static int etm4_resume_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return -EINVAL; + + etm4_cs_unlock(drvdata, csa); + etm4_enable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = false; + return 0; +} + +static void etm4_pause_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = true; +} + static const struct coresight_ops_source etm4_source_ops = { .cpu_id = etm4_cpu_id, .enable = etm4_enable, .disable = etm4_disable, + .resume_perf = etm4_resume_perf, + .pause_perf = etm4_pause_perf, };
static const struct coresight_ops etm4_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index bd7db36ba197..ac649515054d 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -983,6 +983,7 @@ struct etmv4_save_state { * @state_needs_restore: True when there is context to restore after PM exit * @skip_power_up: Indicates if an implementation can skip powering up * the trace unit. + * @paused: Indicates if the trace unit is paused. * @arch_features: Bitmap of arch features of etmv4 devices. */ struct etmv4_drvdata { @@ -1036,6 +1037,7 @@ struct etmv4_drvdata { struct etmv4_save_state *save_state; bool state_needs_restore; bool skip_power_up; + bool paused; DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX); };
On Tue, 11 Mar 2025 at 17:05, Leo Yan leo.yan@arm.com wrote:
Add callbacks for pausing and resuming the tracer.
A "paused" flag in the driver data indicates whether the tracer is paused. If the flag is set, the driver will skip starting the hardware trace. The flag is always set to false for the sysfs mode, meaning the tracer will never be paused in the case.
Signed-off-by: Leo Yan leo.yan@arm.com
.../coresight/coresight-etm4x-core.c | 42 ++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x.h | 2 + 2 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 53cb0569dbbf..5b69446db947 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -577,7 +577,8 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
rc = etm4_enable_trace_unit(drvdata);
if (!drvdata->paused)
rc = etm4_enable_trace_unit(drvdata);
done: etm4_cs_lock(drvdata, csa);
@@ -820,6 +821,9 @@ static int etm4_enable_perf(struct coresight_device *csdev,
drvdata->trcid = path->trace_id;
/* Populate pause state */
drvdata->paused = !!READ_ONCE(event->hw.aux_paused);
/* And enable it */ ret = etm4_enable_hw(drvdata);
@@ -846,6 +850,9 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
drvdata->trcid = path->trace_id;
/* Tracer will never be paused in sysfs mode */
drvdata->paused = false;
/* * Executing etm4_enable_hw on the cpu whose ETM is being enabled * ensures that register writes occur when cpu is powered.
@@ -1080,10 +1087,43 @@ static void etm4_disable(struct coresight_device *csdev, coresight_set_mode(csdev, CS_MODE_DISABLED); }
+static int etm4_resume_perf(struct coresight_device *csdev) +{
struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct csdev_access *csa = &csdev->access;
if (coresight_get_mode(csdev) != CS_MODE_PERF)
return -EINVAL;
etm4_cs_unlock(drvdata, csa);
etm4_enable_trace_unit(drvdata);
etm4_cs_lock(drvdata, csa);
drvdata->paused = false;
return 0;
+}
+static void etm4_pause_perf(struct coresight_device *csdev) +{
struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct csdev_access *csa = &csdev->access;
if (coresight_get_mode(csdev) != CS_MODE_PERF)
return;
etm4_cs_unlock(drvdata, csa);
etm4_disable_trace_unit(drvdata);
etm4_cs_lock(drvdata, csa);
drvdata->paused = true;
+}
static const struct coresight_ops_source etm4_source_ops = { .cpu_id = etm4_cpu_id, .enable = etm4_enable, .disable = etm4_disable,
.resume_perf = etm4_resume_perf,
.pause_perf = etm4_pause_perf,
};
static const struct coresight_ops etm4_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index bd7db36ba197..ac649515054d 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -983,6 +983,7 @@ struct etmv4_save_state {
- @state_needs_restore: True when there is context to restore after PM exit
- @skip_power_up: Indicates if an implementation can skip powering up
the trace unit.
*/
- @paused: Indicates if the trace unit is paused.
- @arch_features: Bitmap of arch features of etmv4 devices.
struct etmv4_drvdata { @@ -1036,6 +1037,7 @@ struct etmv4_drvdata { struct etmv4_save_state *save_state; bool state_needs_restore; bool skip_power_up;
bool paused; DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX);
};
-- 2.34.1
Reviewed-by: Mike Leach mike.leach@linaro.org
This commit supports AUX trace pause and resume in a perf session for Arm CoreSight.
First, we need to decide which flag can indicate the CoreSight PMU event has started. The 'event->hw.state' cannot be used for this purpose because its initial value and the value after hardware trace enabling are both 0.
On the other hand, the context value 'ctxt->event_data' stores the ETM private info. This pointer is valid only when the PMU event has been enabled. It is safe to permit AUX trace pause and resume operations only when it is not a NULL pointer.
To achieve fine-grained control of the pause and resume, only the tracer is disabled and enabled. This avoids the unnecessary complexity and latency caused by manipulating the entire link path.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 45 ++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index f4cccd68e625..2dcf1809cb7f 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -365,6 +365,18 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, continue; }
+ /* + * If AUX pause feature is enabled but the ETM driver does not + * support the operations, clear this CPU from the mask and + * continue to next one. + */ + if (event->attr.aux_start_paused && + (!source_ops(csdev)->pause_perf || !source_ops(csdev)->resume_perf)) { + dev_err_once(&csdev->dev, "AUX pause is not supported.\n"); + cpumask_clear_cpu(cpu, mask); + continue; + } + /* * No sink provided - look for a default sink for all the ETMs, * where this event can be scheduled. @@ -450,6 +462,15 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, goto out; }
+static int etm_event_resume(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return 0; + + return coresight_resume_source(csdev); +} + static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
+ if (flags & PERF_EF_RESUME) { + if (etm_event_resume(csdev, ctxt) < 0) { + dev_err(&csdev->dev, "Failed to resume ETM event.\n"); + goto fail; + } + return; + } + /* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail; @@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return; + + /* Stop tracer */ + coresight_pause_source(csdev); +} + static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
+ if (mode & PERF_EF_PAUSE) + return etm_event_pause(csdev, ctxt); + /* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking. @@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE | - PERF_PMU_CAP_ITRACE); + PERF_PMU_CAP_ITRACE | + PERF_PMU_CAP_AUX_PAUSE);
etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
Hi Leo,
On Tue, 11 Mar 2025 at 17:05, Leo Yan leo.yan@arm.com wrote:
This commit supports AUX trace pause and resume in a perf session for Arm CoreSight.
First, we need to decide which flag can indicate the CoreSight PMU event has started. The 'event->hw.state' cannot be used for this purpose because its initial value and the value after hardware trace enabling are both 0.
On the other hand, the context value 'ctxt->event_data' stores the ETM private info. This pointer is valid only when the PMU event has been enabled. It is safe to permit AUX trace pause and resume operations only when it is not a NULL pointer.
To achieve fine-grained control of the pause and resume, only the tracer is disabled and enabled. This avoids the unnecessary complexity and latency caused by manipulating the entire link path.
Signed-off-by: Leo Yan leo.yan@arm.com
.../hwtracing/coresight/coresight-etm-perf.c | 45 ++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index f4cccd68e625..2dcf1809cb7f 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -365,6 +365,18 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, continue; }
/*
* If AUX pause feature is enabled but the ETM driver does not
* support the operations, clear this CPU from the mask and
* continue to next one.
*/
if (event->attr.aux_start_paused &&
(!source_ops(csdev)->pause_perf || !source_ops(csdev)->resume_perf)) {
dev_err_once(&csdev->dev, "AUX pause is not supported.\n");
cpumask_clear_cpu(cpu, mask);
continue;
}
/* * No sink provided - look for a default sink for all the ETMs, * where this event can be scheduled.
@@ -450,6 +462,15 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, goto out; }
+static int etm_event_resume(struct coresight_device *csdev,
struct etm_ctxt *ctxt)
+{
if (!ctxt->event_data)
return 0;
return coresight_resume_source(csdev);
+}
static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
Is it possible here that the first call to etm_event_start() also has the PERF_EF_RESUME flag set? If so it looks like we need to fall through and do a "normal" start to get all the ctxt->event_data set up.
if (flags & PERF_EF_RESUME) {
if (etm_event_resume(csdev, ctxt) < 0) {
dev_err(&csdev->dev, "Failed to resume ETM event.\n");
goto fail;
}
return;
}
/* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail;
@@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev,
struct etm_ctxt *ctxt)
+{
if (!ctxt->event_data)
return;
/* Stop tracer */
coresight_pause_source(csdev);
+}
static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
if (mode & PERF_EF_PAUSE)
return etm_event_pause(csdev, ctxt);
/* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking.
@@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE |
PERF_PMU_CAP_ITRACE);
PERF_PMU_CAP_ITRACE |
PERF_PMU_CAP_AUX_PAUSE); etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
-- 2.34.1
If the possible issue above is prevented by perf internals
Reviewed-by: Mike Leach mike.leach@linaro.org
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
Hi Mike,
On Tue, Apr 01, 2025 at 01:50:52PM +0100, Mike Leach wrote:
[...]
static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
Is it possible here that the first call to etm_event_start() also has the PERF_EF_RESUME flag set?
The first call has a flow below, using flag 0 but not PERF_EF_RESUME.
etm_event_add() `> etm_event_start(event, 0);
Note: for the first call, the tracer should be disabled if 'event->hw.aux_paused' is 1. This is ensured by patch 03.
Thanks, Leo
If so it looks like we need to fall through and do a "normal" start to get all the ctxt->event_data set up.
if (flags & PERF_EF_RESUME) {
if (etm_event_resume(csdev, ctxt) < 0) {
dev_err(&csdev->dev, "Failed to resume ETM event.\n");
goto fail;
}
return;
}
/* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail;
@@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev,
struct etm_ctxt *ctxt)
+{
if (!ctxt->event_data)
return;
/* Stop tracer */
coresight_pause_source(csdev);
+}
static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
if (mode & PERF_EF_PAUSE)
return etm_event_pause(csdev, ctxt);
/* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking.
@@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE |
PERF_PMU_CAP_ITRACE);
PERF_PMU_CAP_ITRACE |
PERF_PMU_CAP_AUX_PAUSE); etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
-- 2.34.1
If the possible issue above is prevented by perf internals
Reviewed-by: Mike Leach mike.leach@linaro.org
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
Hi Leo
On Tue, 1 Apr 2025 at 16:00, Leo Yan leo.yan@arm.com wrote:
Hi Mike,
On Tue, Apr 01, 2025 at 01:50:52PM +0100, Mike Leach wrote:
[...]
static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
Is it possible here that the first call to etm_event_start() also has the PERF_EF_RESUME flag set?
The first call has a flow below, using flag 0 but not PERF_EF_RESUME.
etm_event_add() `> etm_event_start(event, 0);
When I looked at the vague comments in the perf source - it seemed to imply that ->start() calls could overlap - so the associated event that resumes trace could occur at the same time as the initialising start from paused for the trace operations.
If we are guaranteed this cannot happen then we are good to go!
Note: for the first call, the tracer should be disabled if 'event->hw.aux_paused' is 1. This is ensured by patch 03.
Yes - I saw that.
Mike
Thanks, Leo
If so it looks like we need to fall through and do a "normal" start to get all the ctxt->event_data set up.
if (flags & PERF_EF_RESUME) {
if (etm_event_resume(csdev, ctxt) < 0) {
dev_err(&csdev->dev, "Failed to resume ETM event.\n");
goto fail;
}
return;
}
/* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail;
@@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev,
struct etm_ctxt *ctxt)
+{
if (!ctxt->event_data)
return;
/* Stop tracer */
coresight_pause_source(csdev);
+}
static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
if (mode & PERF_EF_PAUSE)
return etm_event_pause(csdev, ctxt);
/* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking.
@@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE |
PERF_PMU_CAP_ITRACE);
PERF_PMU_CAP_ITRACE |
PERF_PMU_CAP_AUX_PAUSE); etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
-- 2.34.1
If the possible issue above is prevented by perf internals
Reviewed-by: Mike Leach mike.leach@linaro.org
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
Hi Mike,
On Wed, Apr 02, 2025 at 09:45:20AM +0100, Mike Leach wrote:
[...]
static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
Is it possible here that the first call to etm_event_start() also has the PERF_EF_RESUME flag set?
The first call has a flow below, using flag 0 but not PERF_EF_RESUME.
etm_event_add() `> etm_event_start(event, 0);
When I looked at the vague comments in the perf source - it seemed to imply that ->start() calls could overlap - so the associated event that resumes trace could occur at the same time as the initialising start from paused for the trace operations.
Good point. A subtle but important thing is the 'cs_etm' event must be an event group leader, otherwise, the tool reports error:
# perf record -m,64M -e cycles/aux-action=pause,period=10000000/ \ -e cycles/aux-action=resume,period=10/ \ -e cs_etm/aux-action=start-paused/u -- /mnt/sort Events with aux-action must have AUX area event group leader
If the 'cs_etm' event is the event group leader, it will be always enabled ahead other PMU events. So etm_event_start(event, 0) is invoked prior to PMU events enabling. As a result, this can avoid the race condition you mentioned.
I confirmed with ftrace log:
sort-901 [005] d..3. 1033.827186: etm_event_add <-event_sched_in sort-901 [005] d..3. 1033.827187: etm_event_start <-etm_event_add sort-901 [005] d..3. 1033.827283: armpmu_add <-event_sched_in sort-901 [005] d..3. 1033.827287: armpmu_start <-armpmu_add sort-901 [005] d..3. 1033.827288: armpmu_event_set_period <-armpmu_start sort-901 [005] d..3. 1033.827292: armpmu_add <-event_sched_in sort-901 [005] d..3. 1033.827293: armpmu_start <-armpmu_add sort-901 [005] d..3. 1033.827294: armpmu_event_set_period <-armpmu_start sort-901 [005] d..3. 1033.827298: armpmu_filter <-visit_groups_merge.constprop.0.isra.0 sort-901 [005] d..3. 1033.827298: armpmu_enable <-perf_pmu_enable sort-901 [005] d..3. 1033.827301: armpmu_enable <-perf_pmu_enable sort-901 [005] d.h1. 1033.827304: armpmu_dispatch_irq <-__handle_irq_event_percpu sort-901 [005] d.h1. 1033.827306: armpmu_event_update <-armv8pmu_handle_irq sort-901 [005] d.h1. 1033.827308: armpmu_event_set_period <-armv8pmu_handle_irq sort-901 [005] d.h.. 1033.827322: perf_event_aux_pause: event=ffff000207503e40 pause=0
If we are guaranteed this cannot happen then we are good to go!
Now I think we are safe, right? ;)
Thanks, Leo
Hi Leo
On Wed, 2 Apr 2025 at 13:31, Leo Yan leo.yan@arm.com wrote:
Hi Mike,
On Wed, Apr 02, 2025 at 09:45:20AM +0100, Mike Leach wrote:
[...]
static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
Is it possible here that the first call to etm_event_start() also has the PERF_EF_RESUME flag set?
The first call has a flow below, using flag 0 but not PERF_EF_RESUME.
etm_event_add() `> etm_event_start(event, 0);
When I looked at the vague comments in the perf source - it seemed to imply that ->start() calls could overlap - so the associated event that resumes trace could occur at the same time as the initialising start from paused for the trace operations.
Good point. A subtle but important thing is the 'cs_etm' event must be an event group leader, otherwise, the tool reports error:
# perf record -m,64M -e cycles/aux-action=pause,period=10000000/ \ -e cycles/aux-action=resume,period=10/ \ -e cs_etm/aux-action=start-paused/u -- /mnt/sort Events with aux-action must have AUX area event group leader
If the 'cs_etm' event is the event group leader, it will be always enabled ahead other PMU events. So etm_event_start(event, 0) is invoked prior to PMU events enabling. As a result, this can avoid the race condition you mentioned.
I confirmed with ftrace log:
sort-901 [005] d..3. 1033.827186: etm_event_add <-event_sched_in sort-901 [005] d..3. 1033.827187: etm_event_start <-etm_event_add sort-901 [005] d..3. 1033.827283: armpmu_add <-event_sched_in sort-901 [005] d..3. 1033.827287: armpmu_start <-armpmu_add sort-901 [005] d..3. 1033.827288: armpmu_event_set_period <-armpmu_start sort-901 [005] d..3. 1033.827292: armpmu_add <-event_sched_in sort-901 [005] d..3. 1033.827293: armpmu_start <-armpmu_add sort-901 [005] d..3. 1033.827294: armpmu_event_set_period <-armpmu_start sort-901 [005] d..3. 1033.827298: armpmu_filter <-visit_groups_merge.constprop.0.isra.0 sort-901 [005] d..3. 1033.827298: armpmu_enable <-perf_pmu_enable sort-901 [005] d..3. 1033.827301: armpmu_enable <-perf_pmu_enable sort-901 [005] d.h1. 1033.827304: armpmu_dispatch_irq <-__handle_irq_event_percpu sort-901 [005] d.h1. 1033.827306: armpmu_event_update <-armv8pmu_handle_irq sort-901 [005] d.h1. 1033.827308: armpmu_event_set_period <-armv8pmu_handle_irq sort-901 [005] d.h.. 1033.827322: perf_event_aux_pause: event=ffff000207503e40 pause=0
If we are guaranteed this cannot happen then we are good to go!
Now I think we are safe, right? ;)
Thanks, Leo
I'd agree with that!
Mike
Due to sinks like ETR and ETB don't support interrupt handling, the hardware trace data might be lost for continuous running tasks.
This commit takes advantage of the AUX pause for updating trace buffer to mitigate the trace data losing issue.
The per CPU sink has its own interrupt handling. Thus, there will be a race condition between the updating buffer in NMI and sink's interrupt handler. To avoid the race condition, this commit disallows updating buffer on AUX pause for the per CPU sink. Currently, this is only applied for TRBE.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 43 ++++++++++++++++++- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2dcf1809cb7f..f1551c08ecb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -574,14 +574,53 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
-static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event, + struct coresight_device *csdev, struct etm_ctxt *ctxt) { + int cpu = smp_processor_id(); + struct coresight_device *sink; + struct perf_output_handle *handle = &ctxt->handle; + struct coresight_path *path; + unsigned long size; + if (!ctxt->event_data) return;
/* Stop tracer */ coresight_pause_source(csdev); + + path = etm_event_cpu_path(ctxt->event_data, cpu); + sink = coresight_get_sink(path); + if (WARN_ON_ONCE(!sink)) + return; + + /* + * The per CPU sink has own interrupt handling, it might have + * race condition with updating buffer on AUX trace pause if + * it is invoked from NMI. To avoid the race condition, + * disallows updating buffer for the per CPU sink case. + */ + if (coresight_is_percpu_sink(sink)) + return; + + if (WARN_ON_ONCE(handle->event != event)) + return; + + if (!sink_ops(sink)->update_buffer) + return; + + size = sink_ops(sink)->update_buffer(sink, handle, + ctxt->event_data->snk_config); + if (READ_ONCE(handle->event)) { + if (!size) + return; + + perf_aux_output_end(handle, size); + perf_aux_output_begin(handle, event); + } else { + WARN_ON_ONCE(size); + } }
static void etm_event_stop(struct perf_event *event, int mode) @@ -595,7 +634,7 @@ static void etm_event_stop(struct perf_event *event, int mode) struct coresight_path *path;
if (mode & PERF_EF_PAUSE) - return etm_event_pause(csdev, ctxt); + return etm_event_pause(event, csdev, ctxt);
/* * If we still have access to the event_data via handle,
On 11/03/2025 17:04, Leo Yan wrote:
Due to sinks like ETR and ETB don't support interrupt handling, the hardware trace data might be lost for continuous running tasks.
This commit takes advantage of the AUX pause for updating trace buffer to mitigate the trace data losing issue.
The per CPU sink has its own interrupt handling. Thus, there will be a race condition between the updating buffer in NMI and sink's interrupt handler. To avoid the race condition, this commit disallows updating buffer on AUX pause for the per CPU sink. Currently, this is only applied for TRBE.
Signed-off-by: Leo Yan leo.yan@arm.com
.../hwtracing/coresight/coresight-etm-perf.c | 43 ++++++++++++++++++- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2dcf1809cb7f..f1551c08ecb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -574,14 +574,53 @@ static void etm_event_start(struct perf_event *event, int flags) return; } -static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event,
{struct coresight_device *csdev, struct etm_ctxt *ctxt)
- int cpu = smp_processor_id();
- struct coresight_device *sink;
- struct perf_output_handle *handle = &ctxt->handle;
- struct coresight_path *path;
- unsigned long size;
- if (!ctxt->event_data) return;
/* Stop tracer */ coresight_pause_source(csdev);
- path = etm_event_cpu_path(ctxt->event_data, cpu);
- sink = coresight_get_sink(path);
- if (WARN_ON_ONCE(!sink))
return;
- /*
* The per CPU sink has own interrupt handling, it might have
* race condition with updating buffer on AUX trace pause if
* it is invoked from NMI. To avoid the race condition,
* disallows updating buffer for the per CPU sink case.
*/
- if (coresight_is_percpu_sink(sink))
return;
- if (WARN_ON_ONCE(handle->event != event))
return;
- if (!sink_ops(sink)->update_buffer)
return;
- size = sink_ops(sink)->update_buffer(sink, handle,
ctxt->event_data->snk_config);
I believe we keep the sink disabled/stopped in update_buffer. We need to turn it back ON after the "buffer update". May be we could use a flag to update_buffer() to indicate this is "pause" triggered update.
Cheers Suzuki
- if (READ_ONCE(handle->event)) {
if (!size)
return;
perf_aux_output_end(handle, size);
perf_aux_output_begin(handle, event);
- } else {
WARN_ON_ONCE(size);
- } }
static void etm_event_stop(struct perf_event *event, int mode) @@ -595,7 +634,7 @@ static void etm_event_stop(struct perf_event *event, int mode) struct coresight_path *path; if (mode & PERF_EF_PAUSE)
return etm_event_pause(csdev, ctxt);
return etm_event_pause(event, csdev, ctxt);
/* * If we still have access to the event_data via handle,
On Tue, 1 Apr 2025 at 13:52, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 11/03/2025 17:04, Leo Yan wrote:
Due to sinks like ETR and ETB don't support interrupt handling, the hardware trace data might be lost for continuous running tasks.
This commit takes advantage of the AUX pause for updating trace buffer to mitigate the trace data losing issue.
The per CPU sink has its own interrupt handling. Thus, there will be a race condition between the updating buffer in NMI and sink's interrupt handler. To avoid the race condition, this commit disallows updating buffer on AUX pause for the per CPU sink. Currently, this is only applied for TRBE.
Signed-off-by: Leo Yan leo.yan@arm.com
.../hwtracing/coresight/coresight-etm-perf.c | 43 ++++++++++++++++++- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2dcf1809cb7f..f1551c08ecb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -574,14 +574,53 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
-static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event,
struct coresight_device *csdev, struct etm_ctxt *ctxt)
{
int cpu = smp_processor_id();
struct coresight_device *sink;
struct perf_output_handle *handle = &ctxt->handle;
struct coresight_path *path;
unsigned long size;
if (!ctxt->event_data) return; /* Stop tracer */ coresight_pause_source(csdev);
path = etm_event_cpu_path(ctxt->event_data, cpu);
sink = coresight_get_sink(path);
if (WARN_ON_ONCE(!sink))
return;
/*
* The per CPU sink has own interrupt handling, it might have
* race condition with updating buffer on AUX trace pause if
* it is invoked from NMI. To avoid the race condition,
* disallows updating buffer for the per CPU sink case.
*/
if (coresight_is_percpu_sink(sink))
return;
if (WARN_ON_ONCE(handle->event != event))
return;
if (!sink_ops(sink)->update_buffer)
return;
size = sink_ops(sink)->update_buffer(sink, handle,
ctxt->event_data->snk_config);
I believe we keep the sink disabled/stopped in update_buffer. We need to turn it back ON after the "buffer update". May be we could use a flag to update_buffer() to indicate this is "pause" triggered update.
Cheers Suzuki
The sink is stopped to read data, but also important is the enable/disable refcount. The only time that "update_buffer" will read is if the sink has a refcount == 1. These are ultimately controlled by enable/disable path - which will not occur during pause/resume operations.
Regards
Mike
if (READ_ONCE(handle->event)) {
if (!size)
return;
perf_aux_output_end(handle, size);
perf_aux_output_begin(handle, event);
} else {
WARN_ON_ONCE(size);
}
}
static void etm_event_stop(struct perf_event *event, int mode)
@@ -595,7 +634,7 @@ static void etm_event_stop(struct perf_event *event, int mode) struct coresight_path *path;
if (mode & PERF_EF_PAUSE)
return etm_event_pause(csdev, ctxt);
return etm_event_pause(event, csdev, ctxt); /* * If we still have access to the event_data via handle,
On Tue, Apr 01, 2025 at 03:35:44PM +0100, Mike Leach wrote:
[...]
-static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event,
struct coresight_device *csdev, struct etm_ctxt *ctxt)
{
int cpu = smp_processor_id();
struct coresight_device *sink;
struct perf_output_handle *handle = &ctxt->handle;
struct coresight_path *path;
unsigned long size;
if (!ctxt->event_data) return; /* Stop tracer */ coresight_pause_source(csdev);
path = etm_event_cpu_path(ctxt->event_data, cpu);
sink = coresight_get_sink(path);
if (WARN_ON_ONCE(!sink))
return;
/*
* The per CPU sink has own interrupt handling, it might have
* race condition with updating buffer on AUX trace pause if
* it is invoked from NMI. To avoid the race condition,
* disallows updating buffer for the per CPU sink case.
*/
if (coresight_is_percpu_sink(sink))
return;
if (WARN_ON_ONCE(handle->event != event))
return;
if (!sink_ops(sink)->update_buffer)
return;
size = sink_ops(sink)->update_buffer(sink, handle,
ctxt->event_data->snk_config);
I believe we keep the sink disabled/stopped in update_buffer. We need to turn it back ON after the "buffer update". May be we could use a flag to update_buffer() to indicate this is "pause" triggered update.
Thanks for pointing out, Suzuki.
I will fix it in next spin.
The sink is stopped to read data, but also important is the enable/disable refcount. The only time that "update_buffer" will read is if the sink has a refcount == 1. These are ultimately controlled by enable/disable path - which will not occur during pause/resume operations.
Hi Mike,
My understanding is: if a sink is shared by multiple CPUs, e.g. a Perf session with CPU mode or system wide mode, then we cannot arbitrarily disable sink and update buffer, otherwise, it might cause hardware lockup issue.
For per-thread mode, the refcount == 1, we can take chance to update buffer, right?
Thanks, Leo
This adds description for AUX pause and resume. It gives introduction for what's AUX pause and resume and records usage examples.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../trace/coresight/coresight-perf.rst | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/Documentation/trace/coresight/coresight-perf.rst b/Documentation/trace/coresight/coresight-perf.rst index d087aae7d492..30be89320621 100644 --- a/Documentation/trace/coresight/coresight-perf.rst +++ b/Documentation/trace/coresight/coresight-perf.rst @@ -78,6 +78,37 @@ enabled like::
Please refer to the kernel configuration help for more information.
+Fine-grained tracing with AUX pause and resume +---------------------------------------------- + +Arm CoreSight may generate a large amount of hardware trace data, which +will lead to overhead in recording and distract users when reviewing +profiling result. To mitigate the issue of excessive trace data, Perf +provides AUX pause and resume functionality for fine-grained tracing. + +The AUX pause and resume can be triggered by associated events. These +events can be ftrace tracepoints (including static and dynamic +tracepoints) or PMU events (e.g. CPU PMU cycle event). To create a perf +session with AUX pause / resume, three configuration terms are +introduced: + +- "aux-action=start-paused": it is specified for the cs_etm PMU event to + launch in a paused state. +- "aux-action=pause": an associated event is specified with this term + to pause AUX trace. +- "aux-action=resume": an associated event is specified with this term + to resume AUX trace. + +Example for triggering AUX pause and resume with ftrace tracepoints:: + + perf record -e cs_etm/aux-action=start-paused/k,syscalls:sys_enter_openat/aux-action=resume/,syscalls:sys_exit_openat/aux-action=pause/ ls + +Example for triggering AUX pause and resume with PMU event:: + + perf record -a -e cs_etm/aux-action=start-paused/k \ + -e cycles/aux-action=pause,period=10000000/ \ + -e cycles/aux-action=resume,period=1050000/ -- sleep 1 + Perf test - Verify kernel and userspace perf CoreSight work -----------------------------------------------------------