This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patches 05 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board and TC platform. The previous one uses ETR and the later uses TRBE as sink.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately.
Changes from v2: - Rebased on CoreSight next branch. - Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki). - Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki). - Minor improvement for error handling in the AUX resume flow.
Changes from v1: - Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (6): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume
.../trace/coresight/coresight-perf.rst | 31 ++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++ .../hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++- .../coresight/coresight-etm4x-core.c | 143 +++++++++++++----- drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + include/linux/coresight.h | 4 + 7 files changed, 246 insertions(+), 42 deletions(-)
The trace unit is controlled in the ETM hardware enabling and disabling. The sequential changes for support AUX pause and resume will reuse the same operations.
Extract the operations in the etm4_{enable|disable}_trace_unit() functions. A minor improvement in etm4_enable_trace_unit() is for returning the timeout error to callers.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../coresight/coresight-etm4x-core.c | 103 +++++++++++------- 1 file changed, 62 insertions(+), 41 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e5972f16abff..53cb0569dbbf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -431,6 +431,44 @@ static int etm4x_wait_status(struct csdev_access *csa, int pos, int val) return coresight_timeout(csa, TRCSTATR, pos, val); }
+static int etm4_enable_trace_unit(struct etmv4_drvdata *drvdata) +{ + struct coresight_device *csdev = drvdata->csdev; + struct device *etm_dev = &csdev->dev; + struct csdev_access *csa = &csdev->access; + + /* + * ETE mandates that the TRCRSR is written to before + * enabling it. + */ + if (etm4x_is_ete(drvdata)) + etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); + + etm4x_allow_trace(drvdata); + /* Enable the trace unit */ + etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); + + /* Synchronize the register updates for sysreg access */ + if (!csa->io_mem) + isb(); + + /* wait for TRCSTATR.IDLE to go back down to '0' */ + if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) { + dev_err(etm_dev, + "timeout while waiting for Idle Trace Status\n"); + return -ETIME; + } + + /* + * As recommended by section 4.3.7 ("Synchronization when using the + * memory-mapped interface") of ARM IHI 0064D + */ + dsb(sy); + isb(); + + return 0; +} + static int etm4_enable_hw(struct etmv4_drvdata *drvdata) { int i, rc; @@ -539,33 +577,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- /* - * ETE mandates that the TRCRSR is written to before - * enabling it. - */ - if (etm4x_is_ete(drvdata)) - etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); - - etm4x_allow_trace(drvdata); - /* Enable the trace unit */ - etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); - - /* Synchronize the register updates for sysreg access */ - if (!csa->io_mem) - isb(); - - /* wait for TRCSTATR.IDLE to go back down to '0' */ - if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) - dev_err(etm_dev, - "timeout while waiting for Idle Trace Status\n"); - - /* - * As recommended by section 4.3.7 ("Synchronization when using the - * memory-mapped interface") of ARM IHI 0064D - */ - dsb(sy); - isb(); - + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -884,25 +896,12 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, return ret; }
-static void etm4_disable_hw(void *info) +static void etm4_disable_trace_unit(struct etmv4_drvdata *drvdata) { u32 control; - struct etmv4_drvdata *drvdata = info; - struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; struct device *etm_dev = &csdev->dev; struct csdev_access *csa = &csdev->access; - int i; - - etm4_cs_unlock(drvdata, csa); - etm4_disable_arch_specific(drvdata); - - if (!drvdata->skip_power_up) { - /* power can be removed from the trace unit now */ - control = etm4x_relaxed_read32(csa, TRCPDCR); - control &= ~TRCPDCR_PU; - etm4x_relaxed_write32(csa, control, TRCPDCR); - }
control = etm4x_relaxed_read32(csa, TRCPRGCTLR);
@@ -943,6 +942,28 @@ static void etm4_disable_hw(void *info) * of ARM IHI 0064H.b. */ isb(); +} + +static void etm4_disable_hw(void *info) +{ + u32 control; + struct etmv4_drvdata *drvdata = info; + struct etmv4_config *config = &drvdata->config; + struct coresight_device *csdev = drvdata->csdev; + struct csdev_access *csa = &csdev->access; + int i; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_arch_specific(drvdata); + + if (!drvdata->skip_power_up) { + /* power can be removed from the trace unit now */ + control = etm4x_relaxed_read32(csa, TRCPDCR); + control &= ~TRCPDCR_PU; + etm4x_relaxed_write32(csa, control, TRCPDCR); + } + + etm4_disable_trace_unit(drvdata);
/* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) {
Introduce APIs for pausing and resuming trace source and export as GPL symbols.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-core.c | 22 ++++++++++++++++++++ drivers/hwtracing/coresight/coresight-priv.h | 2 ++ include/linux/coresight.h | 4 ++++ 3 files changed, 28 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index fb43ef6a3b1f..d4c3000608f2 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -367,6 +367,28 @@ void coresight_disable_source(struct coresight_device *csdev, void *data) } EXPORT_SYMBOL_GPL(coresight_disable_source);
+void coresight_pause_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return; + + if (source_ops(csdev)->pause_perf) + source_ops(csdev)->pause_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_pause_source); + +int coresight_resume_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return -EOPNOTSUPP; + + if (!source_ops(csdev)->resume_perf) + return -EOPNOTSUPP; + + return source_ops(csdev)->resume_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_resume_source); + /* * coresight_disable_path_from : Disable components in the given path beyond * @nd in the list. If @nd is NULL, all the components, except the SOURCE are diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 82644aff8d2b..2d9baa9d8228 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -249,5 +249,7 @@ void coresight_add_helper(struct coresight_device *csdev, void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); struct coresight_device *coresight_get_percpu_sink(int cpu); void coresight_disable_source(struct coresight_device *csdev, void *data); +void coresight_pause_source(struct coresight_device *csdev); +int coresight_resume_source(struct coresight_device *csdev);
#endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d79a242b271d..c95c72e07e02 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -398,6 +398,8 @@ struct coresight_ops_link { * is associated to. * @enable: enables tracing for a source. * @disable: disables tracing for a source. + * @resume_perf: resumes tracing for a source in perf session. + * @pause_perf: pauses tracing for a source in perf session. */ struct coresight_ops_source { int (*cpu_id)(struct coresight_device *csdev); @@ -405,6 +407,8 @@ struct coresight_ops_source { enum cs_mode mode, struct coresight_path *path); void (*disable)(struct coresight_device *csdev, struct perf_event *event); + int (*resume_perf)(struct coresight_device *csdev); + void (*pause_perf)(struct coresight_device *csdev); };
/**
Add callbacks for pausing and resuming the tracer.
A "paused" flag in the driver data indicates whether the tracer is paused. If the flag is set, the driver will skip starting the hardware trace. The flag is always set to false for the sysfs mode, meaning the tracer will never be paused in the case.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../coresight/coresight-etm4x-core.c | 42 ++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x.h | 2 + 2 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 53cb0569dbbf..5b69446db947 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -577,7 +577,8 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- rc = etm4_enable_trace_unit(drvdata); + if (!drvdata->paused) + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -820,6 +821,9 @@ static int etm4_enable_perf(struct coresight_device *csdev,
drvdata->trcid = path->trace_id;
+ /* Populate pause state */ + drvdata->paused = !!READ_ONCE(event->hw.aux_paused); + /* And enable it */ ret = etm4_enable_hw(drvdata);
@@ -846,6 +850,9 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
drvdata->trcid = path->trace_id;
+ /* Tracer will never be paused in sysfs mode */ + drvdata->paused = false; + /* * Executing etm4_enable_hw on the cpu whose ETM is being enabled * ensures that register writes occur when cpu is powered. @@ -1080,10 +1087,43 @@ static void etm4_disable(struct coresight_device *csdev, coresight_set_mode(csdev, CS_MODE_DISABLED); }
+static int etm4_resume_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return -EINVAL; + + etm4_cs_unlock(drvdata, csa); + etm4_enable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = false; + return 0; +} + +static void etm4_pause_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = true; +} + static const struct coresight_ops_source etm4_source_ops = { .cpu_id = etm4_cpu_id, .enable = etm4_enable, .disable = etm4_disable, + .resume_perf = etm4_resume_perf, + .pause_perf = etm4_pause_perf, };
static const struct coresight_ops etm4_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index bd7db36ba197..ac649515054d 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -983,6 +983,7 @@ struct etmv4_save_state { * @state_needs_restore: True when there is context to restore after PM exit * @skip_power_up: Indicates if an implementation can skip powering up * the trace unit. + * @paused: Indicates if the trace unit is paused. * @arch_features: Bitmap of arch features of etmv4 devices. */ struct etmv4_drvdata { @@ -1036,6 +1037,7 @@ struct etmv4_drvdata { struct etmv4_save_state *save_state; bool state_needs_restore; bool skip_power_up; + bool paused; DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX); };
This commit supports AUX trace pause and resume in a perf session for Arm CoreSight.
First, we need to decide which flag can indicate the CoreSight PMU event has started. The 'event->hw.state' cannot be used for this purpose because its initial value and the value after hardware trace enabling are both 0.
On the other hand, the context value 'ctxt->event_data' stores the ETM private info. This pointer is valid only when the PMU event has been enabled. It is safe to permit AUX trace pause and resume operations only when it is not a NULL pointer.
To achieve fine-grained control of the pause and resume, only the tracer is disabled and enabled. This avoids the unnecessary complexity and latency caused by manipulating the entire link path.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 45 ++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index f4cccd68e625..2dcf1809cb7f 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -365,6 +365,18 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, continue; }
+ /* + * If AUX pause feature is enabled but the ETM driver does not + * support the operations, clear this CPU from the mask and + * continue to next one. + */ + if (event->attr.aux_start_paused && + (!source_ops(csdev)->pause_perf || !source_ops(csdev)->resume_perf)) { + dev_err_once(&csdev->dev, "AUX pause is not supported.\n"); + cpumask_clear_cpu(cpu, mask); + continue; + } + /* * No sink provided - look for a default sink for all the ETMs, * where this event can be scheduled. @@ -450,6 +462,15 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, goto out; }
+static int etm_event_resume(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return 0; + + return coresight_resume_source(csdev); +} + static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
+ if (flags & PERF_EF_RESUME) { + if (etm_event_resume(csdev, ctxt) < 0) { + dev_err(&csdev->dev, "Failed to resume ETM event.\n"); + goto fail; + } + return; + } + /* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail; @@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return; + + /* Stop tracer */ + coresight_pause_source(csdev); +} + static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
+ if (mode & PERF_EF_PAUSE) + return etm_event_pause(csdev, ctxt); + /* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking. @@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE | - PERF_PMU_CAP_ITRACE); + PERF_PMU_CAP_ITRACE | + PERF_PMU_CAP_AUX_PAUSE);
etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
Due to sinks like ETR and ETB don't support interrupt handling, the hardware trace data might be lost for continuous running tasks.
This commit takes advantage of the AUX pause for updating trace buffer to mitigate the trace data losing issue.
The per CPU sink has its own interrupt handling. Thus, there will be a race condition between the updating buffer in NMI and sink's interrupt handler. To avoid the race condition, this commit disallows updating buffer on AUX pause for the per CPU sink. Currently, this is only applied for TRBE.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 43 ++++++++++++++++++- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2dcf1809cb7f..f1551c08ecb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -574,14 +574,53 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
-static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event, + struct coresight_device *csdev, struct etm_ctxt *ctxt) { + int cpu = smp_processor_id(); + struct coresight_device *sink; + struct perf_output_handle *handle = &ctxt->handle; + struct coresight_path *path; + unsigned long size; + if (!ctxt->event_data) return;
/* Stop tracer */ coresight_pause_source(csdev); + + path = etm_event_cpu_path(ctxt->event_data, cpu); + sink = coresight_get_sink(path); + if (WARN_ON_ONCE(!sink)) + return; + + /* + * The per CPU sink has own interrupt handling, it might have + * race condition with updating buffer on AUX trace pause if + * it is invoked from NMI. To avoid the race condition, + * disallows updating buffer for the per CPU sink case. + */ + if (coresight_is_percpu_sink(sink)) + return; + + if (WARN_ON_ONCE(handle->event != event)) + return; + + if (!sink_ops(sink)->update_buffer) + return; + + size = sink_ops(sink)->update_buffer(sink, handle, + ctxt->event_data->snk_config); + if (READ_ONCE(handle->event)) { + if (!size) + return; + + perf_aux_output_end(handle, size); + perf_aux_output_begin(handle, event); + } else { + WARN_ON_ONCE(size); + } }
static void etm_event_stop(struct perf_event *event, int mode) @@ -595,7 +634,7 @@ static void etm_event_stop(struct perf_event *event, int mode) struct coresight_path *path;
if (mode & PERF_EF_PAUSE) - return etm_event_pause(csdev, ctxt); + return etm_event_pause(event, csdev, ctxt);
/* * If we still have access to the event_data via handle,
This adds description for AUX pause and resume. It gives introduction for what's AUX pause and resume and records usage examples.
Signed-off-by: Leo Yan leo.yan@arm.com --- .../trace/coresight/coresight-perf.rst | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/Documentation/trace/coresight/coresight-perf.rst b/Documentation/trace/coresight/coresight-perf.rst index d087aae7d492..30be89320621 100644 --- a/Documentation/trace/coresight/coresight-perf.rst +++ b/Documentation/trace/coresight/coresight-perf.rst @@ -78,6 +78,37 @@ enabled like::
Please refer to the kernel configuration help for more information.
+Fine-grained tracing with AUX pause and resume +---------------------------------------------- + +Arm CoreSight may generate a large amount of hardware trace data, which +will lead to overhead in recording and distract users when reviewing +profiling result. To mitigate the issue of excessive trace data, Perf +provides AUX pause and resume functionality for fine-grained tracing. + +The AUX pause and resume can be triggered by associated events. These +events can be ftrace tracepoints (including static and dynamic +tracepoints) or PMU events (e.g. CPU PMU cycle event). To create a perf +session with AUX pause / resume, three configuration terms are +introduced: + +- "aux-action=start-paused": it is specified for the cs_etm PMU event to + launch in a paused state. +- "aux-action=pause": an associated event is specified with this term + to pause AUX trace. +- "aux-action=resume": an associated event is specified with this term + to resume AUX trace. + +Example for triggering AUX pause and resume with ftrace tracepoints:: + + perf record -e cs_etm/aux-action=start-paused/k,syscalls:sys_enter_openat/aux-action=resume/,syscalls:sys_exit_openat/aux-action=pause/ ls + +Example for triggering AUX pause and resume with PMU event:: + + perf record -a -e cs_etm/aux-action=start-paused/k \ + -e cycles/aux-action=pause,period=10000000/ \ + -e cycles/aux-action=resume,period=1050000/ -- sleep 1 + Perf test - Verify kernel and userspace perf CoreSight work -----------------------------------------------------------