The current TRBE driver operates only in fill mode, where tracing stops at the top of buffer and a maintenance interrupt is raised. Due to interrupt latency, tracing is halted while the program continues to run, resulting in trace data lose.
This series enhances the driver for the trigger mode to mitigate trace discontinuity. The circle buffer mode is introduced to avoid any maintenance interrupts during the snapshot session.
It can be divided into three parts for easier review:
Patches 01~05: Minor refactoring for disabling operations and clearing status. Patches 06~12: Refactor fault action and trace size calculation. Patches 13~19: Support trigger and circle modes. To better utilize the new buffer modes, perf sets the watermark to one-quarter of the buffer size.
This series is applied on coresight-next branch and has been validated on Orion O6 board:
1) The trigger count and wrap mode are verified using Ftrace logs. 2) A new kunit test module verifies limit and count calculations. 3) Basic perf record / script commands and module load/unload have been tested successfully.
Signed-off-by: Leo Yan leo.yan@arm.com --- Leo Yan (19): coresight: trbe: Use helpers for checking errata coresight: trbe: Remove redundant disable operation coresight: trbe: Remove buffer disabling in trbe_handle_overflow() coresight: trbe: Remove set_trbe_disabled() from the enable flow coresight: trbe: Refactor status clearing coresight: trbe: Refactor syndrome decoding coresight: trbe: Refactor AUX flag setting coresight: trbe: Use PERF_AUX_FLAG_PARTIAL instead of PERF_AUX_FLAG_COLLISION coresight: trbe: Add fault action argument to trbe_handle_overflow() coresight: trbe: Always check fault action when updating buffer coresight: trbe: Apply overwrite erratum for only wrap event coresight: trbe: Calculate size for buffer wrapping coresight: trbe: Remove misleading comment coresight: trbe: Refactor compute_trbe_buffer_limit() coresight: trbe: Add static key for bypassing trigger mode coresight: trbe: Support trigger mode coresight: trbe: Enable circle mode for snapshot coresight: trbe: Add kunit tests perf: cs-etm: Set watermark for AUX trace
drivers/hwtracing/coresight/Kconfig | 9 + drivers/hwtracing/coresight/Makefile | 1 + .../coresight/coresight-trbe-kunit-tests.c | 536 +++++++++++++++++++++ drivers/hwtracing/coresight/coresight-trbe.c | 440 +++++++++-------- drivers/hwtracing/coresight/coresight-trbe.h | 111 ++++- tools/perf/arch/arm/util/cs-etm.c | 7 + 6 files changed, 896 insertions(+), 208 deletions(-) --- base-commit: 9e9182cab5ebc3ee7544e60ef08ba19fdf216920 change-id: 20251120-trbe_buffer_refactor_v1-1-8f8023105469
Best regards,
Use the existed helpers for checking errata instead of open coded equivalent.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 293715b4ff0eb0abe30f9b477700ca94f81cd4a2..0ddb3db0213cf0014e29decfb79da68b0a351b31 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -723,7 +723,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * the space we skipped with IGNORE packets. And we are always * guaranteed to have at least a PAGE_SIZE space in the buffer. */ - if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) && + if (trbe_may_overwrite_in_fill_mode(buf->cpudata) && !WARN_ON(size < overwrite_skip)) __trbe_pad_buf(buf, start_off, overwrite_skip);
@@ -946,7 +946,7 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf) * - At trace collection: * - Pad the 256bytes skipped above again with IGNORE packets. */ - if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE)) { + if (trbe_may_overwrite_in_fill_mode(buf->cpudata)) { if (WARN_ON(!IS_ALIGNED(buf->trbe_write, PAGE_SIZE))) return -EINVAL; buf->trbe_hw_base = buf->trbe_write; @@ -970,7 +970,7 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf) * - Adjust the TRBLIMITR.LIMIT to leave the extra PAGE outside * the TRBE's range (i.e [TRBBASER, TRBLIMITR.LIMI] ). */ - if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE)) { + if (trbe_may_write_out_of_range(buf->cpudata)) { s64 space = buf->trbe_limit - buf->trbe_write; /* * We must have more than a PAGE_SIZE worth space in the proposed
The trace unit is already disabled when trbe_stop_and_truncate_event() is called, so draining and stopping the buffer in the function is redundant.
Remove the unnecessary disable operation and rename the function to trbe_truncate_event() to better reflect its purpose.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 0ddb3db0213cf0014e29decfb79da68b0a351b31..2f44e4a65e0ee2b2c8fdd06a51ab01fc57f44a4e 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -284,18 +284,10 @@ static void trbe_report_wrap_event(struct perf_output_handle *handle) perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); }
-static void trbe_stop_and_truncate_event(struct perf_output_handle *handle) +static void trbe_truncate_event(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle);
- /* - * We cannot proceed with the buffer collection and we - * do not have any data for the current session. The - * etm_perf driver expects to close out the aux_buffer - * at event_stop(). So disable the TRBE here and leave - * the update_buffer() to return a 0 size. - */ - trbe_drain_and_disable_local(buf->cpudata); perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); perf_aux_output_end(handle, 0); *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL; @@ -1008,7 +1000,7 @@ static int __arm_trbe_enable(struct trbe_buf *buf, trbe_enable_hw(buf); return 0; err: - trbe_stop_and_truncate_event(handle); + trbe_truncate_event(handle); return ret; }
@@ -1169,7 +1161,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) trbe_handle_spurious(handle); break; case TRBE_FAULT_ACT_FATAL: - trbe_stop_and_truncate_event(handle); + trbe_truncate_event(handle); truncated = true; break; }
When trbe_handle_overflow() runs, the buffer has already been disabled and drained, so the duplicate operation is unnecessary. Remove it.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 2f44e4a65e0ee2b2c8fdd06a51ab01fc57f44a4e..f5597bd9b5fba9a8f5053d5823b03380fd468b5c 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -1077,7 +1077,6 @@ static int trbe_handle_overflow(struct perf_output_handle *handle) * is able to detect this with a disconnected handle * (handle->event = NULL). */ - trbe_drain_and_disable_local(buf->cpudata); *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL; return -EINVAL; }
set_trbe_disabled() should never appear in the enable flow, otherwise, it may potentially hide bugs in the disable flow.
Remove set_trbe_disabled() from the enable path.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index f5597bd9b5fba9a8f5053d5823b03380fd468b5c..e426991e2c2c398a9d3982e9d0f7f542e404cbab 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -629,7 +629,6 @@ static void trbe_enable_hw(struct trbe_buf *buf) WARN_ON(buf->trbe_hw_base < buf->trbe_base); WARN_ON(buf->trbe_write < buf->trbe_hw_base); WARN_ON(buf->trbe_write >= buf->trbe_limit); - set_trbe_disabled(buf->cpudata); clr_trbe_status(); set_trbe_base_pointer(buf->trbe_hw_base); set_trbe_write_pointer(buf->trbe_write);
If the driver does not clear the status when disabling the trace buffer unit, stale state will carry over to the next enable, though the driver clears it again on enable.
Explicitly clear status after the trace is disabled in the interrupt handling and when a perf session ends. Keep the status for spurious interrupts for continuous tracing.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e426991e2c2c398a9d3982e9d0f7f542e404cbab..9e565122816949b37b8ff5e4ba04cfbc317c6f25 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -629,7 +629,6 @@ static void trbe_enable_hw(struct trbe_buf *buf) WARN_ON(buf->trbe_hw_base < buf->trbe_base); WARN_ON(buf->trbe_write < buf->trbe_hw_base); WARN_ON(buf->trbe_write >= buf->trbe_limit); - clr_trbe_status(); set_trbe_base_pointer(buf->trbe_hw_base); set_trbe_write_pointer(buf->trbe_write);
@@ -1036,6 +1035,8 @@ static int arm_trbe_disable(struct coresight_device *csdev) return -EINVAL;
trbe_drain_and_disable_local(cpudata); + clr_trbe_status(); + buf->cpudata = NULL; cpudata->buf = NULL; cpudata->mode = CS_MODE_DISABLED; @@ -1151,6 +1152,10 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) return IRQ_NONE;
act = trbe_get_fault_act(handle, status); + + if (act != TRBE_FAULT_ACT_SPURIOUS) + clr_trbe_status(); + switch (act) { case TRBE_FAULT_ACT_WRAP: truncated = !!trbe_handle_overflow(handle);
It gives priority to TRBSR_EL1.EA (external abort); an external abort will immediately bail out and return an error.
Next, the syndrome decoding is refactored based on two levels of information: the EC (Event Class) bits and the BSC (Trace Buffer Status Code) bits.
If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to identify the specific trace buffer event. Otherwise, any non-zero TRBSR_EL1.EC is treated as an error.
For error cases, the driver prints an error string and dumps registers for debugging.
No additional checks are required for wrap mode beyond verifying the TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit reliably indicates a buffer wrap.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 60 +++++++++++++++++++++------- drivers/hwtracing/coresight/coresight-trbe.h | 8 ++-- 2 files changed, 51 insertions(+), 17 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf) static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle, u64 trbsr) { + const char *err_str; int ec = get_trbe_ec(trbsr); int bsc = get_trbe_bsc(trbsr); - struct trbe_buf *buf = etm_perf_sink_config(handle); - struct trbe_cpudata *cpudata = buf->cpudata;
WARN_ON(is_trbe_running(trbsr)); - if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr)) - return TRBE_FAULT_ACT_FATAL;
- if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT)) - return TRBE_FAULT_ACT_FATAL; + if (is_trbe_abort(trbsr)) { + err_str = "External abort"; + goto out_fatal; + }
- /* - * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE, - * it might write data after a WRAP event in the fill mode. - * Thus the check TRBPTR == TRBBASER will not be honored. - */ - if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) && - (trbe_may_overwrite_in_fill_mode(cpudata) || - get_trbe_write_pointer() == get_trbe_base_pointer())) + switch (ec) { + case TRBE_EC_OTHERS: + break; + case TRBE_EC_BUF_MGMT_IMPL: + err_str = "Unexpected implemented management"; + goto out_fatal; + case TRBE_EC_GP_CHECK_FAULT: + err_str = "Granule Protection Check fault"; + goto out_fatal; + case TRBE_EC_STAGE1_ABORT: + err_str = "Stage 1 data abort"; + goto out_fatal; + case TRBE_EC_STAGE2_ABORT: + err_str = "Stage 2 data abort"; + goto out_fatal; + default: + err_str = "Unknown error code"; + goto out_fatal; + } + + switch (bsc) { + case TRBE_BSC_NOT_STOPPED: + break; + case TRBE_BSC_FILLED: + break; + case TRBE_BSC_TRIGGERED: + err_str = "Unexpected trigger status"; + goto out_fatal; + default: + err_str = "Unexpected buffer status code"; + goto out_fatal; + } + + if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
return TRBE_FAULT_ACT_SPURIOUS; + +out_fatal: + pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n", + err_str, smp_processor_id(), trbsr, + read_sysreg_s(SYS_TRBPTR_EL1), + read_sysreg_s(SYS_TRBLIMITR_EL1)); + return TRBE_FAULT_ACT_FATAL; }
static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.h +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void) return trblimitr & TRBLIMITR_EL1_E; }
-#define TRBE_EC_OTHERS 0 -#define TRBE_EC_STAGE1_ABORT 36 -#define TRBE_EC_STAGE2_ABORT 37 +#define TRBE_EC_OTHERS 0x0 +#define TRBE_EC_GP_CHECK_FAULT 0X1e +#define TRBE_EC_BUF_MGMT_IMPL 0x1f +#define TRBE_EC_STAGE1_ABORT 0x24 +#define TRBE_EC_STAGE2_ABORT 0x25
static inline int get_trbe_ec(u64 trbsr) {
Rather than spreading AUX flag setting in different functions, use trbe_get_fault_act() as a central place for setting the flag.
Later we will support WRAP mode with continuous trace, so the WRAP event does not necessarily cause the trace discontinuity, change to check the stop status instead.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 38 +++++++++++++--------------- 1 file changed, 17 insertions(+), 21 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 28e2bfa68074f19ccaa4a737d00af577aea818fe..b06885a08e082fd34f68d9588518807b5c47c86e 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -265,25 +265,6 @@ static void trbe_reset_local(struct trbe_cpudata *cpudata) write_sysreg_s(0, SYS_TRBSR_EL1); }
-static void trbe_report_wrap_event(struct perf_output_handle *handle) -{ - /* - * Mark the buffer to indicate that there was a WRAP event by - * setting the COLLISION flag. This indicates to the user that - * the TRBE trace collection was stopped without stopping the - * ETE and thus there might be some amount of trace that was - * lost between the time the WRAP was detected and the IRQ - * was consumed by the CPU. - * - * Setting the TRUNCATED flag would move the event to STOPPED - * state unnecessarily, even when there is space left in the - * ring buffer. Using the COLLISION flag doesn't have this side - * effect. We only set TRUNCATED flag when there is no space - * left in the ring buffer. - */ - perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); -} - static void trbe_truncate_event(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle); @@ -687,6 +668,23 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand goto out_fatal; }
+ /* + * Mark the buffer to indicate that there was a WRAP event by + * setting the COLLISION flag. This indicates to the user that + * the TRBE trace collection was stopped without stopping the + * ETE and thus there might be some amount of trace that was + * lost between the time the WRAP was detected and the IRQ + * was consumed by the CPU. + * + * Setting the TRUNCATED flag would move the event to STOPPED + * state unnecessarily, even when there is space left in the + * ring buffer. Using the COLLISION flag doesn't have this side + * effect. We only set TRUNCATED flag when there is no space + * left in the ring buffer. + */ + if (!is_trbe_running(trbsr)) + perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); + if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
@@ -878,7 +876,6 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, goto done; }
- trbe_report_wrap_event(handle); wrap = true; }
@@ -1099,7 +1096,6 @@ static int trbe_handle_overflow(struct perf_output_handle *handle) if (buf->snapshot) handle->head += size;
- trbe_report_wrap_event(handle); perf_aux_output_end(handle, size); event_data = perf_aux_output_begin(handle, event); if (!event_data) {
When tracing stops, there is no collision with other samples, so using PERF_AUX_FLAG_COLLISION does not accurately reflect the trace state and may mislead userspace.
Use PERF_AUX_FLAG_PARTIAL instead to indicate that tracing stopped and the record may contain gaps.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index b06885a08e082fd34f68d9588518807b5c47c86e..0caa4a6b437a3aa39fc6bcc72a23711b54f7c598 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -669,21 +669,14 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand }
/* - * Mark the buffer to indicate that there was a WRAP event by - * setting the COLLISION flag. This indicates to the user that - * the TRBE trace collection was stopped without stopping the - * ETE and thus there might be some amount of trace that was - * lost between the time the WRAP was detected and the IRQ - * was consumed by the CPU. - * - * Setting the TRUNCATED flag would move the event to STOPPED - * state unnecessarily, even when there is space left in the - * ring buffer. Using the COLLISION flag doesn't have this side - * effect. We only set TRUNCATED flag when there is no space - * left in the ring buffer. + * Mark the buffer to indicate that the trace is stopped by setting + * the PARTIAL flag. This indicates to the user that the TRBE trace + * collection was stopped without stopping the ETE and thus there + * might be some amount of trace that was lost between the time the + * TRBE event was detected and the IRQ was consumed by the CPU. */ if (!is_trbe_running(trbsr)) - perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); + perf_aux_output_flag(handle, PERF_AUX_FLAG_PARTIAL);
if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
Add a new argument to trbe_handle_overflow() for the fault action, which is used to compare the wrap event for trace size calculation.
No functional change intended; this is preparation for a later update.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 0caa4a6b437a3aa39fc6bcc72a23711b54f7c598..f56ecdeaa6596afb440e4d53732e08a85f9bf89d 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -1078,14 +1078,15 @@ static void trbe_handle_spurious(struct perf_output_handle *handle) set_trbe_enabled(buf->cpudata, trblimitr); }
-static int trbe_handle_overflow(struct perf_output_handle *handle) +static int trbe_handle_overflow(struct perf_output_handle *handle, + enum trbe_fault_action act) { struct perf_event *event = handle->event; struct trbe_buf *buf = etm_perf_sink_config(handle); unsigned long size; struct etm_event_data *event_data;
- size = trbe_get_trace_size(handle, buf, true); + size = trbe_get_trace_size(handle, buf, act == TRBE_FAULT_ACT_WRAP); if (buf->snapshot) handle->head += size;
@@ -1179,7 +1180,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
switch (act) { case TRBE_FAULT_ACT_WRAP: - truncated = !!trbe_handle_overflow(handle); + truncated = !!trbe_handle_overflow(handle, act); break; case TRBE_FAULT_ACT_SPURIOUS: trbe_handle_spurious(handle);
The current code checks the fault action only via the IRQ status bit, which is unreliable due to possible hardware latency.
Move the fault action check out of the IRQ status condition. This also causes the buffer size to be calculated for non-WRAP and fault cases, which is fine since the write pointer is trusted for the calculation.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index f56ecdeaa6596afb440e4d53732e08a85f9bf89d..e579ea98523c24d23a0cd265dcdd0a46b52b52da 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -806,7 +806,6 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, enum trbe_fault_action act; unsigned long size, status; unsigned long flags; - bool wrap = false;
WARN_ON(buf->cpudata != cpudata); WARN_ON(cpudata->cpu != smp_processor_id()); @@ -858,21 +857,11 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, */ clr_trbe_irq(); isb(); - - act = trbe_get_fault_act(handle, status); - /* - * If this was not due to a WRAP event, we have some - * errors and as such buffer is empty. - */ - if (act != TRBE_FAULT_ACT_WRAP) { - size = 0; - goto done; - } - - wrap = true; }
- size = trbe_get_trace_size(handle, buf, wrap); + act = trbe_get_fault_act(handle, status); + + size = trbe_get_trace_size(handle, buf, act == TRBE_FAULT_ACT_WRAP);
done: local_irq_restore(flags);
The overwrite erratum occurs only on wrap events, so apply the extra wrap condition check in the workaround.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e579ea98523c24d23a0cd265dcdd0a46b52b52da..2600af12a8fb94bb8c74efda2a101aacd01b0b34 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -714,7 +714,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * 64bytes. Thus we ignore the potential triggering of the erratum * on WRAP and limit the data to LIMIT. */ - if (wrap) + if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata)) write = get_trbe_limit_pointer(); else write = get_trbe_write_pointer(); @@ -736,7 +736,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * the space we skipped with IGNORE packets. And we are always * guaranteed to have at least a PAGE_SIZE space in the buffer. */ - if (trbe_may_overwrite_in_fill_mode(buf->cpudata) && + if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata) && !WARN_ON(size < overwrite_skip)) __trbe_pad_buf(buf, start_off, overwrite_skip);
Calculate the wrapped size when the end position is less than the start.
If the start equals the end, the "wrap" flag is used to decide whether the buffer is full or empty.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 2600af12a8fb94bb8c74efda2a101aacd01b0b34..48bc03bd339908b5eac9466dc60325ff1b238976 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -696,7 +696,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, { u64 write; u64 start_off, end_off; - u64 size; + u64 size, buf_size; u64 overwrite_skip = TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
/* @@ -726,11 +726,18 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, */ end_off = write - buf->trbe_base; start_off = PERF_IDX2OFF(handle->head, buf); + buf_size = buf->trbe_limit - buf->trbe_base; + + if (end_off > start_off) + size = end_off - start_off; + else if (end_off < start_off) + size = end_off + buf_size - start_off; + else if (wrap) + /* The start is the same as the end, just wrapped */ + size = buf_size; + else + size = 0;
- if (WARN_ON_ONCE(end_off < start_off)) - return 0; - - size = end_off - start_off; /* * If the TRBE is affected by the following erratum, we must fill * the space we skipped with IGNORE packets. And we are always
Since the rounded-up wakeup address is always higher than the head, the limit cannot be less than the head caused by wakeup capping.
The described scenario is never valid caused by wakeup capping, remove the comment to avoid confusion.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 48bc03bd339908b5eac9466dc60325ff1b238976..206eaf103cd94f36220cb6bddd1a78012f5de35a 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -488,10 +488,10 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) limit = min(limit, round_up(wakeup, PAGE_SIZE));
/* - * There are two situation when this can happen i.e limit is before + * There is a situation when this can happen i.e limit is before * the head and hence TRBE cannot be configured. * - * 1) head < tail (aligned down with PAGE_SIZE) and also they are both + * head < tail (aligned down with PAGE_SIZE) and also they are both * within the same PAGE size range. * * PAGE_SIZE @@ -502,18 +502,6 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) * |$$$$$$$$$$$$$$$$$$$|========|$$$$$$$| * +------------|------|--------|-------+ * trbe_base trbe_base + nr_pages - * - * 2) head < wakeup (aligned up with PAGE_SIZE) < tail and also both - * head and wakeup are within same PAGE size range. - * - * PAGE_SIZE - * |----------------------| - * - * limit head wakeup tail - * +----|------|-------|--------|-------+ - * |$$$$$$$$$$$|=======|========|$$$$$$$| - * +----|------|-------|--------|-------+ - * trbe_base trbe_base + nr_pages */ if (limit > head) return limit;
The buffer currently operates in fill mode, where tracing stops when it reaches the end of the buffer and a maintenance interrupt is raised. However, due to IRQ latency, trace data may be lost during the window in which tracing is halted but the program continues to run.
To mitigate the issue, this commit enables the trigger count to support buffer maintenance without disabling tracing. This is fulfilled with two modes:
1) Set a trigger count as a watermark and use fill mode to prevent the buffer from being overwritten. Once the count is decremented to zero, an interrupt is raised for buffer maintenance, but the hardware continues collecting trace data until limit.
head watermark tail +----+---------------+---------+-------+ |$$$$| | |$$$$$$$| +----+---------------+---------+-------+ base `---- count ----' limit base + nr_pages
$$$ : Filled trace data
2) Use wrap mode so that tracing continues when reach the top of the buffer. The trigger count is configured as "Stop on trigger" to guard the trace data not to be overwritten.
watermark tail head +--------+-----------+---------+-------+ | | |$$$$$$$$$| | +--------+-----------+---------+-------+ base base + nr_pages limit
`-------> >-- counter ---------'
$$$ : Filled trace data
The modes are selected by comparing the limit with the trigger position.
An extra TRBE_FAULT_ACT_TRIG state is introduced for fault action, it is used to distinguish the trigger event from the WRAP event.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 101 +++++++++++++++++++++------ drivers/hwtracing/coresight/coresight-trbe.h | 14 ++++ 2 files changed, 94 insertions(+), 21 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 8390d0a8fe23d35945610df15f21751279ee37ee..0551ea9b4f8286c156e3c9c7ac94e2ecd3b9dc3f 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -48,6 +48,7 @@ #define TRBE_TRACE_MIN_BUF_SIZE 64
enum trbe_fault_action { + TRBE_FAULT_ACT_TRIG, TRBE_FAULT_ACT_WRAP, TRBE_FAULT_ACT_SPURIOUS, TRBE_FAULT_ACT_FATAL, @@ -67,6 +68,7 @@ struct trbe_buf { unsigned long trbe_hw_base; unsigned long trbe_limit; unsigned long trbe_write; + unsigned long trbe_count; int nr_pages; void **pages; bool snapshot; @@ -478,6 +480,10 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) if (head < tail) limit = round_down(tail, PAGE_SIZE);
+ /* If trigger mode is enabled, no need to use limit for watermark */ + if (!static_branch_unlikely(&trbe_trigger_mode_bypass)) + goto out; + /* * Wakeup may be arbitrarily far into the future. If it's not in the * current generation, either we'll wrap before hitting it, or it's @@ -495,6 +501,7 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) if (handle->wakeup < (handle->head + handle->size) && head <= wakeup) limit = min(limit, round_up(wakeup, PAGE_SIZE));
+out: /* * There is a situation when this can happen i.e limit is before * the head and hence TRBE cannot be configured. @@ -518,6 +525,39 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) return 0; }
+static u64 __trbe_normal_trigger_count(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + struct trbe_cpudata *cpudata = buf->cpudata; + u64 limit, head, wakeup; + u64 count = 0; + + if (static_branch_unlikely(&trbe_trigger_mode_bypass)) + return 0; + + limit = buf->trbe_limit - buf->trbe_base; + head = PERF_IDX2OFF(handle->head, buf); + wakeup = PERF_IDX2OFF(handle->wakeup, buf); + + /* Set the count to guard the end of free buffer after wrap around */ + if (limit == buf->nr_pages * PAGE_SIZE && (head + handle->size) > limit) + count = handle->size; + + /* + * If the watermark is less than the limit, use the trigger count for + * the watermark maintenance. + */ + if (handle->wakeup < (handle->head + handle->size) && head <= wakeup) { + u64 wakeup_count = + round_up(wakeup - head, cpudata->trbe_hw_align); + + if (head + wakeup_count < limit) + count = wakeup_count; + } + + return count; +} + static int trbe_normal_offset(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle); @@ -542,6 +582,7 @@ static int trbe_normal_offset(struct perf_output_handle *handle) return -ENOSPC;
buf->trbe_limit = buf->trbe_base + limit; + buf->trbe_count = __trbe_normal_trigger_count(handle); return 0; }
@@ -594,24 +635,40 @@ static void set_trbe_limit_pointer_enabled(struct trbe_buf *buf) trblimitr &= ~TRBLIMITR_EL1_TM_MASK; trblimitr &= ~TRBLIMITR_EL1_LIMIT_MASK;
- /* - * Fill trace buffer mode is used here while configuring the - * TRBE for trace capture. In this particular mode, the trace - * collection is stopped and a maintenance interrupt is raised - * when the current write pointer wraps. This pause in trace - * collection gives the software an opportunity to capture the - * trace data in the interrupt handler, before reconfiguring - * the TRBE. - */ - trblimitr |= (TRBLIMITR_EL1_FM_FILL << TRBLIMITR_EL1_FM_SHIFT) & - TRBLIMITR_EL1_FM_MASK; + if (!buf->trbe_count || + buf->trbe_write + buf->trbe_count == buf->trbe_limit) { + /* + * Fill trace buffer mode is used here while configuring the + * TRBE for trace capture. In this particular mode, the trace + * collection is stopped and a maintenance interrupt is raised + * when the current write pointer wraps. This pause in trace + * collection gives the software an opportunity to capture the + * trace data in the interrupt handler, before reconfiguring + * the TRBE. + */ + trblimitr |= FIELD_PREP(TRBLIMITR_EL1_FM_MASK, TRBLIMITR_EL1_FM_FILL) | + FIELD_PREP(TRBLIMITR_EL1_TM_MASK, TRBLIMITR_EL1_TM_IGNR); + } else if (buf->trbe_write + buf->trbe_count < buf->trbe_limit) { + /* + * Fill mode is used here to stop trace collection and prevent + * the buffer from being overwritten. Trigger mode continues + * trace collection and raises a maintenance interrupt on a + * trigger event, which acts as a watermark for notifying + * userspace. + */ + trblimitr |= FIELD_PREP(TRBLIMITR_EL1_FM_MASK, TRBLIMITR_EL1_FM_FILL) | + FIELD_PREP(TRBLIMITR_EL1_TM_MASK, TRBLIMITR_EL1_TM_IRQ); + } else if (buf->trbe_write + buf->trbe_count > buf->trbe_limit) { + /* + * Wrap buffer mode continues trace collection and raises + * maintenance interrupt on buffer wrap. Trigger mode stops + * trace on trigger event to guard the buffer from being + * overwritten. + */ + trblimitr |= FIELD_PREP(TRBLIMITR_EL1_FM_MASK, TRBLIMITR_EL1_FM_WRAP) | + FIELD_PREP(TRBLIMITR_EL1_TM_MASK, TRBLIMITR_EL1_TM_STOP); + }
- /* - * Trigger mode is not used here while configuring the TRBE for - * the trace capture. Hence just keep this in the ignore mode. - */ - trblimitr |= (TRBLIMITR_EL1_TM_IGNR << TRBLIMITR_EL1_TM_SHIFT) & - TRBLIMITR_EL1_TM_MASK; trblimitr |= (addr & PAGE_MASK); set_trbe_enabled(buf->cpudata, trblimitr); } @@ -623,6 +680,7 @@ static void trbe_enable_hw(struct trbe_buf *buf) WARN_ON(buf->trbe_write >= buf->trbe_limit); set_trbe_base_pointer(buf->trbe_hw_base); set_trbe_write_pointer(buf->trbe_write); + set_trbe_trigger_count(buf->trbe_count);
/* * Synchronize all the register updates @@ -639,8 +697,6 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand int ec = get_trbe_ec(trbsr); int bsc = get_trbe_bsc(trbsr);
- WARN_ON(is_trbe_running(trbsr)); - if (is_trbe_abort(trbsr)) { err_str = "External abort"; goto out_fatal; @@ -672,8 +728,7 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand case TRBE_BSC_FILLED: break; case TRBE_BSC_TRIGGERED: - err_str = "Unexpected trigger status"; - goto out_fatal; + break; default: err_str = "Unexpected buffer status code"; goto out_fatal; @@ -692,6 +747,9 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
+ if (is_trbe_trg(trbsr)) + return TRBE_FAULT_ACT_TRIG; + return TRBE_FAULT_ACT_SPURIOUS;
out_fatal: @@ -1180,6 +1238,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) clr_trbe_status();
switch (act) { + case TRBE_FAULT_ACT_TRIG: case TRBE_FAULT_ACT_WRAP: truncated = !!trbe_handle_overflow(handle, act); break; diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h index d7f7cd763c0c7139cf322b7336ee563073e3bea0..4c65d164a946ec9860825e7564196745b60d730b 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.h +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -114,6 +114,20 @@ static inline void set_trbe_write_pointer(unsigned long addr) write_sysreg_s(addr, SYS_TRBPTR_EL1); }
+static inline void set_trbe_trigger_count(unsigned long count) +{ + u64 trbsr; + + write_sysreg_s(count, SYS_TRBTRG_EL1); + + /* TRBSR_EL1.TRG has been cleared in clr_trbe_status() */ + if (!count) + return; + + trbsr = read_sysreg_s(SYS_TRBSR_EL1); + write_sysreg_s(trbsr | TRBSR_EL1_TRG, SYS_TRBSR_EL1); +} + static inline unsigned long get_trbe_limit_pointer(void) { u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
A snapshot session captures trace data only when userspace receives a signal and disables tracing. Before disabling, enable circle mode instead of fill mode to avoid unnecessary interrupt handling.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 0551ea9b4f8286c156e3c9c7ac94e2ecd3b9dc3f..ee9993d518d2a41f0d709b7d0690b2dfe0bef2d9 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -635,6 +635,12 @@ static void set_trbe_limit_pointer_enabled(struct trbe_buf *buf) trblimitr &= ~TRBLIMITR_EL1_TM_MASK; trblimitr &= ~TRBLIMITR_EL1_LIMIT_MASK;
+ if (buf->snapshot) { + trblimitr |= FIELD_PREP(TRBLIMITR_EL1_FM_MASK, TRBLIMITR_EL1_FM_CBUF) | + FIELD_PREP(TRBLIMITR_EL1_TM_MASK, TRBLIMITR_EL1_TM_IGNR); + goto enable_trace_buf; + } + if (!buf->trbe_count || buf->trbe_write + buf->trbe_count == buf->trbe_limit) { /* @@ -669,6 +675,7 @@ static void set_trbe_limit_pointer_enabled(struct trbe_buf *buf) FIELD_PREP(TRBLIMITR_EL1_TM_MASK, TRBLIMITR_EL1_TM_STOP); }
+enable_trace_buf: trblimitr |= (addr & PAGE_MASK); set_trbe_enabled(buf->cpudata, trblimitr); }
Refactor compute_trbe_buffer_limit() to perform the computation and handle failures. The return type is changed from a limit offset to an error number (0 is for success).
This refactoring is for future extensions, such as calculating additional values (e.g., the trigger count). No functional changes are introduced.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 44 +++++++++++++++++----------- 1 file changed, 27 insertions(+), 17 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 206eaf103cd94f36220cb6bddd1a78012f5de35a..941aa46e9b11f60c707eb40093964de454a3fd83 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -332,7 +332,7 @@ static void trbe_pad_buf(struct perf_output_handle *handle, int len) perf_aux_output_skip(handle, len); }
-static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle) +static int trbe_snapshot_offset(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle);
@@ -341,7 +341,8 @@ static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle) * the decoder to reset in case of an overflow or corruption. * So we can use the entire buffer for the snapshot mode. */ - return buf->nr_pages * PAGE_SIZE; + buf->trbe_limit = buf->trbe_base + buf->nr_pages * PAGE_SIZE; + return 0; }
static u64 trbe_min_trace_buf_size(struct perf_output_handle *handle) @@ -510,7 +511,7 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) return 0; }
-static unsigned long trbe_normal_offset(struct perf_output_handle *handle) +static int trbe_normal_offset(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle); u64 limit = __trbe_normal_offset(handle); @@ -529,19 +530,34 @@ static unsigned long trbe_normal_offset(struct perf_output_handle *handle) limit = __trbe_normal_offset(handle); head = PERF_IDX2OFF(handle->head, buf); } - return limit; + + if (!limit) + return -ENOSPC; + + buf->trbe_limit = buf->trbe_base + limit; + return 0; }
-static unsigned long compute_trbe_buffer_limit(struct perf_output_handle *handle) +static int trbe_compute_next(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle); - unsigned long offset; + int ret; + + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
if (buf->snapshot) - offset = trbe_snapshot_offset(handle); + ret = trbe_snapshot_offset(handle); else - offset = trbe_normal_offset(handle); - return buf->trbe_base + offset; + ret = trbe_normal_offset(handle); + + if (ret) + return ret; + + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf); + + /* Set the base of the TRBE to the buffer base */ + buf->trbe_hw_base = buf->trbe_base; + return 0; }
static void clr_trbe_status(void) @@ -986,15 +1002,9 @@ static int __arm_trbe_enable(struct trbe_buf *buf, { int ret = 0;
- perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW); - buf->trbe_limit = compute_trbe_buffer_limit(handle); - buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf); - if (buf->trbe_limit == buf->trbe_base) { - ret = -ENOSPC; + ret = trbe_compute_next(handle); + if (ret) goto err; - } - /* Set the base of the TRBE to the buffer base */ - buf->trbe_hw_base = buf->trbe_base;
ret = trbe_apply_work_around_before_enable(buf); if (ret)
Add tests to verify the calculation of the limit and trigger count.
Because trigger mode can be disabled, provide two test suites: one with trigger mode enabled and one with it disabled.
The cpudata structure is initialized by the test stub, so move its definition into the header for including.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/Kconfig | 9 + drivers/hwtracing/coresight/Makefile | 1 + .../coresight/coresight-trbe-kunit-tests.c | 536 +++++++++++++++++++++ drivers/hwtracing/coresight/coresight-trbe.c | 112 +---- drivers/hwtracing/coresight/coresight-trbe.h | 89 ++++ 5 files changed, 660 insertions(+), 87 deletions(-)
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig index 6a4239ebb582e95f0ebe8e9c8738a726f27f60a1..f5758563c0090141cdca67f16e7b7b32e7c75bb8 100644 --- a/drivers/hwtracing/coresight/Kconfig +++ b/drivers/hwtracing/coresight/Kconfig @@ -214,6 +214,15 @@ config CORESIGHT_TRBE To compile this driver as a module, choose M here: the module will be called coresight-trbe.
+config CORESIGHT_TRBE_KUNIT_TESTS + tristate "Enable Coresight TRBE unit tests" + depends on KUNIT + depends on CORESIGHT_TRBE + default KUNIT_ALL_TESTS + help + Enable Coresight TRBE unit tests. Only useful for development and not + intended for production. + config ULTRASOC_SMB tristate "Ultrasoc system memory buffer drivers" depends on ACPI || COMPILE_TEST diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile index ab16d06783a572ea1308dfb3a30c96df9e5ffdb7..f8961f6883d167bc2c4bca8008eceb08c3c3a0e9 100644 --- a/drivers/hwtracing/coresight/Makefile +++ b/drivers/hwtracing/coresight/Makefile @@ -57,3 +57,4 @@ obj-$(CONFIG_CORESIGHT_DUMMY) += coresight-dummy.o obj-$(CONFIG_CORESIGHT_CTCU) += coresight-ctcu.o coresight-ctcu-y := coresight-ctcu-core.o obj-$(CONFIG_CORESIGHT_KUNIT_TESTS) += coresight-kunit-tests.o +obj-$(CONFIG_CORESIGHT_TRBE_KUNIT_TESTS) += coresight-trbe-kunit-tests.o diff --git a/drivers/hwtracing/coresight/coresight-trbe-kunit-tests.c b/drivers/hwtracing/coresight/coresight-trbe-kunit-tests.c new file mode 100644 index 0000000000000000000000000000000000000000..836f76dce155d533f9076e85dc97ba25221b7bbf --- /dev/null +++ b/drivers/hwtracing/coresight/coresight-trbe-kunit-tests.c @@ -0,0 +1,536 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <kunit/test.h> +#include <kunit/device.h> +#include <linux/coresight.h> + +#include "coresight-priv.h" +#include "coresight-trbe.h" + +MODULE_IMPORT_NS("EXPORTED_FOR_KUNIT_TESTING"); + +static void test_compute_offset(struct kunit *test) +{ + struct perf_output_handle handle = { 0 }; + struct trbe_buf buf = { 0 }; + struct trbe_cpudata cpudata = { .trbe_align = PAGE_SIZE }; + unsigned long limit; + + if (!static_branch_unlikely(&trbe_trigger_mode_bypass)) + return; + + cpudata.trbe_hw_align = 1; + + buf.nr_pages = SZ_1M / SZ_4K; + buf.cpudata = &cpudata; + + handle.rb = (void *)&buf; + + /* + * ### : Free space, $$$ : Filled space + * + * |################|################| + * `head `wakeup + * `tail `limit + */ + handle.head = 0; + handle.size = SZ_1M; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M / 2); + + /* + * |################|################| + * `head `wakeup `tail + * `limit + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M / 2); + + /* + * |#################################| + * `head `tail + * `wakeup `limit + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = 0; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, 0); + + /* + * |#################################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = SZ_1M; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + + /* + * |$$$$$$$$$$$$$$$$|########|#######| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M * 3 / 4; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + + /* + * |$$$$$$$$|$$$$$$$|################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M * 1 / 4; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + + /* + * |$$$$$$$$$$$$$$$$|################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M - 1; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + + /* + * |#########|$$$$$$$$$$|########|###| + * `tail `head `wakeup + * `limit + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = handle.head + SZ_1M / 8; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 7 / 8); + + /* + * |####|####|$$$$$$$$$$|############| + * `tail `head + * `wakeup + * `limit + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 8; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + + /* + * |#######|########|$$$$$$$$$$$$$$$$| + * `head `wakeup `>tail + * `limit + */ + handle.head = SZ_1M; + handle.wakeup = SZ_1M + SZ_1M / 8; + handle.size = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M / 8); + + /* + * |#######|$$$$$$$$$$$$$$$$$|#######| + * `tail `head + * `wakeup + * `limit + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 4; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + + /* + * |#######|$$$$$$$$|$$$$$$$$|#######| + * `tail `wakeup `head + * `limit + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + + /* + * |$$$$$$$|########|########|$$$$$$$| + * `head `wakeup `tail + * `limit + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M / 2); + + /* + * |$$$$$$$|#################|$$$$$$$| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M * 3 / 4; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + + /* + * |$$$$$$$|#################|$$$$$$$| + * `wakeup `head `tail + * `limit + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = 0; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + + /* + * |$$$$$$|$$$$$$$$$$$$$$$$$$$$$$$$$$| + * `head + * `tail + */ + handle.head = SZ_1M / 4; + handle.size = 0; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, 0); + + /* + * |$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$|#$| + * `head + * `tail + */ + handle.head = SZ_1M - SZ_1K * 2; + handle.size = SZ_1K; + handle.wakeup = 0; + + limit = __trbe_normal_offset(&handle); + KUNIT_ASSERT_EQ(test, limit, 0); +} + +static void test_compute_offset_and_counter(struct kunit *test) +{ + struct perf_output_handle handle = { 0 }; + struct trbe_buf buf = { 0 }; + struct trbe_cpudata cpudata = { .trbe_align = PAGE_SIZE }; + unsigned long limit; + u64 count; + + if (static_branch_unlikely(&trbe_trigger_mode_bypass)) + return; + + cpudata.trbe_hw_align = 1; + + buf.nr_pages = SZ_1M / SZ_4K; + buf.cpudata = &cpudata; + + handle.rb = (void *)&buf; + + /* + * ### : Free space, $$$ : Filled space + * + * |################|################| + * `head `wakeup `limit + * `tail + * `----- count ----' + */ + handle.head = 0; + handle.size = SZ_1M; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 2); + + /* + * |################|################| + * `head `wakeup `tail + * `limit + * `----- count ----' + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 2); + + /* + * |#################################| + * `head `tail + * `wakeup `limit + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = 0; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, 0); + + /* + * |#################################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = 0; + handle.size = SZ_1M - 1; + handle.wakeup = SZ_1M; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, 0); + + /* + * |$$$$$$$$$$$$$$$$|########|#######| + * `head `tail + * `wakeup + * `limit + * [ count ] + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M * 3 / 4; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 4); + + /* + * |$$$$$$$$|$$$$$$$|################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M * 1 / 4; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, 0); + + /* + * |$$$$$$$$$$$$$$$$|################| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 2; + handle.size = SZ_1M / 2 - 1; + handle.wakeup = SZ_1M - 1; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M - SZ_4K); + KUNIT_ASSERT_EQ(test, count, 0); + + /* + * |#########|$$$$$$$$$$|########|###| + * `tail `head `wakeup + * `limit + * [ count ] + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = handle.head + SZ_1M / 8; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 8); + + /* + * |####|####|$$$$$$$$$$|############| + * `tail `head + * `wakeup + * `limit + * [ count >>> + * >>> ] + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 8; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 2); + + /* + * |#######|########|$$$$$$$$$$$$$$$$| + * `head `wakeup `>tail + * `limit + * [ count ] + */ + handle.head = SZ_1M; + handle.wakeup = SZ_1M + SZ_1M / 8; + handle.size = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M / 2); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 8); + + /* + * |#######|$$$$$$$$$$$$$$$$$|#######| + * `tail `head + * `wakeup + * `limit + * [ count > + * >>> ] + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 4; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 2); + + /* + * |#######|$$$$$$$$|$$$$$$$$|#######| + * `tail `wakeup `head + * `limit + * [ count > + * >>> ] + */ + handle.head = SZ_1M * 3 / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M + SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 2); + + /* + * |$$$$$$$|########|########|$$$$$$$| + * `head `wakeup `tail + * `limit + * [ count ] + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M / 2; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + KUNIT_ASSERT_EQ(test, count, SZ_1M / 4); + + /* + * |$$$$$$$|#################|$$$$$$$| + * `head `tail + * `wakeup + * `limit + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = SZ_1M * 3 / 4; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + KUNIT_ASSERT_EQ(test, count, 0); + + /* + * |$$$$$$$|#################|$$$$$$$| + * `wakeup `head `tail + * `limit + */ + handle.head = SZ_1M / 4; + handle.size = SZ_1M / 2; + handle.wakeup = 0; + + limit = __trbe_normal_offset(&handle); + buf.trbe_limit = limit; + count = __trbe_normal_trigger_count(&handle); + + KUNIT_ASSERT_EQ(test, limit, SZ_1M * 3 / 4); + KUNIT_ASSERT_EQ(test, count, 0); +} + +static struct kunit_case coresight_trbe_testcases[] = { + KUNIT_CASE(test_compute_offset), + KUNIT_CASE(test_compute_offset_and_counter), + {} +}; + +static struct kunit_suite coresight_trbe_test_suite = { + .name = "coresight_trbe_test_suite", + .test_cases = coresight_trbe_testcases, +}; + +kunit_test_suites(&coresight_trbe_test_suite); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Leo Yan leo.yan@arm.com"); +MODULE_DESCRIPTION("Arm CoreSight TRBE KUnit tests"); diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index ee9993d518d2a41f0d709b7d0690b2dfe0bef2d9..25d42683ab74b55efa2e19a2d77ab8ae2d68d228 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -54,92 +54,12 @@ enum trbe_fault_action { TRBE_FAULT_ACT_FATAL, };
-struct trbe_buf { - /* - * Even though trbe_base represents vmap() - * mapped allocated buffer's start address, - * it's being as unsigned long for various - * arithmetic and comparision operations & - * also to be consistent with trbe_write & - * trbe_limit sibling pointers. - */ - unsigned long trbe_base; - /* The base programmed into the TRBE */ - unsigned long trbe_hw_base; - unsigned long trbe_limit; - unsigned long trbe_write; - unsigned long trbe_count; - int nr_pages; - void **pages; - bool snapshot; - struct trbe_cpudata *cpudata; -}; - -/* - * TRBE erratum list - * - * The errata are defined in arm64 generic cpu_errata framework. - * Since the errata work arounds could be applied individually - * to the affected CPUs inside the TRBE driver, we need to know if - * a given CPU is affected by the erratum. Unlike the other erratum - * work arounds, TRBE driver needs to check multiple times during - * a trace session. Thus we need a quicker access to per-CPU - * errata and not issue costly this_cpu_has_cap() everytime. - * We keep a set of the affected errata in trbe_cpudata, per TRBE. - * - * We rely on the corresponding cpucaps to be defined for a given - * TRBE erratum. We map the given cpucap into a TRBE internal number - * to make the tracking of the errata lean. - * - * This helps in : - * - Not duplicating the detection logic - * - Streamlined detection of erratum across the system - */ -#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE 0 -#define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE 1 -#define TRBE_NEEDS_DRAIN_AFTER_DISABLE 2 -#define TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE 3 -#define TRBE_IS_BROKEN 4 - -static int trbe_errata_cpucaps[] = { - [TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE, - [TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE, - [TRBE_NEEDS_DRAIN_AFTER_DISABLE] = ARM64_WORKAROUND_2064142, - [TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE] = ARM64_WORKAROUND_2038923, - [TRBE_IS_BROKEN] = ARM64_WORKAROUND_1902691, - -1, /* Sentinel, must be the last entry */ -}; - -/* The total number of listed errata in trbe_errata_cpucaps */ -#define TRBE_ERRATA_MAX (ARRAY_SIZE(trbe_errata_cpucaps) - 1) - /* * Safe limit for the number of bytes that may be overwritten * when ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE is triggered. */ #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES 256
-/* - * struct trbe_cpudata: TRBE instance specific data - * @trbe_flag - TRBE dirty/access flag support - * @trbe_hw_align - Actual TRBE alignment required for TRBPTR_EL1. - * @trbe_align - Software alignment used for the TRBPTR_EL1. - * @cpu - CPU this TRBE belongs to. - * @mode - Mode of current operation. (perf/disabled) - * @drvdata - TRBE specific drvdata - * @errata - Bit map for the errata on this TRBE. - */ -struct trbe_cpudata { - bool trbe_flag; - u64 trbe_hw_align; - u64 trbe_align; - int cpu; - enum cs_mode mode; - struct trbe_buf *buf; - struct trbe_drvdata *drvdata; - DECLARE_BITMAP(errata, TRBE_ERRATA_MAX); -}; - struct trbe_drvdata { struct trbe_cpudata __percpu *cpudata; struct perf_output_handle * __percpu *handle; @@ -150,7 +70,8 @@ struct trbe_drvdata { struct platform_device *pdev; };
-DEFINE_STATIC_KEY_FALSE(trbe_trigger_mode_bypass); +VISIBLE_IF_KUNIT DEFINE_STATIC_KEY_FALSE(trbe_trigger_mode_bypass); +EXPORT_SYMBOL_IF_KUNIT(trbe_trigger_mode_bypass);
#define trbe_trigger_mode_need_bypass(cpudata) \ (trbe_may_overwrite_in_fill_mode((cpudata)) || \ @@ -333,8 +254,17 @@ static void __trbe_pad_buf(struct trbe_buf *buf, u64 offset, int len)
static void trbe_pad_buf(struct perf_output_handle *handle, int len) { - struct trbe_buf *buf = etm_perf_sink_config(handle); - u64 head = PERF_IDX2OFF(handle->head, buf); + struct trbe_buf *buf; + u64 head; + + if (kunit_get_current_test()) { + handle->head += len; + handle->size -= len; + return; + } + + buf = etm_perf_sink_config(handle); + head = PERF_IDX2OFF(handle->head, buf);
__trbe_pad_buf(buf, head, len); if (!buf->snapshot) @@ -383,9 +313,11 @@ static u64 trbe_min_trace_buf_size(struct perf_output_handle *handle) * %%%% - Free area, disabled, trace will not be written * ==== - Free area, padded with ETE_IGNORE_PACKET, trace will be skipped */ -static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) +VISIBLE_IF_KUNIT +unsigned long __trbe_normal_offset(struct perf_output_handle *handle) { - struct trbe_buf *buf = etm_perf_sink_config(handle); + struct trbe_buf *buf = + kunit_get_current_test() ? handle->rb : etm_perf_sink_config(handle); struct trbe_cpudata *cpudata = buf->cpudata; const u64 bufsize = buf->nr_pages * PAGE_SIZE; u64 limit = bufsize; @@ -525,9 +457,13 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) return 0; }
-static u64 __trbe_normal_trigger_count(struct perf_output_handle *handle) +EXPORT_SYMBOL_IF_KUNIT(__trbe_normal_offset); + +VISIBLE_IF_KUNIT +u64 __trbe_normal_trigger_count(struct perf_output_handle *handle) { - struct trbe_buf *buf = etm_perf_sink_config(handle); + struct trbe_buf *buf = + kunit_get_current_test() ? handle->rb : etm_perf_sink_config(handle); struct trbe_cpudata *cpudata = buf->cpudata; u64 limit, head, wakeup; u64 count = 0; @@ -558,6 +494,8 @@ static u64 __trbe_normal_trigger_count(struct perf_output_handle *handle) return count; }
+EXPORT_SYMBOL_IF_KUNIT(__trbe_normal_trigger_count); + static int trbe_normal_offset(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle); diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h index 4c65d164a946ec9860825e7564196745b60d730b..8f90836b5f71d44213699ec1915d59864863a4db 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.h +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -17,8 +17,91 @@ #include <linux/platform_device.h> #include <linux/smp.h>
+#include <kunit/test-bug.h> +#include <kunit/visibility.h> + #include "coresight-etm-perf.h"
+struct trbe_buf { + /* + * Even though trbe_base represents vmap() + * mapped allocated buffer's start address, + * it's being as unsigned long for various + * arithmetic and comparision operations & + * also to be consistent with trbe_write & + * trbe_limit sibling pointers. + */ + unsigned long trbe_base; + /* The base programmed into the TRBE */ + unsigned long trbe_hw_base; + unsigned long trbe_limit; + unsigned long trbe_write; + unsigned long trbe_count; + int nr_pages; + void **pages; + bool snapshot; + struct trbe_cpudata *cpudata; +}; + +/* + * TRBE erratum list + * + * The errata are defined in arm64 generic cpu_errata framework. + * Since the errata work arounds could be applied individually + * to the affected CPUs inside the TRBE driver, we need to know if + * a given CPU is affected by the erratum. Unlike the other erratum + * work arounds, TRBE driver needs to check multiple times during + * a trace session. Thus we need a quicker access to per-CPU + * errata and not issue costly this_cpu_has_cap() everytime. + * We keep a set of the affected errata in trbe_cpudata, per TRBE. + * + * We rely on the corresponding cpucaps to be defined for a given + * TRBE erratum. We map the given cpucap into a TRBE internal number + * to make the tracking of the errata lean. + * + * This helps in : + * - Not duplicating the detection logic + * - Streamlined detection of erratum across the system + */ +#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE 0 +#define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE 1 +#define TRBE_NEEDS_DRAIN_AFTER_DISABLE 2 +#define TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE 3 +#define TRBE_IS_BROKEN 4 + +static int trbe_errata_cpucaps[] = { + [TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE, + [TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE, + [TRBE_NEEDS_DRAIN_AFTER_DISABLE] = ARM64_WORKAROUND_2064142, + [TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE] = ARM64_WORKAROUND_2038923, + [TRBE_IS_BROKEN] = ARM64_WORKAROUND_1902691, + -1, /* Sentinel, must be the last entry */ +}; + +/* The total number of listed errata in trbe_errata_cpucaps */ +#define TRBE_ERRATA_MAX (ARRAY_SIZE(trbe_errata_cpucaps) - 1) + +/* + * struct trbe_cpudata: TRBE instance specific data + * @trbe_flag - TRBE dirty/access flag support + * @trbe_hw_align - Actual TRBE alignment required for TRBPTR_EL1. + * @trbe_align - Software alignment used for the TRBPTR_EL1. + * @cpu - CPU this TRBE belongs to. + * @mode - Mode of current operation. (perf/disabled) + * @drvdata - TRBE specific drvdata + * @errata - Bit map for the errata on this TRBE. + */ +struct trbe_cpudata { + bool trbe_flag; + u64 trbe_hw_align; + u64 trbe_align; + int cpu; + enum cs_mode mode; + struct trbe_buf *buf; + struct trbe_drvdata *drvdata; + DECLARE_BITMAP(errata, TRBE_ERRATA_MAX); +}; + static inline bool is_trbe_available(void) { u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1); @@ -153,3 +236,9 @@ static inline void set_trbe_base_pointer(unsigned long addr) WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE)); write_sysreg_s(addr, SYS_TRBBASER_EL1); } + +#if IS_ENABLED(CONFIG_KUNIT) +DECLARE_STATIC_KEY_FALSE(trbe_trigger_mode_bypass); +unsigned long __trbe_normal_offset(struct perf_output_handle *handle); +u64 __trbe_normal_trigger_count(struct perf_output_handle *handle); +#endif
To avoid complexity, if any CPU in the system has the fill mode erratum, the driver will not use trigger mode, it simply rolls back to fill mode only and apply the workaround on it.
Add a static key to control trigger mode bypassing. During each CPU probe, the key is enabled when the relevant erratum is detected.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-trbe.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 941aa46e9b11f60c707eb40093964de454a3fd83..8390d0a8fe23d35945610df15f21751279ee37ee 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -17,6 +17,7 @@
#include <asm/barrier.h> #include <asm/cpufeature.h> +#include <linux/jump_label.h> #include <linux/kvm_host.h> #include <linux/vmalloc.h>
@@ -147,6 +148,12 @@ struct trbe_drvdata { struct platform_device *pdev; };
+DEFINE_STATIC_KEY_FALSE(trbe_trigger_mode_bypass); + +#define trbe_trigger_mode_need_bypass(cpudata) \ + (trbe_may_overwrite_in_fill_mode((cpudata)) || \ + trbe_may_write_out_of_range((cpudata))) + static void trbe_check_errata(struct trbe_cpudata *cpudata) { int i; @@ -1306,6 +1313,14 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
dev_set_drvdata(&trbe_csdev->dev, cpudata); coresight_set_percpu_sink(cpu, trbe_csdev); + + /* + * If any CPU cannot use trigger mode, bypass the mode globally for + * consistent tracing behaviour. + */ + if (trbe_trigger_mode_need_bypass(cpudata)) + static_branch_enable(&trbe_trigger_mode_bypass); + return; cpu_clear: cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
The default watermark is half of the total buffer size. In many cases, the tool can not be notified with sufficient free space, especially when profiling with small AUX buffer (e.g., 64KiB).
Setting watermark to quarter of the buffer to notifies the tool to read data earlier and prevents the data loss.
Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/arch/arm/util/cs-etm.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index ea891d12f8f40beebf8dee1d3db71cad701f5666..649b8b0d0f92b4af45fb97db9da3c5ccf24a978b 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -424,6 +424,13 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, pr_debug2("%s snapshot size: %zu\n", CORESIGHT_ETM_PMU_NAME, opts->auxtrace_snapshot_size);
+ if (!opts->auxtrace_snapshot_mode && !opts->auxtrace_sample_mode) { + size_t aw = opts->auxtrace_mmap_pages * (size_t)page_size / 4; + u32 aux_watermark = aw > UINT_MAX ? UINT_MAX : aw; + + cs_etm_evsel->core.attr.aux_watermark = aux_watermark; + } + /* * To obtain the auxtrace buffer file descriptor, the auxtrace * event must come first.
Hi Leo
A couple of minor nits below. Otherwise looks like a nice cleanup
On 01/12/2025 11:21, Leo Yan wrote:
It gives priority to TRBSR_EL1.EA (external abort); an external abort will immediately bail out and return an error.
Next, the syndrome decoding is refactored based on two levels of information: the EC (Event Class) bits and the BSC (Trace Buffer Status Code) bits.
If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to identify the specific trace buffer event. Otherwise, any non-zero TRBSR_EL1.EC is treated as an error.
For error cases, the driver prints an error string and dumps registers for debugging.
minor nit: Please avoid describing the code, which cod does well. Briefly mention what you are doing, giving any additional contexts that may not be evident from the code.
e.g:
"Add support for decoding the syndrome for TRBE, providing a verbose description of the code for Fatal/unhandled events. While at it add the new definitions from the latest Arm ARM for EC and BSC"
No additional checks are required for wrap mode beyond verifying the TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
reliably indicates a buffer wrap.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 60 +++++++++++++++++++++------- drivers/hwtracing/coresight/coresight-trbe.h | 8 ++-- 2 files changed, 51 insertions(+), 17 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf) static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle, u64 trbsr) {
- const char *err_str; int ec = get_trbe_ec(trbsr); int bsc = get_trbe_bsc(trbsr);
- struct trbe_buf *buf = etm_perf_sink_config(handle);
- struct trbe_cpudata *cpudata = buf->cpudata;
WARN_ON(is_trbe_running(trbsr));
- if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
return TRBE_FAULT_ACT_FATAL;
- if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
return TRBE_FAULT_ACT_FATAL;
- if (is_trbe_abort(trbsr)) {
err_str = "External abort";goto out_fatal;- }
- /*
* If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,* it might write data after a WRAP event in the fill mode.* Thus the check TRBPTR == TRBBASER will not be honored.*/- if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
(trbe_may_overwrite_in_fill_mode(cpudata) ||get_trbe_write_pointer() == get_trbe_base_pointer()))
- switch (ec) {
- case TRBE_EC_OTHERS:
break;- case TRBE_EC_BUF_MGMT_IMPL:
err_str = "Unexpected implemented management";
minor nit: "Unexpected IMPDEF buffer management event"
Suzuki
goto out_fatal;- case TRBE_EC_GP_CHECK_FAULT:
err_str = "Granule Protection Check fault";goto out_fatal;- case TRBE_EC_STAGE1_ABORT:
err_str = "Stage 1 data abort";goto out_fatal;- case TRBE_EC_STAGE2_ABORT:
err_str = "Stage 2 data abort";goto out_fatal;- default:
err_str = "Unknown error code";goto out_fatal;- }
- switch (bsc) {
- case TRBE_BSC_NOT_STOPPED:
break;- case TRBE_BSC_FILLED:
break;- case TRBE_BSC_TRIGGERED:
err_str = "Unexpected trigger status";goto out_fatal;- default:
err_str = "Unexpected buffer status code";goto out_fatal;- }
- if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
return TRBE_FAULT_ACT_SPURIOUS;
+out_fatal:
- pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
err_str, smp_processor_id(), trbsr,read_sysreg_s(SYS_TRBPTR_EL1),read_sysreg_s(SYS_TRBLIMITR_EL1));- return TRBE_FAULT_ACT_FATAL; }
static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.h +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void) return trblimitr & TRBLIMITR_EL1_E; } -#define TRBE_EC_OTHERS 0 -#define TRBE_EC_STAGE1_ABORT 36 -#define TRBE_EC_STAGE2_ABORT 37 +#define TRBE_EC_OTHERS 0x0 +#define TRBE_EC_GP_CHECK_FAULT 0X1e +#define TRBE_EC_BUF_MGMT_IMPL 0x1f +#define TRBE_EC_STAGE1_ABORT 0x24 +#define TRBE_EC_STAGE2_ABORT 0x25 static inline int get_trbe_ec(u64 trbsr) {
On 01/12/2025 11:21, Leo Yan wrote:
Rather than spreading AUX flag setting in different functions, use trbe_get_fault_act() as a central place for setting the flag.
Later we will support WRAP mode with continuous trace, so the WRAP event does not necessarily cause the trace discontinuity, change to check the stop status instead.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 38 +++++++++++++--------------- 1 file changed, 17 insertions(+), 21 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 28e2bfa68074f19ccaa4a737d00af577aea818fe..b06885a08e082fd34f68d9588518807b5c47c86e 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -265,25 +265,6 @@ static void trbe_reset_local(struct trbe_cpudata *cpudata) write_sysreg_s(0, SYS_TRBSR_EL1); } -static void trbe_report_wrap_event(struct perf_output_handle *handle) -{
- /*
* Mark the buffer to indicate that there was a WRAP event by* setting the COLLISION flag. This indicates to the user that* the TRBE trace collection was stopped without stopping the* ETE and thus there might be some amount of trace that was* lost between the time the WRAP was detected and the IRQ* was consumed by the CPU.** Setting the TRUNCATED flag would move the event to STOPPED* state unnecessarily, even when there is space left in the* ring buffer. Using the COLLISION flag doesn't have this side* effect. We only set TRUNCATED flag when there is no space* left in the ring buffer.*/- perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION);
-}
- static void trbe_truncate_event(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle);
@@ -687,6 +668,23 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand goto out_fatal; }
- /*
* Mark the buffer to indicate that there was a WRAP event by* setting the COLLISION flag. This indicates to the user that* the TRBE trace collection was stopped without stopping the* ETE and thus there might be some amount of trace that was* lost between the time the WRAP was detected and the IRQ* was consumed by the CPU.** Setting the TRUNCATED flag would move the event to STOPPED* state unnecessarily, even when there is space left in the* ring buffer. Using the COLLISION flag doesn't have this side* effect. We only set TRUNCATED flag when there is no space* left in the ring buffer.*/- if (!is_trbe_running(trbsr))
Is this really required ? There is a WARN_ON(is_trbe_running(trbsr)) above ?
perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION);
Also, setting this would unnecessarily mark COLLISION while stopping the TRBE buffer normally (when called from arm_trbe_update_buffer), when there is no WRAP event ?
- if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
Couldn't this be :
/* Move the big fat comment here */ if (is_trbe_wrap(trbsr)) { perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); return TRBE_FAULT_ACT_WRAP; }
Suzuki
On 01/12/2025 11:22, Leo Yan wrote:
The current code checks the fault action only via the IRQ status bit, which is unreliable due to possible hardware latency.
Move the fault action check out of the IRQ status condition. This also causes the buffer size to be calculated for non-WRAP and fault cases, which is fine since the write pointer is trusted for the calculation.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index f56ecdeaa6596afb440e4d53732e08a85f9bf89d..e579ea98523c24d23a0cd265dcdd0a46b52b52da 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -806,7 +806,6 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, enum trbe_fault_action act; unsigned long size, status; unsigned long flags;
- bool wrap = false;
WARN_ON(buf->cpudata != cpudata); WARN_ON(cpudata->cpu != smp_processor_id()); @@ -858,21 +857,11 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, */ clr_trbe_irq(); isb();
act = trbe_get_fault_act(handle, status);/** If this was not due to a WRAP event, we have some* errors and as such buffer is empty.*/if (act != TRBE_FAULT_ACT_WRAP) {size = 0;goto done;} }wrap = true;
- size = trbe_get_trace_size(handle, buf, wrap);
- act = trbe_get_fault_act(handle, status);
Also act is valid only when there is an IRQ ?
- size = trbe_get_trace_size(handle, buf, act == TRBE_FAULT_ACT_WRAP);
We have certain assumptions in trbe_get_trace_size(), which may be broken with this change.
e.g., if we get a fatal error, we don't detect that the buffer is empty and may trigger a WARN_ON() on systems with CPU erratum as below ?
/* * If the TRBE is affected by the following erratum, we must fill * the space we skipped with IGNORE packets. And we are always * guaranteed to have at least a PAGE_SIZE space in the buffer. */ if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) && !WARN_ON(size < overwrite_skip)) __trbe_pad_buf(buf, start_off, overwrite_skip);
Suzuki
done: local_irq_restore(flags);
On 01/12/2025 11:22, Leo Yan wrote:
The overwrite erratum occurs only on wrap events, so apply the extra wrap condition check in the workaround.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e579ea98523c24d23a0cd265dcdd0a46b52b52da..2600af12a8fb94bb8c74efda2a101aacd01b0b34 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -714,7 +714,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * 64bytes. Thus we ignore the potential triggering of the erratum * on WRAP and limit the data to LIMIT. */
- if (wrap)
- if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata)) write = get_trbe_limit_pointer();
This must be trbe_may_write_out_of_range() ?
Suzuki
else write = get_trbe_write_pointer(); @@ -736,7 +736,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * the space we skipped with IGNORE packets. And we are always * guaranteed to have at least a PAGE_SIZE space in the buffer. */
- if (trbe_may_overwrite_in_fill_mode(buf->cpudata) &&
- if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata) && !WARN_ON(size < overwrite_skip)) __trbe_pad_buf(buf, start_off, overwrite_skip);
On 01/12/2025 11:22, Leo Yan wrote:
To avoid complexity, if any CPU in the system has the fill mode erratum, the driver will not use trigger mode, it simply rolls back to fill mode only and apply the workaround on it.
Add a static key to control trigger mode bypassing. During each CPU probe, the key is enabled when the relevant erratum is detected.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 941aa46e9b11f60c707eb40093964de454a3fd83..8390d0a8fe23d35945610df15f21751279ee37ee 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -17,6 +17,7 @@ #include <asm/barrier.h> #include <asm/cpufeature.h> +#include <linux/jump_label.h> #include <linux/kvm_host.h> #include <linux/vmalloc.h> @@ -147,6 +148,12 @@ struct trbe_drvdata { struct platform_device *pdev; }; +DEFINE_STATIC_KEY_FALSE(trbe_trigger_mode_bypass);
+#define trbe_trigger_mode_need_bypass(cpudata) \
- (trbe_may_overwrite_in_fill_mode((cpudata)) || \
trbe_may_write_out_of_range((cpudata)))
Please make that a static inline function.
static void trbe_check_errata(struct trbe_cpudata *cpudata) { int i; @@ -1306,6 +1313,14 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp dev_set_drvdata(&trbe_csdev->dev, cpudata); coresight_set_percpu_sink(cpu, trbe_csdev);
- /*
* If any CPU cannot use trigger mode, bypass the mode globally for* consistent tracing behaviour.*/- if (trbe_trigger_mode_need_bypass(cpudata))
static_branch_enable(&trbe_trigger_mode_bypass);
You only need to enable this once, not for every CPU.
Suzuki
- return; cpu_clear: cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
On Tue, Dec 02, 2025 at 11:15:47AM +0000, Suzuki K Poulose wrote:
[...]
@@ -687,6 +668,23 @@ static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *hand goto out_fatal; }
- /*
* Mark the buffer to indicate that there was a WRAP event by* setting the COLLISION flag. This indicates to the user that* the TRBE trace collection was stopped without stopping the* ETE and thus there might be some amount of trace that was* lost between the time the WRAP was detected and the IRQ* was consumed by the CPU.** Setting the TRUNCATED flag would move the event to STOPPED* state unnecessarily, even when there is space left in the* ring buffer. Using the COLLISION flag doesn't have this side* effect. We only set TRUNCATED flag when there is no space* left in the ring buffer.*/- if (!is_trbe_running(trbsr))
Is this really required ? There is a WARN_ON(is_trbe_running(trbsr)) above ?
The WARN_ON() will reports warning for running case, here it sets COLLISION flag for not running case, which is not conflict.
This works for the Fill mode (stop collection), we always set the COLLISION flag for it. Once we enable trigger and wrap modes, due to the trace continues to run, we don't set the flag for these modes.
perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION);Also, setting this would unnecessarily mark COLLISION while stopping the TRBE buffer normally (when called from arm_trbe_update_buffer), when there is no WRAP event ?
Good point. I should refactor arm_trbe_update_buffer(), it firstly reads out TRBSR, stop trace afterwards, and then call trbe_get_fault_act().
In this case, it is in running mode so COLLISION will not be set for normal disable flow.
I will add a new patch for this and place it before this one.
- if (is_trbe_wrap(trbsr)) return TRBE_FAULT_ACT_WRAP;
Couldn't this be :
/* Move the big fat comment here */ if (is_trbe_wrap(trbsr)) { perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); return TRBE_FAULT_ACT_WRAP; }
It deliberately de-couples the COLLISION flag from the WRAP event.
In later change, a WRAP event only means buffer wrap around, it does mean the trace has been stopped. The AUX flag is only set for stopping case.
Thanks, Leo
On Tue, Dec 02, 2025 at 11:06:48AM +0000, Suzuki K Poulose wrote:
Hi Leo
A couple of minor nits below. Otherwise looks like a nice cleanup
On 01/12/2025 11:21, Leo Yan wrote:
It gives priority to TRBSR_EL1.EA (external abort); an external abort will immediately bail out and return an error.
Next, the syndrome decoding is refactored based on two levels of information: the EC (Event Class) bits and the BSC (Trace Buffer Status Code) bits.
If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to identify the specific trace buffer event. Otherwise, any non-zero TRBSR_EL1.EC is treated as an error.
For error cases, the driver prints an error string and dumps registers for debugging.
minor nit: Please avoid describing the code, which cod does well. Briefly mention what you are doing, giving any additional contexts that may not be evident from the code.
Sure, I will note this later.
e.g:
"Add support for decoding the syndrome for TRBE, providing a verbose description of the code for Fatal/unhandled events. While at it add the new definitions from the latest Arm ARM for EC and BSC"
Will do this and followed comment. Thanks for suggestion!
Leo
On Tue, Dec 02, 2025 at 12:05:52PM +0000, Suzuki K Poulose wrote:
On 01/12/2025 11:22, Leo Yan wrote:
The overwrite erratum occurs only on wrap events, so apply the extra wrap condition check in the workaround.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e579ea98523c24d23a0cd265dcdd0a46b52b52da..2600af12a8fb94bb8c74efda2a101aacd01b0b34 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -714,7 +714,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * 64bytes. Thus we ignore the potential triggering of the erratum * on WRAP and limit the data to LIMIT. */
- if (wrap)
- if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata)) write = get_trbe_limit_pointer();
This must be trbe_may_write_out_of_range() ?
SDEN (e.g., [1]) is ambiguous about how the write pointer acts for the out-of-range erratum.
However, SDEN explicitly states for overwrite erratum: "The current write pointer also increments by the same number of cache line locations.
If this is still concerned, I can add check trbe_may_write_out_of_range() in next spin.
Thanks, Leo
[1] https://documentation-service.arm.com/static/636a660b4e6cf12278ad89c4?token=
On Tue, Dec 02, 2025 at 04:56:47PM +0000, Coresight ML wrote:
On Tue, Dec 02, 2025 at 12:05:52PM +0000, Suzuki K Poulose wrote:
On 01/12/2025 11:22, Leo Yan wrote:
The overwrite erratum occurs only on wrap events, so apply the extra wrap condition check in the workaround.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e579ea98523c24d23a0cd265dcdd0a46b52b52da..2600af12a8fb94bb8c74efda2a101aacd01b0b34 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -714,7 +714,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * 64bytes. Thus we ignore the potential triggering of the erratum * on WRAP and limit the data to LIMIT. */
- if (wrap)
- if (wrap && trbe_may_overwrite_in_fill_mode(buf->cpudata)) write = get_trbe_limit_pointer();
This must be trbe_may_write_out_of_range() ?
SDEN (e.g., [1]) is ambiguous about how the write pointer acts for the out-of-range erratum.
Thought again, we should not worry about out-of-range case.
As the workaround always ensures sufficient space beyond the limit, the write pointer remains valid within the allocated buffer even if it is updated and crosses that limit.
On 01/12/25 4:51 PM, Leo Yan wrote:
Use the existed helpers for checking errata instead of open coded
Small nit - s/the existed helpers/the existing helpers/
equivalent.> Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 293715b4ff0eb0abe30f9b477700ca94f81cd4a2..0ddb3db0213cf0014e29decfb79da68b0a351b31 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -723,7 +723,7 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle, * the space we skipped with IGNORE packets. And we are always * guaranteed to have at least a PAGE_SIZE space in the buffer. */
- if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) &&
- if (trbe_may_overwrite_in_fill_mode(buf->cpudata) && !WARN_ON(size < overwrite_skip)) __trbe_pad_buf(buf, start_off, overwrite_skip);
@@ -946,7 +946,7 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf) * - At trace collection: * - Pad the 256bytes skipped above again with IGNORE packets. */
- if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE)) {
- if (trbe_may_overwrite_in_fill_mode(buf->cpudata)) { if (WARN_ON(!IS_ALIGNED(buf->trbe_write, PAGE_SIZE))) return -EINVAL; buf->trbe_hw_base = buf->trbe_write;
@@ -970,7 +970,7 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf) * - Adjust the TRBLIMITR.LIMIT to leave the extra PAGE outside * the TRBE's range (i.e [TRBBASER, TRBLIMITR.LIMI] ). */
- if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE)) {
- if (trbe_may_write_out_of_range(buf->cpudata)) { s64 space = buf->trbe_limit - buf->trbe_write; /*
- We must have more than a PAGE_SIZE worth space in the proposed
LGTM
Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com
On 01/12/25 4:51 PM, Leo Yan wrote:
The trace unit is already disabled when trbe_stop_and_truncate_event() is called, so draining and stopping the buffer in the function is redundant.
Remove the unnecessary disable operation and rename the function to trbe_truncate_event() to better reflect its purpose.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 0ddb3db0213cf0014e29decfb79da68b0a351b31..2f44e4a65e0ee2b2c8fdd06a51ab01fc57f44a4e 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -284,18 +284,10 @@ static void trbe_report_wrap_event(struct perf_output_handle *handle) perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); } -static void trbe_stop_and_truncate_event(struct perf_output_handle *handle) +static void trbe_truncate_event(struct perf_output_handle *handle) { struct trbe_buf *buf = etm_perf_sink_config(handle);
- /*
* We cannot proceed with the buffer collection and we* do not have any data for the current session. The* etm_perf driver expects to close out the aux_buffer* at event_stop(). So disable the TRBE here and leave* the update_buffer() to return a 0 size.*/- trbe_drain_and_disable_local(buf->cpudata); perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); perf_aux_output_end(handle, 0); *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
@@ -1008,7 +1000,7 @@ static int __arm_trbe_enable(struct trbe_buf *buf, trbe_enable_hw(buf); return 0; err:
- trbe_stop_and_truncate_event(handle);
- trbe_truncate_event(handle); return ret;
} @@ -1169,7 +1161,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) trbe_handle_spurious(handle); break; case TRBE_FAULT_ACT_FATAL:
trbe_stop_and_truncate_event(handle);
truncated = true; break; }trbe_truncate_event(handle);
In arm_trbe_irq_handler() function trbe_drain_and_disable_local() is called early on before TRBE_FAULT_ACT_FATAL gets handled but for the other case __arm_trbe_enable() - do all the code paths leading upto present trbe_stop_and_truncate_event() would have definitely called trbe_drain_and_disable_local() as well ? If yes, please do call out those code paths clearly in the commit message.
Besides it might be worth adding an WARN_ON() in trbe_truncate_event() if the TRBE has not been disabled as now being expected there.
On 01/12/25 4:51 PM, Leo Yan wrote:
When trbe_handle_overflow() runs, the buffer has already been disabled and drained, so the duplicate operation is unnecessary. Remove it.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index 2f44e4a65e0ee2b2c8fdd06a51ab01fc57f44a4e..f5597bd9b5fba9a8f5053d5823b03380fd468b5c 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -1077,7 +1077,6 @@ static int trbe_handle_overflow(struct perf_output_handle *handle) * is able to detect this with a disconnected handle * (handle->event = NULL). */
*this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL; return -EINVAL; }trbe_drain_and_disable_local(buf->cpudata);
Like the previous patch - trbe_drain_and_disable_local() would have run if TRBE_FAULT_ACT_WRAP case is being handled via trbe_handle_overflow(). But please do mention these details in the commit message for things to be clear and explicit.
Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com
On 01/12/25 4:51 PM, Leo Yan wrote:
set_trbe_disabled() should never appear in the enable flow, otherwise, it may potentially hide bugs in the disable flow.
Remove set_trbe_disabled() from the enable path.
IIRC without first disabling TRBLIMITR_EL1_E - TRBE registers or their fields should not be fetched or interpreted. Without that none of the subsequent HW operations should be performed inside trbe_enable_hw() leading upto enabling it.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index f5597bd9b5fba9a8f5053d5823b03380fd468b5c..e426991e2c2c398a9d3982e9d0f7f542e404cbab 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -629,7 +629,6 @@ static void trbe_enable_hw(struct trbe_buf *buf) WARN_ON(buf->trbe_hw_base < buf->trbe_base); WARN_ON(buf->trbe_write < buf->trbe_hw_base); WARN_ON(buf->trbe_write >= buf->trbe_limit);
- set_trbe_disabled(buf->cpudata); clr_trbe_status(); set_trbe_base_pointer(buf->trbe_hw_base); set_trbe_write_pointer(buf->trbe_write);
On 01/12/25 4:51 PM, Leo Yan wrote:
If the driver does not clear the status when disabling the trace buffer unit, stale state will carry over to the next enable, though the driver clears it again on enable.
There is no problem now ! Because trbe_enable_hw() calls clr_trbe_status().
Explicitly clear status after the trace is disabled in the interrupt handling and when a perf session ends. Keep the status for spurious interrupts for continuous tracing.
But is not that the behaviour already without this change ?
clr_trbe_status() in trbe_enable_hw() ensures that no TRBE session can be started without first clearing the existing status. Still wondering what is the purpose of this change ?
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-trbe.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e426991e2c2c398a9d3982e9d0f7f542e404cbab..9e565122816949b37b8ff5e4ba04cfbc317c6f25 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -629,7 +629,6 @@ static void trbe_enable_hw(struct trbe_buf *buf) WARN_ON(buf->trbe_hw_base < buf->trbe_base); WARN_ON(buf->trbe_write < buf->trbe_hw_base); WARN_ON(buf->trbe_write >= buf->trbe_limit);
- clr_trbe_status(); set_trbe_base_pointer(buf->trbe_hw_base); set_trbe_write_pointer(buf->trbe_write);
@@ -1036,6 +1035,8 @@ static int arm_trbe_disable(struct coresight_device *csdev) return -EINVAL; trbe_drain_and_disable_local(cpudata);
- clr_trbe_status();
- buf->cpudata = NULL; cpudata->buf = NULL; cpudata->mode = CS_MODE_DISABLED;
@@ -1151,6 +1152,10 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) return IRQ_NONE; act = trbe_get_fault_act(handle, status);
- if (act != TRBE_FAULT_ACT_SPURIOUS)
clr_trbe_status();- switch (act) { case TRBE_FAULT_ACT_WRAP: truncated = !!trbe_handle_overflow(handle);
On Thu, Dec 04, 2025 at 06:13:56PM +0530, Anshuman Khandual wrote:
On 01/12/25 4:51 PM, Leo Yan wrote:
set_trbe_disabled() should never appear in the enable flow, otherwise, it may potentially hide bugs in the disable flow.
Remove set_trbe_disabled() from the enable path.
IIRC without first disabling TRBLIMITR_EL1_E - TRBE registers or their fields should not be fetched or interpreted.
Yes. I think you are referring to the rule DJMDD in Arm ARM: "The PE might ignore a direct or external write to any of certain Trace Buffer Unit registers ... (when) TRBLIMITR_EL1.E is 1, and the Trace Buffer Unit is using Self-hosted mode."
Without that none of the subsequent HW operations should be performed inside trbe_enable_hw() leading upto enabling it.
Fair enough.
If we can conclude the trace unit has been disabled properly in below cases, no reason to arbitrarily calling set_trbe_disabled() during each enable.
1) The SYS_TRBLIMITR_EL1 register is cleared in trbe_reset_local() during probe phase. 2) The SYS_TRBLIMITR_EL1.E bit is cleared in arm_trbe_irq_handler() for interrupt handling. 3) The SYS_TRBLIMITR_EL1.E bit is cleared in the disable flow.
Seems to me, this is not only for code cleanup, we need to promise a sane logic in the flow.
Thanks, Leo