On Thu, 04 Jun 2026 15:34:25 +0800, Junrui Luo wrote:
> When the SMB sink is used as a perf AUX sink, smb_update_buffer() calls
> smb_sync_perf_buffer() to copy hardware trace data into the perf AUX ring
> buffer pages. It derives pg_idx = head >> PAGE_SHIFT from @head, which is
> handle->head, and indexes dst_pages[pg_idx]. The pg_idx %= nr_pages
> normalization is only applied after the first loop iteration.
>
> This leaves the initial page index underived from the buffer size, which
> can result in an out-of-bounds write past dst_pages[] when head exceeds
> the AUX buffer size.
>
> [...]
Applied, thanks!
[1/1] coresight: ultrasoc-smb: Fix OOB write in smb_sync_perf_buffer()
https://git.kernel.org/coresight/c/98495b5a4d77
Best regards,
--
Suzuki K Poulose <suzuki.poulose(a)arm.com>
Hi Amir,
On Wed, Jun 03, 2026 at 01:10:17PM -0700, Amir Ayupov wrote:
> Hi James,
>
> Thank you for picking it up.
>
> I tested the v2 patch series and it looks good. There was a minor
> difference in 2/39 tested perf data files: the number of brstack samples
> differs by one, however, there was no loss of binary profile. The resulting
> BOLT profile converted from the perf script output was identical, so I'm OK
> with v2 patch as-is.
Sorry jumping in as brstack is mentioned.
Now branch stack is maintained per-CPU wise, it mixes up branch stack
cross threads. The patch 03 in the series [1] refactors branch stack
per-thread by using common code.
Hope this can benefit a bit the profiling data quality and in case
you are interested in.
Thanks,
Leo
[1] https://lore.kernel.org/linux-perf-users/20260526-b4-arm_cs_callchain_suppo…
Fix thread tracking when decoding Coresight trace and add a new test for
it.
The new test is added as a Perf test workload instead of a custom binary
with its own build system, but this requires a new feature in Perf test
to pass in control pipes which can enable and disable events. This
scopes the recording to just the workload and helps to reduce the amount
of data recorded in tracing tests.
With this new feature we can re-write all of the Coresight tests to make
use of it and remove the remaining binaries which fixes the following
issues:
* They didn't work in out of source builds
* A lot of the tests unnecessarily required root and didn't skip
without it
* They were mainly qualitative tests which didn't look for specific
behavior
Most importantly, the long build and runtime has been reduced. On a
Radxa Orion O6, unroll_loop_thread.c took 37s to compile which is longer
than the entire Perf build. Now the build time is negligible and the
before and after test runtimes for all the Coresight tests are:
| N1SDP | Orion O6
-----------------------------------
Before | 4m 0s | 14m 49s
After | 26s | 56s
-----------------------------------
Signed-off-by: James Clark <james.clark(a)linaro.org>
---
Changes in v3:
- Minor sashiko comments
- Close some more pipes
- Fix warning messages
- Error handling improvements
- Pass packet into cs_etm__synth_instruction_sample()
- Fixup stale comment (Leo)
- Link to v2: https://lore.kernel.org/r/20260602-james-cs-context-tracking-fix-v2-0-85b5c…
Changes in v2:
- Add --workload-ctl option to Perf test
- Re-write all the Coresight tests and speed them up
- Pass packet to memory access function so frontend can use either the
previous or current packet's EL
- Link to v1: https://lore.kernel.org/r/20260526-james-cs-context-tracking-fix-v1-0-ebd60…
---
James Clark (19):
perf cs-etm: Queue context packets for frontend
perf test: Add workload-ctl option
perf test: Add a workload that forces context switches
perf test cs-etm: Test process attribution
perf test: Add deterministic workload
perf test cs-etm: Replace unroll loop thread with deterministic decode test
perf test cs-etm: Remove asm_pure_loop test
perf test cs-etm: Replace memcpy test with raw dump stress test
perf test: Add named_threads workload
perf test cs-etm: Test decoding for concurrent threads test
perf test cs-etm: Remove duplicate branch tests
perf test cs-etm: Skip if not root
perf test cs-etm: Reduce snapshot size
perf test cs-etm: Speed up basic test
perf test cs-etm: Remove unused Coresight workloads
perf test cs-etm: Make disassembly test use kcore
perf test cs-etm: Add all branch instructions to test
perf test cs-etm: Speed up disassembly test
perf test cs-etm: Move existing tests to coresight folder
Documentation/trace/coresight/coresight-perf.rst | 78 +------
MAINTAINERS | 2 -
tools/perf/Documentation/perf-test.txt | 18 +-
tools/perf/Makefile.perf | 14 +-
tools/perf/scripts/python/arm-cs-trace-disasm.py | 20 +-
tools/perf/tests/builtin-test.c | 187 +++++++++++++++-
tools/perf/tests/shell/coresight/Makefile | 29 ---
.../perf/tests/shell/coresight/Makefile.miniconfig | 14 --
tools/perf/tests/shell/coresight/asm_pure_loop.sh | 22 --
.../tests/shell/coresight/asm_pure_loop/.gitignore | 1 -
.../tests/shell/coresight/asm_pure_loop/Makefile | 34 ---
.../shell/coresight/asm_pure_loop/asm_pure_loop.S | 30 ---
.../tests/shell/coresight/concurrent_threads.sh | 45 ++++
.../tests/shell/coresight/context_switch_thread.sh | 69 ++++++
tools/perf/tests/shell/coresight/deterministic.sh | 71 +++++++
.../tests/shell/coresight/memcpy_thread/.gitignore | 1 -
.../tests/shell/coresight/memcpy_thread/Makefile | 33 ---
.../shell/coresight/memcpy_thread/memcpy_thread.c | 80 -------
.../tests/shell/coresight/memcpy_thread_16k_10.sh | 22 --
.../perf/tests/shell/coresight/raw_dump_stress.sh | 48 +++++
.../shell/{ => coresight}/test_arm_coresight.sh | 43 ++--
.../{ => coresight}/test_arm_coresight_disasm.sh | 17 +-
.../tests/shell/coresight/thread_loop/.gitignore | 1 -
.../tests/shell/coresight/thread_loop/Makefile | 33 ---
.../shell/coresight/thread_loop/thread_loop.c | 85 --------
.../shell/coresight/thread_loop_check_tid_10.sh | 23 --
.../shell/coresight/thread_loop_check_tid_2.sh | 23 --
.../shell/coresight/unroll_loop_thread/.gitignore | 1 -
.../shell/coresight/unroll_loop_thread/Makefile | 33 ---
.../unroll_loop_thread/unroll_loop_thread.c | 75 -------
.../tests/shell/coresight/unroll_loop_thread_10.sh | 22 --
tools/perf/tests/shell/lib/coresight.sh | 134 ------------
tools/perf/tests/tests.h | 3 +
tools/perf/tests/workloads/Build | 4 +
tools/perf/tests/workloads/context_switch_loop.c | 101 +++++++++
tools/perf/tests/workloads/deterministic.c | 39 ++++
tools/perf/tests/workloads/named_threads.c | 109 ++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 21 +-
tools/perf/util/cs-etm.c | 234 ++++++++++++---------
tools/perf/util/cs-etm.h | 8 +-
40 files changed, 889 insertions(+), 938 deletions(-)
---
base-commit: 5f0ca6b80b12bab1ce06839cdffb6148bb650ff4
change-id: 20260515-james-cs-context-tracking-fix-754998bae7ed
Best regards,
--
James Clark <james.clark(a)linaro.org>
On Tue, May 19, 2026 at 04:48:08PM +0100, Yeoreum Yun wrote:
> In the perf enable path, there are missing cases where
> cscfg_csdev_disable_active_config() is not called:
>
> - Branch broadcast is selected but not supported by the hardware
> - etm4_enable_hw() fails
>
> This can lead to a leak of config_desc->active_cnt.
> Fix this by properly calling cscfg_csdev_disable_active_config()
> in these error paths.
>
> Fixes: 810ac401db1f ("coresight: etm4x: Add complex configuration handlers to etmv4")
> Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
> ---
> .../hwtracing/coresight/coresight-etm4x-core.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index f0a9f2ef4b27..0889937811cb 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -899,6 +899,8 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
> * Missing BB support could cause silent decode errors
> * so fail to open if it's not supported.
> */
> + if (cfg_hash)
> + cscfg_csdev_disable_active_config(csdev);
> ret = -EINVAL;
> goto out;
> } else {
> @@ -915,6 +917,7 @@ static int etm4_enable_perf(struct coresight_device *csdev,
> struct coresight_path *path)
> {
> struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct perf_event_attr *attr = &event->attr;
> int ret;
>
> if (WARN_ON_ONCE(drvdata->cpu != smp_processor_id()))
> @@ -926,7 +929,7 @@ static int etm4_enable_perf(struct coresight_device *csdev,
> /* Configure the tracer based on the session's specifics */
> ret = etm4_parse_event_config(csdev, event);
> if (ret)
> - goto out;
> + goto err;
>
> drvdata->trcid = path->trace_id;
>
> @@ -935,16 +938,19 @@ static int etm4_enable_perf(struct coresight_device *csdev,
>
> /* And enable it */
> ret = etm4_enable_hw(drvdata);
> -
> -out:
> - /* Failed to start tracer; roll back to DISABLED mode */
> if (ret) {
> - coresight_set_mode(csdev, CS_MODE_DISABLED);
> - return ret;
> + if (ATTR_CFG_GET_FLD(attr, configid))
> + cscfg_csdev_disable_active_config(csdev);
> + goto err;
> }
>
> csdev->path = path;
> return 0;
> +
> +err:
> + /* Failed to start tracer; roll back to DISABLED mode */
> + coresight_set_mode(csdev, CS_MODE_DISABLED);
> + return ret;
> }
Seems to me, it is not readable if spread error handling into both child
and parent functions. I would like we can stick to use the same function
for resource allocation and rollback for failures.
How about below change (I did not verify it yet):
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 0889937811cb..8347efb2069b 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -794,8 +794,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
.ATTR_CFG_FLD_timestamp_CFG = U64_MAX,
};
struct perf_event_attr *attr = &event->attr;
- unsigned long cfg_hash;
- int preset, cc_threshold;
+ int cc_threshold;
u8 ts_level;
/* Clear configuration from previous run */
@@ -882,16 +881,6 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
/* bit[12], Return stack enable bit */
config->cfg |= TRCCONFIGR_RS;
- /*
- * Set any selected configuration and preset. A zero configid means no
- * configuration active, preset = 0 means no preset selected.
- */
- cfg_hash = ATTR_CFG_GET_FLD(attr, configid);
- if (cfg_hash) {
- preset = ATTR_CFG_GET_FLD(attr, preset);
- ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset);
- }
-
/* branch broadcast - enable if selected and supported */
if (ATTR_CFG_GET_FLD(attr, branch_broadcast)) {
if (!caps->trcbb) {
@@ -899,8 +888,6 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
* Missing BB support could cause silent decode errors
* so fail to open if it's not supported.
*/
- if (cfg_hash)
- cscfg_csdev_disable_active_config(csdev);
ret = -EINVAL;
goto out;
} else {
@@ -918,7 +905,8 @@ static int etm4_enable_perf(struct coresight_device *csdev,
{
struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct perf_event_attr *attr = &event->attr;
- int ret;
+ unsigned long cfg_hash;
+ int preset, ret;
if (WARN_ON_ONCE(drvdata->cpu != smp_processor_id()))
return -EINVAL;
@@ -931,6 +919,18 @@ static int etm4_enable_perf(struct coresight_device *csdev,
if (ret)
goto err;
+ /*
+ * Set any selected configuration and preset. A zero configid means no
+ * configuration active, preset = 0 means no preset selected.
+ */
+ cfg_hash = ATTR_CFG_GET_FLD(attr, configid);
+ if (cfg_hash) {
+ preset = ATTR_CFG_GET_FLD(attr, preset);
+ ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset);
+ if (ret)
+ goto err;
+ }
+
drvdata->trcid = path->trace_id;
/* Populate pause state */
@@ -938,15 +938,15 @@ static int etm4_enable_perf(struct coresight_device *csdev,
/* And enable it */
ret = etm4_enable_hw(drvdata);
- if (ret) {
- if (ATTR_CFG_GET_FLD(attr, configid))
- cscfg_csdev_disable_active_config(csdev);
- goto err;
- }
+ if (ret)
+ goto err_enable_hw;
csdev->path = path;
return 0;
+err_enable_hw:
+ if (cfg_hash)
+ cscfg_csdev_disable_active_config(csdev);
err:
/* Failed to start tracer; roll back to DISABLED mode */
coresight_set_mode(csdev, CS_MODE_DISABLED);
On Thu, May 28, 2026 at 03:26:57PM +0100, Yeoreum Yun wrote:
[...]
> > > @@ -47,7 +47,7 @@ static int etm4_cfg_map_reg_offset(struct etmv4_drvdata *drvdata,
> > > struct cscfg_regval_csdev *reg_csdev, u32 offset)
> > > {
> > > int err = -EINVAL, idx;
> > > - struct etmv4_config *drvcfg = &drvdata->config;
> > > + struct etmv4_config *drvcfg = &drvdata->active_config;
> >
> > Wouldn't it make more sense to keep using drvdata->config for cfgfs?
> > This would avoid cfgfs updating the active configuration while a session
> > is enabled.
> >
> > My understanding is after refactoring cfgfs, we will never expose
> > active_config or config anymore so this should not an issue. Before
> > that, maybe we just keep to use config so can avoid mess.
>
> I don't think so. since the cfgfs's config is updated right before
> enable and in this patchset, It's applied after "take the mode".
>
> If we do it into drvdta->config, then contention would be increased
> with the access of sysfs and that nullifies almost this patch purpose
> and because of we use "one config" this makes a corruption with perf
> and sysfs.
Here have two different contention: the contention between sysfs and
cfgfs, and the contention between cfgfs and a runtime config (active
config). My suggestion is to avoid the later contention, which is the
main idea in this series that avoid the runtime config is modified in
the middle of a session.
That said, I have no strong opinion.
> > > static void etm4_enable_sysfs_smp_call(void *info)
> > > {
> > > struct etm4_enable_arg *arg = info;
> > > + struct etmv4_drvdata *drvdata;
> > > struct coresight_device *csdev;
> > > + unsigned long cfg_hash;
> > > + int preset;
> > >
> > > if (WARN_ON(!arg))
> > > return;
> > >
> > > - csdev = arg->drvdata->csdev;
> > > + drvdata = arg->drvdata;
> > > + csdev = drvdata->csdev;
> > > if (!coresight_take_mode(csdev, CS_MODE_SYSFS)) {
> > > /* Someone is already using the tracer */
> > > arg->rc = -EBUSY;
> > > return;
> > > }
> > >
> > > - arg->rc = etm4_enable_hw(arg->drvdata);
> > > + /* enable any config activated by configfs */
> > > + cscfg_config_sysfs_get_active_cfg(&cfg_hash, &preset);
> > >
> > > - /* The tracer didn't start */
> > > + drvdata->active_config = arg->config;
> >
> > Can move this down to just before drvdata->trcid assignment so cscfg
> > operations can be put together?
>
> No. Copy the arg->config must be before cscfg_csdev_enable_active_config().
> If that's after, it overrides the "preset" and this is different from
> the former behavior.
Please copy config first, then do cscfg stuffs.
> > > + arg->rc = etm4_enable_hw(drvdata);
> > > if (arg->rc) {
> > > - coresight_set_mode(csdev, CS_MODE_DISABLED);
> > > - return;
> > > + cscfg_csdev_disable_active_config(csdev);
> > > + goto err;
> >
> > Add a new goto tag (like err_enable_hw) for the disable_active_config
> > rollback?
>
> This is only place where cscfg_csdev_disable_active_config().
> Why do we need to goto for cleanup where there is no duplication?
It is about to use a central place for handling errors.
> > > @@ -1449,7 +1458,7 @@ static void etm4_init_arch_data(void *info)
> > >
> > > /* EXLEVEL_S, bits[19:16] Secure state instruction tracing */
> > > caps->s_ex_level = FIELD_GET(TRCIDR3_EXLEVEL_S_MASK, etmidr3);
> > > - drvdata->config.s_ex_level = caps->s_ex_level;
> > > + config->s_ex_level = caps->s_ex_level;
> >
> > config->s_ex_level is redundant and not really a configuration item.
> > We should use caps->s_ex_level instead of config->s_ex_level wherever
> > it is used.
>
> Agree, but this includes it should change the omse function
> declations to pass "caps" argument:
> - etm4_set_comparator_filter()
> - etm4_set_start_stop_filter()
>
> If that's okay, I'll respin with it.
This is fine for me.
Thanks,
Leo
On Mon, Jun 01, 2026 at 10:08:55AM +0800, Yingchao Deng (Consultant) wrote:
[...]
> I noticed that among the new sysfs knobs in this series, the
> integration test registers (ittrigin[1-3], ittrigout[1-3],
> ittrigoutack[1-3], ittriginack[1-3]) have no existing documentation
> for their base (index-0) counterparts, which were added long before
> this series.
Ah, thanks for pointing out the missed piece!
> Two options for patch 5/5:
>
> (a) Document only the knobs whose base versions are already
> documented (triginstatus[1-3] and trigoutstatus[1-3]).
>
> (b) Document everything in patch 5/5 — both the new indexed
> variants and the previously missing base IT register entries.
Please use option (b). Could you fold the code piece below into patch 5?
Note, we cannot merge ittrigin and ittrigin[1-3] into a single entry
because they were introduced in different kernel versions.
@Mike, if any chance you could give a review on this?
---8<---
--- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-cti
+++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-cti
@@ -134,6 +134,60 @@ KernelVersion: 5.7
Contact: Mike Leach or Mathieu Poirier
Description: (Read) read current status of output trigger signals.
+What: /sys/bus/coresight/devices/<cti-name>/regs/itctrl
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (RW) Control integration mode.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/itchin
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Read) Read the values of the CTCHIN inputs.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/itchinack
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Write) Write the value of the CTCHINACK input.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/ittrigin
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Read) Read the values of the CTTRIGIN inputs.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/ittriginack
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Write) Write the value of the CTTRIGINACK input.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/itchout
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (RW) Read or write the value of the CTCHOUT outputs.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/itchoutack
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Read) Read the value of the CTCHOUTACK input.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/ittrigout
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (RW) Read or write the value of the CTTRIGOUT outputs.
+
+What: /sys/bus/coresight/devices/<cti-name>/regs/ittrigoutack
+Date: March 2020
+KernelVersion 5.7
+Contact: coresight(a)lists.linaro.org
+Description: (Read) Read the value of the CTTRIGOUTACK input.
+
What: /sys/bus/coresight/devices/<cti-name>/channels/trigin_attach
Date: March 2020
KernelVersion: 5.7