[PATCH v2 0/3] Coresight TMC-ETR some bugfixes and cleanups

List overview All Threads
Download

newer

older

Re: [PATCH v4] coresight: Replace...

[PATCH v3 0/6] Refactoring finding...

Junhao He

20 Jun 2025 20 Jun '25

7:54 a.m.

This patchset builds upon Yicong's previous patches [1].

Introducing fix two race issues found by using TMC-ETR and CATU, Two cleanups found when debugging the issues.

[1] https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-1-yangyicong@h...

--- Changes in v2: - Updated the commit of patch2. - Rebase to v6.16-rc1

Junhao He (1): coresight: tmc: refactor the tmc-etr mode setting to avoid race conditions

Yicong Yang (2): coresight: tmc: Add missing doc of tmc_drvdata::reading coresight: tmc: Decouple the perf buffer allocation from sysfs mode

.../hwtracing/coresight/coresight-tmc-etr.c | 102 +++++++++--------- drivers/hwtracing/coresight/coresight-tmc.h | 1 + 2 files changed, 53 insertions(+), 50 deletions(-)

-- 2.33.0

Show replies by date

Junhao He

20 Jun 20 Jun

7:54 a.m.

New subject: [PATCH v2 1/3] coresight: tmc: Add missing doc of tmc_drvdata::reading

From: Yicong Yang yangyicong@hisilicon.com

tmc_drvdata::reading is used to indicate whether a reading process is performed through /dev/xyz.tmc. Document it.

Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Junhao He hejunhao3@huawei.com --- drivers/hwtracing/coresight/coresight-tmc.h | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6541a27a018e..3ca0d40c580d 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -220,6 +220,7 @@ struct tmc_resrv_buf { * @pid: Process ID of the process that owns the session that is using * this component. For example this would be the pid of the Perf * process. + * @reading: buffer's in the reading through "/dev/xyz.tmc" entry * @stop_on_flush: Stop on flush trigger user configuration. * @buf: Snapshot of the trace data for ETF/ETB. * @etr_buf: details of buffer used in TMC-ETR

-- 2.33.0

Leo Yan

2 Jul 2 Jul

3:27 p.m.

New subject: [PATCH v2 1/3] coresight: tmc: Add missing doc of tmc_drvdata::reading

On Fri, Jun 20, 2025 at 03:54:10PM +0800, Junhao He wrote:

...

From: Yicong Yang yangyicong@hisilicon.com

tmc_drvdata::reading is used to indicate whether a reading process is performed through /dev/xyz.tmc. Document it.

Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Junhao He hejunhao3@huawei.com

drivers/hwtracing/coresight/coresight-tmc.h | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6541a27a018e..3ca0d40c580d 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -220,6 +220,7 @@ struct tmc_resrv_buf {
@pid: Process ID of the process that owns the session that is using
this component. For example this would be the pid of the Perf
process.
@reading: buffer's in the reading through "/dev/xyz.tmc" entry

Are you working on the latest code base? For example, the mainline kernel or coresight next branch.

The latest code already has comment for reading, and I saw a duplicated "reading" field in tmc_drvdata.

The tmc_resrv_buf structure does not have the fields "pid", "stop_on_flush", "buf", etc. They have been moved into the tmc_drvdata structure.

Thanks, Leo

...

@stop_on_flush: Stop on flush trigger user configuration.

@buf: Snapshot of the trace data for ETF/ETB.

@etr_buf: details of buffer used in TMC-ETR

-- 2.33.0

hejunhao

19 Jul 19 Jul

11:05 a.m.

New subject: [PATCH v2 1/3] coresight: tmc: Add missing doc of tmc_drvdata::reading

On 2025/7/2 23:27, Leo Yan wrote:

...

On Fri, Jun 20, 2025 at 03:54:10PM +0800, Junhao He wrote:

...
From: Yicong Yang yangyicong@hisilicon.com

tmc_drvdata::reading is used to indicate whether a reading process is performed through /dev/xyz.tmc. Document it.

Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Junhao He hejunhao3@huawei.com

drivers/hwtracing/coresight/coresight-tmc.h | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6541a27a018e..3ca0d40c580d 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -220,6 +220,7 @@ struct tmc_resrv_buf {
@pid: Process ID of the process that owns the session that is using
this component. For example this would be the pid of the Perf
process.
@reading: buffer's in the reading through "/dev/xyz.tmc" entry
Are you working on the latest code base? For example, the mainline kernel or coresight next branch. The latest code already has comment for reading, and I saw a duplicated "reading" field in tmc_drvdata.

This patch is based on mainline kernel v6.16-rc1, and adds a comment to struct tmc_drvdata::reading. Perhaps, this git log could easily be misunderstood as adding comments to tmc_resrv_buf::reading.

...

The tmc_resrv_buf structure does not have the fields "pid", "stop_on_flush", "buf", etc. They have been moved into the tmc_drvdata structure.

Thanks, Leo

...

@stop_on_flush: Stop on flush trigger user configuration.

@buf: Snapshot of the trace data for ETF/ETB.

@etr_buf: details of buffer used in TMC-ETR

-- 2.33.0

CoreSight mailing list -- coresight@lists.linaro.org To unsubscribe send an email to coresight-leave@lists.linaro.org .

Junhao He

20 Jun 20 Jun

7:54 a.m.

New subject: [PATCH v2 2/3] coresight: tmc: refactor the tmc-etr mode setting to avoid race conditions

When trying to run perf and sysfs mode simultaneously, the WARN_ON() in tmc_etr_enable_hw() is triggered sometimes:

WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] [..snip..] Call trace: tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P) tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L) tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] coresight_enable_path+0x1c8/0x218 [coresight] coresight_enable_sysfs+0xa4/0x228 [coresight] enable_source_store+0x58/0xa8 [coresight] dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 kernfs_fop_write_iter+0x120/0x1b8 vfs_write+0x2c8/0x388 ksys_write+0x74/0x108 __arm64_sys_write+0x24/0x38 el0_svc_common.constprop.0+0x64/0x148 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x130 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x1ac/0x1b0 ---[ end trace 0000000000000000 ]---

Since the sysfs buffer allocation and the hardware enablement is not in the same critical region, it's possible to race with the perf

mode: [sysfs mode] [perf mode] tmc_etr_get_sysfs_buffer() spin_lock(&drvdata->spinlock) [sysfs buffer allocation] spin_unlock(&drvdata->spinlock) spin_lock(&drvdata->spinlock) tmc_etr_enable_hw() drvdata->etr_buf = etr_perf->etr_buf spin_unlock(&drvdata->spinlock) spin_lock(&drvdata->spinlock) tmc_etr_enable_hw() WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at the perf side spin_unlock(&drvdata->spinlock)

A race condition is introduced here, perf always prioritizes execution, and warnings can lead to concerns about potential hidden bugs, such as getting out of sync.

To fix this, configure the tmc-etr mode before invoking enable_etr_perf() or enable_etr_sysfs(), explicitly check if the tmc-etr sink is already enabled in a different mode, and simplily the setup and checks for "mode". To prevent race conditions between mode transitions.

Fixes: 296b01fd106e ("coresight: Refactor out buffer allocation function for ETR") Reported-by: Yicong Yang yangyicong@hisilicon.com Closes: https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-2-yangyicong@h... Signed-off-by: Junhao He hejunhao3@huawei.com --- .../hwtracing/coresight/coresight-tmc-etr.c | 73 +++++++++++-------- 1 file changed, 43 insertions(+), 30 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index b07fcdb3fe1a..252a57a8e94e 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1263,11 +1263,6 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev) raw_spin_lock_irqsave(&drvdata->spinlock, flags); }

- if (drvdata->reading || coresight_get_mode(csdev) == CS_MODE_PERF) { - ret = -EBUSY; - goto out; - } - /* * If we don't have a buffer or it doesn't match the requested size, * use the buffer allocated above. Otherwise reuse the existing buffer. @@ -1278,7 +1273,6 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev) drvdata->sysfs_buf = new_buf; }

-out: raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);

/* Free memory outside the spinlock if need be */ @@ -1289,7 +1283,7 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)

static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) { - int ret = 0; + int ret; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); struct etr_buf *sysfs_buf = tmc_etr_get_sysfs_buffer(csdev); @@ -1299,23 +1293,10 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)

raw_spin_lock_irqsave(&drvdata->spinlock, flags);

- /* - * In sysFS mode we can have multiple writers per sink. Since this - * sink is already enabled no memory is needed and the HW need not be - * touched, even if the buffer size has changed. - */ - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { - csdev->refcnt++; - goto out; - } - ret = tmc_etr_enable_hw(drvdata, sysfs_buf); - if (!ret) { - coresight_set_mode(csdev, CS_MODE_SYSFS); + if (!ret) csdev->refcnt++; - }

-out: raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);

if (!ret) @@ -1735,11 +1716,6 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) struct etr_perf_buffer *etr_perf = etm_perf_sink_config(handle);

raw_spin_lock_irqsave(&drvdata->spinlock, flags); - /* Don't use this sink if it is already claimed by sysFS */ - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { - rc = -EBUSY; - goto unlock_out; - }

if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) { rc = -EINVAL; @@ -1768,7 +1744,6 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) if (!rc) { /* Associate with monitored process. */ drvdata->pid = pid; - coresight_set_mode(csdev, CS_MODE_PERF); drvdata->perf_buf = etr_perf->etr_buf; csdev->refcnt++; } @@ -1781,14 +1756,52 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) static int tmc_enable_etr_sink(struct coresight_device *csdev, enum cs_mode mode, void *data) { + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + enum cs_mode old_mode; + int rc = -EINVAL; + + scoped_guard(spinlock_irqsave, &drvdata->spinlock) { + old_mode = coresight_get_mode(csdev); + if (old_mode != CS_MODE_DISABLED && old_mode != mode) + return -EBUSY; + + if (drvdata->reading) + return -EBUSY; + + /* In sysFS mode we can have multiple writers per sink. */ + if (old_mode == CS_MODE_SYSFS) { + csdev->refcnt++; + return 0; + } + + /* + * minor note: In sysFS mode, the task1 get locked first, it setup + * etr mode to SYSFS. Then task2 get locked，it will directly return + * success even when the tmc-etr is not enabled at this moment. + * Ultimately, task1 will still successfully enable tmc-etr. + * This is a transient state and does not cause an anomaly. + */ + coresight_set_mode(csdev, mode); + } + switch (mode) { case CS_MODE_SYSFS: - return tmc_enable_etr_sink_sysfs(csdev); + rc = tmc_enable_etr_sink_sysfs(csdev); + break; case CS_MODE_PERF: - return tmc_enable_etr_sink_perf(csdev, data); + rc = tmc_enable_etr_sink_perf(csdev, data); + break; default: - return -EINVAL; + rc = -EINVAL; } + + if (rc && old_mode != mode) { + scoped_guard(spinlock_irqsave, &drvdata->spinlock) { + coresight_set_mode(csdev, old_mode); + } + } + + return rc; }

static int tmc_disable_etr_sink(struct coresight_device *csdev)

-- 2.33.0

Leo Yan

2 Jul 2 Jul

4:47 p.m.

New subject: [PATCH v2 2/3] coresight: tmc: refactor the tmc-etr mode setting to avoid race conditions

On Fri, Jun 20, 2025 at 03:54:11PM +0800, Junhao He wrote:

[...]

...

To fix this, configure the tmc-etr mode before invoking enable_etr_perf() or enable_etr_sysfs(), explicitly check if the tmc-etr sink is already enabled in a different mode, and simplily the setup and checks for "mode". To prevent race conditions between mode transitions.

I have a similiar fixing for CTI driver, see: https://lore.kernel.org/linux-arm-kernel/20250701-arm_cs_pm_fix_v3-v2-0-23eb...

[...]

...

@@ -1781,14 +1756,52 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) static int tmc_enable_etr_sink(struct coresight_device *csdev, enum cs_mode mode, void *data) {

struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);

enum cs_mode old_mode;

int rc = -EINVAL;

scoped_guard(spinlock_irqsave, &drvdata->spinlock) {

scoped_guard(raw_spinlock_irqsave, &drvdata->spinlock) {

I am concerned that the spinlock is acquired and released for several times in a single flow. It is possible to hit some corner case, e.g.,

- CPU0 acquires the lock, set sink as SYSFS mode, and releases the lock; - CPU1 acquires the lock, detects the SYSFS mode has been set, directly bail out, then enable ETM.

The problem is the sink has not been enabled yet. This leads to CPU1 to enable the tracer but prior to sink's enabling.

The key piont is we should ensure the mode is consistent to the hardware state. I will take a bit time for if we can have a better idea for this.

...

```
old_mode = coresight_get_mode(csdev);
```

if (old_mode != CS_MODE_DISABLED && old_mode != mode)

```
	return -EBUSY;
```
```
if (drvdata->reading)
```
```
	return -EBUSY;
```

/* In sysFS mode we can have multiple writers per sink. */

```
if (old_mode == CS_MODE_SYSFS) {
```
```
	csdev->refcnt++;
```

I am just wandering if we can unify the "refcnt" code for both SYSFS mode and Perf mode, and as a result, we can consolidate the code cross different drivers.

Something like:

if (!csdev->refcnt) { coresight_set_mode(csdev, mode); } else { /* The device has been configured with an incompatible mode */ if (coresight_get_mode(csdev) != mode) return -EBUSY;

csdev->refcnt++; }

Then when disable the device:

csdev->refcnt--; if (!csdev->refcnt) coresight_set_mode(csdev, CS_MODE_DISABLED);

We should not check "drvdata->reading" when enable or disable sink, as it is a flag to track if reading the trace buffer, it is irrelevant to the sink device's enabling or disabling.

Please verify perf mode (especially for system wide session) to avoid anything missed.

Thanks, Leo

...

```
	return 0;
```
```
}
```
```
/*
```

 * minor note: In sysFS mode, the task1 get locked first, it setup

 * etr mode to SYSFS. Then task2 get locked，it will directly return

 * success even when the tmc-etr is not enabled at this moment.

 * Ultimately, task1 will still successfully enable tmc-etr.

 * This is a transient state and does not cause an anomaly.

```
 */
```
```
coresight_set_mode(csdev, mode);
```
}
switch (mode) { case CS_MODE_SYSFS:

return tmc_enable_etr_sink_sysfs(csdev);

```
rc = tmc_enable_etr_sink_sysfs(csdev);
```
```
break;
```
case CS_MODE_PERF:

return tmc_enable_etr_sink_perf(csdev, data);

rc = tmc_enable_etr_sink_perf(csdev, data);

```
break;
```
default:

```
return -EINVAL;
```

```
rc = -EINVAL;
```
}
if (rc && old_mode != mode) {

scoped_guard(spinlock_irqsave, &drvdata->spinlock) {

```
	coresight_set_mode(csdev, old_mode);
```
```
}
```
}
return rc;

} static int tmc_disable_etr_sink(struct coresight_device *csdev) -- 2.33.0

hejunhao

19 Jul 19 Jul

11:09 a.m.

New subject: [PATCH v2 2/3] coresight: tmc: refactor the tmc-etr mode setting to avoid race conditions

On 2025/7/3 0:47, Leo Yan wrote:

...

On Fri, Jun 20, 2025 at 03:54:11PM +0800, Junhao He wrote:

[...]

...
To fix this, configure the tmc-etr mode before invoking enable_etr_perf() or enable_etr_sysfs(), explicitly check if the tmc-etr sink is already enabled in a different mode, and simplily the setup and checks for "mode". To prevent race conditions between mode transitions.

I have a similiar fixing for CTI driver, see: https://lore.kernel.org/linux-arm-kernel/20250701-arm_cs_pm_fix_v3-v2-0-23eb...

[...]

...
@@ -1781,14 +1756,52 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) static int tmc_enable_etr_sink(struct coresight_device *csdev, enum cs_mode mode, void *data) {

struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);

enum cs_mode old_mode;

int rc = -EINVAL;

scoped_guard(spinlock_irqsave, &drvdata->spinlock) {

scoped_guard(raw_spinlock_irqsave, &drvdata->spinlock) {

Ok, will fix it.

...

I am concerned that the spinlock is acquired and released for several times in a single flow. It is possible to hit some corner case, e.g.,

CPU0 acquires the lock, set sink as SYSFS mode, and releases the lock;

CPU1 acquires the lock, detects the SYSFS mode has been set, directly bail out, then enable ETM.

The problem is the sink has not been enabled yet. This leads to CPU1 to enable the tracer but prior to sink's enabling.

Yes, you are right. So I added a minor note for that. The impact of this corner case may not have been fully considered.

...

The key piont is we should ensure the mode is consistent to the hardware state. I will take a bit time for if we can have a better idea for this.

...
old_mode = coresight_get_mode(csdev);
if (old_mode != CS_MODE_DISABLED && old_mode != mode)
	return -EBUSY;
if (drvdata->reading)
	return -EBUSY;
/* In sysFS mode we can have multiple writers per sink. */
if (old_mode == CS_MODE_SYSFS) {
	csdev->refcnt++;
I am just wandering if we can unify the "refcnt" code for both SYSFS mode and Perf mode, and as a result, we can consolidate the code cross different drivers.

Something like:
             if (!csdev->refcnt) {
                     coresight_set_mode(csdev, mode);
             } else {
                     /* The device has been configured with an incompatible mode */
                     if (coresight_get_mode(csdev) != mode)
                             return -EBUSY;

                     csdev->refcnt++;

we need to check the mode, cannot directly increase the reference count In performance mode.

...

             }
Then when disable the device:
             csdev->refcnt--;
             if (!csdev->refcnt)
                     coresight_set_mode(csdev, CS_MODE_DISABLED);
We should not check "drvdata->reading" when enable or disable sink, as it is a flag to track if reading the trace buffer, it is irrelevant to the sink device's enabling or disabling.

In sysfs mode, it is necessary to check drvdata->reading.

...

Please verify perf mode (especially for system wide session) to avoid anything missed.

I will refactor this code and upload a new version that includes all of your suggested solutions.

Thanks, Junhao

...

Thanks, Leo

...

```
	return 0;
```
```
}
```
```
/*
```

 * minor note: In sysFS mode, the task1 get locked first, it setup

 * etr mode to SYSFS. Then task2 get locked，it will directly return

 * success even when the tmc-etr is not enabled at this moment.

 * Ultimately, task1 will still successfully enable tmc-etr.

 * This is a transient state and does not cause an anomaly.

```
 */
```
```
coresight_set_mode(csdev, mode);
```
}
switch (mode) { case CS_MODE_SYSFS:

return tmc_enable_etr_sink_sysfs(csdev);

```
rc = tmc_enable_etr_sink_sysfs(csdev);
```
```
break;
```
case CS_MODE_PERF:

return tmc_enable_etr_sink_perf(csdev, data);

rc = tmc_enable_etr_sink_perf(csdev, data);

```
break;
```
default:

```
return -EINVAL;
```

```
rc = -EINVAL;
```
}
if (rc && old_mode != mode) {

scoped_guard(spinlock_irqsave, &drvdata->spinlock) {

```
	coresight_set_mode(csdev, old_mode);
```
```
}
```
}
return rc; }

static int tmc_disable_etr_sink(struct coresight_device *csdev) -- 2.33.0

Junhao He

20 Jun 20 Jun

7:54 a.m.

New subject: [PATCH v2 3/3] coresight: tmc: Decouple the perf buffer allocation from sysfs mode

From: Yicong Yang yangyicong@hisilicon.com

Currently the perf buffer allocation follows the below logic: - if the required AUX buffer size if larger, allocate the buffer with the required size - otherwise allocate the size reference to the sysfs buffer size

This is not useful as we only collect to one AUX data, so just try to allocate the buffer match the AUX buffer size.

Suggested-by: Suzuki K Poulose suzuki.poulose@arm.com Link: https://lore.kernel.org/linux-arm-kernel/df8967cd-2157-46a2-97d9-a1aea883cf6... Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Junhao He hejunhao3@huawei.com --- .../hwtracing/coresight/coresight-tmc-etr.c | 29 ++++++------------- 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 252a57a8e94e..94dc968a76da 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1327,9 +1327,7 @@ EXPORT_SYMBOL_GPL(tmc_etr_get_buffer);

/* * alloc_etr_buf: Allocate ETR buffer for use by perf. - * The size of the hardware buffer is dependent on the size configured - * via sysfs and the perf ring buffer size. We prefer to allocate the - * largest possible size, scaling down the size by half until it + * Allocate the largest possible size, scaling down the size by half until it * reaches a minimum limit (1M), beyond which we give up. */ static struct etr_buf * @@ -1341,33 +1339,24 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event, unsigned long size;

node = (event->cpu == -1) ? NUMA_NO_NODE : cpu_to_node(event->cpu); - /* - * Try to match the perf ring buffer size if it is larger - * than the size requested via sysfs. - */ - if ((nr_pages << PAGE_SHIFT) > drvdata->size) { - etr_buf = tmc_alloc_etr_buf(drvdata, ((ssize_t)nr_pages << PAGE_SHIFT), - 0, node, NULL); - if (!IS_ERR(etr_buf)) - goto done; - } + + /* Use the minimum limit if the required size is smaller */ + size = (unsigned long)nr_pages << PAGE_SHIFT; + if (size < TMC_ETR_PERF_MIN_BUF_SIZE) + size = TMC_ETR_PERF_MIN_BUF_SIZE;

/* - * Else switch to configured size for this ETR - * and scale down until we hit the minimum limit. + * Try to allocate the required size for this ETR, if failed scale + * down until we hit the minimum limit. */ - size = drvdata->size; do { etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); if (!IS_ERR(etr_buf)) - goto done; + return etr_buf; size /= 2; } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);

return ERR_PTR(-ENOMEM); - -done: - return etr_buf; }

static struct etr_buf *

-- 2.33.0

Leo Yan

2 Jul 2 Jul

5:08 p.m.

New subject: [PATCH v2 3/3] coresight: tmc: Decouple the perf buffer allocation from sysfs mode

On Fri, Jun 20, 2025 at 03:54:12PM +0800, Junhao He wrote:

[..]

...

@@ -1341,33 +1339,24 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event, unsigned long size; node = (event->cpu == -1) ? NUMA_NO_NODE : cpu_to_node(event->cpu);
/*
* Try to match the perf ring buffer size if it is larger
* than the size requested via sysfs.
*/
if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
etr_buf = tmc_alloc_etr_buf(drvdata, ((ssize_t)nr_pages << PAGE_SHIFT),
			    0, node, NULL);
if (!IS_ERR(etr_buf))
	goto done;
}
/* Use the minimum limit if the required size is smaller */

size = (unsigned long)nr_pages << PAGE_SHIFT;

Please change the size's type to ssize_t, then:

size = nr_pages << PAGE_SHIFT;

...

if (size < TMC_ETR_PERF_MIN_BUF_SIZE)
size = TMC_ETR_PERF_MIN_BUF_SIZE;

size = min_t(ssize_t, size, TMC_ETR_PERF_MIN_BUF_SIZE);

...

/*
* Else switch to configured size for this ETR
* and scale down until we hit the minimum limit.
* Try to allocate the required size for this ETR, if failed scale
* down until we hit the minimum limit.
*/
size = drvdata->size; do { etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); if (!IS_ERR(etr_buf))
	goto done;
	return etr_buf;
size /= 2; } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);

Do we really need to scale down buffer size for failure cases?

I would like a straightforward code:

etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); if (IS_ERR_OR_NULL(etr_buf)) return etr_buf;

Just a side topic, we know tmc_alloc_etr_buf() should not return NULL pointer. For a sanity check, the callers (alloc_etr_buf(), tmc_etr_get_sysfs_buffer(), etc) should valid a buffer pointer with IS_ERR_OR_NULL() rather than IS_ERR(). This can be a separate patch.

Thanks, Leo

...

return ERR_PTR(-ENOMEM);

-done:

return etr_buf;

} static struct etr_buf * -- 2.33.0

hejunhao

19 Jul 19 Jul

11:20 a.m.

New subject: [PATCH v2 3/3] coresight: tmc: Decouple the perf buffer allocation from sysfs mode

On 2025/7/3 1:08, Leo Yan wrote:

...

On Fri, Jun 20, 2025 at 03:54:12PM +0800, Junhao He wrote:

[..]

...
@@ -1341,33 +1339,24 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event, unsigned long size; node = (event->cpu == -1) ? NUMA_NO_NODE : cpu_to_node(event->cpu);
/*
* Try to match the perf ring buffer size if it is larger
* than the size requested via sysfs.
*/
if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
etr_buf = tmc_alloc_etr_buf(drvdata, ((ssize_t)nr_pages << PAGE_SHIFT),
			    0, node, NULL);
if (!IS_ERR(etr_buf))
	goto done;
}
/* Use the minimum limit if the required size is smaller */

size = (unsigned long)nr_pages << PAGE_SHIFT;
Please change the size's type to ssize_t, then:
     size = nr_pages << PAGE_SHIFT;
...
if (size < TMC_ETR_PERF_MIN_BUF_SIZE)
size = TMC_ETR_PERF_MIN_BUF_SIZE;
size = min_t(ssize_t, size, TMC_ETR_PERF_MIN_BUF_SIZE);

Sure, I will do that.

...

...
/*
* Else switch to configured size for this ETR
* and scale down until we hit the minimum limit.
* Try to allocate the required size for this ETR, if failed scale
* down until we hit the minimum limit.
*/
size = drvdata->size; do { etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); if (!IS_ERR(etr_buf))
	goto done;
	return etr_buf;
size /= 2; } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
Do we really need to scale down buffer size for failure cases? I would like a straightforward code:
     etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
     if (IS_ERR_OR_NULL(etr_buf))
             return etr_buf;
Just a side topic, we know tmc_alloc_etr_buf() should not return NULL pointer. For a sanity check, the callers (alloc_etr_buf(), tmc_etr_get_sysfs_buffer(), etc) should valid a buffer pointer with IS_ERR_OR_NULL() rather than IS_ERR(). This can be a separate patch.

A new patch will be added to achieve this.

Thank you for your comments. Junhao.

...

Thanks, Leo

...
return ERR_PTR(-ENOMEM);

-done:

return etr_buf; }

static struct etr_buf * -- 2.33.0

.

hejunhao

21 Jul 21 Jul

6:36 a.m.

New subject: [PATCH v2 3/3] coresight: tmc: Decouple the perf buffer allocation from sysfs mode

On 2025/7/3 1:08, Leo Yan wrote:

...

...
/*
* Else switch to configured size for this ETR
* and scale down until we hit the minimum limit.
* Try to allocate the required size for this ETR, if failed scale
* down until we hit the minimum limit.
*/
size = drvdata->size; do { etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); if (!IS_ERR(etr_buf))
	goto done;
	return etr_buf;
size /= 2; } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
Do we really need to scale down buffer size for failure cases? I would like a straightforward code:
     etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
     if (IS_ERR_OR_NULL(etr_buf))
             return etr_buf;

I conducted test on my platform, and the logs are as follows:

[root@localhost tmc_etr0]# cat buf_modes_available auto flat [root@localhost etm]# perf record -e cs_etm// -C 0 -m,8M -- stress -c 1 -t 10 failed to mmap with 22 (Invalid argument) [root@localhost etm]# perf record -e cs_etm// -C 0 -m,4M -- stress -c 1 -t 10 stress: info: [1727601] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [1727601] successful run completed in 10s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 9.358 MB perf.data ] [root@localhost etm]# perf record -e cs_etm// -C 0 -- stress -c 1 -t 10 stress: info: [1727717] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [1727717] successful run completed in 10s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 9.380 MB perf.data ]

If the user-specified mmap pages exceed the maximum number of pages supported by ETR, the perf process will return a failure. If the user has a method to obtain the maximum number of pages supported by ETR, I believe that scaling down the buffer size for failure cases is unnecessary.

Best regards, Junhao.

...

Just a side topic, we know tmc_alloc_etr_buf() should not return NULL pointer. For a sanity check, the callers (alloc_etr_buf(), tmc_etr_get_sysfs_buffer(), etc) should valid a buffer pointer with IS_ERR_OR_NULL() rather than IS_ERR(). This can be a separate patch.

300

days inactive

331

days old

coresight@lists.linaro.org

10 comments

participants

tags (0)

participants (3)

hejunhao
Junhao He
Leo Yan