Hi,
Thanks for the report, see my comments inline.
On 14/11/2024 15:26, James Clark wrote:
On 14/11/2024 2:51 pm, Yicong Yang wrote:
On 2024/11/14 18:30, James Clark wrote:
On 14/11/2024 8:16 am, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
Enable the trace in below steps will crash the kernel by NULL pointer dereferencing: echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink echo 1 > /sys/bus/coresight/devices/etm0/enable_source echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size echo 1 > /sys/bus/coresight/devices/etm2/enable_source dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
The call trace will be like: WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/ coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc] [...] Call trace: __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc] tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc] tmc_open+0x60/0xa0 [coresight_tmc] misc_open+0x11c/0x170 chrdev_open+0xcc/0x2b0 do_dentry_open+0x140/0x4e0 vfs_open+0x34/0xf8 path_openat+0x2b0/0xf58 do_filp_open+0x8c/0x148 do_sys_openat2+0xb8/0xe8 __arm64_sys_openat+0x70/0xc0 el0_svc_common.constprop.0+0x64/0x148 do_el0_svc+0x24/0x38 el0_svc+0x40/0x140 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x1a4/0x1a8 ---[ end trace 0000000000000000 ]--- Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028 [...] Call trace: tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc] vfs_read+0xcc/0x310 ksys_read+0x74/0x108 __arm64_sys_read+0x24/0x38 el0_svc_common.constprop.0+0x64/0x148 do_el0_svc+0x24/0x38 el0_svc+0x40/0x140
Due to the buffer size changed, the buffer will be reallocated in tmc_etr_get_sysfs_buffer() when the second source enabled. At trace end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and trigger the later NULL pointer dereference when reading out the data.
But it doesn't make sense to change the buffer size when it's already in use. So block such behavior.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com
drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/ drivers/hwtracing/coresight/coresight-tmc-core.c index 475fa4bb6813..9660af63e9bc 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev, if (drvdata->config_type != TMC_CONFIG_TYPE_ETR) return -EPERM; + /* Don't change the buffer size if it's in use */ + guard(spinlock)(&drvdata->spinlock); + if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
Could we do something like this below ?
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index a48bb85d0e7f..863a645fa88a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1178,7 +1178,9 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev) */ spin_lock_irqsave(&drvdata->spinlock, flags); sysfs_buf = READ_ONCE(drvdata->sysfs_buf); - if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) { + if (!sysfs_buf || + ((sysfs_buf->size != drvdata->size) && + coresight_get_mode(csdev) != CS_MODE_SYSFS)) spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Allocate memory with the locks released */
i.e., do not allocate a new buffer if the sysfs mode is active. The new size can be set when the new session starts
Size isn't used in perf mode is it? So it can be -EBUSY only when mode == CS_MODE_SYSFS.
alloc_etr_buf() on the perf path will read drvdata->size, not sure it matters if user change it through sysfs in the meanwhile. Will test and have a check if there are any other places using size on the perf path.
That was there to make sure the user can allocate a bigger buffer (of the AUX size vs sysfs configured size) and possibly collect more trace (i.e., in multiple aux buffers). But looks like that is not useful, given we can only ever collect to one AUX (the last one turning ETR off).
So we could remove that check.
Suzuki
Hmmm I assumed that Perf mode completely ignored anything from sysfs mode. I see that alloc_etr_buf() does sometimes use the sysfs value. I don't really see why that's necessary because that means it sometimes ignores the buffer size from the perf command line depending on what's in sysfs, but the modes should be mutually exclusive.
Unless we fix that then I think you do need to use the device spinlock. But I think we should tidy up alloc_etr_buf() to only try to allocate from the Perf size down to TMC_ETR_PERF_MIN_BUF_SIZE, ignoring drvdata-
size. Then the behavior is less surprising to users and also anyone
reading the code. And rename it to alloc_etr_buf_perf().
Unless Suzuki knows of a reason it was done that way to begin with? I checked the commit message but it just says that it was like that but not why.
+ return -EBUSY;
ret = kstrtoul(buf, 0, &val); if (ret) return ret;
Looks ok to me. Although for consistency it might be worth changing to guard(mutex)(&coresight_mutex) because this is about sysfs mode only and other usages of mode and comments point to coresight_mutex. Using the device's spinlock will technically work but it did make me go and double check the code. And there are other cases of reading the mode like this:
ok, I thought to also serialize the use of drvdata->size. But as you mentioned use coresight_mutex is enough and will be consistenct with other places.
static ssize_t enable_source_show(struct device *dev, struct device_attribute *attr, char *buf) { struct coresight_device *csdev = to_coresight_device(dev);
guard(mutex)(&coresight_mutex); return scnprintf(buf, PAGE_SIZE, "%u\n", coresight_get_mode(csdev) == CS_MODE_SYSFS); }
Mode can change to CS_MODE_PERF while inside coresight_mutex but the device would end up not being enabled for sysfs, so it's still ok to update the sysfs size value in that case.
With that change:
Reviewed-by: James Clark james.clark@linaro.org
Thanks.