If a profiling program runs in a non-root PID namespace, if CoreSight driver enables PID tracing (with contextID), it can lead to mismatching issue between the context ID traced in hardware (from the root namespace) and the PIDs gathered by profiling tool (e.g. perf) in its non-root namespace.
CoreSight driver has tried to address this issue for the contextID related interfaces under sysfs, but it misses to prevent user to set VMID (virtual contextID) for kernel runs in EL2 with VHE; furthermore, it misses to handle the case when the profiling tool runs in the non-root PID namespace.
For this reason, this patch series is to correct contextID tracing for non-root namespace. After applied this patchset, patch 02 doesn't permit users to access virtual contextID via sysfs nodes in the non-root PID namespace, patch 03 and 04 stop to trace PID packet for non-root PID namespace.
This patch is dependent on the patchset "pid: Introduce helper task_is_in_root_ns()" [1].
[1] https://lore.kernel.org/lkml/20211208083320.472503-1-leo.yan@linaro.org/
Leo Yan (4): coresight: etm4x: Add lock for reading virtual context ID comparator coresight: etm4x: Don't use virtual contextID for non-root PID namespace coresight: etm4x: Don't trace PID for non-root PID namespace coresight: etm3x: Don't trace PID for non-root PID namespace
.../coresight/coresight-etm3x-core.c | 4 +++ .../coresight/coresight-etm4x-core.c | 10 +++++-- .../coresight/coresight-etm4x-sysfs.c | 30 +++++++++++++++++++ 3 files changed, 42 insertions(+), 2 deletions(-)
Updates to the values and the index are protected via the spinlock. Ensure we use the same lock to read the value safely.
Signed-off-by: Leo Yan leo.yan@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-etm4x-sysfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c index 10ef2a29006e..2f3b4eef8261 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c @@ -2111,7 +2111,9 @@ static ssize_t vmid_val_show(struct device *dev, struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etmv4_config *config = &drvdata->config;
+ spin_lock(&drvdata->spinlock); val = (unsigned long)config->vmid_val[config->vmid_idx]; + spin_unlock(&drvdata->spinlock); return scnprintf(buf, PAGE_SIZE, "%#lx\n", val); }
As commented in the function ctxid_pid_store(), it can cause the PID values mismatching between context ID tracing and PID allocated in a non-root namespace.
For this reason, when a process runs in non-root PID namespace, the driver doesn't allow PID tracing and returns failure when access contextID related sysfs nodes.
VMID works for virtual contextID when the kernel runs in EL2 mode with VHE; on the other hand, the driver doesn't prevent users from accessing it when programs run in the non-root namespace. Thus this can lead to same issues with contextID described above.
This patch imposes the checking on VMID related sysfs knobs and returns failure if current process runs in non-root PID namespace.
Signed-off-by: Leo Yan leo.yan@linaro.org --- .../coresight/coresight-etm4x-sysfs.c | 28 +++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c index 2f3b4eef8261..a00c0d1bbd85 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c @@ -2111,6 +2111,13 @@ static ssize_t vmid_val_show(struct device *dev, struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etmv4_config *config = &drvdata->config;
+ /* + * Don't use virtual contextID tracing if coming from a PID namespace. + * See comment in ctxid_pid_store(). + */ + if (!task_is_in_init_pid_ns(current)) + return -EINVAL; + spin_lock(&drvdata->spinlock); val = (unsigned long)config->vmid_val[config->vmid_idx]; spin_unlock(&drvdata->spinlock); @@ -2125,6 +2132,13 @@ static ssize_t vmid_val_store(struct device *dev, struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etmv4_config *config = &drvdata->config;
+ /* + * Don't use virtual contextID tracing if coming from a PID namespace. + * See comment in ctxid_pid_store(). + */ + if (!task_is_in_init_pid_ns(current)) + return -EINVAL; + /* * only implemented when vmid tracing is enabled, i.e. at least one * vmid comparator is implemented and at least 8 bit vmid size @@ -2148,6 +2162,13 @@ static ssize_t vmid_masks_show(struct device *dev, struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etmv4_config *config = &drvdata->config;
+ /* + * Don't use virtual contextID tracing if coming from a PID namespace. + * See comment in ctxid_pid_store(). + */ + if (!task_is_in_init_pid_ns(current)) + return -EINVAL; + spin_lock(&drvdata->spinlock); val1 = config->vmid_mask0; val2 = config->vmid_mask1; @@ -2165,6 +2186,13 @@ static ssize_t vmid_masks_store(struct device *dev, struct etmv4_config *config = &drvdata->config; int nr_inputs;
+ /* + * Don't use virtual contextID tracing if coming from a PID namespace. + * See comment in ctxid_pid_store(). + */ + if (!task_is_in_init_pid_ns(current)) + return -EINVAL; + /* * only implemented when vmid tracing is enabled, i.e. at least one * vmid comparator is implemented and at least 8 bit vmid size
When runs in perf mode, the driver always enables the PID tracing. This can lead confusion when the profiling session runs in non-root PID namespace, whereas it records the PIDs from the root PID namespace.
To avoid confusion for PID tracing, when runs in perf mode, this patch changes to only enable PID tracing for root PID namespace.
As result, after apply this patch, the perf tool reports PID as '-1' for all samples:
# unshare --fork --pid perf record -e cs_etm// -m 64K,64K -a \ -o perf_test.data -- uname # perf report -i perf_test.data --itrace=Zi1000i --stdio
# Total Lost Samples: 0 # # Samples: 94 of event 'instructions' # Event count (approx.): 94000 # # Overhead Command Shared Object Symbol # ........ ....... ................. .............................. # 68.09% :-1 [kernel.kallsyms] [k] __sched_text_end 3.19% :-1 [kernel.kallsyms] [k] hrtimer_interrupt 2.13% :-1 [kernel.kallsyms] [k] __bitmap_and 2.13% :-1 [kernel.kallsyms] [k] trace_vbprintk 1.06% :-1 [kernel.kallsyms] [k] __fget_files 1.06% :-1 [kernel.kallsyms] [k] __schedule 1.06% :-1 [kernel.kallsyms] [k] __softirqentry_text_start 1.06% :-1 [kernel.kallsyms] [k] __update_load_avg_cfs_rq 1.06% :-1 [kernel.kallsyms] [k] __update_load_avg_se 1.06% :-1 [kernel.kallsyms] [k] arch_counter_get_cntpct 1.06% :-1 [kernel.kallsyms] [k] check_and_switch_context 1.06% :-1 [kernel.kallsyms] [k] format_decode 1.06% :-1 [kernel.kallsyms] [k] handle_percpu_devid_irq 1.06% :-1 [kernel.kallsyms] [k] irq_enter_rcu 1.06% :-1 [kernel.kallsyms] [k] irqtime_account_irq 1.06% :-1 [kernel.kallsyms] [k] ktime_get 1.06% :-1 [kernel.kallsyms] [k] ktime_get_coarse_real_ts64 1.06% :-1 [kernel.kallsyms] [k] memmove 1.06% :-1 [kernel.kallsyms] [k] perf_ioctl 1.06% :-1 [kernel.kallsyms] [k] perf_output_begin 1.06% :-1 [kernel.kallsyms] [k] perf_output_copy 1.06% :-1 [kernel.kallsyms] [k] profile_tick 1.06% :-1 [kernel.kallsyms] [k] sched_clock 1.06% :-1 [kernel.kallsyms] [k] timerqueue_add 1.06% :-1 [kernel.kallsyms] [k] trace_save_cmdline 1.06% :-1 [kernel.kallsyms] [k] update_load_avg 1.06% :-1 [kernel.kallsyms] [k] vbin_printf
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 86a313857b58..f3eda536267c 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -656,7 +656,9 @@ static int etm4_parse_event_config(struct coresight_device *csdev, config->cfg |= BIT(11); }
- if (attr->config & BIT(ETM_OPT_CTXTID)) + /* Only trace contextID when runs in root PID namespace */ + if ((attr->config & BIT(ETM_OPT_CTXTID)) && + task_is_in_init_pid_ns(current)) /* bit[6], Context ID tracing bit */ config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
@@ -670,7 +672,11 @@ static int etm4_parse_event_config(struct coresight_device *csdev, ret = -EINVAL; goto out; } - config->cfg |= BIT(ETM4_CFG_BIT_VMID) | BIT(ETM4_CFG_BIT_VMID_OPT); + + /* Only trace virtual contextID when runs in root PID namespace */ + if (task_is_in_init_pid_ns(current)) + config->cfg |= BIT(ETM4_CFG_BIT_VMID) | + BIT(ETM4_CFG_BIT_VMID_OPT); }
/* return stack - enable if selected and supported */
ETMv3 driver enables PID tracing by directly using perf config from userspace, this means the tracer will capture PID packets from root namespace but the profiling session runs in non-root PID namespace. Finally, the recorded packets can mislead perf reporting with the mismatched PID values.
This patch changes to only enable PID tracing for root PID namespace. Note, the hardware supports VMID tracing from ETMv3.5, but the driver never enables VMID trace, this patch doesn't handle VMID trace (bit 30 in ETMCR register) particularly.
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-etm3x-core.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index cf64ce73a741..7d413ba8b823 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -340,6 +340,10 @@ static int etm_parse_event_config(struct etm_drvdata *drvdata,
config->ctrl = attr->config;
+ /* Don't trace contextID when runs in non-root PID namespace */ + if (!task_is_in_init_pid_ns(current)) + config->ctrl &= ~ETMCR_CTXID_SIZE; + /* * Possible to have cores with PTM (supports ret stack) and ETM * (never has ret stack) on the same SoC. So if we have a request
On 13/12/2021 12:13, Leo Yan wrote:
If a profiling program runs in a non-root PID namespace, if CoreSight driver enables PID tracing (with contextID), it can lead to mismatching issue between the context ID traced in hardware (from the root namespace) and the PIDs gathered by profiling tool (e.g. perf) in its non-root namespace.
CoreSight driver has tried to address this issue for the contextID related interfaces under sysfs, but it misses to prevent user to set VMID (virtual contextID) for kernel runs in EL2 with VHE; furthermore, it misses to handle the case when the profiling tool runs in the non-root PID namespace.
For this reason, this patch series is to correct contextID tracing for non-root namespace. After applied this patchset, patch 02 doesn't permit users to access virtual contextID via sysfs nodes in the non-root PID namespace, patch 03 and 04 stop to trace PID packet for non-root PID namespace.
This patch is dependent on the patchset "pid: Introduce helper task_is_in_root_ns()" [1].
[1] https://lore.kernel.org/lkml/20211208083320.472503-1-leo.yan@linaro.org/
Leo Yan (4): coresight: etm4x: Add lock for reading virtual context ID comparator coresight: etm4x: Don't use virtual contextID for non-root PID namespace coresight: etm4x: Don't trace PID for non-root PID namespace coresight: etm3x: Don't trace PID for non-root PID namespace
For the series,
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com