Hi,
On Tue, 23 Sept 2025 at 02:49, Jie Gan jie.gan@oss.qualcomm.com wrote:
On 9/23/2025 1:31 AM, Carl Worth wrote:
Jie Gan jie.gan@oss.qualcomm.com writes:
From: Carl Worth carl@os.amperecomputing.com
The handle is essential for retrieving the AUX_EVENT of each CPU and is required in perf mode. It has been added to the coresight_path so that dependent devices can access it from the path when needed.
I'd still like to have the original command I used to trigger the bug in the commit message. I really like having reproduction steps captured in commit messages when I look back at commits in the future. So, that was:
perf record -e cs_etm//k -C 0-9 dd if=/dev/zero of=/dev/null
Sure, I’ll include your commit message in the formal patch series, I think it's V3 since you have submitted two versions, if you're okay with me sending it out.
/**
- struct coresight_path - data needed by enable/disable path
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
*/
- struct perf_output_handle: handle of the aux_event.
Fixing to "@handle" was mentioned in another comment already.
Something about the above still feels a little off to me. It feels like we're throwing new data into a structure just because it happens to be conveniently at hand for the code paths we're needing, and not because it really _belongs_ there.
This data is perf specific - not path generic; so I agree that this structure should go elsewhere.
I would suggest in the csdev (coresight_device) structure itself. We already have some sink specific data in here e.g. perf_sink_id_map.
This could then be set/clear in the functions coresight-etm-perf.c file, where there is a significant amount of code dealing with the perf handle and ensuring it is valid and in scope.
This can then be set only when appropriate - for source / sink devices and only when in perf mode, and avoid the need to pass the handle around as API call parameters.
Regards
Mike.
The core idea behind coresight_path is that it can hold all the data potentially needed by any device along the path.
For example with the path ETM->Link->ETR->CATU:
All the mentioned devices operate by forming a path, for which the system constructs a coresight_path. This 'path' is then passed to each device along the route, allowing any device to directly access the required data from coresight_path instead of receiving it as a separate argument.
Imagine a device that requires more than two or three arguments, and you want to pass them through coresight_enable_path or similar functions...
For certain coresight_path instances, we may not need the handle or other parameters. Since these values are initialized, it's acceptable to leave them as NULL or 0.
Or, maybe it's the right place for it, and the cause of my concern is that "path" is an overly-narrow name in struct coresight_path?
It defines the direction of data flow—serving as the path for trace data.
Thanks, Jie
But if a renaming of this structure would improve the code, I'd also be fine with that happening in a subsequent commit, so I won't try to hold up the current series based on that.
-Carl
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
On 24/09/2025 11:21, Mike Leach wrote:
Hi,
On Tue, 23 Sept 2025 at 02:49, Jie Gan jie.gan@oss.qualcomm.com wrote:
On 9/23/2025 1:31 AM, Carl Worth wrote:
Jie Gan jie.gan@oss.qualcomm.com writes:
From: Carl Worth carl@os.amperecomputing.com
The handle is essential for retrieving the AUX_EVENT of each CPU and is required in perf mode. It has been added to the coresight_path so that dependent devices can access it from the path when needed.
I'd still like to have the original command I used to trigger the bug in the commit message. I really like having reproduction steps captured in commit messages when I look back at commits in the future. So, that was:
perf record -e cs_etm//k -C 0-9 dd if=/dev/zero of=/dev/null
Sure, I’ll include your commit message in the formal patch series, I think it's V3 since you have submitted two versions, if you're okay with me sending it out.
/** * struct coresight_path - data needed by enable/disable path
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
- struct perf_output_handle: handle of the aux_event. */
Fixing to "@handle" was mentioned in another comment already.
Something about the above still feels a little off to me. It feels like we're throwing new data into a structure just because it happens to be conveniently at hand for the code paths we're needing, and not because it really _belongs_ there.
This data is perf specific - not path generic; so I agree that this structure should go elsewhere.
I would suggest in the csdev (coresight_device) structure itself. We already have some sink specific data in here e.g. perf_sink_id_map.
This could then be set/clear in the functions coresight-etm-perf.c file, where there is a significant amount of code dealing with the perf handle and ensuring it is valid and in scope.
This can then be set only when appropriate - for source / sink devices and only when in perf mode, and avoid the need to pass the handle around as API call parameters.
I think this data is specific to the session we are enabling the device(s) in. e.g., we keep the trace-id in the path. So, I don't mind having this in the path structure. Instead of modifying csdev with additional locking from "etm-perf" it is always cleaner to handle this in the path.
Suzuki
Regards
Mike.
The core idea behind coresight_path is that it can hold all the data potentially needed by any device along the path.
For example with the path ETM->Link->ETR->CATU:
All the mentioned devices operate by forming a path, for which the system constructs a coresight_path. This 'path' is then passed to each device along the route, allowing any device to directly access the required data from coresight_path instead of receiving it as a separate argument.
Imagine a device that requires more than two or three arguments, and you want to pass them through coresight_enable_path or similar functions...
For certain coresight_path instances, we may not need the handle or other parameters. Since these values are initialized, it's acceptable to leave them as NULL or 0.
Or, maybe it's the right place for it, and the cause of my concern is that "path" is an overly-narrow name in struct coresight_path?
It defines the direction of data flow—serving as the path for trace data.
Thanks, Jie
But if a renaming of this structure would improve the code, I'd also be fine with that happening in a subsequent commit, so I won't try to hold up the current series based on that.
-Carl
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
On 24/09/2025 5:42 pm, Suzuki K Poulose wrote:
On 24/09/2025 11:21, Mike Leach wrote:
Hi,
On Tue, 23 Sept 2025 at 02:49, Jie Gan jie.gan@oss.qualcomm.com wrote:
On 9/23/2025 1:31 AM, Carl Worth wrote:
Jie Gan jie.gan@oss.qualcomm.com writes:
From: Carl Worth carl@os.amperecomputing.com
The handle is essential for retrieving the AUX_EVENT of each CPU and is required in perf mode. It has been added to the coresight_path so that dependent devices can access it from the path when needed.
I'd still like to have the original command I used to trigger the bug in the commit message. I really like having reproduction steps captured in commit messages when I look back at commits in the future. So, that was:
perf record -e cs_etm//k -C 0-9 dd if=/dev/zero of=/dev/null
Sure, I’ll include your commit message in the formal patch series, I think it's V3 since you have submitted two versions, if you're okay with me sending it out.
/** * struct coresight_path - data needed by enable/disable path
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
- @path_list: path from source to sink.
- @trace_id: trace_id of the whole path.
- struct perf_output_handle: handle of the aux_event.
*/
Fixing to "@handle" was mentioned in another comment already.
Something about the above still feels a little off to me. It feels like we're throwing new data into a structure just because it happens to be conveniently at hand for the code paths we're needing, and not because it really _belongs_ there.
This data is perf specific - not path generic; so I agree that this structure should go elsewhere.
I would suggest in the csdev (coresight_device) structure itself. We already have some sink specific data in here e.g. perf_sink_id_map.
This could then be set/clear in the functions coresight-etm-perf.c file, where there is a significant amount of code dealing with the perf handle and ensuring it is valid and in scope.
This can then be set only when appropriate - for source / sink devices and only when in perf mode, and avoid the need to pass the handle around as API call parameters.
I think this data is specific to the session we are enabling the device(s) in. e.g., we keep the trace-id in the path. So, I don't mind having this in the path structure. Instead of modifying csdev with additional locking from "etm-perf" it is always cleaner to handle this in the path.
Suzuki
Yeah, and perf_sink_id_map only "needs" to be in the csdev because it controls sharing IDs between multiple paths which can't be accomplished by storing it in the path.
This one is just a pointer to the perf handle which really does belong to the session rather than the device. This makes it more of a path thing than a csdev thing. Maybe we can rename path to be more like "session", which also happens to contain a path. But I think path is fine for now.
However in this case handle is per-cpu data that is only accessed on the same cpu in tmc_etr_get_buffer(). Assigning it in etm_event_start() just copies the same per-cpu variable into a non per-cpu place that eventually gets accessed on the same cpu anyway.
If we exported it then it can be used directly without worrying where to store it:
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 17afa0f4cdee..4c33f442c80b 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -42,12 +42,8 @@ static bool etm_perf_up; * the ETM. Thus the event_data for the session must be part of the ETM context * to make sure we can disable the trace path. */ -struct etm_ctxt { - struct perf_output_handle handle; - struct etm_event_data *event_data; -}; - -static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt); +DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt); +EXPORT_SYMBOL_GPL(etm_ctxt); static DEFINE_PER_CPU(struct coresight_device *, csdev_src);
/* diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index fd896ac07942..b834e8bef2a5 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -14,6 +14,7 @@
extern struct mutex coresight_mutex; extern const struct device_type coresight_dev_type[]; +DECLARE_PER_CPU(struct etm_ctxt, etm_ctxt);
/* * Coresight management registers (0xf00-0xfcc) @@ -49,6 +50,11 @@ extern const struct device_type coresight_dev_type[]; #define ETM_MODE_EXCL_HOST BIT(32) #define ETM_MODE_EXCL_GUEST BIT(33)
+struct etm_ctxt { + struct perf_output_handle handle; + struct etm_event_data *event_data; +}; + struct cs_pair_attribute { struct device_attribute attr; u32 lo_off; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index bf08f6117a7f..7026994b02b3 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1328,8 +1328,9 @@ struct etr_buf *tmc_etr_get_buffer(struct coresight_device *csdev, enum cs_mode mode, struct coresight_path *path) { - struct perf_output_handle *handle = path->handle; struct etr_perf_buffer *etr_perf; + struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt); + struct perf_output_handle *handle = &ctxt->handle;
switch (mode) { case CS_MODE_SYSFS:
Regards
Mike.
The core idea behind coresight_path is that it can hold all the data potentially needed by any device along the path.
For example with the path ETM->Link->ETR->CATU:
All the mentioned devices operate by forming a path, for which the system constructs a coresight_path. This 'path' is then passed to each device along the route, allowing any device to directly access the required data from coresight_path instead of receiving it as a separate argument.
Imagine a device that requires more than two or three arguments, and you want to pass them through coresight_enable_path or similar functions...
For certain coresight_path instances, we may not need the handle or other parameters. Since these values are initialized, it's acceptable to leave them as NULL or 0.
Or, maybe it's the right place for it, and the cause of my concern is that "path" is an overly-narrow name in struct coresight_path?
It defines the direction of data flow—serving as the path for trace data.
Thanks, Jie
But if a renaming of this structure would improve the code, I'd also be fine with that happening in a subsequent commit, so I won't try to hold up the current series based on that.
-Carl
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
On Thu, Sep 25, 2025 at 11:10:51AM +0100, James Clark wrote:
[...]
In short, I prefer to store perf handle in coresight path, as Jie has done in this series.
I will give details below, sorry for a long replying.
[...]
This one is just a pointer to the perf handle which really does belong to the session rather than the device. This makes it more of a path thing than a csdev thing. Maybe we can rename path to be more like "session", which also happens to contain a path. But I think path is fine for now.
Yes, renaming 'coresight_path' (to like 'coresight_runtime_ctx') better reflects the structure’s purpose.
The point is we can take chance to separate runtime parameters from device and driver instances.
- 'coresight_path': runtime parameters for an active session - 'coresight_device': a device instance registered on the bus - driver data: after probe, the driver maintains driver-specific attributes (e.g., the ETMv4 driver keeps mode and filters in struct etmv4_drvdata)
These structures have different lifetimes. For example, coresight_path is valid only during a session; otherwise, it should be cleared.
From a lifecycle perspective, storing the perf handle in coresight_path makes sense, since both are valid only for the duration of a session (the perf handle isn't used in sysfs mode, in which case we can simply leave it unset).
Furthermore, the perf handle is not just a handle; it lets us easily retrieve private event data (see perf_get_aux()).
However in this case handle is per-cpu data that is only accessed on the same cpu in tmc_etr_get_buffer(). Assigning it in etm_event_start() just copies the same per-cpu variable into a non per-cpu place that eventually gets accessed on the same cpu anyway.
If we exported it then it can be used directly without worrying where to store it:
We need to be careful for exporting data structures across modules, as it makes them harder to manage.
In fact, we already share 'etm_event_data' for ETR driver for the same purpose, and seems to me, it is redendant to export another structure 'etm_ctxt' to just make it convenient for obtaining a buffer config pointer.
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 17afa0f4cdee..4c33f442c80b 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -42,12 +42,8 @@ static bool etm_perf_up;
- the ETM. Thus the event_data for the session must be part of the ETM
context
- to make sure we can disable the trace path.
*/ -struct etm_ctxt {
struct perf_output_handle handle;
struct etm_event_data *event_data;
-};
-static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt); +DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt); +EXPORT_SYMBOL_GPL(etm_ctxt);
I'd suggest a different approach: drop the etm_ctxt structure and instead define a per-CPU pointer to etm_event_data:
static DEFINE_PER_CPU(struct etm_event_data *, etm_ev_ctx);
As mentioned above, if perf handle is maintained in coresight_path, we can easily retrieve the etm_event_data via the perf handle.
A more aggressive refactoring would remove etm_ctxt from coresight-etm-perf.c entirely, relying on the perf event to manage private context data. For now, we keep it only to validate whether ETM is enabled (see multiple place to validate ctxt->event_data).
Thanks, Leo
On Thu, Sep 25, 2025 at 11:10:51AM +0100, James Clark wrote:
I would suggest in the csdev (coresight_device) structure itself. We already have some sink specific data in here e.g. perf_sink_id_map.
[...]
I think this data is specific to the session we are enabling the device(s) in. e.g., we keep the trace-id in the path. So, I don't mind having this in the path structure. Instead of modifying csdev with additional locking from "etm-perf" it is always cleaner to handle this in the path.
Yeah, and perf_sink_id_map only "needs" to be in the csdev because it controls sharing IDs between multiple paths which can't be accomplished by storing it in the path.
This is a bit off-topic: do we really need to maintain an id_map in every sink device, or could we simply use a global id_map?
I might miss some info; anyway, consolidating trace IDs is a low priority for me and not critical to this thread. But this might be benefit for later refactoring.
Thanks, Leo
On Thu, 25 Sept 2025 at 17:04, Leo Yan leo.yan@arm.com wrote:
On Thu, Sep 25, 2025 at 11:10:51AM +0100, James Clark wrote:
I would suggest in the csdev (coresight_device) structure itself. We already have some sink specific data in here e.g. perf_sink_id_map.
[...]
I think this data is specific to the session we are enabling the device(s) in. e.g., we keep the trace-id in the path. So, I don't mind having this in the path structure. Instead of modifying csdev with additional locking from "etm-perf" it is always cleaner to handle this in the path.
Yeah, and perf_sink_id_map only "needs" to be in the csdev because it controls sharing IDs between multiple paths which can't be accomplished by storing it in the path.
This is a bit off-topic: do we really need to maintain an id_map in every sink device, or could we simply use a global id_map?
I might miss some info; anyway, consolidating trace IDs is a low priority for me and not critical to this thread. But this might be benefit for later refactoring.
Thanks, Leo
There is a limit to the number of IDs that can be used - using ID map per sink means that IDs can be re-used in different sinks.
This was added to overcome this problem in large multi-core systems where there are more PEs than Available IDs.
Mike