Previously the sink had to be specified, but now it auto selects one by default. Including a sink in the examples causes issues when copy pasting the command because it might not work if that sink isn't present. Remove the sink from all the basic examples and create a new section specifically about overriding the default one.
Make the text a but more concise now that it's in the advanced section, and similarly for removing the old kernel advice.
Signed-off-by: James Clark james.clark@linaro.org --- Documentation/trace/coresight/coresight.rst | 41 ++++++++----------- .../userspace-api/perf_ring_buffer.rst | 4 +- 2 files changed, 18 insertions(+), 27 deletions(-)
diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst index d4f93d6a2d63..806699871b80 100644 --- a/Documentation/trace/coresight/coresight.rst +++ b/Documentation/trace/coresight/coresight.rst @@ -462,44 +462,35 @@ queried by the perf command line tool:
cs_etm// [Kernel PMU event]
- linaro@linaro-nano:~$ - Regardless of the number of tracers available in a system (usually equal to the amount of processor cores), the "cs_etm" PMU will be listed only once.
A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is -listed along with configuration options within forward slashes '/'. Since a -Coresight system will typically have more than one sink, the name of the sink to -work with needs to be specified as an event option. -On newer kernels the available sinks are listed in sysFS under +provided along with configuration options within forward slashes '/' (see +`Config option formats`_). + +Advanced Perf framework usage +----------------------------- + +Sink selection +~~~~~~~~~~~~~~ + +An appropriate sink will be selected automatically for use with Perf, but since +there will typically be more than one sink, the name of the sink to use may be +specified as a special config option prefixed with '@'. + +The available sinks are listed in sysFS under ($SYSFS)/bus/event_source/devices/cs_etm/sinks/::
root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls tmc_etf0 tmc_etr0 tpiu0
-On older kernels, this may need to be found from the list of coresight devices, -available under ($SYSFS)/bus/coresight/devices/:: - - root:~# ls /sys/bus/coresight/devices/ - etm0 etm1 etm2 etm3 etm4 etm5 funnel0 - funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
-As mentioned above in section "Device Naming scheme", the names of the devices could -look different from what is used in the example above. One must use the device names -as it appears under the sysFS. - -The syntax within the forward slashes '/' is important. The '@' character -tells the parser that a sink is about to be specified and that this is the sink -to use for the trace session. - More information on the above and other example on how to use Coresight with the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub repository [#third]_.
-Advanced perf framework usage ------------------------------ - AutoFDO analysis using the perf tools ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -508,7 +499,7 @@ perf can be used to record and analyze trace of programs. Execution can be recorded using 'perf record' with the cs_etm event, specifying the name of the sink to record to, e.g::
- perf record -e cs_etm/@tmc_etr0/u --per-thread + perf record -e cs_etm//u --per-thread
The 'perf report' and 'perf script' commands can be used to analyze execution, synthesizing instruction and branch events from the instruction trace. @@ -572,7 +563,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto Bubble sorting array of 30000 elements 5910 ms
- $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort + $ perf record -e cs_etm//u --per-thread taskset -c 2 ./sort Bubble sorting array of 30000 elements 12543 ms [ perf record: Woken up 35 times to write data ] diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst index bde9d8cbc106..dc71544532ce 100644 --- a/Documentation/userspace-api/perf_ring_buffer.rst +++ b/Documentation/userspace-api/perf_ring_buffer.rst @@ -627,7 +627,7 @@ regular ring buffer. AUX events and AUX trace data are two different things. Let's see an example::
- perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2 + perf record -a -e cycles -e cs_etm// -- sleep 2
The above command enables two events: one is the event *cycles* from PMU and another is the AUX event *cs_etm* from Arm CoreSight, both are saved @@ -766,7 +766,7 @@ only record AUX trace data at a specific time point which users are interested in. E.g. below gives an example of how to take snapshots with 1 second interval with Arm CoreSight::
- perf record -e cs_etm/@tmc_etr0/u -S -a program & + perf record -e cs_etm//u -S -a program & PERFPID=$! while true; do kill -USR2 $PERFPID
Hi James,
I thought I'd mention this issue with multicore self-hosted trace. The perf command line syntax does not allow a sink "type" to be specified (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to specify a processor mapped sink as would be the case for single core trace. A sink "type" should be allowed to avoid the auto select default. In our case, the default is the ETF sink.
Thanks, Steve C.
On 12/10/2024 6:49 AM, James Clark wrote:
Previously the sink had to be specified, but now it auto selects one by default. Including a sink in the examples causes issues when copy pasting the command because it might not work if that sink isn't present. Remove the sink from all the basic examples and create a new section specifically about overriding the default one.
Make the text a but more concise now that it's in the advanced section, and similarly for removing the old kernel advice.
Signed-off-by: James Clark james.clark@linaro.org
Documentation/trace/coresight/coresight.rst | 41 ++++++++----------- .../userspace-api/perf_ring_buffer.rst | 4 +- 2 files changed, 18 insertions(+), 27 deletions(-)
diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst index d4f93d6a2d63..806699871b80 100644 --- a/Documentation/trace/coresight/coresight.rst +++ b/Documentation/trace/coresight/coresight.rst @@ -462,44 +462,35 @@ queried by the perf command line tool: cs_etm// [Kernel PMU event]
- linaro@linaro-nano:~$
Regardless of the number of tracers available in a system (usually equal to the amount of processor cores), the "cs_etm" PMU will be listed only once. A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is -listed along with configuration options within forward slashes '/'. Since a -Coresight system will typically have more than one sink, the name of the sink to -work with needs to be specified as an event option. -On newer kernels the available sinks are listed in sysFS under +provided along with configuration options within forward slashes '/' (see +`Config option formats`_).
+Advanced Perf framework usage +-----------------------------
+Sink selection +~~~~~~~~~~~~~~
+An appropriate sink will be selected automatically for use with Perf, but since +there will typically be more than one sink, the name of the sink to use may be +specified as a special config option prefixed with '@'.
+The available sinks are listed in sysFS under ($SYSFS)/bus/event_source/devices/cs_etm/sinks/:: root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls tmc_etf0 tmc_etr0 tpiu0 -On older kernels, this may need to be found from the list of coresight devices, -available under ($SYSFS)/bus/coresight/devices/::
- root:~# ls /sys/bus/coresight/devices/
etm0 etm1 etm2 etm3 etm4 etm5 funnel0
root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread programfunnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
-As mentioned above in section "Device Naming scheme", the names of the devices could -look different from what is used in the example above. One must use the device names -as it appears under the sysFS.
-The syntax within the forward slashes '/' is important. The '@' character -tells the parser that a sink is about to be specified and that this is the sink -to use for the trace session.
More information on the above and other example on how to use Coresight with the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub repository [#third]_.
-Advanced perf framework usage
AutoFDO analysis using the perf tools
@@ -508,7 +499,7 @@ perf can be used to record and analyze trace of programs. Execution can be recorded using 'perf record' with the cs_etm event, specifying the name of the sink to record to, e.g:: - perf record -e cs_etm/@tmc_etr0/u --per-thread + perf record -e cs_etm//u --per-thread The 'perf report' and 'perf script' commands can be used to analyze execution, synthesizing instruction and branch events from the instruction trace. @@ -572,7 +563,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto Bubble sorting array of 30000 elements 5910 ms - $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort + $ perf record -e cs_etm//u --per-thread taskset -c 2 ./sort Bubble sorting array of 30000 elements 12543 ms [ perf record: Woken up 35 times to write data ] diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst index bde9d8cbc106..dc71544532ce 100644 --- a/Documentation/userspace-api/perf_ring_buffer.rst +++ b/Documentation/userspace-api/perf_ring_buffer.rst @@ -627,7 +627,7 @@ regular ring buffer. AUX events and AUX trace data are two different things. Let's see an example:: - perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2 + perf record -a -e cycles -e cs_etm// -- sleep 2 The above command enables two events: one is the event *cycles* from PMU and another is the AUX event *cs_etm* from Arm CoreSight, both are saved @@ -766,7 +766,7 @@ only record AUX trace data at a specific time point which users are interested in. E.g. below gives an example of how to take snapshots with 1 second interval with Arm CoreSight:: - perf record -e cs_etm/@tmc_etr0/u -S -a program & + perf record -e cs_etm//u -S -a program & PERFPID=$! while true; do kill -USR2 $PERFPID
On 11/12/2024 6:01 pm, Steve Clevenger wrote:
Hi James,
I thought I'd mention this issue with multicore self-hosted trace. The perf command line syntax does not allow a sink "type" to be specified (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to specify a processor mapped sink as would be the case for single core trace. A sink "type" should be allowed to avoid the auto select default. In our case, the default is the ETF sink.
Thanks, Steve C.
I'm sure it would be possible to add support for this, but I'm wondering if the real issue is that the default selection logic is wrong? Are you saying the default you get is ETF but you want ETR? And there is both for each ETM? The default selection logic isn't easy to summarize but it should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().
It's probably better to fix that rather than add a new sink selection feature. Maybe if you shared a diagram of your coresight architecture it would help.
Thanks James
On 12/12/2024 7:27 AM, James Clark wrote:
On 11/12/2024 6:01 pm, Steve Clevenger wrote:
Hi James,
I thought I'd mention this issue with multicore self-hosted trace. The perf command line syntax does not allow a sink "type" to be specified (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to specify a processor mapped sink as would be the case for single core trace. A sink "type" should be allowed to avoid the auto select default. In our case, the default is the ETF sink.
Thanks, Steve C.
I'm sure it would be possible to add support for this, but I'm wondering if the real issue is that the default selection logic is wrong? Are you saying the default you get is ETF but you want ETR? And there is both for each ETM? The default selection logic isn't easy to summarize but it should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().
It's probably better to fix that rather than add a new sink selection feature. Maybe if you shared a diagram of your coresight architecture it would help.
Thanks James
Hi James,
It appears the default sink selection is ETF for multicore trace. In any case, for the ArmĀ® CoreSight Base System Architecture STC Level compliance, I need to be able to specify the sink type.
The Ampere CoreSight hierarchy is described to the ACPI as follows:
+-----------------+ | | | ETM | | | +--------+--------+ | | +--------+--------+ | | | ETF | | | +--------+--------+ | | +--------+--------+ | | | ETR | | | +--------+--------+ | | +--------+--------+ | | | CATU | | | +--------+--------+
Steve C.
On 12/12/2024 7:38 pm, Steve Clevenger wrote:
On 12/12/2024 7:27 AM, James Clark wrote:
On 11/12/2024 6:01 pm, Steve Clevenger wrote:
Hi James,
I thought I'd mention this issue with multicore self-hosted trace. The perf command line syntax does not allow a sink "type" to be specified (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to specify a processor mapped sink as would be the case for single core trace. A sink "type" should be allowed to avoid the auto select default. In our case, the default is the ETF sink.
Thanks, Steve C.
I'm sure it would be possible to add support for this, but I'm wondering if the real issue is that the default selection logic is wrong? Are you saying the default you get is ETF but you want ETR? And there is both for each ETM? The default selection logic isn't easy to summarize but it should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().
It's probably better to fix that rather than add a new sink selection feature. Maybe if you shared a diagram of your coresight architecture it would help.
Thanks James
Hi James,
It appears the default sink selection is ETF for multicore trace. In any case, for the ArmĀ® CoreSight Base System Architecture STC Level compliance, I need to be able to specify the sink type.
Yep it makes sense to add support for selecting it then then, I will put it on the list but not sure about the priority. I think looking into why the default isn't working is more important for now.
The Ampere CoreSight hierarchy is described to the ACPI as follows:
+-----------------+ | | | ETM | | | +--------+--------+ | | +--------+--------+ | | | ETF | | | +--------+--------+ | | +--------+--------+ | | | ETR | | | +--------+--------+ | | +--------+--------+ | | | CATU | | | +--------+--------+
Steve C.
I recreated this in the test here: https://lore.kernel.org/linux-kernel/20241217171132.834943-1-james.clark@lin...
But it looks like it correctly selects ETR rather than ETF, so I'm not sure what the difference is between your setup and that. If you can have a look at that test and compare it that would be very helpful.
Thanks James