RE: [PATCH 00/10] coresight: Add new API to allocate trace source ID values

23 Mar 2022


      ...
-----Original Message-----
From: Mike Leach mike.leach@linaro.org
Sent: 23 March 2022 10:08
To: Suzuki Poulose Suzuki.Poulose@arm.com
Cc: coresight@lists.linaro.org; linux-arm-kernel@lists.infradead.org; linux-
kernel@vger.kernel.org; peterz@infradead.org; mingo@redhat.com;
acme@kernel.org; linux-perf-users@vger.kernel.org; James Clark
James.Clark@arm.com
Subject: Re: [PATCH 00/10] coresight: Add new API to allocate trace source ID
values
Hi Suzuki,
On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
...
Hi Mike
On 22/03/2022 14:27, Mike Leach wrote:
...
Hi Suzuki
On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
suzuki.poulose@arm.com wrote:
...
On 22/03/2022 11:38, Mike Leach wrote:
...
HI Suzuki,
On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
suzuki.poulose@arm.com wrote:
...

Cc: James Clark

Hi Mike,
On 08/03/2022 20:49, Mike Leach wrote:
> The current method for allocating trace source ID values to
> sources is to use a fixed algorithm for CPU based sources of
(cpu_num * 2 + 0x10).
...
...
...
...
...
> The STM is allocated ID 0x1.
>
> This fixed algorithm is used in both the CoreSight driver code,
> and by perf when writing the trace metadata in the AUXTRACE_INFO
record.
...
...
...
...
...
>
> The method needs replacing as currently:- 1. It is inefficient
> in using available IDs.
> 2. Does not scale to larger systems with many cores and the
> algorithm has no limits so will generate invalid trace IDs for cpu
number > 44.
...
...
...
...
...
Thanks for addressing this issue.
>
> Additionally requirements to allocate additional system IDs on
> some systems have been seen.
>
> This patch set  introduces an API that allows the allocation of
> trace IDs in a dynamic manner.
>
> Architecturally reserved IDs are never allocated, and the system
> is limited to allocating only valid IDs.
>
> Each of the current trace sources ETM3.x, ETM4.x and STM is
> updated to use the new API.
>
> perf handling is changed so that the ID associated with the CPU
> is read from sysfs. The ID allocator is notified when perf
> events start and stop so CPU based IDs are kept constant
throughout any perf session.
...
...
...
...
...
>
> For the ETMx.x devices IDs are allocated on certain events
> a) When using sysfs, an ID will be allocated on hardware enable,
> and freed when the sysfs reset is written.
> b) When using perf, ID is allocated on hardware enable, and
> freed on hardware disable.
>
> For both cases the ID is allocated when sysfs is read to get the
> current trace ID. This ensures that consistent decode metadata
> can be extracted from the system where this read occurs before
device enable.
...
...
...
...
...
>
> Note: This patchset breaks backward compatibility for perf record.
> Because the method for generating the AUXTRACE_INFO meta data
> has changed, using an older perf record will result in metadata
> that does not match the trace IDs used in the recorded trace data.
> This mismatch will cause subsequent decode to fail. Older
> versions of perf will still be able to decode data generated by the
updated system.
...
...
...
...
...
I have some concerns over this and the future plans for the
dynamic allocation per sink. i.e., we are breaking/modifying the
perf now to accommodate the dynamic nature of the trace id of a
given CPU/ETM.
...
...
...
...
I don't beleive we have a choice for this - we cannot retain what
is an essentially broken allocation mechanism.
I completely agree and I am happy with the current step by step
approach of moving to a dynamic allocation scheme. Apologies, this
wasn't conveyed appropriately.
...
...
The proposed approach of exposing this via sysfs may (am not sure
if this would be the case) break for the trace-id per sink
change, as a sink could assign different trace-id for a CPU depending.
If a path exists between a CPU and a sink - the current framework
as far as I can tell would not allow for a new path to be set up
between the cpu and another sink.
e.g, if we have concurrent perf sessions, in the future with sink
based allocation :
perf record -e cs_etm/@sink1/... payload1 perf record -e
cs_etm/@sink2/... payload2
perf record -e cs_etm// ...      payload3
The trace id allocated for first session for CPU0 *could* be
different from that of the second or the third.
If these sessions run concurrently then the same Trace ID will be
used for CPU0 for all the sessions.
We ensure this by notifications that a cs_etm session is starting
and stopping - and keep a refcount.
The scheme is fine now, with a global trace-id map. But with per-sink
allocation, this could cause problems.
e.g., there could be a situation where:
trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
So if we have a session where both CPU0 and CPU1 trace to a common
sink, we get the trace mixed with no way of splitting them. As the
perf will read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
I think we need to consider the CoreSight hardware topology here.
Any CPUx that can trace to a sink reachable by another CPUy must always
get trace IDs from the same pool as CPUy.
Consider the options for multi sink topologies:-
CPU0->funnel0->ETF->ETR
CPU1--+^
Clearly - in-line sinks can never have simultaneous independent sessions -
the session into ETR traces through ETF
Now we could have replicators / programmable replicators -
ATB->Prog replicator->ETR0
                                 +->ETR1
however programmable replicators use trace ID for filtering - this is
effectively a single sink on the input, so once again the Trace IDs must come
from the same pool.
Now, we could have independent per cluster / socket topology Cluster 0
CPU0->funnel0->ETF0->ETR0
CPU1--+^
Cluster 1
CPU2->funnel1->ETF1->ETR1
CPU3--+^
Here cluster 0 & 1 can have independent sets of trace IDs as their respective
cores can never trace to the same sink.
Finally we have the ETE+TRBE 1:1 type topologies. These could actually not
bother allocating any trace ID when in 1:1 mode, which should probably be a
follow on incremental addition to this initial set.
So, my conclusion when I was considering all this is that "per-sink"
trace Id allocation is in fact "per unique trace path set" allocation.
The id pools can't always be defined statically in all situations.
E.g. if you have 128 CPUs each with their own ETR, and also
replicated into a funnel network leading to a common ETR.
This topology supports (at least) two distinct modes: (a) all CPUs
enabled for tracing to their own ETRs (b) a selection of CPUs
(up to some limit), combined together. Both are valid dynamic
configurations. Perf might have a preference on which one to use,
but both are valid.
But there's no static id pool that works for both. The pool for
(b) has to allocate to some random selection of 128 CPUs,
from only around 120 numbers. The pool has to take account of
which CPUs are selected.
So your comment "any CPU that can trace..." has to be interpreted as
"any CPU that can trace in the currently configured dynamic topology"
rather than "any CPU that can be dynamically configured to trace..."...
is that what you meant?
Alternatively, we could just declare that such systems are too
complicated to support, and say that we wouldn't support the
use of a global sink that (statically) was reachable by 128 CPUs.
Al
...
...
So my point is, we are changing the ABI for perf to grab the TraceID
with your patches. And clearly this approach could break easily when
we extend to sink-based idmap. So, lets make the ABI change for perf
scalable and bullet proof (as far as we can) by exposing this
information via the perf RECORD. That way any future changes in the
scheme won't affect the perf as long as it has a reliable information
within each "record".
My point is, let us fix this once and for all, so that we don't need
to change this again. I understand this involves more work in the perf
tool. I believe that is for better
Thoughts ?
My preference is the incremental approach.
Fix the trace ID allocation issues that partners are having now, then update
to the perf record approach in a separate follow up patchset.
Then when we start to see systems that require it - update to using the per-
unique-path trace ID pools.
Regards
Mike
...
Suzuki
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
_______________________________________________
CoreSight mailing list -- coresight@lists.linaro.org To unsubscribe send an
email to coresight-leave@lists.linaro.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

RE: [PATCH 00/10] coresight: Add new API to allocate trace source ID values