On Thu, 14 Jun 2018 at 15:42, Al Grant Al.Grant@arm.com wrote:
Open question:
At this time the implementation supports tracing a single CPU since the only HW we have exhibit an N:1 source/sink topology. The HW itself does support collecting traces from more than one source but using the feature in this way could be very confusing and mislead users.
For example the following:
# perf record -e cs_etm/20070000.etr/ -C 2,3 application1
would end up tracing everyting that is happening on CPU 2 and 3 for as long as appliation1 is executing. Because the HW doesn't give us an interrupt when buffers are full, traces from one CPU could easily clobber traces from the other, giving the impression that nothing was executed on the latter.
You only really get that impression if you expect it to behave like ordinary perf events traced into a per-CPU ring buffer. It's just a question of remembering that for ETM the trace is usually into a shared buffer and older trace _of an inactive CPU_ might be lost.
Two _active_ CPUs will interleave into the buffer, and being able to get trace of two communicating processes on separate CPUs is very useful. The benefits of being able to trace simultaneously active CPUs outweighs the small risk of someone misinterpreting trace when one CPU happens to be inactive.
That's the kind of discussion I had with myself when working on the feature. I decided to make things work with the simplest option, i.e a single CPU, and then add support for multiple CPU should we deem it desirable. I've always been of the opinion that if the HW supports it, we should use it so I'm good with extending the feature. But doing so will require a fair amount of adjustment that need to be carefully considered.
For example calling the following
# perf record -e cs_etm/20070000.etr/ -C 2,3 application1
and
# perf record -e cs_etm/20070000.etr/ -C 2 application1 # perf record -e cs_etm/20070000.etr/ -C 3 application1
look exactly the same to the kernel. That is, both example create two new events and there is currently now way to correlate them. In the first example we need to allow both events to use the same sink while preventing it in the latter.
Thanks for the comments, Mathieu
Al
So this would work:
# perf record -e cs_etm/20070000.etr/ -C 3 application1
I am open to discussion on the topic should someone think of something.
As with the cleanup set this code has been uploaded here [1].
Thanks, Mathieu
[1].https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd- master-cpu-wide-support
Mathieu Poirier (12): perf tools: Add defines for CONTEXTID configuration perf tools: Configure contextID tracing in CPU-wide mode perf tools: Configure timestsamp generation in CPU-wide mode perf tools: Configure SWITCH_EVENTS in CPU-wide mode perf tools: Add handling of itrace start events perf tools: Add handling of switch-CPU-wide events perf tools: Linking PE contextID with perf thread mechanic perf tools: Allocate decoder tree as needed perf tools: Make cs_etm__dump_event() work with CPU-wide scenarios perf tools: Add notion of time to the decoding code perf tools: Make function cs_etm_decoder__clear_buffer() public perf tools: Add support for CPU-wide trace scenarios
include/linux/coresight-pmu.h | 2 + tools/include/linux/coresight-pmu.h | 2 + tools/perf/arch/arm/util/cs-etm.c | 174 ++++++++++-- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 140 +++++++++- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 4 +- tools/perf/util/cs-etm.c | 334 ++++++++++++++++++++++-- tools/perf/util/cs-etm.h | 17 ++ 7 files changed, 623 insertions(+), 50 deletions(-)
-- 2.7.4
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight