Hello,
I'm trying to design a solution to use CoreSight for measuring application execution time, with granularity of specific ranges of instructions. I have some idea how this may be achieved and I'd like to know your opinion.
Great inspiration comes from this patch set by Leo Yan, especially from the disassembly script: https://lists.linaro.org/pipermail/coresight/2018-May/001325.html Analyzing this, I learned that perf-script is capable of understanding perf.data AUXTRACE section and parsing some of the trace elements to branch samples, which illustrate how the IP moved around. These pieces of information are available for the built-in python interpreter, so we can script it to get assembly from the program image.
If I understand perf-script in its current shape correctly, it ignores all the non-branching events (so everything that's not an ATOM, EXCEPTION or TRACE_ON packet) - specifically, timestamping is lost during the process. I'd like to modify perf-script to generate samples on such timing events, so later I can have them in between assembly instructions to calculate deltas and be able to tell either: - how much time and/or CPU cycles have been spent between two arbitrary instructions (ideally), or - what instructions have been executed between timestamp T and T+1 (this seems to be more in-line with how timestamping in CS works, I think)
Brief analysis of tools/perf/util.cs-etm.c and cs-etm-decoder/cs-etm-decoder.c suggests that timestamp events are not turned into packets, but merely recorded as a packet queue parameter (I'm not sure why this is needed, though). The cycacc event is not processed further at all, beside being later decoded to plaintext by OpenCSD. I think it may be worth to give them both a dedicated `enum cs_etm_sample_type` value and packet generator functions.
Then, I think, it should be possible to generate samples (not sure what type though, perhaps not 'branch' this time) for timestamp/cycacc packets, analogically to what has been done for TRACE_ON https://lists.linaro.org/pipermail/coresight/2018-May/001327.html and then expose it in the python interface.
I'd be grateful for any opinion about this idea, especially about usefulness of such feature for the general audience, as well as any possible compatibility issues. If you are aware of another approach to achieve timestamp correlation with branch samples, it would also be very welcome.
I hope the idea is not completely pointless. I'm still making my way through the perf subsystem, so I might have missed some crucial details.
Thank you for your time, Wojciech