On 19/09/2023 23:45, Besar Wicaksono wrote:
The decoder creation for raw trace uses metadata from the first CPU. On per-process/per-thread traces, the first CPU is CPU0. If CPU0 trace is not enabled, its metadata will be marked unused and the decoder is not created. Perf report dump skips the decoding part because the decoder is missing.
Hi Besar,
It's not just per-process trace, the bug is also in per-cpu mode but it means that the metadata from CPU 0 is used for every decoder which is wrong. Although your change also fixes this issue.
To fix this, use metadata of the CPU associated with sample object.
Signed-off-by: Besar Wicaksono bwicaksono@nvidia.com
tools/perf/util/cs-etm.c | 130 +++++++++++++++++++++++---------------- 1 file changed, 77 insertions(+), 53 deletions(-)
[...]
- if (!formatted) {
/*
* There is only one decoder when unformatted. Use metadata of
* sample AUX cpu.
*/
t_param = t_params;
metadata = get_cpu_data(etm, sample_cpu);
if (!metadata) {
}pr_err("CS_ETM: invalid sample CPU: %d\n", sample_cpu); return -EINVAL;
Apart from Mike's comments, this looks ok. Thanks for fixing this it has been on our list for a while.
One issue with calling get_cpu_data() with the sample CPU ID is that it won't work with old files that don't have the CPU sample flag set. Mike added the sample flag fairly recently, and I don't think that was a breaking change for old files at that time. It should be easy to avoid that by still returning the metadata from CPU 0 when CPU = -1 (Which isn't correct but is 99% likely to work).
I checked the Coresight tests and they're all passing, at least on a system without ETE. If you could make sure they're all passing for you as well that would be great:
sudo ./perf test coresight
I think they currently only work from an in source build, if you get stuck there.
Thanks James