On 15/04/2021 17:33, Leo Yan wrote:
Hi James,
On Thu, Apr 15, 2021 at 03:51:46PM +0300, James Clark wrote:
[...]
For the orignal perf data file with "--per-thread" option, the decoder runs into the condition for "etm->timeless_decoding"; and it doesn't contain ETM timestamp.
Afterwards, the injected perf data file also misses ETM timestamp and hit the condition "etm->timeless_decoding".
So I am confusing why the original perf data can be processed properly but fails to handle the injected perf data file.
Hi Leo,
My patch only deals with per-cpu mode. With per-thread mode everything is already working because _none_ of the events have timestamps because they are not enabled by default:
/* In per-cpu case, always need the time of mmap events etc */ if (!perf_cpu_map__empty(cpus)) evsel__set_sample_bit(tracking_evsel, TIME);
When none of the events have timestamps, I think perf doesn't use the ordering code in ordered-events.c. So when the inject file is opened, the events are read in file order.
The explination makes sense to me. One thinking: if the original file doesn't use the ordered event, is it possible for the injected file to not use the ordered event as well?
Yes if you inject on a file with no timestamps and then open it, then the function queue_event() in ordered_events.c is not hit.
If you create a file based on one with timestamps, then the queue_event() function is hit even on the injected file.
The relevant bit of code is here:
if (tool->ordered_events) { u64 timestamp = -1ULL;
ret = evlist__parse_sample_timestamp(evlist, event, ×tamp); if (ret && ret != -1) return ret;
ret = perf_session__queue_event(session, event, timestamp, file_offset); if (ret != -ETIME) return ret; }
return perf_session__deliver_event(session, event, tool, file_offset);
If tool->ordered_events is set AND the timestamp for the sample parses to be non zero and non -1:
if (!timestamp || timestamp == ~0ULL) return -ETIME;
Then the event is added into the queue, otherwise it goes straight through to perf_session__deliver_event() The ordering can be disabled manually with tool->ordered_events and --disable-order and is also disabled with --dump-raw-trace.
It seems like processing the file only really works when all events are unordered but in the right order, or ordered with the right timestamps set.
Could you confirm Intel-pt can work well for per-cpu mode for inject file?
Yes it seems like synthesised samples are assigned sensible timestamps.
perf record -e intel_pt//u top perf inject -i perf.data -o perf-intel-per-cpu.inject.data --itrace=i100i --strip perf report -i perf-intel-per-cpu.inject.data -D
Results in the correct binary and DSO names and the SAMPLE timestamp is after the COMM:
0 381165621595220 0x1200 [0x38]: PERF_RECORD_COMM exec: top:20173/20173 ... 2 381165622169297 0x13b0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x2): 20173/20173: 0x7fdaa14abf53 period: 100 addr: 0 ... thread: top:20173 ...... dso: /lib/x86_64-linux-gnu/ld-2.27.so
Per-thread also works, but no samples or events have timestamps.
So it's not really about --per-thread vs per-cpu mode, it's actually about whether PERF_SAMPLE_TIME is set, which is set as a by-product of per-cpu mode.
I hope I understood your question properly.
Thanks for info, sorry if I miss any info you have elaborated.
Leo