Hi Adrian,
On Wed, Nov 18, 2020 at 05:57:09PM +0200, Adrian Hunter wrote:
[...]
The follow diagram depicts the flow for how the AUX event is arriving ahead than MMAP2 event:
T1: T3: T4:
perf process Open PMU device Perf is scheduled out; invoke perf_aux_output_end() and generate AUX event ^ ^ ^ | | | CPU0 ---------------------------------------------------> (T) \ \ Forked process is placed on another CPU V CPU1 ---------------------------------------------------> (T) | | V V T2: T5: Invoke execvp() for profiled Record MMAP2 event program
I don't understand. An MMAP at T5 cannot affect data recorded before T5 (e.g. T4). Are you saying the timestamps from different CPUs are not in time order? Ot are you saying the data is actually recorded after T5?
Here refers to the later one.
I observed the PMU device is opened and closed for multiple times, so the AUX events and MMAP2 events are interleave with each other. Below is the dumping with debugging option "--debug ordered-events=1" (and I added code for printing out event type value).
When perf tool starts to decode the trace data for the first AUX event (its timestamp is 160.767623) and try to synthesize samples, it doesn't give chance for handling the sequential MMAP2 events (ts: 160.770223, 160.770451, 160.773111).
[ 160.765623] queue_event nr_events 1 [ 160.765623] new event on queue: type=12 [ 160.765714] queue_event nr_events 2 [ 160.765714] new event on queue: type=3 [ 160.765822] queue_event nr_events 3 [ 160.765822] new event on queue: type=10 ==> MMAP2 event [ 160.766180] queue_event nr_events 4 [ 160.766180] new event on queue: type=14 [ 160.767623] queue_event nr_events 5 [ 160.767623] new event on queue: type=11 ==> AUX event [ 160.769710] queue_event nr_events 6 [ 160.769710] new event on queue: type=12 [ 160.769762] queue_event nr_events 7 [ 160.769762] new event on queue: type=14 [ 160.769810] queue_event nr_events 8 [ 160.769810] new event on queue: type=14 [ 160.769943] queue_event nr_events 9 [ 160.769943] new event on queue: type=11 ==> AUX event [ 160.770028] queue_event nr_events 10 [ 160.770028] new event on queue: type=12 [ 160.770073] queue_event nr_events 11 [ 160.770073] new event on queue: type=14 [ 160.770223] queue_event nr_events 12 [ 160.770223] new event on queue: type=10 ==> MMAP2 event [ 160.770451] queue_event nr_events 13 [ 160.770451] new event on queue: type=10 ==> MMAP2 event [ 160.770628] queue_event nr_events 14 [ 160.770628] new event on queue: type=14 [ 160.771650] queue_event nr_events 15 [ 160.771650] new event on queue: type=11 ==> AUX event [ 160.771752] queue_event nr_events 16 [ 160.771752] new event on queue: type=12 [ 160.771798] queue_event nr_events 17 [ 160.771798] new event on queue: type=14 [ 160.773111] queue_event nr_events 18 [ 160.773111] new event on queue: type=10 ==> MMAP2 event [ 160.782613] queue_event nr_events 19 [ 160.782613] new event on queue: type=14
Thanks for the reviewing,
Leo
When you say AUX event, do you mean PERF_RECORD_AUX or PERF_RECORD_AUXTRACE?
The AUX event is PERF_RECORD_AUX.
The kernel creates PERF_RECORD_AUX and the data range aux_offset/aux_size should be for things prior to the sample time.
perf tool creates PERF_RECORD_AUXTRACE when copying data out of the aux area. PERF_RECORD_AUXTRACE does not have a timestamp and can contain data corresponding to several PERF_RECORD_AUX. Data is queued based on PERF_RECORD_AUXTRACE because that is what the data is attached to i.e. one buffer can contain data from several PERF_RECORD_AUX events.
If you are processing data based on PERF_RECORD_AUX timestamp then you need to pay attention to the offset. PERF_RECORD_AUX gives you aux_offset/aux_size and auxtrace_buffer (which may contain data from several PERF_RECORD_AUX) gives you offset/size.
Yeah, as you suggested, in this case the key point is for the first PERF_RECORD_AUX event handling, in theory it should contain very short trace data and exit the decoding quickly. The issue is the first PERF_RECORD_AUX event will decode all AUX trace data; and near the end of events handling, the tool finally has chance to process MMAP2 events.
So this RFC patch is not the right way, I will dig more for PERF_RECORD_AUX event handling for cs-etm.
Thanks for the detailed explanation and suggetions.
Leo