Hi Mike,
Thanks for the suggestions.
On Wed, Dec 09, 2020 at 04:38:24PM +0000, Mike Leach wrote:
[...]
Otherwise, as Adrian Hunter suggested: "If you are processing data based on PERF_RECORD_AUX timestamp then you need to pay attention to the offset. PERF_RECORD_AUX gives you aux_offset/aux_size and auxtrace_buffer (which may contain data from several PERF_RECORD_AUX) gives you offset/size.", so should we extract the aux_offset and aux_size for PERF_RECORD_AUX event and every time only decode the trace data no more than aux_size for every PERF_RECORD_AUX event?
Yes - I think this has to happen in order for accurate decode to occur.
This means we need to add additional checking for buffer length in the functions cs_etm__process_queues() and cs_etm__run_decoder() to avoid processing any trace data not belonging to the event.
So, although the issues I am seeing are not directly related to perf snapshots, I do believe that both problems require a re-thinking of the handling of the perf records.
a) PERF_RECORD_AUX events appear to be directly related to PERF_RECORD_AUXTRACE buffers by index, offset and size. This should allow the correct parsing of a single PERF_RECORD_AUXTRACE buffer into a set of hw capture buffers according to the offset and size of the AUX records. This will allow us to restart the decoder appropriately at the start of each hw buffer.
b) Even if the arch timer is not running the coresight timestamp, or the coresight timestamp is not enabled, will it be correct to associate the arch timestamp in the AUX record with that in the MMAP2 records, and therefore have all the information needed for correct decode of the block of trace in the AUXTRACE record associated with the AUX record?
Here have several concepts:
- The timestamp in PERF_RECORD_AUX event; - The timestamp in PERF_RECORD_MMAP2 event; - The timestamp which is contained in the CoreSight trace data.
The first two items' timestamps are coming from Kernel system time, the third item's timestamp is coming from CoreSight decoding packet and it only contains the raw counter value.
So at current stage, we can only know the timestamps for PERF_RECORD_AUX and PERF_RECORD_MMAP2 events; if the PERF_RECORD_AUX event arrives ahead than PERF_RECORD_MMAP2 event, we can simply handle the PERF_RECORD_AUX event and skips to synthenize samples due to lacking DSO loading. But this is inaccurate, the CoreSight trace data can generate many samples, so it's possible that samples might be earlier or later than PERF_RECORD_MMAP2 event.
If we cannot convert the CoreSight's timestamp to kernel's time, this means we must drop some CoreSight trace data which prior to PERF_RECORD_MMAP2 events. But this is better than current implementation, which will decode the trace data in one go and has no chance to handle PERF_RECORD_MMAP2 events.
c) At present the handler in cs-etm for the AUX records returns immediately when dumping trace. This leads to errors in dumping the AUXTRACE record for the reasons given above , Again this suggests that we need to split the AUXTRACE record up according to the size / offset parameters in the AUX records in the dump trace case.
d) Correctly splitting the AUXTRACE buffer using AUX records will allow us to remove the barrier packets in the ETR / ETB drivers. This is a best a workaround for a problem that my have been shown to reside elsewhere.
In short I think that for CoreSight we need to correlate and associate the AUX and AUXTRACE records to correctly dump the trace packets, and add in the correlation of the correct MMAP2 events for full trace decode.
Could you let me know what's the latest working status for this? If you have made progress, I am very glad to test and review the related changes. Otherwise, I will try to do experiment for this (after we agree with the approach).
Thanks, Leo