On Mon, 20 May 2019 at 12:06, Mathieu Poirier mathieu.poirier@linaro.org wrote:
Hi Wojciech,
On Fri, May 10, 2019 at 01:54:25PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the time field in branches and instruction samples generated from RANGE, EXCEPTION and EXCEPTION_RET packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm.c | 36 ++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 1 + 2 files changed, 37 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..838726748cd7 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
queue->packet_buffer[i].timestamp = 0; }
}
@@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp; if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq);
@@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp; /* * perf report cannot handle events without a branch stack
@@ -1894,6 +1897,37 @@ static int cs_etm__decode_data_block(struct cs_etm_queue *etmq) return ret; }
+static void +cs_etm__timestamp_traceid_queue_packet(struct cs_etm_traceid_queue *tidq) +{
/**
* At this point, packet queue's timestamp and next_timestamp fields
* represent start and stop timestamp value for range packets buffered
* in the queue. Since timestamp packets come once for every N range packets,
* we cannot directly assign timestamp values to specific range packets.
* Some estimation is done to tell at what approximate timestamp
* has execution of each instruction range started.
*/
if (!tidq->prev_packet->timestamp) {
/* First range packet in this queue */
tidq->packet->timestamp = tidq->packet_queue.timestamp;
} else {
/*
* Similarly to what is done in cs_etm_decoder__do_{soft|hard}_timestamp(),
* adjust timestamp value by the number of executed instructions,
* so that packets representing ranges of instructions executed later,
* have higher timestamp value than those executed earlier, in line with
* number of instruction executed in the previous range.
* Currently, the assumption is that execution time of each instruction
* equals 1 timestamp cycle.
*
* TODO find better approximation.
*/
tidq->packet->timestamp = tidq->prev_packet->timestamp +
tidq->prev_packet->instr_count;
}
+}
This will work but as you said in the cover letter, only in CPU-wide trace scenarios.
To me the first thing to do his confine time management to cs-etm-decoder.c so that the front end (i.e cs-etm.c) only has to use time. I also think each range packet should be timestamp'ed as they arrive. To do so two things need to happen:
- Fix function cs_etm__do_soft_timestamp(). On line 282 [1] the timestamp of
the packet is equal to the timestamp of the queue. Line 287 and below should be moved to cs_etm_decoder__buffer_range(), exactly where cs_etm__do_soft_timestamp() currently is. The new function cs_etm__do_soft_timestamp() should be moved to line 436. That way we timestamp all packets and we keep the current code to trigger front end attention in CPU-wide scenarios.
The above addresses the case where we've already seen a timestamp.
- Fix function cs_etm_decoder__do_hard_timestamp() to timestamp all range
packet in the packet_queue when the first timestamp is encountered.
That should cover all basis. To answer your question from the cover letter, timeless_decoding == per-thread.
I forgot... This might also be of interest to you: https://lkml.org/lkml/2019/5/20/494
Mathieu
[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git/tree/tools/perf/...
static int cs_etm__process_traceid_queue(struct cs_etm_queue *etmq, struct cs_etm_traceid_queue *tidq) { @@ -1926,6 +1960,7 @@ static int cs_etm__process_traceid_queue(struct cs_etm_queue *etmq,
switch (tidq->packet->sample_type) { case CS_ETM_RANGE:
cs_etm__timestamp_traceid_queue_packet(tidq); /* * If the packet contains an instruction * range, generate instruction sequence
@@ -1935,6 +1970,7 @@ static int cs_etm__process_traceid_queue(struct cs_etm_queue *etmq, break; case CS_ETM_EXCEPTION: case CS_ETM_EXCEPTION_RET:
cs_etm__timestamp_traceid_queue_packet(tidq); /* * If the exception packet is coming, * make sure the previous instruction
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0