From: Wojciech Zmuda wzmuda@n7space.com
This patchset adds time notion to perf instruction and branch samples to allow coarse time measurement of code blocks execution.
The simplest verification is visibility of the time field in 'perf script' output:
root@zynq:~# perf record -e cs_etm/timestamp,@fe970000.etr/u -a sleep 1 Couldn't synthesize bpf events. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.262 MB perf.data ]
root@zynq:~# perf script --ns -F cpu,comm,time perf [002] 9546.053455325: perf [002] 9546.053455340: perf [002] 9546.053455344: (...) sleep [003] 9546.060163742: sleep [003] 9546.060163754: sleep [003] 9546.060163766: (...) ntpd [001] 9546.389083194: ntpd [001] 9546.389083400: ntpd [001] 9546.389086319: (...)
The step above works only if trace has been collected in CPU-wide mode because of some perf event flags mismatch I'm working on fixing.
In general, timestamps in subsequent samples are monotonically increasing. Exceptions are: - discontinuities in trace. From my understanding, we can't timestamp discontinuities reasonably, since after decoder synchronizes back after trace loss, it needs to wait for another timestamp packet. Thus, time value of such samples stays at 0x0. - cases when a lot of non-branching instructions is executed between two subsequent timestamps. Since we use approximation of timestamps for subsequent samples by increasing the timestamp value by instruction count, it is possible that we go past the value of the next hardware timestamp.
Another way to access these values is to use the perf script engine, which I used for validation of the feature. The script below calculates timestamp differences of two consecutive branches sharing the same branch address. This is a simple example of execution time fluctuation detector.
from __future__ import print_function
import os import sys
sys.path.append(os.environ['PERF_EXEC_PATH'] + \ '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
from perf_trace_context import *
target_start_addr = int('4005e4', 16) # 0x4005e4 is func() from listing below
branch = dict() branch['from'] = 0 branches = []
def process_event(s): global branch global branches
sample = s['sample']
branch['cpu'] = sample['cpu']
if not branch['from']: branch['from'] = sample['addr'] branch['ts'] = sample['time'] return
branch['to'] = sample['ip'] if not branch['to']: branch['from'] = 0 branch['ts'] = 0 return
if branch['from'] and branch['to']: branches.append(branch.copy()) branch['from'] = 0 return
def trace_end(): global branches
count = 0 timestamp_start = 0
print("Got {0} samples:".format(len(branches)))
for b in branches: if b['from'] == target_start_addr: if not timestamp_start: timestamp_start = b['ts'] continue
print("[{0}]: ts diff = 0x{1:x} - 0x{2:x} = {3:d}".format(count, b['ts'], timestamp_start, b['ts'] - timestamp_start)) count = count + 1 timestamp_start = b['ts']
The following function was traced:
static int func(int cnt) { volatile int x = 0; static int i;
x += cnt + 0xdeadbeefcafeb00b; (...) /* repeats ~100 times */
if (i++ % 3 == 0) // Every third execution is longer usleep(1000);
return x; }
root@zynq:~# perf record -m,16K -e cs_etm/timestamp,@fe970000.etr/u \ --filter 'filter func @./program \ --per-thread ./program Couldn't synthesize bpf events. CTRL+C me when you find appropriate. ^C[ perf record: Woken up 12 times to write data ] [ perf record: Captured and wrote 0.135 MB perf.data ]
root@zynq:~# perf script -s exectime.py Got 2469 samples: [0]: ts diff = 0x92f2752e512 - 0x92f274a7ae9 = 551465 [1]: ts diff = 0x92f2752e694 - 0x92f2752e512 = 386 [2]: ts diff = 0x92f2752e817 - 0x92f2752e694 = 387 [3]: ts diff = 0x92f275bef12 - 0x92f2752e817 = 591611 [4]: ts diff = 0x92f275bf093 - 0x92f275bef12 = 385 [5]: ts diff = 0x92f275bf211 - 0x92f275bf093 = 382 [6]: ts diff = 0x92f276451d7 - 0x92f275bf211 = 548806 [7]: ts diff = 0x92f2764535a - 0x92f276451d7 = 387 [8]: ts diff = 0x92f276454d7 - 0x92f2764535a = 381 [9]: ts diff = 0x92f276cb256 - 0x92f276454d7 = 548223 [10]: ts diff = 0x92f276cb3d9 - 0x92f276cb256 = 387 [11]: ts diff = 0x92f276cb556 - 0x92f276cb3d9 = 381 (...)
At the listing above it is visible that every third execution of the function lasted longer than the other two. It is a naive example and could be enhanced to point to the area that caused the disruption by examining events 'in the middle' of the traced code range.
Applies cleanly on Mathieu's 5.1-rc3-cpu-wide-v3 branch.
Changes for V3: - fix checkpatch warnings - don't timestamp discontinuity packets with hardware timestamp, as it is for range packets only - fix timestamping packets on full queue in per-thread mode - simplify code in timestamping last branch samples before discontinuity and in timestamping pending packets
Changes for V2: - move packet timestamping logic to decoder. Front end only uses this information to timestamp samples (as suggested by Mathieu). - leave original behaviour of CPU-wide mode, where decoder is stopped and front end is triggered about pending queue with timestamp packet. At the same time, always adjust next and current timestamp in both CPU-wide and per-thread modes (as suggested by Mathieu). - when timestamp packet is encountered, timestamp range and discontinuity packets waiting in the queue, that are not yet consumed by the front end (as suggested by Mathieu). - don't timestamp exceptions, since they are not turned into branch nor instruction samples. - fix timestamping of the last branch sample before discontinuity appears (as suggested by Leo).
Wojciech Zmuda (1): perf cs-etm: Set time value for samples
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params, trace_config); }
-static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq, - struct cs_etm_packet_queue *packet_queue, - const uint8_t trace_chan_id) +static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue *queue) +{ + struct cs_etm_packet *packet; + u64 next_timestamp; + u32 count, head; + + count = queue->packet_count; + if (count == 0) + return; + + head = queue->head; + next_timestamp = queue->timestamp; + while (count--) { + head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1); + packet = &queue->packet_buffer[head]; + + if (packet->sample_type != CS_ETM_RANGE) + continue; + + packet->timestamp = next_timestamp; + /* TODO: better approximation of instruction range duration */ + next_timestamp += packet->instr_count; + } +} + +static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue *packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp) - return OCSD_RESP_CONT; + return;
packet_queue->timestamp = packet_queue->next_timestamp;
/* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0; - - /* Tell the front end which traceid_queue needs attention */ - cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id); - - return OCSD_RESP_WAIT; }
static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq, packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
+ /* Timestamp range packets that got queued up to this point */ + cs_etm_decoder__timestamp_pending_packets(packet_queue); + /* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id; + packet_queue->packet_buffer[et].timestamp = 0;
if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, }
packet->last_instr_size = elem->last_instr_sz; - - /* per-thread scenario, no need to generate a timestamp */ - if (cs_etm__etmq_is_timeless(etmq)) - goto out; + packet_queue->instr_count += elem->num_instr_range; + cs_etm_decoder__do_soft_timestamp(packet_queue); + packet->timestamp = packet_queue->timestamp;
/* * The packet queue is full and we haven't seen a timestamp (had we @@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, if (ret == OCSD_RESP_WAIT) goto out;
- packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */ - ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue, - trace_chan_id); + if (!cs_etm__etmq_is_timeless(etmq)) + cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id); out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) { + int ret; + struct cs_etm_packet *packet; + + ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id, + CS_ETM_DISCONTINUITY); + + packet = &queue->packet_buffer[queue->tail]; + packet->timestamp = queue->timestamp; + /* * Something happened and who knows when we'll get new traces so * reset time statistics. + * Record the last observed timestamp, so the last branch sample + * before discontinuity can be marked with it. */ cs_etm_decoder__reset_timestamp(queue); - return cs_etm_decoder__buffer_packet(queue, trace_chan_id, - CS_ETM_DISCONTINUITY); + + return ret; }
static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN; + queue->packet_buffer[i].timestamp = 0; } }
@@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc; + sample.time = tidq->packet->timestamp;
if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq); @@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc; + sample.time = tidq->packet->timestamp;
/* * perf report cannot handle events without a branch stack diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr; + u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
On Fri, Jun 07, 2019 at 05:06:42PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params, trace_config); } -static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
struct cs_etm_packet_queue *packet_queue,
const uint8_t trace_chan_id)
+static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue *queue) +{
- struct cs_etm_packet *packet;
- u64 next_timestamp;
- u32 count, head;
- count = queue->packet_count;
- if (count == 0)
return;
- head = queue->head;
- next_timestamp = queue->timestamp;
- while (count--) {
head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1);
packet = &queue->packet_buffer[head];
if (packet->sample_type != CS_ETM_RANGE)
continue;
packet->timestamp = next_timestamp;
/* TODO: better approximation of instruction range duration */
next_timestamp += packet->instr_count;
- }
+}
+static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue *packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp)
return OCSD_RESP_CONT;
return;
packet_queue->timestamp = packet_queue->next_timestamp; /* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0;
- /* Tell the front end which traceid_queue needs attention */
- cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
- return OCSD_RESP_WAIT;
} static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq, packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
- /* Timestamp range packets that got queued up to this point */
- cs_etm_decoder__timestamp_pending_packets(packet_queue);
- /* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id;
- packet_queue->packet_buffer[et].timestamp = 0;
if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, } packet->last_instr_size = elem->last_instr_sz;
- /* per-thread scenario, no need to generate a timestamp */
- if (cs_etm__etmq_is_timeless(etmq))
goto out;
- packet_queue->instr_count += elem->num_instr_range;
- cs_etm_decoder__do_soft_timestamp(packet_queue);
- packet->timestamp = packet_queue->timestamp;
/* * The packet queue is full and we haven't seen a timestamp (had we @@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, if (ret == OCSD_RESP_WAIT) goto out;
- packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */
- ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
trace_chan_id);
- if (!cs_etm__etmq_is_timeless(etmq))
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) {
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- packet = &queue->packet_buffer[queue->tail];
- packet->timestamp = queue->timestamp;
- /*
- Something happened and who knows when we'll get new traces so
- reset time statistics.
* Record the last observed timestamp, so the last branch sample
* before discontinuity can be marked with it.
Am I missing something or this comment should be added above, just before you deal with the packet timestamp?
Otherwise I think this set is working properly but see my comment on the cover letter.
*/
cs_etm_decoder__reset_timestamp(queue);
- return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- return ret;
} static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
}queue->packet_buffer[i].timestamp = 0;
} @@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
- sample.time = tidq->packet->timestamp;
if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq); @@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
- sample.time = tidq->packet->timestamp;
/* * perf report cannot handle events without a branch stack diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
- u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0
Hi Mathieu,
-----Original Message----- From: Mathieu Poirier mathieu.poirier@linaro.org Sent: Monday, June 10, 2019 8:02 PM To: Wojciech Żmuda wzmuda@n7space.com Cc: coresight@lists.linaro.org; leo.yan@linaro.org; Michal Mosdorf mmosdorf@n7space.com; Michał Kurowski mkurowski@n7space.com Subject: Re: [RFC PATCH v3 1/1] perf cs-etm: Set time value for samples
On Fri, Jun 07, 2019 at 05:06:42PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F
time'
output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70
++++++++++++++++++-------
tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct
cs_etm_trace_params *t_params,
trace_config);
}
-static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
struct cs_etm_packet_queue *packet_queue,
const uint8_t trace_chan_id)
+static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue +*queue) {
- struct cs_etm_packet *packet;
- u64 next_timestamp;
- u32 count, head;
- count = queue->packet_count;
- if (count == 0)
return;
- head = queue->head;
- next_timestamp = queue->timestamp;
- while (count--) {
head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1);
packet = &queue->packet_buffer[head];
if (packet->sample_type != CS_ETM_RANGE)
continue;
packet->timestamp = next_timestamp;
/* TODO: better approximation of instruction range duration */
next_timestamp += packet->instr_count;
- }
+}
+static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue +*packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp)
return OCSD_RESP_CONT;
return;
packet_queue->timestamp = packet_queue->next_timestamp;
/* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0;
- /* Tell the front end which traceid_queue needs attention */
- cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
- return OCSD_RESP_WAIT;
}
static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct
cs_etm_queue *etmq,
packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
- /* Timestamp range packets that got queued up to this point */
- cs_etm_decoder__timestamp_pending_packets(packet_queue);
- /* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct
cs_etm_packet_queue *packet_queue,
packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id;
packet_queue->packet_buffer[et].timestamp = 0;
if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT;
@@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue
*etmq,
}
packet->last_instr_size = elem->last_instr_sz;
- /* per-thread scenario, no need to generate a timestamp */
- if (cs_etm__etmq_is_timeless(etmq))
goto out;
packet_queue->instr_count += elem->num_instr_range;
cs_etm_decoder__do_soft_timestamp(packet_queue);
packet->timestamp = packet_queue->timestamp;
/*
- The packet queue is full and we haven't seen a timestamp (had we
@@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue
*etmq,
if (ret == OCSD_RESP_WAIT) goto out;
- packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */
- ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
trace_chan_id);
- if (!cs_etm__etmq_is_timeless(etmq))
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) {
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- packet = &queue->packet_buffer[queue->tail];
- packet->timestamp = queue->timestamp;
- /*
- Something happened and who knows when we'll get new traces so
- reset time statistics.
* Record the last observed timestamp, so the last branch sample
* before discontinuity can be marked with it.
Am I missing something or this comment should be added above, just before you deal with the packet timestamp?
Of course. I'll fix this in v4.
Otherwise I think this set is working properly but see my comment on the cover letter.
Thanks for taking a look at this. At the moment I have a couple of items on my plate, but as soon as I'm done with them, I'll get back to this set and fix the issue mentioned in cover letter.
Best regards, Wojciech
*/
cs_etm_decoder__reset_timestamp(queue);
- return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- return ret;
}
static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct
cs_etm_packet_queue *queue)
queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
}queue->packet_buffer[i].timestamp = 0;
}
@@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct
cs_etm_queue *etmq,
sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp;
if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq); @@ -1157,6 +1159,7 @@
static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp;
/*
- perf report cannot handle events without a branch stack diff
--git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
- u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0
Hi Wojciech,
On Fri, Jun 07, 2019 at 05:06:42PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params, trace_config); } -static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
struct cs_etm_packet_queue *packet_queue,
const uint8_t trace_chan_id)
+static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue *queue) +{
- struct cs_etm_packet *packet;
- u64 next_timestamp;
- u32 count, head;
- count = queue->packet_count;
- if (count == 0)
return;
- head = queue->head;
- next_timestamp = queue->timestamp;
- while (count--) {
head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1);
packet = &queue->packet_buffer[head];
if (packet->sample_type != CS_ETM_RANGE)
continue;
packet->timestamp = next_timestamp;
/* TODO: better approximation of instruction range duration */
next_timestamp += packet->instr_count;
- }
+}
+static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue *packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp)
return OCSD_RESP_CONT;
return;
packet_queue->timestamp = packet_queue->next_timestamp; /* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0;
- /* Tell the front end which traceid_queue needs attention */
- cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
- return OCSD_RESP_WAIT;
} static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq, packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
- /* Timestamp range packets that got queued up to this point */
- cs_etm_decoder__timestamp_pending_packets(packet_queue);
- /* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id;
- packet_queue->packet_buffer[et].timestamp = 0;
if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, } packet->last_instr_size = elem->last_instr_sz;
- /* per-thread scenario, no need to generate a timestamp */
- if (cs_etm__etmq_is_timeless(etmq))
goto out;
- packet_queue->instr_count += elem->num_instr_range;
- cs_etm_decoder__do_soft_timestamp(packet_queue);
- packet->timestamp = packet_queue->timestamp;
/* * The packet queue is full and we haven't seen a timestamp (had we @@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, if (ret == OCSD_RESP_WAIT) goto out;
- packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */
- ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
trace_chan_id);
- if (!cs_etm__etmq_is_timeless(etmq))
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) {
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- packet = &queue->packet_buffer[queue->tail];
- packet->timestamp = queue->timestamp;
- /*
- Something happened and who knows when we'll get new traces so
- reset time statistics.
* Record the last observed timestamp, so the last branch sample
*/ cs_etm_decoder__reset_timestamp(queue);* before discontinuity can be marked with it.
- return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
- return ret;
} static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
}queue->packet_buffer[i].timestamp = 0;
} @@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
- sample.time = tidq->packet->timestamp;
if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq); @@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
- sample.time = tidq->packet->timestamp;
I think here we should use 'tidq->prev_packet->timstamp', since this sample is generated for the previous packet finishing and jump to the new address in the next packet.
If the timestamp value means the previous packet finish time but not its start time, it's more suitable to use the previous packet's timestamp.
@Mike, @Mathieu, could you double confirm for this?
I tested this patch on DB410c board, so get below result; seems to me we need to convert the raw timestamp to readable time format (e.g. to raw counter value to seconds format). I know before we have some discussion for this, should we firstly get ready for passing timestamp counter from CoreSight driver to perf firstly?
perf 10802 [001] 0.476703: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8c98 __GI___ioctl+0x8 (/lib/aarch64-linux-gnu/libc-2.27.so) => 0 [unknown] ([unknown]) perf 10802 [001] 0.476711: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476711: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-cs_etm__etmq_get_traceid_queue: enter
Thanks, Leo Yan
/* * perf report cannot handle events without a branch stack diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
- u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0
On Fri, 21 Jun 2019 at 03:04, Leo Yan leo.yan@linaro.org wrote:
Hi Wojciech,
On Fri, Jun 07, 2019 at 05:06:42PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params, trace_config); }
-static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
struct cs_etm_packet_queue *packet_queue,
const uint8_t trace_chan_id)
+static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue *queue) +{
struct cs_etm_packet *packet;
u64 next_timestamp;
u32 count, head;
count = queue->packet_count;
if (count == 0)
return;
head = queue->head;
next_timestamp = queue->timestamp;
while (count--) {
head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1);
packet = &queue->packet_buffer[head];
if (packet->sample_type != CS_ETM_RANGE)
continue;
packet->timestamp = next_timestamp;
/* TODO: better approximation of instruction range duration */
next_timestamp += packet->instr_count;
}
+}
+static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue *packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp)
return OCSD_RESP_CONT;
return; packet_queue->timestamp = packet_queue->next_timestamp; /* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0;
/* Tell the front end which traceid_queue needs attention */
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
return OCSD_RESP_WAIT;
}
static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq, packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
/* Timestamp range packets that got queued up to this point */
cs_etm_decoder__timestamp_pending_packets(packet_queue);
/* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id;
packet_queue->packet_buffer[et].timestamp = 0; if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT;
@@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, }
packet->last_instr_size = elem->last_instr_sz;
/* per-thread scenario, no need to generate a timestamp */
if (cs_etm__etmq_is_timeless(etmq))
goto out;
packet_queue->instr_count += elem->num_instr_range;
cs_etm_decoder__do_soft_timestamp(packet_queue);
packet->timestamp = packet_queue->timestamp; /* * The packet queue is full and we haven't seen a timestamp (had we
@@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, if (ret == OCSD_RESP_WAIT) goto out;
packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */
ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
trace_chan_id);
if (!cs_etm__etmq_is_timeless(etmq))
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) {
int ret;
struct cs_etm_packet *packet;
ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
packet = &queue->packet_buffer[queue->tail];
packet->timestamp = queue->timestamp;
/* * Something happened and who knows when we'll get new traces so * reset time statistics.
* Record the last observed timestamp, so the last branch sample
* before discontinuity can be marked with it. */ cs_etm_decoder__reset_timestamp(queue);
return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
return ret;
}
static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
queue->packet_buffer[i].timestamp = 0; }
}
@@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp; if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq);
@@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp;
I think here we should use 'tidq->prev_packet->timstamp', since this sample is generated for the previous packet finishing and jump to the new address in the next packet.
I think Leo has a point here.
If the timestamp value means the previous packet finish time but not its start time, it's more suitable to use the previous packet's timestamp.
To the front end (not the decoding code) the timestamp means the time this packet started executing at.
@Mike, @Mathieu, could you double confirm for this?
I tested this patch on DB410c board, so get below result; seems to me we need to convert the raw timestamp to readable time format (e.g. to raw counter value to seconds format). I know before we have some discussion for this, should we firstly get ready for passing timestamp counter from CoreSight driver to perf firstly?
At this time we can't transform the raw timestamp to a readable format, at least not in a way that can be used in a generic way on all platforms. That is because the frequency of the clock used for timestamps can vary from one platform to another without the possibility for a driver to query it. We would have to push that information to the DT/ACPI (assuming vendors would be willing to make it public) and carry it forward in the private AUX header for CS, the same kind of thing we currently do for session configuration specifics. It is possible and I am not against it, but it would demands a lot for work.
perf 10802 [001] 0.476703: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8c98 __GI___ioctl+0x8 (/lib/aarch64-linux-gnu/libc-2.27.so) => 0 [unknown] ([unknown]) perf 10802 [001] 0.476711: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476711: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-cs_etm__etmq_get_traceid_queue: enter
Thanks, Leo Yan
/* * perf report cannot handle events without a branch stack
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0
Hi,
On Fri, 21 Jun 2019 at 17:14, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On Fri, 21 Jun 2019 at 03:04, Leo Yan leo.yan@linaro.org wrote:
Hi Wojciech,
On Fri, Jun 07, 2019 at 05:06:42PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
Populate the 'time' field in branches and instruction samples generated from RANGE and DISCONTINUITY packets. The time value is based on TIMESTAMP packets that may appear in the trace.
The time value may be accessed through perf-script built-in script engine in the sample dict 'time' field or printed in 'perf script -F time' output (the latter only in CPU-wide mode at the moment).
Signed-off-by: Wojciech Zmuda wzmuda@n7space.com
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 33e975c8d11b..a4d470eccbcf 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,25 +269,44 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params, trace_config); }
-static ocsd_datapath_resp_t -cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
struct cs_etm_packet_queue *packet_queue,
const uint8_t trace_chan_id)
+static void +cs_etm_decoder__timestamp_pending_packets(struct cs_etm_packet_queue *queue) +{
struct cs_etm_packet *packet;
u64 next_timestamp;
u32 count, head;
count = queue->packet_count;
if (count == 0)
return;
head = queue->head;
next_timestamp = queue->timestamp;
while (count--) {
head = (head + 1) & (CS_ETM_PACKET_MAX_BUFFER - 1);
packet = &queue->packet_buffer[head];
if (packet->sample_type != CS_ETM_RANGE)
continue;
packet->timestamp = next_timestamp;
/* TODO: better approximation of instruction range duration */
next_timestamp += packet->instr_count;
}
+}
+static void +cs_etm_decoder__do_soft_timestamp(struct cs_etm_packet_queue *packet_queue) { /* No timestamp packet has been received, nothing to do */ if (!packet_queue->timestamp)
return OCSD_RESP_CONT;
return; packet_queue->timestamp = packet_queue->next_timestamp; /* Estimate the timestamp for the next range packet */ packet_queue->next_timestamp += packet_queue->instr_count; packet_queue->instr_count = 0;
/* Tell the front end which traceid_queue needs attention */
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
return OCSD_RESP_WAIT;
}
static ocsd_datapath_resp_t @@ -324,6 +343,9 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq, packet_queue->next_timestamp = elem->timestamp; packet_queue->instr_count = 0;
/* Timestamp range packets that got queued up to this point */
cs_etm_decoder__timestamp_pending_packets(packet_queue);
/* Tell the front end which traceid_queue needs attention */ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
@@ -372,6 +394,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, packet_queue->packet_buffer[et].flags = 0; packet_queue->packet_buffer[et].exception_number = UINT32_MAX; packet_queue->packet_buffer[et].trace_chan_id = trace_chan_id;
packet_queue->packet_buffer[et].timestamp = 0; if (packet_queue->packet_count == CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_WAIT;
@@ -435,10 +458,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, }
packet->last_instr_size = elem->last_instr_sz;
/* per-thread scenario, no need to generate a timestamp */
if (cs_etm__etmq_is_timeless(etmq))
goto out;
packet_queue->instr_count += elem->num_instr_range;
cs_etm_decoder__do_soft_timestamp(packet_queue);
packet->timestamp = packet_queue->timestamp; /* * The packet queue is full and we haven't seen a timestamp (had we
@@ -448,10 +470,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, if (ret == OCSD_RESP_WAIT) goto out;
packet_queue->instr_count += elem->num_instr_range; /* Tell the front end we have a new timestamp to process */
ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
trace_chan_id);
if (!cs_etm__etmq_is_timeless(etmq))
cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
out: return ret; } @@ -460,13 +481,24 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) {
int ret;
struct cs_etm_packet *packet;
ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
packet = &queue->packet_buffer[queue->tail];
packet->timestamp = queue->timestamp;
/* * Something happened and who knows when we'll get new traces so * reset time statistics.
* Record the last observed timestamp, so the last branch sample
* before discontinuity can be marked with it. */ cs_etm_decoder__reset_timestamp(queue);
return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
return ret;
}
static ocsd_datapath_resp_t diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8496190ad553..84d615ae3a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -204,6 +204,7 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue) queue->packet_buffer[i].exception_number = UINT32_MAX; queue->packet_buffer[i].trace_chan_id = UINT8_MAX; queue->packet_buffer[i].cpu = INT_MIN;
queue->packet_buffer[i].timestamp = 0; }
}
@@ -1098,6 +1099,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.flags = tidq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp; if (etm->synth_opts.last_branch) { cs_etm__copy_last_branch_rb(etmq, tidq);
@@ -1157,6 +1159,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, sample.cpu = tidq->packet->cpu; sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
sample.time = tidq->packet->timestamp;
I think here we should use 'tidq->prev_packet->timstamp', since this sample is generated for the previous packet finishing and jump to the new address in the next packet.
I think Leo has a point here.
If the timestamp value means the previous packet finish time but not its start time, it's more suitable to use the previous packet's timestamp.
To the front end (not the decoding code) the timestamp means the time this packet started executing at.
@Mike, @Mathieu, could you double confirm for this?
You are correct - the ordering of the timestamps in the decode output follows that of the ordering in trace. A timestamp is trace is the time the previous waypoint was traced -this typically being the branch instruction that forms the end of a trace range output by the decoder. Thus the timestamp is associated with the end of the previous traced element - normally a trace range, but could be an exception element.
I tested this patch on DB410c board, so get below result; seems to me we need to convert the raw timestamp to readable time format (e.g. to raw counter value to seconds format). I know before we have some discussion for this, should we firstly get ready for passing timestamp counter from CoreSight driver to perf firstly?
At this time we can't transform the raw timestamp to a readable format, at least not in a way that can be used in a generic way on all platforms. That is because the frequency of the clock used for timestamps can vary from one platform to another without the possibility for a driver to query it. We would have to push that information to the DT/ACPI (assuming vendors would be willing to make it public) and carry it forward in the private AUX header for CS, the same kind of thing we currently do for session configuration specifics. It is possible and I am not against it, but it would demands a lot for work.
At present we cannot know if the clock driving the core or timestamp counter has changed during trace, so even having this information may not result in a 100% reliable conversion.
Regards
Mike
perf 10802 [001] 0.476703: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476703: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476703: 1 branches:u: ffffa68e8c98 __GI___ioctl+0x8 (/lib/aarch64-linux-gnu/libc-2.27.so) => 0 [unknown] ([unknown]) perf 10802 [001] 0.476711: 1 branches:u: 0 [unknown] ([unknown]) => ffffa68e8c9c __GI___ioctl+0xc (/lib/aarch64-linux-gnu/libc-2.27.so) perf 10802 [001] 0.476711: 1 branches:u: ffffa68e8ca4 __GI___ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so) => aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada4 perf_evsel__enable+0x84 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad70 perf_evsel__enable+0x50 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad84 perf_evsel__enable+0x64 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8add0 perf_evsel__enable+0xb0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ade0 perf_evsel__enable+0xc0 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad60 perf_evsel__enable+0x40 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ad68 perf_evsel__enable+0x48 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9c8ad88 perf_evsel__enable+0x68 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9c8ada0 perf_evsel__enable+0x80 (/mnt/linux-kernel/linux/tools/perf/perf) => aaaae9be77e0 ioctl@plt+0x0 (/mnt/linux-kernel/linux/tools/perf/perf) perf 10802 [001] 0.476711: 1 branches:u: aaaae9be77ec ioctl@plt+0xc (/mnt/linux-kernel/linux/tools/perf/perf) => ffffa68e8c90 __GI___ioctl+0x0 (/lib/aarch64-linux-gnu/libc-cs_etm__etmq_get_traceid_queue: enter
Thanks, Leo Yan
/* * perf report cannot handle events without a branch stack
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 1f6ec7babe70..a14b0220eaa7 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -122,6 +122,7 @@ struct cs_etm_packet { enum cs_etm_isa isa; u64 start_addr; u64 end_addr;
u64 timestamp; u32 instr_count; u32 last_instr_type; u32 last_instr_subtype;
-- 2.11.0
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Wojciech,
On Fri, Jun 07, 2019 at 05:06:40PM +0000, Wojciech Żmuda wrote:
From: Wojciech Zmuda wzmuda@n7space.com
This patchset adds time notion to perf instruction and branch samples to allow coarse time measurement of code blocks execution.
The simplest verification is visibility of the time field in 'perf script' output:
root@zynq:~# perf record -e cs_etm/timestamp,@fe970000.etr/u -a sleep 1 Couldn't synthesize bpf events. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.262 MB perf.data ]
root@zynq:~# perf script --ns -F cpu,comm,time perf [002] 9546.053455325: perf [002] 9546.053455340: perf [002] 9546.053455344: (...) sleep [003] 9546.060163742: sleep [003] 9546.060163754: sleep [003] 9546.060163766: (...) ntpd [001] 9546.389083194: ntpd [001] 9546.389083400: ntpd [001] 9546.389086319: (...)
The step above works only if trace has been collected in CPU-wide mode because of some perf event flags mismatch I'm working on fixing.
Even if I had no comment on this set (see other patch) I couldn't apply it before the problems you report above are fixed.
In general, timestamps in subsequent samples are monotonically increasing. Exceptions are:
- discontinuities in trace. From my understanding, we can't timestamp discontinuities reasonably, since after decoder synchronizes back after trace loss, it needs to wait for another timestamp packet. Thus, time value of such samples stays at 0x0.
- cases when a lot of non-branching instructions is executed between two subsequent timestamps. Since we use approximation of timestamps for subsequent samples by increasing the timestamp value by instruction count, it is possible that we go past the value of the next hardware timestamp.
Another way to access these values is to use the perf script engine, which I used for validation of the feature. The script below calculates timestamp differences of two consecutive branches sharing the same branch address. This is a simple example of execution time fluctuation detector.
from __future__ import print_function
import os import sys
sys.path.append(os.environ['PERF_EXEC_PATH'] + \ '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
from perf_trace_context import *
target_start_addr = int('4005e4', 16) # 0x4005e4 is func() from listing below
branch = dict() branch['from'] = 0 branches = []
def process_event(s): global branch global branches
sample = s['sample'] branch['cpu'] = sample['cpu'] if not branch['from']: branch['from'] = sample['addr'] branch['ts'] = sample['time'] return branch['to'] = sample['ip'] if not branch['to']: branch['from'] = 0 branch['ts'] = 0 return if branch['from'] and branch['to']: branches.append(branch.copy()) branch['from'] = 0 return
def trace_end(): global branches
count = 0 timestamp_start = 0 print("Got {0} samples:".format(len(branches))) for b in branches: if b['from'] == target_start_addr: if not timestamp_start: timestamp_start = b['ts'] continue print("[{0}]: ts diff = 0x{1:x} - 0x{2:x} = {3:d}".format(count, b['ts'], timestamp_start, b['ts'] - timestamp_start)) count = count + 1 timestamp_start = b['ts']
The following function was traced:
static int func(int cnt) { volatile int x = 0; static int i;
x += cnt + 0xdeadbeefcafeb00b; (...) /* repeats ~100 times */ if (i++ % 3 == 0) // Every third execution is longer usleep(1000); return x;
}
root@zynq:~# perf record -m,16K -e cs_etm/timestamp,@fe970000.etr/u \ --filter 'filter func @./program \ --per-thread ./program Couldn't synthesize bpf events. CTRL+C me when you find appropriate. ^C[ perf record: Woken up 12 times to write data ] [ perf record: Captured and wrote 0.135 MB perf.data ]
root@zynq:~# perf script -s exectime.py Got 2469 samples: [0]: ts diff = 0x92f2752e512 - 0x92f274a7ae9 = 551465 [1]: ts diff = 0x92f2752e694 - 0x92f2752e512 = 386 [2]: ts diff = 0x92f2752e817 - 0x92f2752e694 = 387 [3]: ts diff = 0x92f275bef12 - 0x92f2752e817 = 591611 [4]: ts diff = 0x92f275bf093 - 0x92f275bef12 = 385 [5]: ts diff = 0x92f275bf211 - 0x92f275bf093 = 382 [6]: ts diff = 0x92f276451d7 - 0x92f275bf211 = 548806 [7]: ts diff = 0x92f2764535a - 0x92f276451d7 = 387 [8]: ts diff = 0x92f276454d7 - 0x92f2764535a = 381 [9]: ts diff = 0x92f276cb256 - 0x92f276454d7 = 548223 [10]: ts diff = 0x92f276cb3d9 - 0x92f276cb256 = 387 [11]: ts diff = 0x92f276cb556 - 0x92f276cb3d9 = 381 (...)
At the listing above it is visible that every third execution of the function lasted longer than the other two. It is a naive example and could be enhanced to point to the area that caused the disruption by examining events 'in the middle' of the traced code range.
Applies cleanly on Mathieu's 5.1-rc3-cpu-wide-v3 branch.
Changes for V3:
- fix checkpatch warnings
I really appreciate that.
Thanks, Mathieu
- don't timestamp discontinuity packets with hardware timestamp, as it is for range packets only
- fix timestamping packets on full queue in per-thread mode
- simplify code in timestamping last branch samples before discontinuity and in timestamping pending packets
Changes for V2:
- move packet timestamping logic to decoder. Front end only uses this information to timestamp samples (as suggested by Mathieu).
- leave original behaviour of CPU-wide mode, where decoder is stopped and front end is triggered about pending queue with timestamp packet. At the same time, always adjust next and current timestamp in both CPU-wide and per-thread modes (as suggested by Mathieu).
- when timestamp packet is encountered, timestamp range and discontinuity packets waiting in the queue, that are not yet consumed by the front end (as suggested by Mathieu).
- don't timestamp exceptions, since they are not turned into branch nor instruction samples.
- fix timestamping of the last branch sample before discontinuity appears (as suggested by Leo).
Wojciech Zmuda (1): perf cs-etm: Set time value for samples
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++------- tools/perf/util/cs-etm.c | 3 ++ tools/perf/util/cs-etm.h | 1 + 3 files changed, 55 insertions(+), 19 deletions(-)
-- 2.11.0