This patch series is to support for using 'perf script' for CoreSight trace disassembler, for this purpose this patch series adds a new python script to parse CoreSight tracing event and use command 'objdump' for disassembled lines, finally this can generate readable program execution flow for reviewing tracing data.
Patches 0001 ~ 0003 are to generate samples for the start packet, CS_ETM_TRACE_ON packet and exception packets.
Patch 0004 is to introduce invalid address macro.
Patch 0005 is to add python script for trace disassembler.
Patch 0006 is to add doc to explain python script usage and give example for it.
This patch series has been rebased on acme git tree [1] with the latest commit e9175538c04f ("perf script python: Add addr into perf sample dict") and tested on Hikey (ARM64 octa CA53 cores).
In this version the script has no dependency on ARM64 platform and is expected to support ARM32 platform, but I am lacking ARM32 platform for testing on it, so firstly upstream to support ARM64 platform.
This patch series is firstly to support 'per-thread' recording tracing data, and it has been verified for kernel panic kdump tracing data.
Please note, this patch series (v4) is ONLY used for discussion for packet handling, after we get solid result I will send to LKML for reviewing and merging into mainline kernel.
Changes from v3: * Split packet handling for three patches, one is for start tracing packet, one is for CS_ETM_TRACE_ON packet and the last one patch is for exception packet; * Introduce invalid address macro.
Changes from v2: * Synced with Rob for handling CS_ETM_TRACE_ON packet, so refined 0001 patch according to dicussion; * Minor cleanup and fixes in 0003 patch for python script: remove 'svc' checking.
Changes from v1: * According to Mike and Rob suggestion, add the fixing to generate samples for the start packet and exception packets. * Simplify the python script to remove the exception prediction algorithm, we can rely on the sane exception packets for disassembler.
Leo Yan (6): perf cs-etm: Fix start tracing packet handling perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet perf cs-etm: Generate branch sample for exception packet perf cs-etm: Introduce invalid address macro perf script python: Add script for CoreSight trace disassembler coresight: Document for CoreSight trace disassembler
Documentation/trace/coresight.txt | 52 +++++ tools/perf/scripts/python/arm-cs-trace-disasm.py | 235 +++++++++++++++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 19 +- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 11 +- tools/perf/util/cs-etm.c | 101 ++++++++-- 5 files changed, 390 insertions(+), 28 deletions(-) create mode 100644 tools/perf/scripts/python/arm-cs-trace-disasm.py
Usually the start tracing packet is a CS_ETM_TRACE_ON packet, this packet is passed to cs_etm__flush(); cs_etm__flush() will check the condition 'prev_packet->sample_type == CS_ETM_RANGE' but 'prev_packet' is allocated by zalloc() so 'prev_packet->sample_type' is zero in initialization and this condition is false. So cs_etm__flush() will directly bail out without handling the start tracing packet.
This patch is to introduce a new sample type CS_ETM_EMPTY, which is used to indicate the packet is an empty packet. cs_etm__flush() will swap packets when it finds the previous packet is empty, so this can record the start tracing packet into 'etmq->prev_packet'.
Another minor change in cs_etm__flush() is to check the condition 'etmq->prev_packet->sample_type == CS_ETM_TRACE_ON', if the previous packet is also a CS_ETM_TRACE_ON packet, the function will skip for contiguous CS_ETM_TRACE_ON packet.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + tools/perf/util/cs-etm.c | 26 ++++++++++++++++++++++--- 2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 743f5f4..612b575 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,6 +23,7 @@ struct cs_etm_buffer { };
enum cs_etm_sample_type { + CS_ETM_EMPTY = 0, CS_ETM_RANGE = 1 << 0, CS_ETM_TRACE_ON = 1 << 1, }; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 822ba91..67564c1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -924,9 +924,18 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) int err = 0; struct cs_etm_packet *tmp;
- if (etmq->etm->synth_opts.last_branch && - etmq->prev_packet && - etmq->prev_packet->sample_type == CS_ETM_RANGE) { + if (!etmq->prev_packet) + return 0; + + /* Skip for contiguous CS_ETM_TRACE_ON packet */ + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) + return 0; + + /* Handle start tracing packet */ + if (etmq->prev_packet->sample_type == CS_ETM_EMPTY) + goto swap_packet; + + if (etmq->etm->synth_opts.last_branch) { /* * Generate a last branch event for the branches left in the * circular buffer at the end of the trace. @@ -941,6 +950,10 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) etmq->period_instructions); etmq->period_instructions = 0;
+ } + +swap_packet: + if (etmq->etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet. @@ -1020,6 +1033,13 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ cs_etm__flush(etmq); break; + case CS_ETM_EMPTY: + /* + * Should not receive empty packet, + * report error. + */ + pr_err("CS ETM Trace: empty packet\n"); + return -EINVAL; default: break; }
If one CS_ETM_TRACE_ON packet is inserted, we miss to handle it for branch samples. CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, furthermore we also miss to generate branch sample for packets before and after CS_ETM_TRACE_ON packet.
This patch is to add branch sample handling if CS_ETM_TRACE_ON packet is inserted in the middle of CS_ETM_RANGE packets:
If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate branch sample in the function cs_etm__flush(), this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. We also generate branch sample for the new CS_ETM_RANGE packet after CS_ETM_TRACE_ON packet, this have two purposes, the first one purpose is to save the info for the new CS_ETM_RANGE packet, the second purpose is to save CS_ETM_TRACE_ON packet info so we can have hint for a discontinuity in the trace.
For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value. This patch is to convert these values to zeros for more readable; this is accomplished by functions cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch.
Reviewed-by: Robert Walker robert.walker@arm.com Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm.c | 52 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 67564c1..8ec2f2c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -494,6 +494,10 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq)
static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) { + /* Returns 0 for the CS_ETM_TRACE_ON packet */ + if (packet->sample_type == CS_ETM_TRACE_ON) + return 0; + /* * The packet records the execution range with an exclusive end address * @@ -505,6 +509,15 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) return packet->end_addr - A64_INSTR_SIZE; }
+static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) +{ + /* Returns 0 for the CS_ETM_TRACE_ON packet */ + if (packet->sample_type == CS_ETM_TRACE_ON) + return 0; + + return packet->start_addr; +} + static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) { /* @@ -546,7 +559,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq)
be = &bs->entries[etmq->last_branch_pos]; be->from = cs_etm__last_executed_instr(etmq->prev_packet); - be->to = etmq->packet->start_addr; + be->to = cs_etm__first_executed_instr(etmq->packet); /* No support for mispredict */ be->flags.mispred = 0; be->flags.predicted = 1; @@ -701,7 +714,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.ip = cs_etm__last_executed_instr(etmq->prev_packet); sample.pid = etmq->pid; sample.tid = etmq->tid; - sample.addr = etmq->packet->start_addr; + sample.addr = cs_etm__first_executed_instr(etmq->packet); sample.id = etmq->etm->branches_id; sample.stream_id = etmq->etm->branches_id; sample.period = 1; @@ -897,13 +910,23 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) etmq->period_instructions = instrs_over; }
- if (etm->sample_branches && - etmq->prev_packet && - etmq->prev_packet->sample_type == CS_ETM_RANGE && - etmq->prev_packet->last_instr_taken_branch) { - ret = cs_etm__synth_branch_sample(etmq); - if (ret) - return ret; + if (etm->sample_branches && etmq->prev_packet) { + bool generate_sample = false; + + /* Generate sample for tracing on packet */ + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) + generate_sample = true; + + /* Generate sample for normal branch packet */ + if (etmq->prev_packet->sample_type == CS_ETM_RANGE && + etmq->prev_packet->last_instr_taken_branch) + generate_sample = true; + + if (generate_sample) { + ret = cs_etm__synth_branch_sample(etmq); + if (ret) + return ret; + } }
if (etm->sample_branches || etm->synth_opts.last_branch) { @@ -922,6 +945,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0; + struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp;
if (!etmq->prev_packet) @@ -948,12 +972,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) err = cs_etm__synth_instruction_sample( etmq, addr, etmq->period_instructions); + if (err) + return err; etmq->period_instructions = 0;
}
+ if (etm->sample_branches) { + err = cs_etm__synth_branch_sample(etmq); + if (err) + return err; + } + swap_packet: - if (etmq->etm->synth_opts.last_branch) { + if (etm->sample_branches || etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet.
On Thu, May 31, 2018 at 01:22:06PM +0800, Leo Yan wrote:
If one CS_ETM_TRACE_ON packet is inserted, we miss to handle it for branch samples. CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, furthermore we also miss to generate branch sample for packets before and after CS_ETM_TRACE_ON packet.
This patch is to add branch sample handling if CS_ETM_TRACE_ON packet is inserted in the middle of CS_ETM_RANGE packets:
If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate branch sample in the function cs_etm__flush(), this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. We also generate branch sample for the new CS_ETM_RANGE packet after CS_ETM_TRACE_ON packet, this have two purposes, the first one purpose is to save the info for the new CS_ETM_RANGE packet, the second purpose is to save CS_ETM_TRACE_ON packet info so we can have hint for a discontinuity in the trace.
For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value. This patch is to convert these values to zeros for more readable; this is accomplished by functions cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch.
Reviewed-by: Robert Walker robert.walker@arm.com Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 52 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 67564c1..8ec2f2c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -494,6 +494,10 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq) static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- /*
- The packet records the execution range with an exclusive end address
@@ -505,6 +509,15 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) return packet->end_addr - A64_INSTR_SIZE; } +static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) +{
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- return packet->start_addr;
+}
static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) { /* @@ -546,7 +559,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq) be = &bs->entries[etmq->last_branch_pos]; be->from = cs_etm__last_executed_instr(etmq->prev_packet);
- be->to = etmq->packet->start_addr;
- be->to = cs_etm__first_executed_instr(etmq->packet); /* No support for mispredict */ be->flags.mispred = 0; be->flags.predicted = 1;
@@ -701,7 +714,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.ip = cs_etm__last_executed_instr(etmq->prev_packet); sample.pid = etmq->pid; sample.tid = etmq->tid;
- sample.addr = etmq->packet->start_addr;
- sample.addr = cs_etm__first_executed_instr(etmq->packet); sample.id = etmq->etm->branches_id; sample.stream_id = etmq->etm->branches_id; sample.period = 1;
@@ -897,13 +910,23 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) etmq->period_instructions = instrs_over; }
- if (etm->sample_branches &&
etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
- if (etm->sample_branches && etmq->prev_packet) {
bool generate_sample = false;
/* Generate sample for tracing on packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
generate_sample = true;
if (generate_sample) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
}}
if (etm->sample_branches || etm->synth_opts.last_branch) { @@ -922,6 +945,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0;
- struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp;
if (!etmq->prev_packet) @@ -948,12 +972,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) err = cs_etm__synth_instruction_sample( etmq, addr, etmq->period_instructions);
if (err)
return err;
To me this is a bug fix and should be in a separate patch.
etmq->period_instructions = 0;
}
- if (etm->sample_branches) {
err = cs_etm__synth_branch_sample(etmq);
if (err)
return err;
- }
So is this - we should generate a branch sample when receiving a TRACE_ON packet.
swap_packet:
- if (etmq->etm->synth_opts.last_branch) {
- if (etm->sample_branches || etm->synth_opts.last_branch) { /*
- Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
- the next incoming packet.
-- 2.7.4
Hi Mathieu,
Oa Thu, May 31, 2018 at 11:07:11AM -0600, Mathieu Poirier wrote:
On Thu, May 31, 2018 at 01:22:06PM +0800, Leo Yan wrote:
If one CS_ETM_TRACE_ON packet is inserted, we miss to handle it for branch samples. CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, furthermore we also miss to generate branch sample for packets before and after CS_ETM_TRACE_ON packet.
This patch is to add branch sample handling if CS_ETM_TRACE_ON packet is inserted in the middle of CS_ETM_RANGE packets:
If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate branch sample in the function cs_etm__flush(), this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. We also generate branch sample for the new CS_ETM_RANGE packet after CS_ETM_TRACE_ON packet, this have two purposes, the first one purpose is to save the info for the new CS_ETM_RANGE packet, the second purpose is to save CS_ETM_TRACE_ON packet info so we can have hint for a discontinuity in the trace.
For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value. This patch is to convert these values to zeros for more readable; this is accomplished by functions cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch.
Reviewed-by: Robert Walker robert.walker@arm.com Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 52 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 67564c1..8ec2f2c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -494,6 +494,10 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq) static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- /*
- The packet records the execution range with an exclusive end address
@@ -505,6 +509,15 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) return packet->end_addr - A64_INSTR_SIZE; } +static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) +{
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- return packet->start_addr;
+}
static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) { /* @@ -546,7 +559,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq) be = &bs->entries[etmq->last_branch_pos]; be->from = cs_etm__last_executed_instr(etmq->prev_packet);
- be->to = etmq->packet->start_addr;
- be->to = cs_etm__first_executed_instr(etmq->packet); /* No support for mispredict */ be->flags.mispred = 0; be->flags.predicted = 1;
@@ -701,7 +714,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.ip = cs_etm__last_executed_instr(etmq->prev_packet); sample.pid = etmq->pid; sample.tid = etmq->tid;
- sample.addr = etmq->packet->start_addr;
- sample.addr = cs_etm__first_executed_instr(etmq->packet); sample.id = etmq->etm->branches_id; sample.stream_id = etmq->etm->branches_id; sample.period = 1;
@@ -897,13 +910,23 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) etmq->period_instructions = instrs_over; }
- if (etm->sample_branches &&
etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
- if (etm->sample_branches && etmq->prev_packet) {
bool generate_sample = false;
/* Generate sample for tracing on packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
generate_sample = true;
if (generate_sample) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
}}
if (etm->sample_branches || etm->synth_opts.last_branch) { @@ -922,6 +945,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0;
- struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp;
if (!etmq->prev_packet) @@ -948,12 +972,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) err = cs_etm__synth_instruction_sample( etmq, addr, etmq->period_instructions);
if (err)
return err;
To me this is a bug fix and should be in a separate patch.
etmq->period_instructions = 0;
}
- if (etm->sample_branches) {
err = cs_etm__synth_branch_sample(etmq);
if (err)
return err;
- }
So is this - we should generate a branch sample when receiving a TRACE_ON packet.
Have splitted into a dedicated patch for 'generate a branch sample when receiving a TRACE_ON packet.' For more clear logic in the patch series, I tried to split into 3 patches:
- Support dummy address in CS_ETM_TRACE_ON packet; - Fix patch for generating a branch sample when receiving a CS_ETM_TRACE_ON packet; - Generate branch sample for CS_ETM_TRACE_ON packet
Please let me know if it's doable or any better idea?
Thanks, Leo Yan
swap_packet:
- if (etmq->etm->synth_opts.last_branch) {
- if (etm->sample_branches || etm->synth_opts.last_branch) { /*
- Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
- the next incoming packet.
-- 2.7.4
On 1 June 2018 at 04:05, Leo Yan leo.yan@linaro.org wrote:
Hi Mathieu,
Oa Thu, May 31, 2018 at 11:07:11AM -0600, Mathieu Poirier wrote:
On Thu, May 31, 2018 at 01:22:06PM +0800, Leo Yan wrote:
If one CS_ETM_TRACE_ON packet is inserted, we miss to handle it for branch samples. CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, furthermore we also miss to generate branch sample for packets before and after CS_ETM_TRACE_ON packet.
This patch is to add branch sample handling if CS_ETM_TRACE_ON packet is inserted in the middle of CS_ETM_RANGE packets:
If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate branch sample in the function cs_etm__flush(), this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. We also generate branch sample for the new CS_ETM_RANGE packet after CS_ETM_TRACE_ON packet, this have two purposes, the first one purpose is to save the info for the new CS_ETM_RANGE packet, the second purpose is to save CS_ETM_TRACE_ON packet info so we can have hint for a discontinuity in the trace.
For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value. This patch is to convert these values to zeros for more readable; this is accomplished by functions cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch.
Reviewed-by: Robert Walker robert.walker@arm.com Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 52 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 67564c1..8ec2f2c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -494,6 +494,10 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq)
static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- /*
- The packet records the execution range with an exclusive end address
@@ -505,6 +509,15 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) return packet->end_addr - A64_INSTR_SIZE; }
+static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) +{
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
return 0;
- return packet->start_addr;
+}
static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) { /* @@ -546,7 +559,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq)
be = &bs->entries[etmq->last_branch_pos]; be->from = cs_etm__last_executed_instr(etmq->prev_packet);
- be->to = etmq->packet->start_addr;
- be->to = cs_etm__first_executed_instr(etmq->packet); /* No support for mispredict */ be->flags.mispred = 0; be->flags.predicted = 1;
@@ -701,7 +714,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.ip = cs_etm__last_executed_instr(etmq->prev_packet); sample.pid = etmq->pid; sample.tid = etmq->tid;
- sample.addr = etmq->packet->start_addr;
- sample.addr = cs_etm__first_executed_instr(etmq->packet); sample.id = etmq->etm->branches_id; sample.stream_id = etmq->etm->branches_id; sample.period = 1;
@@ -897,13 +910,23 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) etmq->period_instructions = instrs_over; }
- if (etm->sample_branches &&
etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
if (etm->sample_branches && etmq->prev_packet) {
bool generate_sample = false;
/* Generate sample for tracing on packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
generate_sample = true;
if (generate_sample) {
ret = cs_etm__synth_branch_sample(etmq);
if (ret)
return ret;
}
}
if (etm->sample_branches || etm->synth_opts.last_branch) {
@@ -922,6 +945,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0;
struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp;
if (!etmq->prev_packet)
@@ -948,12 +972,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) err = cs_etm__synth_instruction_sample( etmq, addr, etmq->period_instructions);
if (err)
return err;
To me this is a bug fix and should be in a separate patch.
etmq->period_instructions = 0; }
- if (etm->sample_branches) {
err = cs_etm__synth_branch_sample(etmq);
if (err)
return err;
- }
So is this - we should generate a branch sample when receiving a TRACE_ON packet.
Have splitted into a dedicated patch for 'generate a branch sample when receiving a TRACE_ON packet.' For more clear logic in the patch series, I tried to split into 3 patches:
- Support dummy address in CS_ETM_TRACE_ON packet;
- Fix patch for generating a branch sample when receiving a CS_ETM_TRACE_ON packet;
- Generate branch sample for CS_ETM_TRACE_ON packet
Please let me know if it's doable or any better idea?
I'll review your patches next week and will advise then.
Mathieu
Thanks, Leo Yan
swap_packet:
- if (etmq->etm->synth_opts.last_branch) {
- if (etm->sample_branches || etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet.
-- 2.7.4
On Fri, Jun 01, 2018 at 03:20:00PM -0600, Mathieu Poirier wrote:
[...]
static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0;
struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp;
if (!etmq->prev_packet)
@@ -948,12 +972,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) err = cs_etm__synth_instruction_sample( etmq, addr, etmq->period_instructions);
if (err)
return err;
To me this is a bug fix and should be in a separate patch.
etmq->period_instructions = 0; }
- if (etm->sample_branches) {
err = cs_etm__synth_branch_sample(etmq);
if (err)
return err;
- }
So is this - we should generate a branch sample when receiving a TRACE_ON packet.
Have splitted into a dedicated patch for 'generate a branch sample when receiving a TRACE_ON packet.' For more clear logic in the patch series, I tried to split into 3 patches:
- Support dummy address in CS_ETM_TRACE_ON packet;
- Fix patch for generating a branch sample when receiving a CS_ETM_TRACE_ON packet;
- Generate branch sample for CS_ETM_TRACE_ON packet
Please let me know if it's doable or any better idea?
I'll review your patches next week and will advise then.
Sure. Thanks, Mathieu.
Every exception packet actually appears as two duplicated elements in the decoder layer, the first element has 'elem_type' = OCSD_GEN_TRC_ELEM_INSTR_RANGE and the 'elem_type' in the second element is OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which present for exception entry and exit respectively. The decoder set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets according to the second element; but exception packets don't have dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets.
As result, the exception packets are taken as normal instruction packets and they will be processed for branch sample only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted.
To process exception packets properly, this patch introduce two new sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two kind packets, they will be handled by cs_etm__sample(). Except we need to consider branch taken packets for normal CS_ETM_RANGE packet with 'last_instr_taken_branch' is true, we also make below three kind packets as branch taken packets:
CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET packets; Instruction range packet before exception packets; Instruction range packet after exception packets.
This patch is to generate branch samples and update last branch stack for all packets as mentioned above. After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 11 ++++----- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++---- tools/perf/util/cs-etm.c | 31 +++++++++++++++++++++---- 3 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 4d5fc37..e009e01 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -264,8 +264,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].last_instr_taken_branch = false; - decoder->packet_buffer[i].exc = false; - decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN; } } @@ -292,8 +290,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++;
decoder->packet_buffer[et].sample_type = sample_type; - decoder->packet_buffer[et].exc = false; - decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[et].end_addr = 0xdeadbeefdeadbeefUL; @@ -353,6 +349,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( { ocsd_datapath_resp_t resp = OCSD_RESP_CONT; struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context; + struct cs_etm_packet *packet;
switch (elem->elem_type) { case OCSD_GEN_TRC_ELEM_UNKNOWN: @@ -370,10 +367,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - decoder->packet_buffer[decoder->tail].exc = true; + packet = &decoder->packet_buffer[decoder->tail]; + packet->sample_type = CS_ETM_EXCEPTION; break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: - decoder->packet_buffer[decoder->tail].exc_ret = true; + packet = &decoder->packet_buffer[decoder->tail]; + packet->sample_type = CS_ETM_EXCEPTION_RET; break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_EO_TRACE: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..cf31a9c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,11 @@ struct cs_etm_buffer { };
enum cs_etm_sample_type { - CS_ETM_EMPTY = 0, - CS_ETM_RANGE = 1 << 0, - CS_ETM_TRACE_ON = 1 << 1, + CS_ETM_EMPTY = 0, + CS_ETM_RANGE = 1 << 0, + CS_ETM_TRACE_ON = 1 << 1, + CS_ETM_EXCEPTION = 1 << 2, + CS_ETM_EXCEPTION_RET = 1 << 3, };
struct cs_etm_packet { @@ -33,8 +35,6 @@ struct cs_etm_packet { u64 start_addr; u64 end_addr; u8 last_instr_taken_branch; - u8 exc; - u8 exc_ret; int cpu; };
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8ec2f2c..6453a1b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -862,6 +862,27 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, return 0; }
+static bool cs_etm__is_taken_branch(struct cs_etm_packet *prev_packet, + struct cs_etm_packet *packet) +{ + /* The branch is taken for normal range packet with taken branch flag */ + if (prev_packet->sample_type == CS_ETM_RANGE && + prev_packet->last_instr_taken_branch) + return true; + + /* The branch is taken if previous packet is exception packet */ + if (prev_packet->sample_type == CS_ETM_EXCEPTION || + prev_packet->sample_type == CS_ETM_EXCEPTION_RET) + return true; + + /* The branch is taken for an intervening exception packet */ + if (packet->sample_type == CS_ETM_EXCEPTION || + packet->sample_type == CS_ETM_EXCEPTION_RET) + return true; + + return false; +} + static int cs_etm__sample(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -878,8 +899,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) */ if (etm->synth_opts.last_branch && etmq->prev_packet && - etmq->prev_packet->sample_type == CS_ETM_RANGE && - etmq->prev_packet->last_instr_taken_branch) + cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) cs_etm__update_last_branch_rb(etmq);
if (etm->sample_instructions && @@ -917,9 +937,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
- /* Generate sample for normal branch packet */ - if (etmq->prev_packet->sample_type == CS_ETM_RANGE && - etmq->prev_packet->last_instr_taken_branch) + /* Generate sample for exception and normal branch packet */ + if (cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) generate_sample = true;
if (generate_sample) { @@ -1051,6 +1070,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
switch (etmq->packet->sample_type) { case CS_ETM_RANGE: + case CS_ETM_EXCEPTION: + case CS_ETM_EXCEPTION_RET: /* * If the packet contains an instruction * range, generate instruction sequence
On Thu, May 31, 2018 at 01:22:07PM +0800, Leo Yan wrote:
Every exception packet actually appears as two duplicated elements in the decoder layer, the first element has 'elem_type' = OCSD_GEN_TRC_ELEM_INSTR_RANGE and the 'elem_type' in the second element is OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which present for exception entry and exit respectively. The decoder set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets according to the second element; but exception packets don't have dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets.
As result, the exception packets are taken as normal instruction packets and they will be processed for branch sample only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted.
To process exception packets properly, this patch introduce two new sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two kind packets, they will be handled by cs_etm__sample(). Except we need to consider branch taken packets for normal CS_ETM_RANGE packet with 'last_instr_taken_branch' is true, we also make below three kind packets as branch taken packets:
CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET packets; Instruction range packet before exception packets; Instruction range packet after exception packets.
This patch is to generate branch samples and update last branch stack for all packets as mentioned above. After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 11 ++++----- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++---- tools/perf/util/cs-etm.c | 31 +++++++++++++++++++++---- 3 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 4d5fc37..e009e01 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -264,8 +264,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].last_instr_taken_branch = false;
decoder->packet_buffer[i].exc = false;
decoder->packet_buffer[i].cpu = INT_MIN; }decoder->packet_buffer[i].exc_ret = false;
} @@ -292,8 +290,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++; decoder->packet_buffer[et].sample_type = sample_type;
- decoder->packet_buffer[et].exc = false;
- decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[et].end_addr = 0xdeadbeefdeadbeefUL;
@@ -353,6 +349,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( { ocsd_datapath_resp_t resp = OCSD_RESP_CONT; struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context;
- struct cs_etm_packet *packet;
switch (elem->elem_type) { case OCSD_GEN_TRC_ELEM_UNKNOWN: @@ -370,10 +367,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
decoder->packet_buffer[decoder->tail].exc = true;
packet = &decoder->packet_buffer[decoder->tail];
break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:packet->sample_type = CS_ETM_EXCEPTION;
decoder->packet_buffer[decoder->tail].exc_ret = true;
packet = &decoder->packet_buffer[decoder->tail];
packet->sample_type = CS_ETM_EXCEPTION_RET;
Am I missing something the above won't work? ->tail needs to be incremented before being used. A such what this is doing is clobbering the type of the previous packet.
You need to insert a new packet in the decoder packet queue using function cs_etm_decoder__buffer_packet() and then set the sample type.
break;
case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_EO_TRACE: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..cf31a9c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,11 @@ struct cs_etm_buffer { }; enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EXCEPTION = 1 << 2,
- CS_ETM_EXCEPTION_RET = 1 << 3,
};
Thank you for that.
struct cs_etm_packet { @@ -33,8 +35,6 @@ struct cs_etm_packet { u64 start_addr; u64 end_addr; u8 last_instr_taken_branch;
- u8 exc;
- u8 exc_ret; int cpu;
};
Ok
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8ec2f2c..6453a1b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -862,6 +862,27 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, return 0; } +static bool cs_etm__is_taken_branch(struct cs_etm_packet *prev_packet,
struct cs_etm_packet *packet)
+{
- /* The branch is taken for normal range packet with taken branch flag */
- if (prev_packet->sample_type == CS_ETM_RANGE &&
prev_packet->last_instr_taken_branch)
return true;
- /* The branch is taken if previous packet is exception packet */
- if (prev_packet->sample_type == CS_ETM_EXCEPTION ||
prev_packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- /* The branch is taken for an intervening exception packet */
- if (packet->sample_type == CS_ETM_EXCEPTION ||
packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- return false;
+}
static int cs_etm__sample(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -878,8 +899,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) */ if (etm->synth_opts.last_branch && etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
cs_etm__update_last_branch_rb(etmq);cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet))
if (etm->sample_instructions && @@ -917,9 +937,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
/* Generate sample for exception and normal branch packet */
if (cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) generate_sample = true;
if (generate_sample) { @@ -1051,6 +1070,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) switch (etmq->packet->sample_type) { case CS_ETM_RANGE:
case CS_ETM_EXCEPTION:
case CS_ETM_EXCEPTION_RET:
I think this is too much coupling and will invariably lead to errors in the code. Function cs_etm__sample() works well for range packets. I suggest spinning off another function (or two) to handle exception packets. That way we have a neat state machine and handling functions are kept simple.
/* * If the packet contains an instruction * range, generate instruction sequence
-- 2.7.4
On Thu, May 31, 2018 at 11:23:13AM -0600, Mathieu Poirier wrote:
On Thu, May 31, 2018 at 01:22:07PM +0800, Leo Yan wrote:
Every exception packet actually appears as two duplicated elements in the decoder layer, the first element has 'elem_type' = OCSD_GEN_TRC_ELEM_INSTR_RANGE and the 'elem_type' in the second element is OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which present for exception entry and exit respectively. The decoder set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets according to the second element; but exception packets don't have dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets.
As result, the exception packets are taken as normal instruction packets and they will be processed for branch sample only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted.
To process exception packets properly, this patch introduce two new sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two kind packets, they will be handled by cs_etm__sample(). Except we need to consider branch taken packets for normal CS_ETM_RANGE packet with 'last_instr_taken_branch' is true, we also make below three kind packets as branch taken packets:
CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET packets; Instruction range packet before exception packets; Instruction range packet after exception packets.
This patch is to generate branch samples and update last branch stack for all packets as mentioned above. After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 11 ++++----- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++---- tools/perf/util/cs-etm.c | 31 +++++++++++++++++++++---- 3 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 4d5fc37..e009e01 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -264,8 +264,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].last_instr_taken_branch = false;
decoder->packet_buffer[i].exc = false;
decoder->packet_buffer[i].cpu = INT_MIN; }decoder->packet_buffer[i].exc_ret = false;
} @@ -292,8 +290,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++; decoder->packet_buffer[et].sample_type = sample_type;
- decoder->packet_buffer[et].exc = false;
- decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[et].end_addr = 0xdeadbeefdeadbeefUL;
@@ -353,6 +349,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( { ocsd_datapath_resp_t resp = OCSD_RESP_CONT; struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context;
- struct cs_etm_packet *packet;
switch (elem->elem_type) { case OCSD_GEN_TRC_ELEM_UNKNOWN: @@ -370,10 +367,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
decoder->packet_buffer[decoder->tail].exc = true;
packet = &decoder->packet_buffer[decoder->tail];
break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:packet->sample_type = CS_ETM_EXCEPTION;
decoder->packet_buffer[decoder->tail].exc_ret = true;
packet = &decoder->packet_buffer[decoder->tail];
packet->sample_type = CS_ETM_EXCEPTION_RET;
Am I missing something the above won't work? ->tail needs to be incremented before being used. A such what this is doing is clobbering the type of the previous packet.
Here the logic is a bit tricky. The elements OCSD_GEN_TRC_ELEM_EXCEPTION/OCSD_GEN_TRC_ELEM_EXCEPTION_RET don't include any useful address range info, so I think this is the main reason now we take the exception elements as an 'attribution' and attach it to the previous instruction range element.
You need to insert a new packet in the decoder packet queue using function cs_etm_decoder__buffer_packet() and then set the sample type.
I will write one new patch by following this method.
break;
case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_EO_TRACE: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..cf31a9c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,11 @@ struct cs_etm_buffer { }; enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EXCEPTION = 1 << 2,
- CS_ETM_EXCEPTION_RET = 1 << 3,
};
Thank you for that.
struct cs_etm_packet { @@ -33,8 +35,6 @@ struct cs_etm_packet { u64 start_addr; u64 end_addr; u8 last_instr_taken_branch;
- u8 exc;
- u8 exc_ret; int cpu;
};
Ok
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8ec2f2c..6453a1b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -862,6 +862,27 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, return 0; } +static bool cs_etm__is_taken_branch(struct cs_etm_packet *prev_packet,
struct cs_etm_packet *packet)
+{
- /* The branch is taken for normal range packet with taken branch flag */
- if (prev_packet->sample_type == CS_ETM_RANGE &&
prev_packet->last_instr_taken_branch)
return true;
- /* The branch is taken if previous packet is exception packet */
- if (prev_packet->sample_type == CS_ETM_EXCEPTION ||
prev_packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- /* The branch is taken for an intervening exception packet */
- if (packet->sample_type == CS_ETM_EXCEPTION ||
packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- return false;
+}
static int cs_etm__sample(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -878,8 +899,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) */ if (etm->synth_opts.last_branch && etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
cs_etm__update_last_branch_rb(etmq);cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet))
if (etm->sample_instructions && @@ -917,9 +937,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
/* Generate sample for exception and normal branch packet */
if (cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) generate_sample = true;
if (generate_sample) { @@ -1051,6 +1070,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) switch (etmq->packet->sample_type) { case CS_ETM_RANGE:
case CS_ETM_EXCEPTION:
case CS_ETM_EXCEPTION_RET:
I think this is too much coupling and will invariably lead to errors in the code. Function cs_etm__sample() works well for range packets. I suggest spinning off another function (or two) to handle exception packets. That way we have a neat state machine and handling functions are kept simple.
Sure, will do this and let's see if it's doable.
/* * If the packet contains an instruction * range, generate instruction sequence
-- 2.7.4
On 31 May 2018 at 19:57, Leo Yan leo.yan@linaro.org wrote:
On Thu, May 31, 2018 at 11:23:13AM -0600, Mathieu Poirier wrote:
On Thu, May 31, 2018 at 01:22:07PM +0800, Leo Yan wrote:
Every exception packet actually appears as two duplicated elements in the decoder layer, the first element has 'elem_type' = OCSD_GEN_TRC_ELEM_INSTR_RANGE and the 'elem_type' in the second element is OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which present for exception entry and exit respectively. The decoder set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets according to the second element; but exception packets don't have dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets.
As result, the exception packets are taken as normal instruction packets and they will be processed for branch sample only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted.
To process exception packets properly, this patch introduce two new sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two kind packets, they will be handled by cs_etm__sample(). Except we need to consider branch taken packets for normal CS_ETM_RANGE packet with 'last_instr_taken_branch' is true, we also make below three kind packets as branch taken packets:
CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET packets; Instruction range packet before exception packets; Instruction range packet after exception packets.
This patch is to generate branch samples and update last branch stack for all packets as mentioned above. After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 11 ++++----- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++---- tools/perf/util/cs-etm.c | 31 +++++++++++++++++++++---- 3 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 4d5fc37..e009e01 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -264,8 +264,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[i].last_instr_taken_branch = false;
decoder->packet_buffer[i].exc = false;
}decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN;
} @@ -292,8 +290,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++;
decoder->packet_buffer[et].sample_type = sample_type;
- decoder->packet_buffer[et].exc = false;
- decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = 0xdeadbeefdeadbeefUL; decoder->packet_buffer[et].end_addr = 0xdeadbeefdeadbeefUL;
@@ -353,6 +349,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( { ocsd_datapath_resp_t resp = OCSD_RESP_CONT; struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context;
struct cs_etm_packet *packet;
switch (elem->elem_type) { case OCSD_GEN_TRC_ELEM_UNKNOWN:
@@ -370,10 +367,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
decoder->packet_buffer[decoder->tail].exc = true;
packet = &decoder->packet_buffer[decoder->tail];
case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:packet->sample_type = CS_ETM_EXCEPTION; break;
decoder->packet_buffer[decoder->tail].exc_ret = true;
packet = &decoder->packet_buffer[decoder->tail];
packet->sample_type = CS_ETM_EXCEPTION_RET;
Am I missing something the above won't work? ->tail needs to be incremented before being used. A such what this is doing is clobbering the type of the previous packet.
Here the logic is a bit tricky. The elements OCSD_GEN_TRC_ELEM_EXCEPTION/OCSD_GEN_TRC_ELEM_EXCEPTION_RET don't include any useful address range info, so I think this is the main reason now we take the exception elements as an 'attribution' and attach it to the previous instruction range element.
You are correct only for the exc/exc_ret part. The sample_type gets clobbered leading to very difficult (and fragile) code to manage in the front end.
You need to insert a new packet in the decoder packet queue using function cs_etm_decoder__buffer_packet() and then set the sample type.
I will write one new patch by following this method.
Yes please.
break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_EO_TRACE:
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..cf31a9c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,11 @@ struct cs_etm_buffer { };
enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EXCEPTION = 1 << 2,
- CS_ETM_EXCEPTION_RET = 1 << 3,
};
Thank you for that.
struct cs_etm_packet { @@ -33,8 +35,6 @@ struct cs_etm_packet { u64 start_addr; u64 end_addr; u8 last_instr_taken_branch;
- u8 exc;
- u8 exc_ret; int cpu;
};
Ok
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8ec2f2c..6453a1b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -862,6 +862,27 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, return 0; }
+static bool cs_etm__is_taken_branch(struct cs_etm_packet *prev_packet,
struct cs_etm_packet *packet)
+{
- /* The branch is taken for normal range packet with taken branch flag */
- if (prev_packet->sample_type == CS_ETM_RANGE &&
prev_packet->last_instr_taken_branch)
return true;
- /* The branch is taken if previous packet is exception packet */
- if (prev_packet->sample_type == CS_ETM_EXCEPTION ||
prev_packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- /* The branch is taken for an intervening exception packet */
- if (packet->sample_type == CS_ETM_EXCEPTION ||
packet->sample_type == CS_ETM_EXCEPTION_RET)
return true;
- return false;
+}
static int cs_etm__sample(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -878,8 +899,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) */ if (etm->synth_opts.last_branch && etmq->prev_packet &&
etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) cs_etm__update_last_branch_rb(etmq);
if (etm->sample_instructions &&
@@ -917,9 +937,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for normal branch packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
/* Generate sample for exception and normal branch packet */
if (cs_etm__is_taken_branch(etmq->prev_packet, etmq->packet)) generate_sample = true; if (generate_sample) {
@@ -1051,6 +1070,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
switch (etmq->packet->sample_type) { case CS_ETM_RANGE:
case CS_ETM_EXCEPTION:
case CS_ETM_EXCEPTION_RET:
I think this is too much coupling and will invariably lead to errors in the code. Function cs_etm__sample() works well for range packets. I suggest spinning off another function (or two) to handle exception packets. That way we have a neat state machine and handling functions are kept simple.
Sure, will do this and let's see if it's doable.
/* * If the packet contains an instruction * range, generate instruction sequence
-- 2.7.4
This patch introduces invalid address macro and uses it to replace dummy value '0xdeadbeefdeadbeefUL'.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 8 ++++---- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 2 ++ 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index e009e01..e833f68 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -261,8 +261,8 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->tail = 0; decoder->packet_count = 0; for (i = 0; i < MAX_BUFFER; i++) { - decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; - decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; + decoder->packet_buffer[i].start_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[i].end_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[i].last_instr_taken_branch = false; decoder->packet_buffer[i].cpu = INT_MIN; } @@ -291,8 +291,8 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,
decoder->packet_buffer[et].sample_type = sample_type; decoder->packet_buffer[et].cpu = *((int *)inode->priv); - decoder->packet_buffer[et].start_addr = 0xdeadbeefdeadbeefUL; - decoder->packet_buffer[et].end_addr = 0xdeadbeefdeadbeefUL; + decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index cf31a9c..cb57756 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -13,6 +13,8 @@ #include <linux/types.h> #include <stdio.h>
+#define CS_ETM_INVAL_ADDR 0xdeadbeefdeadbeefUL + struct cs_etm_decoder;
struct cs_etm_buffer {
This commit adds python script to parse CoreSight tracing event and use command 'objdump' for disassembled lines, finally we can generate readable program execution flow for reviewing tracing data.
The script receives CoreSight tracing packet with below format:
+------------+------------+------------+ packet(n): | addr | ip | cpu | +------------+------------+------------+ packet(n+1): | addr | ip | cpu | +------------+------------+------------+
packet::ip is the last address of current branch instruction and packet::addr presents the start address of the next coming branch instruction. So for one branch instruction which starts in packet(n), its execution flow starts from packet(n)::addr and it stops at packet(n+1)::ip. As results we need to combine the two continuous packets to generate the instruction range, this is the rationale for the script implementation:
[ sample(n)::addr .. sample(n+1)::ip ]
Credits to Tor Jeremiassen who have written the script skeleton and provides the ideas for reading symbol file according to build-id, creating memory map for dso and basic packet handling. Mathieu Poirier contributed fixes for build-id and memory map bugs. The detailed development history for this script you can find from [1]. Based on Tor and Mathieu work, the script is updated samples handling for the corrected sample format. Another minor enhancement is to support for without build-id case, the script can parse kernel symbols with option '-k' for vmlinux file path.
[1] https://github.com/Linaro/perf-opencsd/commits/perf-opencsd-v4.15/tools/perf...
Co-authored-by: Tor Jeremiassen tor@ti.com Co-authored-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/scripts/python/arm-cs-trace-disasm.py | 235 +++++++++++++++++++++++ 1 file changed, 235 insertions(+) create mode 100644 tools/perf/scripts/python/arm-cs-trace-disasm.py
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py new file mode 100644 index 0000000..1239ab4 --- /dev/null +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py @@ -0,0 +1,235 @@ +# arm-cs-trace-disasm.py: ARM CoreSight Trace Dump With Disassember +# SPDX-License-Identifier: GPL-2.0 +# +# Tor Jeremiassen tor@ti.com is original author who wrote script +# skeleton, Mathieu Poirier mathieu.poirier@linaro.org contributed +# fixes for build-id and memory map; Leo Yan leo.yan@linaro.org +# updated the packet parsing with new samples format. + +import os +import sys +import re +from subprocess import * +from optparse import OptionParser, make_option + +# Command line parsing + +option_list = [ + # formatting options for the bottom entry of the stack + make_option("-k", "--vmlinux", dest="vmlinux_name", + help="Set path to vmlinux file"), + make_option("-d", "--objdump", dest="objdump_name", + help="Set path to objdump executable file"), + make_option("-v", "--verbose", dest="verbose", + action="store_true", default=False, + help="Enable debugging log") +] + +parser = OptionParser(option_list=option_list) +(options, args) = parser.parse_args() + +if (options.objdump_name == None): + sys.exit("No objdump executable file specified - use -d or --objdump option") + +# Initialize global dicts and regular expression + +build_ids = dict() +mmaps = dict() +disasm_cache = dict() +cpu_data = dict() +disasm_re = re.compile("^\s*([0-9a-fA-F]+):") +disasm_func_re = re.compile("^\s*([0-9a-fA-F]+)\s<.*>:") +cache_size = 32*1024 +prev_cpu = -1 + +def parse_buildid(): + global build_ids + + buildid_regex = "([a-fA-f0-9]+)[ \t]([^ \n]+)" + buildid_re = re.compile(buildid_regex) + + results = check_output(["perf", "buildid-list"]).split('\n'); + for line in results: + m = buildid_re.search(line) + if (m == None): + continue; + + id_name = m.group(2) + id_num = m.group(1) + + if (id_name == "[kernel.kallsyms]") : + append = "/kallsyms" + elif (id_name == "[vdso]") : + append = "/vdso" + else: + append = "/elf" + + build_ids[id_name] = os.environ['PERF_BUILDID_DIR'] + \ + "/" + id_name + "/" + id_num + append; + # Replace duplicate slash chars to single slash char + build_ids[id_name] = build_ids[id_name].replace('//', '/', 1) + + if ((options.vmlinux_name == None) and ("[kernel.kallsyms]" in build_ids)): + print "kallsyms cannot be used to dump assembler" + + # Set vmlinux path to replace kallsyms file, if without buildid we still + # can use vmlinux to prase kernel symbols + if ((options.vmlinux_name != None)): + build_ids['[kernel.kallsyms]'] = options.vmlinux_name; + +def parse_mmap(): + global mmaps + + # Check mmap for PERF_RECORD_MMAP and PERF_RECORD_MMAP2 + mmap_regex = "PERF_RECORD_MMAP.* -?[0-9]+/[0-9]+: [(0x[0-9a-fA-F]+)((0x[0-9a-fA-F]+)).*:\s.*\s(\S*)" + mmap_re = re.compile(mmap_regex) + + results = check_output("perf script --show-mmap-events | fgrep PERF_RECORD_MMAP", shell=True).split('\n') + for line in results: + m = mmap_re.search(line) + if (m != None): + if (m.group(3) == '[kernel.kallsyms]_text'): + dso = '[kernel.kallsyms]' + else: + dso = m.group(3) + + start = int(m.group(1),0) + end = int(m.group(1),0) + int(m.group(2),0) + mmaps[dso] = [start, end] + +def find_dso_mmap(addr): + global mmaps + + for key, value in mmaps.items(): + if (addr >= value[0] and addr < value[1]): + return key + + return None + +def read_disam(dso, start_addr, stop_addr): + global mmaps + global build_ids + + addr_range = start_addr + ":" + stop_addr; + + # Don't let the cache get too big, clear it when it hits max size + if (len(disasm_cache) > cache_size): + disasm_cache.clear(); + + try: + disasm_output = disasm_cache[addr_range]; + except: + try: + fname = build_ids[dso]; + except KeyError: + sys.exit("cannot find symbol file for " + dso) + + disasm = [ options.objdump_name, "-d", "-z", + "--start-address="+start_addr, + "--stop-address="+stop_addr, fname ] + + disasm_output = check_output(disasm).split('\n') + disasm_cache[addr_range] = disasm_output; + + return disasm_output + +def dump_disam(dso, start_addr, stop_addr): + for line in read_disam(dso, start_addr, stop_addr): + m = disasm_func_re.search(line) + if (m != None): + print "\t",line + continue + + m = disasm_re.search(line) + if (m == None): + continue; + + print "\t",line + +def dump_packet(sample): + print "Packet = { cpu: 0x%d addr: 0x%x phys_addr: 0x%x ip: 0x%x " \ + "pid: %d tid: %d period: %d time: %d }" % \ + (sample['cpu'], sample['addr'], sample['phys_addr'], \ + sample['ip'], sample['pid'], sample['tid'], \ + sample['period'], sample['time']) + +def trace_begin(): + print 'ARM CoreSight Trace Data Assembler Dump' + parse_buildid() + parse_mmap() + +def trace_end(): + print 'End' + +def trace_unhandled(event_name, context, event_fields_dict): + print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())]) + +def process_event(param_dict): + global cache_size + global options + global prev_cpu + + sample = param_dict["sample"] + + if (options.verbose == True): + dump_packet(sample) + + # If period doesn't equal to 1, this packet is for instruction sample + # packet, we need drop this synthetic packet. + if (sample['period'] != 1): + print "Skip synthetic instruction sample" + return + + cpu = format(sample['cpu'], "d"); + + # Initialize CPU data if it's empty, and directly return back + # if this is the first tracing event for this CPU. + if (cpu_data.get(str(cpu) + 'addr') == None): + cpu_data[str(cpu) + 'addr'] = format(sample['addr'], "#x") + prev_cpu = cpu + return + + # The format for packet is: + # + # +------------+------------+------------+ + # sample_prev: | addr | ip | cpu | + # +------------+------------+------------+ + # sample_next: | addr | ip | cpu | + # +------------+------------+------------+ + # + # We need to combine the two continuous packets to get the instruction + # range for sample_prev::cpu: + # + # [ sample_prev::addr .. sample_next::ip ] + # + # For this purose, sample_prev::addr is stored into cpu_data structure + # and read back for 'start_addr' when the new packet comes, and we need + # to use sample_next::ip to calculate 'stop_addr', plusing extra 4 for + # 'stop_addr' is for the sake of objdump so the final assembler dump can + # include last instruction for sample_next::ip. + + start_addr = cpu_data[str(prev_cpu) + 'addr'] + stop_addr = format(sample['ip'] + 4, "#x") + + # Record for previous sample packet + cpu_data[str(cpu) + 'addr'] = format(sample['addr'], "#x") + prev_cpu = cpu + + # Handle CS_ETM_TRACE_ON packet if start_addr=0 and stop_addr=4 + if (int(start_addr, 0) == 0 and int(stop_addr, 0) == 4): + print "CPU%s: CS_ETM_TRACE_ON packet is inserted" % cpu + return + + # Sanity checking dso for start_addr and stop_addr + prev_dso = find_dso_mmap(int(start_addr, 0)) + next_dso = find_dso_mmap(int(stop_addr, 0)) + + # If cannot find dso so cannot dump assembler, bail out + if (prev_dso == None or next_dso == None): + print "Address range [ %s .. %s ]: failed to find dso" % (start_addr, stop_addr) + return + elif (prev_dso != next_dso): + print "Address range [ %s .. %s ]: isn't in same dso" % (start_addr, stop_addr) + return + + dump_disam(prev_dso, start_addr, stop_addr)
This commit documents CoreSight trace disassembler usage and gives example for it.
Signed-off-by: Leo Yan leo.yan@linaro.org --- Documentation/trace/coresight.txt | 52 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)
diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt index 6f0120c..b8f2359 100644 --- a/Documentation/trace/coresight.txt +++ b/Documentation/trace/coresight.txt @@ -381,3 +381,55 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto $ taskset -c 2 ./sort_autofdo Bubble sorting array of 30000 elements 5806 ms + + +Tracing data disassembler +------------------------- + +'perf script' supports to use script to parse tracing packet and rely on +'objdump' for disassembled lines, this can convert tracing data to readable +program execution flow for easily reviewing tracing data. + +The CoreSight trace disassembler is located in the folder: +tools/perf/scripts/python/arm-cs-trace-disasm.py. This script support below +options: + + -d, --objdump: Set path to objdump executable, this option is + mandatory. + -k, --vmlinux: Set path to vmlinux file. + -v, --verbose: Enable debugging log, after enable this option the + script dumps every event data. + +Below is one example for using python script to dump CoreSight trace +disassembler: + + $ perf script -s arm-cs-trace-disasm.py -i perf.data \ + -F cpu,event,ip,addr,sym -- -d objdump -k ./vmlinux > cs-disasm.log + +Below is one example for the disassembler log: + +ARM CoreSight Trace Data Assembler Dump + ffff000008a5f2dc <etm4_enable_hw+0x344>: + ffff000008a5f2dc: 340000a0 cbz w0, ffff000008a5f2f0 <etm4_enable_hw+0x358> + ffff000008a5f2f0 <etm4_enable_hw+0x358>: + ffff000008a5f2f0: f9400260 ldr x0, [x19] + ffff000008a5f2f4: d5033f9f dsb sy + ffff000008a5f2f8: 913ec000 add x0, x0, #0xfb0 + ffff000008a5f2fc: b900001f str wzr, [x0] + ffff000008a5f300: f9400bf3 ldr x19, [sp, #16] + ffff000008a5f304: a8c27bfd ldp x29, x30, [sp], #32 + ffff000008a5f308: d65f03c0 ret + ffff000008a5fa18 <etm4_enable+0x1b0>: + ffff000008a5fa18: 14000025 b ffff000008a5faac <etm4_enable+0x244> + ffff000008a5faac <etm4_enable+0x244>: + ffff000008a5faac: b9406261 ldr w1, [x19, #96] + ffff000008a5fab0: 52800015 mov w21, #0x0 // #0 + ffff000008a5fab4: f901ca61 str x1, [x19, #912] + ffff000008a5fab8: 2a1503e0 mov w0, w21 + ffff000008a5fabc: 3940e261 ldrb w1, [x19, #56] + ffff000008a5fac0: f901ce61 str x1, [x19, #920] + ffff000008a5fac4: a94153f3 ldp x19, x20, [sp, #16] + ffff000008a5fac8: a9425bf5 ldp x21, x22, [sp, #32] + ffff000008a5facc: a94363f7 ldp x23, x24, [sp, #48] + ffff000008a5fad0: a8c47bfd ldp x29, x30, [sp], #64 + ffff000008a5fad4: d65f03c0 ret