This patch seris adds support for sample flags so can facilitate perf to print sample flags for branch instruction.
Patch 0001 is used to save last branch information in packet structure, this includes instruction type, subtype and condition flag to help making decision for which branch instruction it is. It passes related information from decoder layer to cs-etm.c, so we use cs-etm.c as a central place to set sample flags.
Patch 0002 is used to set sample flags for instruction range packet.
Patch 0003 is used to set sample flags for trace discontinuity packet.
Patch 0004 addes exception number in packet, so this is preparation for patch 0005 to set sample flags for exception packet; patch 0005 support ETMv3/ETMv4 exception packet together.
The patch series is based on OpenCSD v0.10.0 and later versions (so far the latest version is v0.10.1 when wrote this patch series).
This patch series is applied on the acme's perf core branch [1] with the with latest commit aaab25f03e9e ("perf trace: Allow selecting use the use of the ordered_events code") and has dependency on patch series 'perf cs-etm: Correct packets handling' [2].
After applying the dependency patches and this patch series, we can verify sample flags with below command:
# perf script -F,-time,+flags,+ip,+sym,+dso,+addr,+symoff -k vmlinux
Changes from v1: * Addressed Mathieu's suggestion to split one big patch to 3 small patches for setting sample flags, one is for instruction range packet, one is for discontinuity packet and one is for exception packet. * Added supporting for ETMv3 exception packet. * Followed Mathieu's suggestion to move all sample flags handling from decoder layer to cs-etm.c, thus it has enough info to set flags based on trace context in single place.
Changes from v1: * Moved exception packets handling patches into patch series 'perf cs-etm: Correct packets handling'. * Added sample flags fixing up for TRACE_OFF packet. * Created a new function which is used to maintain flags fixing up.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/c... [2] https://lkml.org/lkml/2018/12/11/73
Leo Yan (5): perf cs-etm: Add last instruction information in packet perf cs-etm: Set sample flags for instruction range packet perf cs-etm: Set sample flags for trace discontinuity perf cs-etm: Add exception number in exception packet perf cs-etm: Set sample flags for exception packet
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 29 ++- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 5 + tools/perf/util/cs-etm.c | 227 +++++++++++++++++++++++- tools/perf/util/cs-etm.h | 36 ++++ 4 files changed, 291 insertions(+), 6 deletions(-)
Decoder provides last instruction related information, these information can be used for trace analysis; specifically we can get to know what kind of branch instruction has been executed, mainly the information are contained in three element fields:
last_i_type: this is significant type for waypoint calculation, it indicates the last instruction is one of immediate branch instruction, indirect branch instruction, instruction barrier (ISB), or data barrier (DSB/DMB).
last_i_subtype: this is used for instruction sub type, it can be branch with link, ARMv8 return instruction, ARMv8 eret instruction (return from exception), or ARMv7 instruction which could imply return (e.g. MOV PC, LR; POP { ,PC}).
last_instr_cond: it indicates if the last instruction was conditional.
But these there fields are not saved into cs_etm_packet struct, thus cs-etm layer don't know related information and cannot generate sample flags for branch instructions.
This patch add corresponding three new fields in cs_etm_packet struct and save related value into the packet structure, it is preparation for supporting sample flags.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 9 +++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +++ 2 files changed, 12 insertions(+)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 8c15557..8a19310 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -290,6 +290,9 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; decoder->packet_buffer[i].last_instr_size = 0; + decoder->packet_buffer[i].last_instr_type = 0; + decoder->packet_buffer[i].last_instr_subtype = 0; + decoder->packet_buffer[i].last_instr_cond = 0; decoder->packet_buffer[i].cpu = INT_MIN; } } @@ -323,6 +326,9 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].instr_count = 0; decoder->packet_buffer[et].last_instr_taken_branch = false; decoder->packet_buffer[et].last_instr_size = 0; + decoder->packet_buffer[et].last_instr_type = 0; + decoder->packet_buffer[et].last_instr_subtype = 0; + decoder->packet_buffer[et].last_instr_cond = 0;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -366,6 +372,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, packet->start_addr = elem->st_addr; packet->end_addr = elem->en_addr; packet->instr_count = elem->num_instr_range; + packet->last_instr_type = elem->last_i_type; + packet->last_instr_subtype = elem->last_i_subtype; + packet->last_instr_cond = elem->last_instr_cond;
switch (elem->last_i_type) { case OCSD_INSTR_BR: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index a6407d4..0ffa7c5 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -45,6 +45,9 @@ struct cs_etm_packet { u32 instr_count; u8 last_instr_taken_branch; u8 last_instr_size; + u32 last_instr_type; + u32 last_instr_subtype; + u8 last_instr_cond; int cpu; };
On Tue, Dec 11, 2018 at 11:01:07PM +0800, Leo Yan wrote:
Decoder provides last instruction related information, these information can be used for trace analysis; specifically we can get to know what kind of branch instruction has been executed, mainly the information are contained in three element fields:
last_i_type: this is significant type for waypoint calculation, it indicates the last instruction is one of immediate branch instruction, indirect branch instruction, instruction barrier (ISB), or data barrier (DSB/DMB).
last_i_subtype: this is used for instruction sub type, it can be branch with link, ARMv8 return instruction, ARMv8 eret instruction (return from exception), or ARMv7 instruction which could imply return (e.g. MOV PC, LR; POP { ,PC}).
last_instr_cond: it indicates if the last instruction was conditional.
But these there fields are not saved into cs_etm_packet struct, thus
s/there/three
cs-etm layer don't know related information and cannot generate sample flags for branch instructions.
This patch add corresponding three new fields in cs_etm_packet struct and save related value into the packet structure, it is preparation for supporting sample flags.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 9 +++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +++ 2 files changed, 12 insertions(+)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 8c15557..8a19310 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -290,6 +290,9 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; decoder->packet_buffer[i].last_instr_size = 0;
decoder->packet_buffer[i].last_instr_type = 0;
decoder->packet_buffer[i].last_instr_subtype = 0;
decoder->packet_buffer[i].cpu = INT_MIN; }decoder->packet_buffer[i].last_instr_cond = 0;
} @@ -323,6 +326,9 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].instr_count = 0; decoder->packet_buffer[et].last_instr_taken_branch = false; decoder->packet_buffer[et].last_instr_size = 0;
- decoder->packet_buffer[et].last_instr_type = 0;
- decoder->packet_buffer[et].last_instr_subtype = 0;
- decoder->packet_buffer[et].last_instr_cond = 0;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -366,6 +372,9 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, packet->start_addr = elem->st_addr; packet->end_addr = elem->en_addr; packet->instr_count = elem->num_instr_range;
- packet->last_instr_type = elem->last_i_type;
- packet->last_instr_subtype = elem->last_i_subtype;
- packet->last_instr_cond = elem->last_instr_cond;
switch (elem->last_i_type) { case OCSD_INSTR_BR: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index a6407d4..0ffa7c5 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -45,6 +45,9 @@ struct cs_etm_packet { u32 instr_count; u8 last_instr_taken_branch; u8 last_instr_size;
- u32 last_instr_type;
- u32 last_instr_subtype;
- u8 last_instr_cond; int cpu;
If you insert this block after @instr_count and before @last_instr_taken_branch all the types will be grouped together.
}; -- 2.7.4
The perf sample data contains flags to indicate the hardware trace data is belonging to which type branch instruction, thus this can be used to print out the human readable string. Arm CoreSight ETM sample data is missed to set flags and it is always set to zeros, this results in perf tool skips to print string for instruction types.
This patch is to set branch instruction flags for instruction range packet, mainly based on three fields which have been introduced in cs_etm_packet struct:
cs_etm_packet::last_instr_type cs_etm_packet::last_instr_subtype cs_etm_packet::last_instr_cond
Below is detailed implementation for set sample flags for branch instructions:
If the instruction is immediate branch but without link and return flag, we consider it as function internal branch; in fact the immediate branch also can be used to invoke the function entry, usually this is only used in assembly code to directly call a symbol and don't expect to return back; after reviewing kernel normal functions and user space programs, both of them are very seldom to use immediate branch for function call. On the other hand, if we want to decide the immediate branch is for function branch jumping or for function calling, we need to rely on the start address of next packet and check the symbol offset for the start address, this will introduce much complexity in the implementation. So for this version we simply consider immediate branch as function internal branch. Moreover, we rely on 'last_instr_cond' to decide if the branch instruction is a conditional branch or not.
If the instruction is immediate branch with link, it's instruction 'BL' and which is used for function call.
If the instruction is indirect branch with link, e.g BLR, we think it's a function call.
If the instruction is indirect branch and with subtype OCSD_S_INSTR_V7_IMPLIED_RET, the decoders gives the hint the function return for below cases related with A32/T32 instruction; set this branch flag as function return (Thanks for Al's suggestion).
BX R14 MOV PC, LR POP {…, PC} LDR PC, [SP], #offset
If the instruction is indirect branch without link, this is corresponding to instruction 'BR', this instruction usually is used for dynamic link lib with below usage; so we think it's a return instruction.
0000000000000680 <.plt>: 680: a9bf7bf0 stp x16, x30, [sp, #-16]! 684: 90000090 adrp x16, 10000 <__FRAME_END__+0xf630> 688: f947fe11 ldr x17, [x16, #4088] 68c: 913fe210 add x16, x16, #0xff8 690: d61f0220 br x17
For function return, ARMv8 introduces a dedicated instruction 'ret', which has flag of OCSD_S_INSTR_V8_RET.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + tools/perf/util/cs-etm.c | 81 ++++++++++++++++++++++++- 2 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 0ffa7c5..516a0fb 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -49,6 +49,7 @@ struct cs_etm_packet { u32 last_instr_subtype; u8 last_instr_cond; int cpu; + u32 flags; };
struct cs_etm_queue; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 27a374d..3ad0b87 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -13,6 +13,7 @@ #include <linux/types.h>
#include <stdlib.h> +#include <opencsd/ocsd_if_types.h>
#include "auxtrace.h" #include "color.h" @@ -719,7 +720,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.stream_id = etmq->etm->instructions_id; sample.period = period; sample.cpu = etmq->packet->cpu; - sample.flags = 0; + sample.flags = etmq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
@@ -778,7 +779,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.stream_id = etmq->etm->branches_id; sample.period = 1; sample.cpu = etmq->packet->cpu; - sample.flags = 0; + sample.flags = etmq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
/* @@ -1107,6 +1108,80 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; }
+static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) +{ + struct cs_etm_packet *packet = etmq->packet; + + packet->flags = 0; + + switch (packet->sample_type) { + case CS_ETM_RANGE: + /* + * Immediate branch instruction without neither link nor + * return flag, it's normal branch instruction within + * the function. + */ + if (packet->last_instr_type == OCSD_INSTR_BR && + packet->last_instr_subtype == OCSD_S_INSTR_NONE) { + packet->flags = PERF_IP_FLAG_BRANCH; + + if (packet->last_instr_cond) + packet->flags |= PERF_IP_FLAG_CONDITIONAL; + } + + /* + * Immediate branch instruction with link (e.g. BL), this is + * branch instruction for function call. + */ + if (packet->last_instr_type == OCSD_INSTR_BR && + packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL; + + /* + * Indirect branch instruction with link (e.g. BLR), this is + * branch instruction for function call. + */ + if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT && + packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL; + + /* + * Indirect branch instruction with subtype of + * OCSD_S_INSTR_V7_IMPLIED_RET, this is explicit hint for + * function return for A32/T32. + */ + if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT && + packet->last_instr_subtype == OCSD_S_INSTR_V7_IMPLIED_RET) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_RETURN; + + /* + * Indirect branch instruction without link (e.g. BR), usually + * this is used for function return, especially for functions + * within dynamic link lib. + */ + if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT && + packet->last_instr_subtype == OCSD_S_INSTR_NONE) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_RETURN; + + /* Return instruction for function return. */ + if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT && + packet->last_instr_subtype == OCSD_S_INSTR_V8_RET) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_RETURN; + break; + case CS_ETM_DISCONTINUITY: + case CS_ETM_EXCEPTION: + case CS_ETM_EXCEPTION_RET: + case CS_ETM_EMPTY: + default: + break; + } +} + static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -1158,6 +1233,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ break;
+ cs_etm__set_sample_flags(etmq); + switch (etmq->packet->sample_type) { case CS_ETM_RANGE: /*
On Tue, Dec 11, 2018 at 11:01:08PM +0800, Leo Yan wrote:
The perf sample data contains flags to indicate the hardware trace data is belonging to which type branch instruction, thus this can be used to print out the human readable string. Arm CoreSight ETM sample data is missed to set flags and it is always set to zeros, this results in perf tool skips to print string for instruction types.
This patch is to set branch instruction flags for instruction range packet,
Remove everything between here and your SOB. It is too long and you already added valuable comments to function cs_etm__set_sample_flags().
mainly based on three fields which have been introduced in cs_etm_packet struct:
cs_etm_packet::last_instr_type cs_etm_packet::last_instr_subtype cs_etm_packet::last_instr_cond
Below is detailed implementation for set sample flags for branch instructions:
If the instruction is immediate branch but without link and return flag, we consider it as function internal branch; in fact the immediate branch also can be used to invoke the function entry, usually this is only used in assembly code to directly call a symbol and don't expect to return back; after reviewing kernel normal functions and user space programs, both of them are very seldom to use immediate branch for function call. On the other hand, if we want to decide the immediate branch is for function branch jumping or for function calling, we need to rely on the start address of next packet and check the symbol offset for the start address, this will introduce much complexity in the implementation. So for this version we simply consider immediate branch as function internal branch. Moreover, we rely on 'last_instr_cond' to decide if the branch instruction is a conditional branch or not.
If the instruction is immediate branch with link, it's instruction 'BL' and which is used for function call.
If the instruction is indirect branch with link, e.g BLR, we think it's a function call.
If the instruction is indirect branch and with subtype OCSD_S_INSTR_V7_IMPLIED_RET, the decoders gives the hint the function return for below cases related with A32/T32 instruction; set this branch flag as function return (Thanks for Al's suggestion).
BX R14 MOV PC, LR POP {…, PC} LDR PC, [SP], #offset
If the instruction is indirect branch without link, this is corresponding to instruction 'BR', this instruction usually is used for dynamic link lib with below usage; so we think it's a return instruction.
0000000000000680 <.plt>: 680: a9bf7bf0 stp x16, x30, [sp, #-16]! 684: 90000090 adrp x16, 10000 <__FRAME_END__+0xf630> 688: f947fe11 ldr x17, [x16, #4088] 68c: 913fe210 add x16, x16, #0xff8 690: d61f0220 br x17
For function return, ARMv8 introduces a dedicated instruction 'ret', which has flag of OCSD_S_INSTR_V8_RET.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + tools/perf/util/cs-etm.c | 81 ++++++++++++++++++++++++- 2 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 0ffa7c5..516a0fb 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -49,6 +49,7 @@ struct cs_etm_packet { u32 last_instr_subtype; u8 last_instr_cond; int cpu;
- u32 flags;
Add this just below @last_instr_subtype so that types are grouped together.
}; struct cs_etm_queue; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 27a374d..3ad0b87 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -13,6 +13,7 @@ #include <linux/types.h> #include <stdlib.h> +#include <opencsd/ocsd_if_types.h> #include "auxtrace.h" #include "color.h" @@ -719,7 +720,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.stream_id = etmq->etm->instructions_id; sample.period = period; sample.cpu = etmq->packet->cpu;
- sample.flags = 0;
- sample.flags = etmq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
@@ -778,7 +779,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.stream_id = etmq->etm->branches_id; sample.period = 1; sample.cpu = etmq->packet->cpu;
- sample.flags = 0;
- sample.flags = etmq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
/* @@ -1107,6 +1108,80 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) +{
- struct cs_etm_packet *packet = etmq->packet;
- packet->flags = 0;
- switch (packet->sample_type) {
- case CS_ETM_RANGE:
/*
* Immediate branch instruction without neither link nor
* return flag, it's normal branch instruction within
* the function.
*/
if (packet->last_instr_type == OCSD_INSTR_BR &&
packet->last_instr_subtype == OCSD_S_INSTR_NONE) {
packet->flags = PERF_IP_FLAG_BRANCH;
if (packet->last_instr_cond)
packet->flags |= PERF_IP_FLAG_CONDITIONAL;
}
/*
* Immediate branch instruction with link (e.g. BL), this is
* branch instruction for function call.
*/
if (packet->last_instr_type == OCSD_INSTR_BR &&
packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL;
/*
* Indirect branch instruction with link (e.g. BLR), this is
* branch instruction for function call.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL;
/*
* Indirect branch instruction with subtype of
* OCSD_S_INSTR_V7_IMPLIED_RET, this is explicit hint for
* function return for A32/T32.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_V7_IMPLIED_RET)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
/*
* Indirect branch instruction without link (e.g. BR), usually
* this is used for function return, especially for functions
* within dynamic link lib.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_NONE)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
/* Return instruction for function return. */
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_V8_RET)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
Many thanks for the comments - it is really clear and helps understand what you are doing.
break;
- case CS_ETM_DISCONTINUITY:
- case CS_ETM_EXCEPTION:
- case CS_ETM_EXCEPTION_RET:
- case CS_ETM_EMPTY:
- default:
break;
- }
+}
static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -1158,6 +1233,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ break;
Intuitively one would be tempted to call cs_etm__set_sample_flags() after the switch statement but that won't work since packet addresses are swaped in cs_etm__sample(). As such I think it is worth adding a comment here stressing the importance of doing flag processing before the switch() statement.
cs_etm__set_sample_flags(etmq);
switch (etmq->packet->sample_type) { case CS_ETM_RANGE: /*
-- 2.7.4
Hi Mathieu,
On Thu, Dec 13, 2018 at 01:28:34PM -0700, Mathieu Poirier wrote:
On Tue, Dec 11, 2018 at 11:01:08PM +0800, Leo Yan wrote:
The perf sample data contains flags to indicate the hardware trace data is belonging to which type branch instruction, thus this can be used to print out the human readable string. Arm CoreSight ETM sample data is missed to set flags and it is always set to zeros, this results in perf tool skips to print string for instruction types.
This patch is to set branch instruction flags for instruction range packet,
Remove everything between here and your SOB. It is too long and you already added valuable comments to function cs_etm__set_sample_flags().
Will do.
mainly based on three fields which have been introduced in cs_etm_packet struct:
cs_etm_packet::last_instr_type cs_etm_packet::last_instr_subtype cs_etm_packet::last_instr_cond
Below is detailed implementation for set sample flags for branch instructions:
If the instruction is immediate branch but without link and return flag, we consider it as function internal branch; in fact the immediate branch also can be used to invoke the function entry, usually this is only used in assembly code to directly call a symbol and don't expect to return back; after reviewing kernel normal functions and user space programs, both of them are very seldom to use immediate branch for function call. On the other hand, if we want to decide the immediate branch is for function branch jumping or for function calling, we need to rely on the start address of next packet and check the symbol offset for the start address, this will introduce much complexity in the implementation. So for this version we simply consider immediate branch as function internal branch. Moreover, we rely on 'last_instr_cond' to decide if the branch instruction is a conditional branch or not.
If the instruction is immediate branch with link, it's instruction 'BL' and which is used for function call.
If the instruction is indirect branch with link, e.g BLR, we think it's a function call.
If the instruction is indirect branch and with subtype OCSD_S_INSTR_V7_IMPLIED_RET, the decoders gives the hint the function return for below cases related with A32/T32 instruction; set this branch flag as function return (Thanks for Al's suggestion).
BX R14 MOV PC, LR POP {…, PC} LDR PC, [SP], #offset
If the instruction is indirect branch without link, this is corresponding to instruction 'BR', this instruction usually is used for dynamic link lib with below usage; so we think it's a return instruction.
0000000000000680 <.plt>: 680: a9bf7bf0 stp x16, x30, [sp, #-16]! 684: 90000090 adrp x16, 10000 <__FRAME_END__+0xf630> 688: f947fe11 ldr x17, [x16, #4088] 68c: 913fe210 add x16, x16, #0xff8 690: d61f0220 br x17
For function return, ARMv8 introduces a dedicated instruction 'ret', which has flag of OCSD_S_INSTR_V8_RET.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + tools/perf/util/cs-etm.c | 81 ++++++++++++++++++++++++- 2 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 0ffa7c5..516a0fb 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -49,6 +49,7 @@ struct cs_etm_packet { u32 last_instr_subtype; u8 last_instr_cond; int cpu;
- u32 flags;
Add this just below @last_instr_subtype so that types are grouped together.
Will do it.
}; struct cs_etm_queue; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 27a374d..3ad0b87 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -13,6 +13,7 @@ #include <linux/types.h> #include <stdlib.h> +#include <opencsd/ocsd_if_types.h> #include "auxtrace.h" #include "color.h" @@ -719,7 +720,7 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.stream_id = etmq->etm->instructions_id; sample.period = period; sample.cpu = etmq->packet->cpu;
- sample.flags = 0;
- sample.flags = etmq->prev_packet->flags; sample.insn_len = 1; sample.cpumode = event->sample.header.misc;
@@ -778,7 +779,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq) sample.stream_id = etmq->etm->branches_id; sample.period = 1; sample.cpu = etmq->packet->cpu;
- sample.flags = 0;
- sample.flags = etmq->prev_packet->flags; sample.cpumode = event->sample.header.misc;
/* @@ -1107,6 +1108,80 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) +{
- struct cs_etm_packet *packet = etmq->packet;
- packet->flags = 0;
- switch (packet->sample_type) {
- case CS_ETM_RANGE:
/*
* Immediate branch instruction without neither link nor
* return flag, it's normal branch instruction within
* the function.
*/
if (packet->last_instr_type == OCSD_INSTR_BR &&
packet->last_instr_subtype == OCSD_S_INSTR_NONE) {
packet->flags = PERF_IP_FLAG_BRANCH;
if (packet->last_instr_cond)
packet->flags |= PERF_IP_FLAG_CONDITIONAL;
}
/*
* Immediate branch instruction with link (e.g. BL), this is
* branch instruction for function call.
*/
if (packet->last_instr_type == OCSD_INSTR_BR &&
packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL;
/*
* Indirect branch instruction with link (e.g. BLR), this is
* branch instruction for function call.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_BR_LINK)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL;
/*
* Indirect branch instruction with subtype of
* OCSD_S_INSTR_V7_IMPLIED_RET, this is explicit hint for
* function return for A32/T32.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_V7_IMPLIED_RET)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
/*
* Indirect branch instruction without link (e.g. BR), usually
* this is used for function return, especially for functions
* within dynamic link lib.
*/
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_NONE)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
/* Return instruction for function return. */
if (packet->last_instr_type == OCSD_INSTR_BR_INDIRECT &&
packet->last_instr_subtype == OCSD_S_INSTR_V8_RET)
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN;
Many thanks for the comments - it is really clear and helps understand what you are doing.
Yeah, this also helps me for clear idea.
break;
- case CS_ETM_DISCONTINUITY:
- case CS_ETM_EXCEPTION:
- case CS_ETM_EXCEPTION_RET:
- case CS_ETM_EMPTY:
- default:
break;
- }
+}
static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { struct cs_etm_auxtrace *etm = etmq->etm; @@ -1158,6 +1233,8 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ break;
Intuitively one would be tempted to call cs_etm__set_sample_flags() after the switch statement but that won't work since packet addresses are swaped in cs_etm__sample(). As such I think it is worth adding a comment here stressing the importance of doing flag processing before the switch() statement.
Indeed. Will add comment for this.
And comments in other patches are also acked, will follow up in next series. Thanks a lot for reviewing and suggestions!
Leo Yan
cs_etm__set_sample_flags(etmq);
switch (etmq->packet->sample_type) { case CS_ETM_RANGE: /*
-- 2.7.4
In the middle of trace stream, it might be interrupted thus the trace data is not discontinuous, the trace stream firstly is ended for previous trace block and restarted for next block.
To display related information for showing trace is restarted, this patch set sample flags for trace discontinuity:
- If one discontinuity packet is coming, append flag PERF_IP_FLAG_TRACE_END to the previous packet to indicate the trace has been ended; - If one instruction packet is following discontinuity packet, this instruction packet is the first one packet to restarting trace. So set flag PERF_IP_FLAG_TRACE_START to discontinuity packet, this flag will be used to generate sample when connect with the sequential instruction packet.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 3ad0b87..bc8a4bc 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1111,6 +1111,7 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; + struct cs_etm_packet *prev_packet = etmq->prev_packet;
packet->flags = 0;
@@ -1172,8 +1173,26 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) packet->last_instr_subtype == OCSD_S_INSTR_V8_RET) packet->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN; + + /* + * Decoder might insert a discontinuity in the middle of + * instruction packets, fixup prev_packet with flag + * PERF_IP_FLAG_TRACE_BEGIN to indicate restarting trace. + */ + if (prev_packet->sample_type == CS_ETM_DISCONTINUITY) + prev_packet->flags |= PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_TRACE_BEGIN; break; case CS_ETM_DISCONTINUITY: + /* + * The trace is discontinuous, if the previous packet is + * instruction packet, set flag PERF_IP_FLAG_TRACE_END + * for previous packet. + */ + if (prev_packet->sample_type == CS_ETM_RANGE) + prev_packet->flags |= PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_TRACE_END; + break; case CS_ETM_EXCEPTION: case CS_ETM_EXCEPTION_RET: case CS_ETM_EMPTY:
On Tue, Dec 11, 2018 at 11:01:09PM +0800, Leo Yan wrote:
In the middle of trace stream, it might be interrupted thus the trace data is not discontinuous, the trace stream firstly is ended for
s/discontinuous/continuous
previous trace block and restarted for next block.
To display related information for showing trace is restarted, this patch set sample flags for trace discontinuity:
- If one discontinuity packet is coming, append flag PERF_IP_FLAG_TRACE_END to the previous packet to indicate the trace has been ended;
- If one instruction packet is following discontinuity packet, this instruction packet is the first one packet to restarting trace. So set flag PERF_IP_FLAG_TRACE_START to discontinuity packet, this flag will be used to generate sample when connect with the sequential instruction packet.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 3ad0b87..bc8a4bc 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1111,6 +1111,7 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet;
- struct cs_etm_packet *prev_packet = etmq->prev_packet;
packet->flags = 0; @@ -1172,8 +1173,26 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) packet->last_instr_subtype == OCSD_S_INSTR_V8_RET) packet->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN;
/*
* Decoder might insert a discontinuity in the middle of
* instruction packets, fixup prev_packet with flag
* PERF_IP_FLAG_TRACE_BEGIN to indicate restarting trace.
*/
if (prev_packet->sample_type == CS_ETM_DISCONTINUITY)
prev_packet->flags |= PERF_IP_FLAG_BRANCH |
break; case CS_ETM_DISCONTINUITY:PERF_IP_FLAG_TRACE_BEGIN;
/*
* The trace is discontinuous, if the previous packet is
* instruction packet, set flag PERF_IP_FLAG_TRACE_END
* for previous packet.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags |= PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_TRACE_END;
case CS_ETM_EXCEPTION: case CS_ETM_EXCEPTION_RET: case CS_ETM_EMPTY:break;
-- 2.7.4
When an exception packet comes, it contains the information for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But here has one limitation for exception return packet, which cannot delivery exception number correctly by decoder thus we must reuse the exception number which is delivered by the previous exception packet to know what's type for exception return.
This patch simply adds a new 'exc_num' field in cs_etm_packet struct, it records exception number for exception packet. For making decision for exception returning, cs-etm.c has more context for packet flow thus it will be handled later.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 20 ++++++++++++++++---- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + 2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 8a19310..19e7fd4 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -293,6 +293,7 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].last_instr_type = 0; decoder->packet_buffer[i].last_instr_subtype = 0; decoder->packet_buffer[i].last_instr_cond = 0; + decoder->packet_buffer[i].exc_num = UINT32_MAX; decoder->packet_buffer[i].cpu = INT_MIN; } } @@ -329,6 +330,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].last_instr_type = 0; decoder->packet_buffer[et].last_instr_subtype = 0; decoder->packet_buffer[et].last_instr_cond = 0; + decoder->packet_buffer[et].exc_num = UINT32_MAX;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -404,10 +406,20 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_decoder *decoder,
static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder, + const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id) -{ - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, - CS_ETM_EXCEPTION); +{ int ret = 0; + struct cs_etm_packet *packet; + + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION); + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) + return ret; + + packet = &decoder->packet_buffer[decoder->tail]; + packet->exc_num = elem->exception_number; + + return ret; }
static ocsd_datapath_resp_t @@ -441,7 +453,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - resp = cs_etm_decoder__buffer_exception(decoder, + resp = cs_etm_decoder__buffer_exception(decoder, elem, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 516a0fb..26ae961 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -48,6 +48,7 @@ struct cs_etm_packet { u32 last_instr_type; u32 last_instr_subtype; u8 last_instr_cond; + u32 exc_num; int cpu; u32 flags; };
On Tue, Dec 11, 2018 at 11:01:10PM +0800, Leo Yan wrote:
When an exception packet comes, it contains the information for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But here has one limitation for exception return packet, which cannot delivery exception number correctly by decoder thus we must reuse the exception number which is delivered by the previous exception packet to know what's type for exception return.
This patch simply adds a new 'exc_num' field in cs_etm_packet struct, it records exception number for exception packet. For making decision for exception returning, cs-etm.c has more context for packet flow thus it will be handled later.
Remove the last sentence, it doesn't add anything to the context of the patch.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 20 ++++++++++++++++---- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + 2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 8a19310..19e7fd4 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -293,6 +293,7 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].last_instr_type = 0; decoder->packet_buffer[i].last_instr_subtype = 0; decoder->packet_buffer[i].last_instr_cond = 0;
decoder->packet_buffer[i].cpu = INT_MIN; }decoder->packet_buffer[i].exc_num = UINT32_MAX;
} @@ -329,6 +330,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].last_instr_type = 0; decoder->packet_buffer[et].last_instr_subtype = 0; decoder->packet_buffer[et].last_instr_cond = 0;
- decoder->packet_buffer[et].exc_num = UINT32_MAX;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -404,10 +406,20 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_decoder *decoder, static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
-{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
+{ int ret = 0;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- packet->exc_num = elem->exception_number;
- return ret;
} static ocsd_datapath_resp_t @@ -441,7 +453,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
resp = cs_etm_decoder__buffer_exception(decoder,
break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:resp = cs_etm_decoder__buffer_exception(decoder, elem, trace_chan_id);
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 516a0fb..26ae961 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -48,6 +48,7 @@ struct cs_etm_packet { u32 last_instr_type; u32 last_instr_subtype; u8 last_instr_cond;
- u32 exc_num;
Why not simply call it "exception_number" like in the library? Otherwise someone might be mislead in thinking you mean "execution" number (I certainly did before reading the changelog).
int cpu; u32 flags; }; -- 2.7.4
On Tue, Dec 11, 2018 at 11:01:10PM +0800, Leo Yan wrote:
When an exception packet comes, it contains the information for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But here has one limitation for exception return packet, which cannot delivery exception number correctly by decoder thus we must reuse the exception number which is delivered by the previous exception packet to know what's type for exception return.
This patch simply adds a new 'exc_num' field in cs_etm_packet struct, it records exception number for exception packet. For making decision for exception returning, cs-etm.c has more context for packet flow thus it will be handled later.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 20 ++++++++++++++++---- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 1 + 2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 8a19310..19e7fd4 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -293,6 +293,7 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].last_instr_type = 0; decoder->packet_buffer[i].last_instr_subtype = 0; decoder->packet_buffer[i].last_instr_cond = 0;
decoder->packet_buffer[i].cpu = INT_MIN; }decoder->packet_buffer[i].exc_num = UINT32_MAX;
} @@ -329,6 +330,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].last_instr_type = 0; decoder->packet_buffer[et].last_instr_subtype = 0; decoder->packet_buffer[et].last_instr_cond = 0;
- decoder->packet_buffer[et].exc_num = UINT32_MAX;
if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -404,10 +406,20 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_decoder *decoder, static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
-{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
+{ int ret = 0;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- packet->exc_num = elem->exception_number;
- return ret;
} static ocsd_datapath_resp_t @@ -441,7 +453,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
resp = cs_etm_decoder__buffer_exception(decoder,
break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:resp = cs_etm_decoder__buffer_exception(decoder, elem, trace_chan_id);
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 516a0fb..26ae961 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -48,6 +48,7 @@ struct cs_etm_packet { u32 last_instr_type; u32 last_instr_subtype; u8 last_instr_cond;
- u32 exc_num;
Move this up one line.
int cpu; u32 flags; }; -- 2.7.4
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions; + u32 exc_num; struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos; @@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; }
+static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{ + struct cs_etm_auxtrace *etm = etmq->etm; + int cpu = etmq->packet->cpu; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic) + if (etmq->exc_num == CS_ETMV3_EXC_SVC) + return true; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic) + if (etmq->exc_num == CS_ETMV4_EXC_CALL) + return true; + + return false; +} + +static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{ + struct cs_etm_auxtrace *etm = etmq->etm; + int cpu = etmq->packet->cpu; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic) + if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT || + etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT || + etmq->exc_num == CS_ETMV3_EXC_PE_RESET || + etmq->exc_num == CS_ETMV3_EXC_IRQ || + etmq->exc_num == CS_ETMV3_EXC_FIQ) + return true; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic) + if (etmq->exc_num == CS_ETMV4_EXC_RESET || + etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT || + etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR || + etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG || + etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG || + etmq->exc_num == CS_ETMV4_EXC_IRQ || + etmq->exc_num == CS_ETMV4_EXC_FIQ) + return true; + + return false; +} + +static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{ + struct cs_etm_auxtrace *etm = etmq->etm; + int cpu = etmq->packet->cpu; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic) + if (etmq->exc_num == CS_ETMV3_EXC_SMC || + etmq->exc_num == CS_ETMV3_EXC_HYP || + etmq->exc_num == CS_ETMV3_EXC_JAZELLE || + etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR || + etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT || + etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT || + etmq->exc_num == CS_ETMV3_EXC_GENERIC) + return true; + + if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic) + if (etmq->exc_num == CS_ETMV4_EXC_TRAP || + etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT || + etmq->exc_num == CS_ETMV4_EXC_INST_FAULT || + etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT) + return true; + + return false; +} + static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END; + + etmq->exc_num = UINT_MAX; break; case CS_ETM_EXCEPTION: + etmq->exc_num = packet->exc_num; + + /* The exception is for system call. */ + if (cs_etm__is_syscall(etmq)) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL | + PERF_IP_FLAG_SYSCALLRET; + /* + * The exceptions are triggered by external signals from bus, + * interrupt controller, debug module, PE reset or halt. + */ + else if (cs_etm__is_async_exception(etmq)) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL | + PERF_IP_FLAG_ASYNC | + PERF_IP_FLAG_INTERRUPT; + /* + * Otherwise, exception is caused by trap, instruction & + * data fault, or alignment errors. + */ + else if (cs_etm__is_sync_exception(etmq)) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL | + PERF_IP_FLAG_INTERRUPT; + + /* + * When the exception packet is inserted, since exception + * packet is not used standalone for generating samples + * and it's affiliation to the previous instruction range + * packet; so set previous range packet flags to tell perf + * it is an exception taken branch. + */ + if (prev_packet->sample_type == CS_ETM_RANGE) + prev_packet->flags = packet->flags; + break; case CS_ETM_EXCEPTION_RET: + if (cs_etm__is_syscall(etmq)) + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_RETURN | + PERF_IP_FLAG_SYSCALLRET; + else + packet->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_RETURN | + PERF_IP_FLAG_INTERRUPT; + + /* + * When the exception return packet is inserted, since + * exception return packet is not used standalone for + * generating samples and it's affiliation to the previous + * instruction range packet; so set previous range packet + * flags to tell perf it is an exception return branch. + */ + if (prev_packet->sample_type == CS_ETM_RANGE) + prev_packet->flags = packet->flags; + + etmq->exc_num = UINT_MAX; + break; case CS_ETM_EMPTY: default: break; @@ -1553,6 +1679,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; } + for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) metadata[j][k] = ptr[i + k];
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, };
+/* ETMv3 exception number */ +enum { + CS_ETMV3_EXC_NONE, + CS_ETMV3_EXC_DEBUG_HALT, + CS_ETMV3_EXC_SMC, + CS_ETMV3_EXC_HYP, + CS_ETMV3_EXC_ASYNC_DATA_ABORT, + CS_ETMV3_EXC_JAZELLE, + CS_ETMV3_EXC_RESERVED1, + CS_ETMV3_EXC_RESERVED2, + CS_ETMV3_EXC_PE_RESET, + CS_ETMV3_EXC_UNDEFINED_INSTR, + CS_ETMV3_EXC_SVC, + CS_ETMV3_EXC_PREFETCH_ABORT, + CS_ETMV3_EXC_DATA_FAULT, + CS_ETMV3_EXC_GENERIC, + CS_ETMV3_EXC_IRQ, + CS_ETMV3_EXC_FIQ, +}; + /* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, };
+/* ETMv4 exception number */ +enum { + CS_ETMV4_EXC_RESET, + CS_ETMV4_EXC_DEBUG_HALT, + CS_ETMV4_EXC_CALL, + CS_ETMV4_EXC_TRAP, + CS_ETMV4_EXC_SYSTEM_ERROR, + CS_ETMV4_EXC_INST_DEBUG, + CS_ETMV4_EXC_DATA_DEBUG, + CS_ETMV4_EXC_ALIGNMENT, + CS_ETMV4_EXC_INST_FAULT, + CS_ETMV4_EXC_DATA_FAULT, + CS_ETMV4_EXC_IRQ, + CS_ETMV4_EXC_FIQ, +}; + /* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list;
Hi Mathieu, Mike,
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
This patch seris is sent to you and CoreSight mailing list for internal review before send to LKML. I still have several questions want to check with you firstly, so please see below comments.
Also very welcome comment and suggestion.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Should I divide this into two patches, one is for ETMv3 and another is for ETMv4?
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
- u32 exc_num; struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos;
@@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
- return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
- return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
Mike, could you help confirm what's the exception type for CS_ETMV3_EXC_JAZELLE/CS_ETMV3_EXC_UNDEFINED_INSTR/CS_ETMV3_EXC_GENERIC?
Are they synchronous exceptions?
Thanks, Leo Yan
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
- return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
break; case CS_ETM_EXCEPTION:etmq->exc_num = UINT_MAX;
etmq->exc_num = packet->exc_num;
/* The exception is for system call. */
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_SYSCALLRET;
/*
* The exceptions are triggered by external signals from bus,
* interrupt controller, debug module, PE reset or halt.
*/
else if (cs_etm__is_async_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_ASYNC |
PERF_IP_FLAG_INTERRUPT;
/*
* Otherwise, exception is caused by trap, instruction &
* data fault, or alignment errors.
*/
else if (cs_etm__is_sync_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception packet is inserted, since exception
* packet is not used standalone for generating samples
* and it's affiliation to the previous instruction range
* packet; so set previous range packet flags to tell perf
* it is an exception taken branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
case CS_ETM_EXCEPTION_RET:break;
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_SYSCALLRET;
else
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception return packet is inserted, since
* exception return packet is not used standalone for
* generating samples and it's affiliation to the previous
* instruction range packet; so set previous range packet
* flags to tell perf it is an exception return branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
etmq->exc_num = UINT_MAX;
case CS_ETM_EMPTY: default: break;break;
@@ -1553,6 +1679,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; }
for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) metadata[j][k] = ptr[i + k];
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, }; +/* ETMv3 exception number */ +enum {
- CS_ETMV3_EXC_NONE,
- CS_ETMV3_EXC_DEBUG_HALT,
- CS_ETMV3_EXC_SMC,
- CS_ETMV3_EXC_HYP,
- CS_ETMV3_EXC_ASYNC_DATA_ABORT,
- CS_ETMV3_EXC_JAZELLE,
- CS_ETMV3_EXC_RESERVED1,
- CS_ETMV3_EXC_RESERVED2,
- CS_ETMV3_EXC_PE_RESET,
- CS_ETMV3_EXC_UNDEFINED_INSTR,
- CS_ETMV3_EXC_SVC,
- CS_ETMV3_EXC_PREFETCH_ABORT,
- CS_ETMV3_EXC_DATA_FAULT,
- CS_ETMV3_EXC_GENERIC,
- CS_ETMV3_EXC_IRQ,
- CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, }; +/* ETMv4 exception number */ +enum {
- CS_ETMV4_EXC_RESET,
- CS_ETMV4_EXC_DEBUG_HALT,
- CS_ETMV4_EXC_CALL,
- CS_ETMV4_EXC_TRAP,
- CS_ETMV4_EXC_SYSTEM_ERROR,
- CS_ETMV4_EXC_INST_DEBUG,
- CS_ETMV4_EXC_DATA_DEBUG,
- CS_ETMV4_EXC_ALIGNMENT,
- CS_ETMV4_EXC_INST_FAULT,
- CS_ETMV4_EXC_DATA_FAULT,
- CS_ETMV4_EXC_IRQ,
- CS_ETMV4_EXC_FIQ,
+};
/* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list; -- 2.7.4
Hi Leo, On Tue, 11 Dec 2018 at 15:08, leo.yan@linaro.org wrote:
Hi Mathieu, Mike,
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
This patch seris is sent to you and CoreSight mailing list for internal review before send to LKML. I still have several questions want to check with you firstly, so please see below comments.
Also very welcome comment and suggestion.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Should I divide this into two patches, one is for ETMv3 and another is for ETMv4?
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
u32 exc_num; struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos;
@@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; }
+static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
Mike, could you help confirm what's the exception type for CS_ETMV3_EXC_JAZELLE/CS_ETMV3_EXC_UNDEFINED_INSTR/CS_ETMV3_EXC_GENERIC?
Are they synchronous exceptions?
Thanks, Leo Yan
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
etmq->exc_num = UINT_MAX; break; case CS_ETM_EXCEPTION:
etmq->exc_num = packet->exc_num;
/* The exception is for system call. */
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_SYSCALLRET;
/*
* The exceptions are triggered by external signals from bus,
* interrupt controller, debug module, PE reset or halt.
*/
else if (cs_etm__is_async_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_ASYNC |
PERF_IP_FLAG_INTERRUPT;
/*
* Otherwise, exception is caused by trap, instruction &
* data fault, or alignment errors.
*/
else if (cs_etm__is_sync_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception packet is inserted, since exception
* packet is not used standalone for generating samples
* and it's affiliation to the previous instruction range
* packet; so set previous range packet flags to tell perf
* it is an exception taken branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
break; case CS_ETM_EXCEPTION_RET:
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_SYSCALLRET;
else
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception return packet is inserted, since
* exception return packet is not used standalone for
* generating samples and it's affiliation to the previous
* instruction range packet; so set previous range packet
* flags to tell perf it is an exception return branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
etmq->exc_num = UINT_MAX;
break; case CS_ETM_EMPTY: default: break;
@@ -1553,6 +1679,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; }
for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) metadata[j][k] = ptr[i + k];
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, };
+/* ETMv3 exception number */ +enum {
CS_ETMV3_EXC_NONE,
CS_ETMV3_EXC_DEBUG_HALT,
CS_ETMV3_EXC_SMC,
CS_ETMV3_EXC_HYP,
CS_ETMV3_EXC_ASYNC_DATA_ABORT,
CS_ETMV3_EXC_JAZELLE,
CS_ETMV3_EXC_RESERVED1,
CS_ETMV3_EXC_RESERVED2,
CS_ETMV3_EXC_PE_RESET,
CS_ETMV3_EXC_UNDEFINED_INSTR,
CS_ETMV3_EXC_SVC,
CS_ETMV3_EXC_PREFETCH_ABORT,
CS_ETMV3_EXC_DATA_FAULT,
CS_ETMV3_EXC_GENERIC,
CS_ETMV3_EXC_IRQ,
CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, };
+/* ETMv4 exception number */ +enum {
CS_ETMV4_EXC_RESET,
CS_ETMV4_EXC_DEBUG_HALT,
CS_ETMV4_EXC_CALL,
CS_ETMV4_EXC_TRAP,
CS_ETMV4_EXC_SYSTEM_ERROR,
CS_ETMV4_EXC_INST_DEBUG,
CS_ETMV4_EXC_DATA_DEBUG,
CS_ETMV4_EXC_ALIGNMENT,
CS_ETMV4_EXC_INST_FAULT,
CS_ETMV4_EXC_DATA_FAULT,
CS_ETMV4_EXC_IRQ,
CS_ETMV4_EXC_FIQ,
+};
This enum needs correcting to cover the reserved values so that the enum values match up for the actual exception number value. Additionally there are implementation defined exceptions after FIQ that could occur on custom devices.
Regards
Mike
/* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list;
-- 2.7.4
On Wed, Dec 12, 2018 at 12:48:54PM +0000, Mike Leach wrote:
[...]
+/* ETMv4 exception number */ +enum {
CS_ETMV4_EXC_RESET,
CS_ETMV4_EXC_DEBUG_HALT,
CS_ETMV4_EXC_CALL,
CS_ETMV4_EXC_TRAP,
CS_ETMV4_EXC_SYSTEM_ERROR,
CS_ETMV4_EXC_INST_DEBUG,
CS_ETMV4_EXC_DATA_DEBUG,
CS_ETMV4_EXC_ALIGNMENT,
CS_ETMV4_EXC_INST_FAULT,
CS_ETMV4_EXC_DATA_FAULT,
CS_ETMV4_EXC_IRQ,
CS_ETMV4_EXC_FIQ,
+};
This enum needs correcting to cover the reserved values so that the enum values match up for the actual exception number value.
Oops, thanks for pointing out this, good catching!
Will fix it.
Additionally there are implementation defined exceptions after FIQ that could occur on custom devices.
As I saw in another email "5 bit exception number in ETMv4", so exception numbers 16 ~ 31 also are valid value. If so, we just assume they are normal exception (not async exception).
If you disagree with this, please let me know.
Thanks for reviewing. Leo Yan
Hi Mike,
On Wed, Dec 12, 2018 at 12:48:54PM +0000, Mike Leach wrote:
[...]
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
Mike, could you help confirm what's the exception type for CS_ETMV3_EXC_JAZELLE/CS_ETMV3_EXC_UNDEFINED_INSTR/CS_ETMV3_EXC_GENERIC?
Are they synchronous exceptions?
Could you help to confirm for this part, thanks!
[...]
Thanks, Leo Yan
Hi Leo,
Sorry missed this... On Wed, 12 Dec 2018 at 14:42, leo.yan@linaro.org wrote:
Hi Mike,
On Wed, Dec 12, 2018 at 12:48:54PM +0000, Mike Leach wrote:
[...]
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
Mike, could you help confirm what's the exception type for CS_ETMV3_EXC_JAZELLE/CS_ETMV3_EXC_UNDEFINED_INSTR/CS_ETMV3_EXC_GENERIC?
Are they synchronous exceptions?
JAZELLE - more accurately renamed THUMBEE in PTM, and undefined instruction are synchronous as occur as a result of attempt to execute instuctions. Generic exception is used for implementation defined exceptions - which could be either type.
Mike.
Could you help to confirm for this part, thanks!
[...]
Thanks, Leo Yan
Hi Mike,
On Wed, Dec 12, 2018 at 03:29:18PM +0000, Mike Leach wrote:
Hi Leo,
Sorry missed this...
No worry.
On Wed, 12 Dec 2018 at 14:42, leo.yan@linaro.org wrote:
Hi Mike,
On Wed, Dec 12, 2018 at 12:48:54PM +0000, Mike Leach wrote:
[...]
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;
if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
Mike, could you help confirm what's the exception type for CS_ETMV3_EXC_JAZELLE/CS_ETMV3_EXC_UNDEFINED_INSTR/CS_ETMV3_EXC_GENERIC?
Are they synchronous exceptions?
JAZELLE - more accurately renamed THUMBEE in PTM,
I did a search, JAZELLE is used at early time (ARMv5/v6).
Anyway, I will change to CS_ETMV3_EXC_JAZELLE_THUMBEE for more accuration, but if you prefer simple name CS_ETMV3_EXC_THUMBEE please let me know.
and undefined instruction are synchronous as occur as a result of attempt to execute instuctions. Generic exception is used for implementation defined exceptions - which could be either type.
In the first version will use by default exception type. later we can consider to add extra flag to indicate "implementation defined'.
Thanks for suggestions.
Thanks, Leo Yan
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
[...]
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, }; +/* ETMv3 exception number */ +enum {
- CS_ETMV3_EXC_NONE,
- CS_ETMV3_EXC_DEBUG_HALT,
- CS_ETMV3_EXC_SMC,
- CS_ETMV3_EXC_HYP,
- CS_ETMV3_EXC_ASYNC_DATA_ABORT,
- CS_ETMV3_EXC_JAZELLE,
- CS_ETMV3_EXC_RESERVED1,
- CS_ETMV3_EXC_RESERVED2,
- CS_ETMV3_EXC_PE_RESET,
- CS_ETMV3_EXC_UNDEFINED_INSTR,
- CS_ETMV3_EXC_SVC,
- CS_ETMV3_EXC_PREFETCH_ABORT,
- CS_ETMV3_EXC_DATA_FAULT,
- CS_ETMV3_EXC_GENERIC,
- CS_ETMV3_EXC_IRQ,
- CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, }; +/* ETMv4 exception number */ +enum {
- CS_ETMV4_EXC_RESET,
- CS_ETMV4_EXC_DEBUG_HALT,
- CS_ETMV4_EXC_CALL,
- CS_ETMV4_EXC_TRAP,
- CS_ETMV4_EXC_SYSTEM_ERROR,
- CS_ETMV4_EXC_INST_DEBUG,
- CS_ETMV4_EXC_DATA_DEBUG,
- CS_ETMV4_EXC_ALIGNMENT,
- CS_ETMV4_EXC_INST_FAULT,
- CS_ETMV4_EXC_DATA_FAULT,
- CS_ETMV4_EXC_IRQ,
- CS_ETMV4_EXC_FIQ,
+};
I personally think for a best practice, OpenCSD needs to define exception numbers and use them as exception numbers when output exception events; perf cs-etm works as OpenCSD's consumer and directly include OpenCSD header file for these defintions.
I am sorry I don't sync this ealier so can get rid of barriers before upstream this patch series, I accept Mike's suggestion to define exception number in perf header cs-etm.h rather than define them in OpenCSD, just confirm will all of us agree with this?
Thanks, Leo Yan
/* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list; -- 2.7.4
Hi,
On Tue, 11 Dec 2018 at 15:20, leo.yan@linaro.org wrote:
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
[...]
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, };
+/* ETMv3 exception number */ +enum {
CS_ETMV3_EXC_NONE,
CS_ETMV3_EXC_DEBUG_HALT,
CS_ETMV3_EXC_SMC,
CS_ETMV3_EXC_HYP,
CS_ETMV3_EXC_ASYNC_DATA_ABORT,
CS_ETMV3_EXC_JAZELLE,
CS_ETMV3_EXC_RESERVED1,
CS_ETMV3_EXC_RESERVED2,
CS_ETMV3_EXC_PE_RESET,
CS_ETMV3_EXC_UNDEFINED_INSTR,
CS_ETMV3_EXC_SVC,
CS_ETMV3_EXC_PREFETCH_ABORT,
CS_ETMV3_EXC_DATA_FAULT,
CS_ETMV3_EXC_GENERIC,
CS_ETMV3_EXC_IRQ,
CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, };
+/* ETMv4 exception number */ +enum {
CS_ETMV4_EXC_RESET,
CS_ETMV4_EXC_DEBUG_HALT,
CS_ETMV4_EXC_CALL,
CS_ETMV4_EXC_TRAP,
CS_ETMV4_EXC_SYSTEM_ERROR,
CS_ETMV4_EXC_INST_DEBUG,
CS_ETMV4_EXC_DATA_DEBUG,
CS_ETMV4_EXC_ALIGNMENT,
CS_ETMV4_EXC_INST_FAULT,
CS_ETMV4_EXC_DATA_FAULT,
CS_ETMV4_EXC_IRQ,
CS_ETMV4_EXC_FIQ,
+};
I personally think for a best practice, OpenCSD needs to define exception numbers and use them as exception numbers when output exception events; perf cs-etm works as OpenCSD's consumer and directly include OpenCSD header file for these defintions.
I am sorry I don't sync this ealier so can get rid of barriers before upstream this patch series, I accept Mike's suggestion to define exception number in perf header cs-etm.h rather than define them in OpenCSD, just confirm will all of us agree with this?
What appears to be required by this client (perf / cs-etm) is a mapping of number value onto exception 'type' enumeration for ease of reference in client code.. But this interpretation is complex and both architecture and trace protocol specific. The A profile cores typically running linux define a 4 bit exception number in ETMv3 and a 5 bit exception number in ETMv4, and as noted above the mapping of type differing according to protocol - the reset type has a different number in ETMv3/ ETMv4, with different available types reported - the ETMv4 CALL covers all the SVC SMC cases reported in ETMv3.
However, if we look at M class exceptions, then these are different again, with a 10 bit exception number - and lots of new 'types' not defined above.
Any OpenCSD solution must take in all these variants and could not be practically confined to a simple enumerated type - so would require a combination of factors, including perhaps some generic typing supplemented by a value.
It is my view therefore that passing the raw exception number allows a client to interpret the raw data according to requirements.
Regards
Mike
Thanks, Leo Yan
/* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list;
-- 2.7.4
On Wed, Dec 12, 2018 at 12:24:35PM +0000, Mike Leach wrote:
[...]
+/* ETMv3 exception number */ +enum {
CS_ETMV3_EXC_NONE,
CS_ETMV3_EXC_DEBUG_HALT,
CS_ETMV3_EXC_SMC,
CS_ETMV3_EXC_HYP,
CS_ETMV3_EXC_ASYNC_DATA_ABORT,
CS_ETMV3_EXC_JAZELLE,
CS_ETMV3_EXC_RESERVED1,
CS_ETMV3_EXC_RESERVED2,
CS_ETMV3_EXC_PE_RESET,
CS_ETMV3_EXC_UNDEFINED_INSTR,
CS_ETMV3_EXC_SVC,
CS_ETMV3_EXC_PREFETCH_ABORT,
CS_ETMV3_EXC_DATA_FAULT,
CS_ETMV3_EXC_GENERIC,
CS_ETMV3_EXC_IRQ,
CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, };
+/* ETMv4 exception number */ +enum {
CS_ETMV4_EXC_RESET,
CS_ETMV4_EXC_DEBUG_HALT,
CS_ETMV4_EXC_CALL,
CS_ETMV4_EXC_TRAP,
CS_ETMV4_EXC_SYSTEM_ERROR,
CS_ETMV4_EXC_INST_DEBUG,
CS_ETMV4_EXC_DATA_DEBUG,
CS_ETMV4_EXC_ALIGNMENT,
CS_ETMV4_EXC_INST_FAULT,
CS_ETMV4_EXC_DATA_FAULT,
CS_ETMV4_EXC_IRQ,
CS_ETMV4_EXC_FIQ,
+};
I personally think for a best practice, OpenCSD needs to define exception numbers and use them as exception numbers when output exception events; perf cs-etm works as OpenCSD's consumer and directly include OpenCSD header file for these defintions.
I am sorry I don't sync this ealier so can get rid of barriers before upstream this patch series, I accept Mike's suggestion to define exception number in perf header cs-etm.h rather than define them in OpenCSD, just confirm will all of us agree with this?
What appears to be required by this client (perf / cs-etm) is a mapping of number value onto exception 'type' enumeration for ease of reference in client code.. But this interpretation is complex and both architecture and trace protocol specific. The A profile cores typically running linux define a 4 bit exception number in ETMv3 and a 5 bit exception number in ETMv4, and as noted above the mapping of type differing according to protocol - the reset type has a different number in ETMv3/ ETMv4, with different available types reported - the ETMv4 CALL covers all the SVC SMC cases reported in ETMv3.
Yeah, have noted this when wrote enum structures for ETMv3 and ETMv4.
However, if we look at M class exceptions, then these are different again, with a 10 bit exception number - and lots of new 'types' not defined above.
Yeah, when I read the string array for exceptions and found M class have their own different definition.
Any OpenCSD solution must take in all these variants and could not be practically confined to a simple enumerated type - so would require a combination of factors, including perhaps some generic typing supplemented by a value.
It is my view therefore that passing the raw exception number allows a client to interpret the raw data according to requirements.
I understand now, seems to me OpenCSD just pass through the number to Linux perf and OpenCSD doesn't use the number, hence Linux perf needs to know meaning for the number by itself.
This is reasonable to me; so I don't have strong opinion now and want to get feedback from Mathieu before move forward :)
Thanks, Leo Yan
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
- u32 exc_num;
Same comment as before - this can easily be confused for execution number.
struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos; @@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
I think this works because you are in --per-thread mode where all the CPUs in the system are taken into account in the metadata. But say you have a 6 CPU system and hotplug out CPU 0-2. Running your code would result in a segmentation fault.
If you want to go foward with this you will have to introduce an RB tree like I did for the CPU/traceid combination, or find something else I haven't thought about yet.
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
- return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
- return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
- return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
break; case CS_ETM_EXCEPTION:etmq->exc_num = UINT_MAX;
etmq->exc_num = packet->exc_num;
I thought that due to filtering we can't do things like that, i.e guarantee that exception and exception_ret can be correlated. Moreover this won't work for CPU wide scenarios where packets from various processors can be present in the same queue. Here the exception from one processor would trample the exception from another processor.
To me the only thing we can do is treat exception like discontinuity but I have been plagued by a serious head cold for days now and I could be missing something. Apologies if that is the case.
Mathieu
/* The exception is for system call. */
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_SYSCALLRET;
/*
* The exceptions are triggered by external signals from bus,
* interrupt controller, debug module, PE reset or halt.
*/
else if (cs_etm__is_async_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_ASYNC |
PERF_IP_FLAG_INTERRUPT;
/*
* Otherwise, exception is caused by trap, instruction &
* data fault, or alignment errors.
*/
else if (cs_etm__is_sync_exception(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_CALL |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception packet is inserted, since exception
* packet is not used standalone for generating samples
* and it's affiliation to the previous instruction range
* packet; so set previous range packet flags to tell perf
* it is an exception taken branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
case CS_ETM_EXCEPTION_RET:break;
if (cs_etm__is_syscall(etmq))
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_SYSCALLRET;
else
packet->flags = PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_INTERRUPT;
/*
* When the exception return packet is inserted, since
* exception return packet is not used standalone for
* generating samples and it's affiliation to the previous
* instruction range packet; so set previous range packet
* flags to tell perf it is an exception return branch.
*/
if (prev_packet->sample_type == CS_ETM_RANGE)
prev_packet->flags = packet->flags;
etmq->exc_num = UINT_MAX;
case CS_ETM_EMPTY: default: break;break;
@@ -1553,6 +1679,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; }
for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) metadata[j][k] = ptr[i + k];
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 37f8d48..fa46ff6 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -39,6 +39,26 @@ enum { CS_ETM_PRIV_MAX, }; +/* ETMv3 exception number */ +enum {
- CS_ETMV3_EXC_NONE,
- CS_ETMV3_EXC_DEBUG_HALT,
- CS_ETMV3_EXC_SMC,
- CS_ETMV3_EXC_HYP,
- CS_ETMV3_EXC_ASYNC_DATA_ABORT,
- CS_ETMV3_EXC_JAZELLE,
- CS_ETMV3_EXC_RESERVED1,
- CS_ETMV3_EXC_RESERVED2,
- CS_ETMV3_EXC_PE_RESET,
- CS_ETMV3_EXC_UNDEFINED_INSTR,
- CS_ETMV3_EXC_SVC,
- CS_ETMV3_EXC_PREFETCH_ABORT,
- CS_ETMV3_EXC_DATA_FAULT,
- CS_ETMV3_EXC_GENERIC,
- CS_ETMV3_EXC_IRQ,
- CS_ETMV3_EXC_FIQ,
+};
/* ETMv4 metadata */ enum { /* Dynamic, configurable parameters */ @@ -53,6 +73,22 @@ enum { CS_ETMV4_PRIV_MAX, }; +/* ETMv4 exception number */ +enum {
- CS_ETMV4_EXC_RESET,
- CS_ETMV4_EXC_DEBUG_HALT,
- CS_ETMV4_EXC_CALL,
- CS_ETMV4_EXC_TRAP,
- CS_ETMV4_EXC_SYSTEM_ERROR,
- CS_ETMV4_EXC_INST_DEBUG,
- CS_ETMV4_EXC_DATA_DEBUG,
- CS_ETMV4_EXC_ALIGNMENT,
- CS_ETMV4_EXC_INST_FAULT,
- CS_ETMV4_EXC_DATA_FAULT,
- CS_ETMV4_EXC_IRQ,
- CS_ETMV4_EXC_FIQ,
+};
/* RB tree for quick conversion between traceID and CPUs */ struct intlist *traceid_list; -- 2.7.4
On Thu, Dec 13, 2018 at 05:36:35PM -0700, Mathieu Poirier wrote:
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
- u32 exc_num;
Same comment as before - this can easily be confused for execution number.
Will fix it.
struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos; @@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
I think this works because you are in --per-thread mode where all the CPUs in the system are taken into account in the metadata. But say you have a 6 CPU system and hotplug out CPU 0-2. Running your code would result in a segmentation fault.
Indeed. I did a quick try and can see segmentation fault after hotplug out CPUs 0-2.
If you want to go foward with this you will have to introduce an RB tree like I did for the CPU/traceid combination, or find something else I haven't thought about yet.
Have looked traceid code and got some sense for this. Will try to fix with this way.
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
- return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
- return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
- return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
break; case CS_ETM_EXCEPTION:etmq->exc_num = UINT_MAX;
etmq->exc_num = packet->exc_num;
I thought that due to filtering we can't do things like that, i.e guarantee that exception and exception_ret can be correlated.
Actually I have considered a bit for exception and exception_ret unpair issue :). So you could see for discontinuity and exception_ret packet, both will set exception number to UINT_MAX. This can avoid to use the old exception number after trace discontinuity.
Moreover this won't work for CPU wide scenarios where packets from various processors can be present in the same queue. Here the exception from one processor would trample the exception from another processor.
Yeah, I am not clear for this point. So check couple things with you.
Firstly, for CPU wide trace, should every CPU has its own cs_etm_queue struct and thus we can use every CPU's cs_etm_queue::exception_num to maintain exception number?
If for CPU wide trace, all CPUs share only one cs_etm_queue, then can we add a new array cs_etm_queue::exception_num[] to maintain exception number for every CPU?
To me the only thing we can do is treat exception like discontinuity but I have been plagued by a serious head cold for days now and I could be missing something. Apologies if that is the case.
No rush, so far the info is enough for me to cook a new patch series; we will base on new patch set to move forward.
Thanks a lot for reviewing.
[...]
Thanks, Leo Yan
On Thu, 13 Dec 2018 at 20:12, leo.yan@linaro.org wrote:
On Thu, Dec 13, 2018 at 05:36:35PM -0700, Mathieu Poirier wrote:
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
- u32 exc_num;
Same comment as before - this can easily be confused for execution number.
Will fix it.
struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos;
@@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; }
+static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
I think this works because you are in --per-thread mode where all the CPUs in the system are taken into account in the metadata. But say you have a 6 CPU system and hotplug out CPU 0-2. Running your code would result in a segmentation fault.
Indeed. I did a quick try and can see segmentation fault after hotplug out CPUs 0-2.
If you want to go foward with this you will have to introduce an RB tree like I did for the CPU/traceid combination, or find something else I haven't thought about yet.
Have looked traceid code and got some sense for this. Will try to fix with this way.
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
- return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
- return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
- return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
case CS_ETM_EXCEPTION:etmq->exc_num = UINT_MAX; break;
etmq->exc_num = packet->exc_num;
I thought that due to filtering we can't do things like that, i.e guarantee that exception and exception_ret can be correlated.
Actually I have considered a bit for exception and exception_ret unpair issue :). So you could see for discontinuity and exception_ret packet, both will set exception number to UINT_MAX. This can avoid to use the old exception number after trace discontinuity.
Indeed, but I don't think your code would work for nested interrupts. Also what happens if we don't get an exception due to filtering an only end up with an exception return... At that point the code defaults to an interrupt, which may or may not be valid. To me the solution seems to be very brittle and I have little confidence in it. I _think_ Intel looks at userspace/kernel space addresses to decide what is happening, perhaps taking a look at what they've done is worth it.
Moreover this won't work for CPU wide scenarios where packets from various processors can be present in the same queue. Here the exception from one processor would trample the exception from another processor.
Yeah, I am not clear for this point. So check couple things with you.
Firstly, for CPU wide trace, should every CPU has its own cs_etm_queue struct and thus we can use every CPU's cs_etm_queue::exception_num to maintain exception number?
Your reasoning is correct in a world where we have a 1:1 ratio between source and sinks. But for now this isn't the case and traces from different CPUs are aggregated in the same sink.
If for CPU wide trace, all CPUs share only one cs_etm_queue, then can we add a new array cs_etm_queue::exception_num[] to maintain exception number for every CPU?
Right, something like that. But before we get excited about CPU wide scenarios we need to find a solid solution for --per-thread. Once that is in place we can think about extending for multiple CPUs.
To me the only thing we can do is treat exception like discontinuity but I have been plagued by a serious head cold for days now and I could be missing something. Apologies if that is the case.
No rush, so far the info is enough for me to cook a new patch series; we will base on new patch set to move forward.
Thanks a lot for reviewing.
[...]
Thanks, Leo Yan