On Thu, Dec 13, 2018 at 05:36:35PM -0700, Mathieu Poirier wrote:
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
The exception taken and returning are typical flow for instruction jump but it needs to be handled with exception packets. This patch is to set sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus, interrupt controller, debug module or PE reset or halt; this is corresponding to flags "bcyi" which defined in documentation perf-script.txt;
The second type is for system call, this is set as "bcs" by following definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort, alignment abort; usually these exceptions are synchronous for CPU, so set them as "bci" type.
As the exception return packet doesn't contain valid exception number, the exception number is recorded in cs_etm_queue struct from the previous exception packet and this value will be reused by exception return packet. To avoid to use stale exception number, the exception number will be reset to UINT_MAX after handling exception return packet or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone packet which can be used to generate samples; essentially they must affiliate with instruction range packets for samples generation. The previous instruction range packet sample flags are assigned with its following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4 separately, hence this patch needs firstly decide the ETM version by using the metadata magic number and then use corresponding exception numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/cs-etm.h | 36 ++++++++++++++ 2 files changed, 163 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bc8a4bc..0c917b1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -73,6 +73,7 @@ struct cs_etm_queue { u64 timestamp; u64 offset; u64 period_instructions;
- u32 exc_num;
Same comment as before - this can easily be confused for execution number.
Will fix it.
struct branch_stack *last_branch; struct branch_stack *last_branch_rb; size_t last_branch_pos; @@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq) return 0; } +static bool cs_etm__is_syscall(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
I think this works because you are in --per-thread mode where all the CPUs in the system are taken into account in the metadata. But say you have a 6 CPU system and hotplug out CPU 0-2. Running your code would result in a segmentation fault.
Indeed. I did a quick try and can see segmentation fault after hotplug out CPUs 0-2.
If you want to go foward with this you will have to introduce an RB tree like I did for the CPU/traceid combination, or find something else I haven't thought about yet.
Have looked traceid code and got some sense for this. Will try to fix with this way.
if (etmq->exc_num == CS_ETMV3_EXC_SVC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)
return true;
- return false;
+}
+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||
etmq->exc_num == CS_ETMV3_EXC_IRQ ||
etmq->exc_num == CS_ETMV3_EXC_FIQ)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||
etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||
etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||
etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||
etmq->exc_num == CS_ETMV4_EXC_IRQ ||
etmq->exc_num == CS_ETMV4_EXC_FIQ)
return true;
- return false;
+}
+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq) +{
- struct cs_etm_auxtrace *etm = etmq->etm;
- int cpu = etmq->packet->cpu;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||
etmq->exc_num == CS_ETMV3_EXC_HYP ||
etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||
etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||
etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||
etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||
etmq->exc_num == CS_ETMV3_EXC_GENERIC)
return true;
- if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||
etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||
etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||
etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)
return true;
- return false;
+}
static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) { struct cs_etm_packet *packet = etmq->packet; @@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq) if (prev_packet->sample_type == CS_ETM_RANGE) prev_packet->flags |= PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END;
break; case CS_ETM_EXCEPTION:etmq->exc_num = UINT_MAX;
etmq->exc_num = packet->exc_num;
I thought that due to filtering we can't do things like that, i.e guarantee that exception and exception_ret can be correlated.
Actually I have considered a bit for exception and exception_ret unpair issue :). So you could see for discontinuity and exception_ret packet, both will set exception number to UINT_MAX. This can avoid to use the old exception number after trace discontinuity.
Moreover this won't work for CPU wide scenarios where packets from various processors can be present in the same queue. Here the exception from one processor would trample the exception from another processor.
Yeah, I am not clear for this point. So check couple things with you.
Firstly, for CPU wide trace, should every CPU has its own cs_etm_queue struct and thus we can use every CPU's cs_etm_queue::exception_num to maintain exception number?
If for CPU wide trace, all CPUs share only one cs_etm_queue, then can we add a new array cs_etm_queue::exception_num[] to maintain exception number for every CPU?
To me the only thing we can do is treat exception like discontinuity but I have been plagued by a serious head cold for days now and I could be missing something. Apologies if that is the case.
No rush, so far the info is enough for me to cook a new patch series; we will base on new patch set to move forward.
Thanks a lot for reviewing.
[...]
Thanks, Leo Yan