Re: [PATCH v3 5/5] perf cs-etm: Set sample flags for exception packet

14 Dec 2018

On Thu, Dec 13, 2018 at 05:36:35PM -0700, Mathieu Poirier wrote:
...
On Tue, Dec 11, 2018 at 11:01:11PM +0800, Leo Yan wrote:
...
The exception taken and returning are typical flow for instruction jump
but it needs to be handled with exception packets. This patch is to set
sample flags for exception packet and exception return packet.
Since the exception packet contains the exception number, according this
number value this patch divide exception types into three classes:
The first type of exception is caused by external logics like bus,
  interrupt controller, debug module or PE reset or halt; this is
  corresponding to flags "bcyi" which defined in documentation
  perf-script.txt;
The second type is for system call, this is set as "bcs" by following
  definition in the documentation;
The third type is for CPU trap, data and instruction prefetch abort,
  alignment abort; usually these exceptions are synchronous for CPU, so
  set them as "bci" type.
As the exception return packet doesn't contain valid exception number,
the exception number is recorded in cs_etm_queue struct from the
previous exception packet and this value will be reused by exception
return packet.  To avoid to use stale exception number, the exception
number will be reset to UINT_MAX after handling exception return packet
or there have one discontinuity packet is coming.
Neither exception packet nor exception return packet is standalone
packet which can be used to generate samples; essentially they must
affiliate with instruction range packets for samples generation.  The
previous instruction range packet sample flags are assigned with its
following exception packet or exception return packet.
The decoder have defined different exception number for ETMv3 and ETMv4
separately, hence this patch needs firstly decide the ETM version by
using the metadata magic number and then use corresponding exception
numbers for the specific ETM version.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/cs-etm.h |  36 ++++++++++++++
 2 files changed, 163 insertions(+)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index bc8a4bc..0c917b1 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -73,6 +73,7 @@ struct cs_etm_queue {
   u64 timestamp;
   u64 offset;
   u64 period_instructions;

u32 exc_num;

Same comment as before - this can easily be confused for execution number.
Will fix it.
...
...
struct branch_stack *last_branch;
   struct branch_stack *last_branch_rb;
   size_t last_branch_pos;
@@ -1108,6 +1109,73 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq)
   return 0;
 }
 
+static bool cs_etm__is_syscall(struct cs_etm_queue *etmq)
+{

struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;

if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)

I think this works because you are in --per-thread mode where all the CPUs in
the system are taken into account in the metadata.  But say you have a 6 CPU
system and hotplug out CPU 0-2.  Running your code would result in a
segmentation fault.
Indeed.  I did a quick try and can see segmentation fault after
hotplug out CPUs 0-2.
...
If you want to go foward with this you will have to introduce an RB tree like I
did for the CPU/traceid combination, or find something else I haven't thought
about yet.
Have looked traceid code and got some sense for this.  Will try to fix
with this way.
...
...

if (etmq->exc_num == CS_ETMV3_EXC_SVC)


	return true;



if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_CALL)


	return true;



return false;

+}



+static bool cs_etm__is_async_exception(struct cs_etm_queue *etmq)
+{

struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;

if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_DEBUG_HALT ||


    etmq->exc_num == CS_ETMV3_EXC_ASYNC_DATA_ABORT ||


    etmq->exc_num == CS_ETMV3_EXC_PE_RESET ||


    etmq->exc_num == CS_ETMV3_EXC_IRQ ||


    etmq->exc_num == CS_ETMV3_EXC_FIQ)


	return true;



if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_RESET ||


    etmq->exc_num == CS_ETMV4_EXC_DEBUG_HALT ||


    etmq->exc_num == CS_ETMV4_EXC_SYSTEM_ERROR ||


    etmq->exc_num == CS_ETMV4_EXC_INST_DEBUG ||


    etmq->exc_num == CS_ETMV4_EXC_DATA_DEBUG ||


    etmq->exc_num == CS_ETMV4_EXC_IRQ ||


    etmq->exc_num == CS_ETMV4_EXC_FIQ)


	return true;



return false;

+}



+static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq)
+{

struct cs_etm_auxtrace *etm = etmq->etm;
int cpu = etmq->packet->cpu;

if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv3_magic)
if (etmq->exc_num == CS_ETMV3_EXC_SMC ||


    etmq->exc_num == CS_ETMV3_EXC_HYP ||


    etmq->exc_num == CS_ETMV3_EXC_JAZELLE ||


    etmq->exc_num == CS_ETMV3_EXC_UNDEFINED_INSTR ||


    etmq->exc_num == CS_ETMV3_EXC_PREFETCH_ABORT ||


    etmq->exc_num == CS_ETMV3_EXC_DATA_FAULT ||


    etmq->exc_num == CS_ETMV3_EXC_GENERIC)


	return true;



if (etm->metadata[cpu][CS_ETM_MAGIC] == __perf_cs_etmv4_magic)
if (etmq->exc_num == CS_ETMV4_EXC_TRAP ||


    etmq->exc_num == CS_ETMV4_EXC_ALIGNMENT ||


    etmq->exc_num == CS_ETMV4_EXC_INST_FAULT ||


    etmq->exc_num == CS_ETMV4_EXC_DATA_FAULT)


	return true;



return false;

+}



static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq)
 {
   struct cs_etm_packet *packet = etmq->packet;
@@ -1192,9 +1260,67 @@ static void cs_etm__set_sample_flags(struct cs_etm_queue *etmq)
   	if (prev_packet->sample_type == CS_ETM_RANGE)
   		prev_packet->flags |= PERF_IP_FLAG_BRANCH |
   				      PERF_IP_FLAG_TRACE_END;


etmq->exc_num = UINT_MAX;

break;
case CS_ETM_EXCEPTION:
etmq->exc_num = packet->exc_num;



I thought that due to filtering we can't do things like that, i.e guarantee that
exception and exception_ret can be correlated.
Actually I have considered a bit for exception and exception_ret
unpair issue :).  So you could see for discontinuity and exception_ret
packet, both will set exception number to UINT_MAX.  This can avoid to
use the old exception number after trace discontinuity.
...
Moreover this won't work for CPU
wide scenarios where packets from various processors can be present in the same
queue.  Here the exception from one processor would trample the exception from
another processor.
Yeah, I am not clear for this point.  So check couple things with you.
Firstly, for CPU wide trace, should every CPU has its own cs_etm_queue
struct and thus we can use every CPU's cs_etm_queue::exception_num to
maintain exception number?
If for CPU wide trace, all CPUs share only one cs_etm_queue, then can
we add a new array cs_etm_queue::exception_num[] to maintain exception
number for every CPU?
...
To me the only thing we can do is treat exception like discontinuity but I have
been plagued by a serious head cold for days now and I could be missing
something.  Apologies if that is the case.
No rush, so far the info is enough for me to cook a new patch series;
we will base on new patch set to move forward.
Thanks a lot for reviewing.
[...]
Thanks,
Leo Yan

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [PATCH v3 5/5] perf cs-etm: Set sample flags for exception packet

Signed-off-by: Leo Yan leo.yan@linaro.org