[PATCH 0/9] perf tools: Cleanup for CoreSight trace decoding code

List overview All Threads
Download

newer

older

Managed Service Providers (MSPs)...

OpenCSD ITM support

Mathieu Poirier

14 Jun 2018 14 Jun '18

7:41 p.m.

This set serves no other purpose than doing preparatory work before support for CPU-wide trace scenarios is introduced. It cleans up a lot of code and reorganise things so that what is currently being used for per-thread decoding can be used as well for CPU-wide sessions.

It is based on Leo's latest submission to the CS mailing list. I am sending this here to circumvent the constraints imposed by the merge window and will go public next week when the first release candidate for the 4.18 cycle has been minted.

Leo and Rob, you guys are now familiar with the perf tools code and as such would appreciate your thougths on this.

The code has been uploaded here[1].

Regards, Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-master-cpu-wide-support

Mathieu Poirier (9): perf tools: Remove unused structure field "state" perf tools: Introducing cs_etm_decoder__init_dparams() perf tools: Introducing function cs_etm__init_trace_params() perf tools: Cleaning up function cs_etm__alloc_queue() perf tools: Rethink kernel address initialisation perf tools: Make cs_etm__run_decoder() queue independent perf tools: Modularize main decoder function perf tools: Modularize main packet processing loop perf tools: Modularize auxtrace_buffer fetch function

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 13 +- tools/perf/util/cs-etm.c | 339 ++++++++++++++---------- 2 files changed, 210 insertions(+), 142 deletions(-)

-- 2.7.4

Show replies by date

Mathieu Poirier

14 Jun 14 Jun

7:41 p.m.

New subject: [PATCH 1/9] perf tools: Remove unused structure field "state"

Field "state" in structure cs_etm_queue belongs to a bygone era and no longer servers a purpose.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index cbfbe34c51a0..492160393a57 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -71,7 +71,6 @@ struct cs_etm_queue { struct thread *thread; struct cs_etm_decoder *decoder; struct auxtrace_buffer *buffer; - const struct cs_etm_state *state; union perf_event *event_buf; unsigned int queue_nr; pid_t pid, tid;

-- 2.7.4

Mathieu Poirier

7:41 p.m.

New subject: [PATCH 2/9] perf tools: Introducing cs_etm_decoder__init_dparams()

Introducing function cs_etm_decoder__init_dparams() to avoid repeating code at two different places.

No change of functionatlity is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +- tools/perf/util/cs-etm.c | 41 +++++++++++++++++-------- 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 108dc9dbb146..faa8bafca3c6 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -82,9 +82,10 @@ enum { CS_ETM_PROTO_ETMV4d, };

-enum { +enum cs_etm_decoder_operation { CS_ETM_OPERATION_PRINT = 1, CS_ETM_OPERATION_DECODE, + CS_ETM_OPERATION_MAX, };

int cs_etm_decoder__process_data_block(struct cs_etm_decoder *decoder, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 492160393a57..093f2b6a93bb 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -103,6 +103,28 @@ static void cs_etm__packet_dump(const char *pkt_string) fflush(stdout); }

+static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params, + struct cs_etm_queue *etmq, + enum cs_etm_decoder_operation mode) +{ + int ret = -EINVAL; + + if (!(mode < CS_ETM_OPERATION_MAX)) + goto out; + + d_params->packet_printer = cs_etm__packet_dump; + d_params->operation = mode; + d_params->data = etmq; + d_params->formatted = true; + d_params->fsyncs = false; + d_params->hsyncs = false; + d_params->frame_aligned = true; + + ret = 0; +out: + return ret; +} + static void cs_etm__dump_event(struct cs_etm_auxtrace *etm, struct auxtrace_buffer *buffer) { @@ -133,12 +155,9 @@ static void cs_etm__dump_event(struct cs_etm_auxtrace *etm, }

/* Set decoder parameters to simply print the trace packets */ - d_params.packet_printer = cs_etm__packet_dump; - d_params.operation = CS_ETM_OPERATION_PRINT; - d_params.formatted = true; - d_params.fsyncs = false; - d_params.hsyncs = false; - d_params.frame_aligned = true; + if (cs_etm__init_decoder_params(&d_params, NULL, + CS_ETM_OPERATION_PRINT)) + return;

decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params);

@@ -355,13 +374,9 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, }

/* Set decoder parameters to simply print the trace packets */ - d_params.packet_printer = cs_etm__packet_dump; - d_params.operation = CS_ETM_OPERATION_DECODE; - d_params.formatted = true; - d_params.fsyncs = false; - d_params.hsyncs = false; - d_params.frame_aligned = true; - d_params.data = etmq; + if (cs_etm__init_decoder_params(&d_params, etmq, + CS_ETM_OPERATION_DECODE)) + goto out_free;

etmq->decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params);

-- 2.7.4

Kim Phillips

11:44 p.m.

New subject: [PATCH 2/9] perf tools: Introducing cs_etm_decoder__init_dparams()

On Thu, 14 Jun 2018 13:41:31 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...

Introducing function cs_etm_decoder__init_dparams() to avoid repeating code at two different places.

No change of functionatlity is introduced by this patch.

functionality

...

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +- tools/perf/util/cs-etm.c | 41 +++++++++++++++++-------- 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 108dc9dbb146..faa8bafca3c6 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -82,9 +82,10 @@ enum { CS_ETM_PROTO_ETMV4d, }; -enum { +enum cs_etm_decoder_operation { CS_ETM_OPERATION_PRINT = 1, CS_ETM_OPERATION_DECODE,

CS_ETM_OPERATION_MAX,

What does adding a _MAX buy us here? The compiler ought to do the enum check.

Same comment applies for the next patch in this series.

Kikm

Mathieu Poirier

15 Jun 15 Jun

3:52 p.m.

New subject: [PATCH 2/9] perf tools: Introducing cs_etm_decoder__init_dparams()

On Thu, 14 Jun 2018 at 17:44, Kim Phillips kim.phillips@arm.com wrote:

...

On Thu, 14 Jun 2018 13:41:31 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
Introducing function cs_etm_decoder__init_dparams() to avoid repeating code at two different places.

No change of functionatlity is introduced by this patch.
           functionality
...
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +- tools/perf/util/cs-etm.c | 41 +++++++++++++++++-------- 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 108dc9dbb146..faa8bafca3c6 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -82,9 +82,10 @@ enum { CS_ETM_PROTO_ETMV4d, };

-enum { +enum cs_etm_decoder_operation { CS_ETM_OPERATION_PRINT = 1, CS_ETM_OPERATION_DECODE,
CS_ETM_OPERATION_MAX,
What does adding a _MAX buy us here? The compiler ought to do the enum check.

The compiler doesn't do the check.

...

Same comment applies for the next patch in this series.

Kikm

Mathieu Poirier

14 Jun 14 Jun

7:41 p.m.

New subject: [PATCH 3/9] perf tools: Introducing function cs_etm__init_trace_params()

Introducing function cs_etm__init_trace_params() to avoid repeating code at two different places.

No change of functionatlity is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +- tools/perf/util/cs-etm.c | 66 +++++++++++++++---------- 2 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index faa8bafca3c6..226f5db37d34 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -76,10 +76,11 @@ struct cs_etm_decoder_params { * The following enums are indexed starting with 1 to align with the * open source coresight trace decoder library. */ -enum { +enum cs_etm_decoder_protocol { CS_ETM_PROTO_ETMV3 = 1, CS_ETM_PROTO_ETMV4i, CS_ETM_PROTO_ETMV4d, + CS_ETM_PROTO_MAX, };

enum cs_etm_decoder_operation { diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 093f2b6a93bb..7803b28ec9f0 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -103,6 +103,37 @@ static void cs_etm__packet_dump(const char *pkt_string) fflush(stdout); }

+static void cs_etm__set_trace_param(struct cs_etm_trace_params *t_params, + struct cs_etm_auxtrace *etm, int idx, + enum cs_etm_decoder_protocol protocol) +{ + t_params->protocol = protocol; + t_params->etmv4.reg_idr0 = etm->metadata[idx][CS_ETMV4_TRCIDR0]; + t_params->etmv4.reg_idr1 = etm->metadata[idx][CS_ETMV4_TRCIDR1]; + t_params->etmv4.reg_idr2 = etm->metadata[idx][CS_ETMV4_TRCIDR2]; + t_params->etmv4.reg_idr8 = etm->metadata[idx][CS_ETMV4_TRCIDR8]; + t_params->etmv4.reg_configr = etm->metadata[idx][CS_ETMV4_TRCCONFIGR]; + t_params->etmv4.reg_traceidr = etm->metadata[idx][CS_ETMV4_TRCTRACEIDR]; +} + +static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, + struct cs_etm_auxtrace *etm, + enum cs_etm_decoder_protocol protocol) +{ + int i, ret = 0; + + if (!(protocol < CS_ETM_PROTO_MAX)) { + ret = -EINVAL; + goto out; + } + + for (i = 0; i < etm->num_cpu; i++) + cs_etm__set_trace_param(&t_params[i], etm, i, protocol); + +out: + return ret; +} + static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params, struct cs_etm_queue *etmq, enum cs_etm_decoder_operation mode) @@ -128,7 +159,7 @@ static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params, static void cs_etm__dump_event(struct cs_etm_auxtrace *etm, struct auxtrace_buffer *buffer) { - int i, ret; + int ret; const char *color = PERF_COLOR_BLUE; struct cs_etm_decoder_params d_params; struct cs_etm_trace_params *t_params; @@ -142,17 +173,11 @@ static void cs_etm__dump_event(struct cs_etm_auxtrace *etm,

/* Use metadata to fill in trace parameters for trace decoder */ t_params = zalloc(sizeof(*t_params) * etm->num_cpu); - for (i = 0; i < etm->num_cpu; i++) { - t_params[i].protocol = CS_ETM_PROTO_ETMV4i; - t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; - t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; - t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; - t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; - t_params[i].etmv4.reg_configr = - etm->metadata[i][CS_ETMV4_TRCCONFIGR]; - t_params[i].etmv4.reg_traceidr = - etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; - } + if (!t_params) + return; + + if (cs_etm__init_trace_params(t_params, etm, CS_ETM_PROTO_ETMV4i)) + return;

/* Set decoder parameters to simply print the trace packets */ if (cs_etm__init_decoder_params(&d_params, NULL, @@ -312,7 +337,6 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address, static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, unsigned int queue_nr) { - int i; struct cs_etm_decoder_params d_params; struct cs_etm_trace_params *t_params; struct cs_etm_queue *etmq; @@ -357,23 +381,13 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,

/* Use metadata to fill in trace parameters for trace decoder */ t_params = zalloc(sizeof(*t_params) * etm->num_cpu); - if (!t_params) goto out_free;

- for (i = 0; i < etm->num_cpu; i++) { - t_params[i].protocol = CS_ETM_PROTO_ETMV4i; - t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; - t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; - t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; - t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; - t_params[i].etmv4.reg_configr = - etm->metadata[i][CS_ETMV4_TRCCONFIGR]; - t_params[i].etmv4.reg_traceidr = - etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; - } + if (cs_etm__init_trace_params(t_params, etm, CS_ETM_PROTO_ETMV4i)) + goto out_free;

- /* Set decoder parameters to simply print the trace packets */ + /* Set decoder parameters to decode trace packets */ if (cs_etm__init_decoder_params(&d_params, etmq, CS_ETM_OPERATION_DECODE)) goto out_free;

-- 2.7.4

Mathieu Poirier

7:41 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

Function cs_etm__alloc_queue() should be concerned with only the allocation of memory for the etmq and accompanying decoder. Everything else should be done in the calling function.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 36 ++++++++++++++++-------------------- 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 7803b28ec9f0..0ada0674e0fe 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -334,8 +334,7 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address, return len; }

-static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, - unsigned int queue_nr) +static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm) { struct cs_etm_decoder_params d_params; struct cs_etm_trace_params *t_params; @@ -373,12 +372,6 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, if (!etmq->event_buf) goto out_free;

- etmq->etm = etm; - etmq->queue_nr = queue_nr; - etmq->pid = -1; - etmq->tid = -1; - etmq->cpu = -1; - /* Use metadata to fill in trace parameters for trace decoder */ t_params = zalloc(sizeof(*t_params) * etm->num_cpu); if (!t_params) @@ -408,9 +401,6 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, cs_etm__mem_access)) goto out_free_decoder;

- etmq->offset = 0; - etmq->period_instructions = 0; - return etmq;

out_free_decoder: @@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) { + int ret = 0; struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq) - return 0; + goto out;

- etmq = cs_etm__alloc_queue(etm, queue_nr); + etmq = cs_etm__alloc_queue(etm);

- if (!etmq) - return -ENOMEM; + if (!etmq) { + ret = -ENOMEM; + goto out; + }

queue->priv = etmq; - - if (queue->cpu != -1) - etmq->cpu = queue->cpu; - + etmq->etm = etm; + etmq->queue_nr = queue_nr; + etmq->cpu = queue->cpu; etmq->tid = queue->tid; + etmq->pid = -1; + etmq->offset = 0; + etmq->period_instructions = 0;

- return 0; +out: + return ret; }

static int cs_etm__setup_queues(struct cs_etm_auxtrace *etm)

-- 2.7.4

Kim Phillips

11:53 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...

@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {

int ret = 0; struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
return 0;
goto out;

What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.

...

etmq = cs_etm__alloc_queue(etm, queue_nr);

etmq = cs_etm__alloc_queue(etm);
if (!etmq)
return -ENOMEM;
if (!etmq) {
ret = -ENOMEM;
goto out;
}

same here.

...

queue->priv = etmq;
if (queue->cpu != -1)
etmq->cpu = queue->cpu;
etmq->etm = etm;

etmq->queue_nr = queue_nr;

etmq->cpu = queue->cpu;

the != -1 condition above is being removed and not replaced, yet there's no explanation in the commit text. Should it be a part of a different patch?

Kim

Mathieu Poirier

15 Jun 15 Jun

4:03 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Thu, 14 Jun 2018 at 17:53, Kim Phillips kim.phillips@arm.com wrote:

...

On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {
int ret = 0;
struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
        return 0;
        goto out;
What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.

It streamlines the exit path.

...

etmq = cs_etm__alloc_queue(etm, queue_nr);

```
etmq = cs_etm__alloc_queue(etm);
```

```
if (!etmq)
```
```
        return -ENOMEM;
```

```
if (!etmq) {
```
```
        ret = -ENOMEM;
```
```
        goto out;
```
```
}
```

same here.

...

  queue->priv = etmq;

```
if (queue->cpu != -1)
```
```
        etmq->cpu = queue->cpu;
```

```
etmq->etm = etm;
```
```
etmq->queue_nr = queue_nr;
```
```
etmq->cpu = queue->cpu;
```

the != -1 condition above is being removed and not replaced, yet there's no explanation in the commit text. Should it be a part of a different patch?

Variable queue->cpu is either -1 (per-thread) or hold the value of a CPU (CPU-wide). So instead of invariably setting etmq->cpu to -1 in cs_etm__alloc_queue() and checking if it is worth changing that in cs_etm__setup_queue() as the current code does, directly doing etmq->cpu = queue->cpu yields exactly the same result.

...

Kim

Kim Phillips

9:06 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Fri, 15 Jun 2018 10:03:43 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...

On Thu, 14 Jun 2018 at 17:53, Kim Phillips kim.phillips@arm.com wrote:

...
On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {
int ret = 0;
struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
        return 0;
        goto out;
What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.
It streamlines the exit path.

How? It unnecessarily redirects it and sends the reader hunting.

This is how the function body looks after this patch:

{ int ret = 0; struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq) goto out;

etmq = cs_etm__alloc_queue(etm);

if (!etmq) { ret = -ENOMEM; goto out; }

queue->priv = etmq; etmq->etm = etm; etmq->queue_nr = queue_nr; etmq->cpu = queue->cpu; etmq->tid = queue->tid; etmq->pid = -1; etmq->offset = 0; etmq->period_instructions = 0;

out: return ret; }

And this is how it would look whilst continuing to directly return from either of the exit paths:

{ struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq) return 0;

etmq = cs_etm__alloc_queue(etm); if (!etmq) return -ENOMEM;

queue->priv = etmq; etmq->etm = etm; etmq->queue_nr = queue_nr; etmq->cpu = queue->cpu; etmq->tid = queue->tid; etmq->pid = -1; etmq->offset = 0; etmq->period_instructions = 0;

return 0; }

The latter is easier to follow, and doesn't need a 'ret' variable, any gotos, or any labels at all.

...

...
...
etmq = cs_etm__alloc_queue(etm, queue_nr);
etmq = cs_etm__alloc_queue(etm);
if (!etmq)
        return -ENOMEM;
if (!etmq) {
        ret = -ENOMEM;
        goto out;
}
same here.

...
  queue->priv = etmq;
if (queue->cpu != -1)
        etmq->cpu = queue->cpu;
etmq->etm = etm;
etmq->queue_nr = queue_nr;
etmq->cpu = queue->cpu;
the != -1 condition above is being removed and not replaced, yet there's no explanation in the commit text. Should it be a part of a different patch?
Variable queue->cpu is either -1 (per-thread) or hold the value of a CPU (CPU-wide). So instead of invariably setting etmq->cpu to -1 in cs_etm__alloc_queue() and checking if it is worth changing that in cs_etm__setup_queue() as the current code does, directly doing etmq->cpu = queue->cpu yields exactly the same result.

OK, it would be nice to have that explanation in the commit text.

Thanks,

Kim

Mathieu Poirier

9:41 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Fri, 15 Jun 2018 at 15:06, Kim Phillips kim.phillips@arm.com wrote:

...

On Fri, 15 Jun 2018 10:03:43 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
On Thu, 14 Jun 2018 at 17:53, Kim Phillips kim.phillips@arm.com wrote:

...
On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {
int ret = 0;
struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
        return 0;
        goto out;
What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.
It streamlines the exit path.
How? It unnecessarily redirects it and sends the reader hunting.

This is how the function body looks after this patch:

{ int ret = 0; struct cs_etm_queue *etmq = queue->priv;
    if (list_empty(&queue->head) || etmq)
            goto out;

    etmq = cs_etm__alloc_queue(etm);

    if (!etmq) {
            ret = -ENOMEM;
            goto out;
    }

    queue->priv = etmq;
    etmq->etm = etm;
    etmq->queue_nr = queue_nr;
    etmq->cpu = queue->cpu;
    etmq->tid = queue->tid;
    etmq->pid = -1;
    etmq->offset = 0;
    etmq->period_instructions = 0;
out: return ret; }

The real gain from this approach becomes clearer when timestamp initialisation (second patchset) gets thrown in the mix.

...

And this is how it would look whilst continuing to directly return from either of the exit paths:

{ struct cs_etm_queue *etmq = queue->priv;
    if (list_empty(&queue->head) || etmq)
            return 0;

    etmq = cs_etm__alloc_queue(etm);
    if (!etmq)
            return -ENOMEM;

    queue->priv = etmq;
    etmq->etm = etm;
    etmq->queue_nr = queue_nr;
    etmq->cpu = queue->cpu;
    etmq->tid = queue->tid;
    etmq->pid = -1;
    etmq->offset = 0;
    etmq->period_instructions = 0;

    return 0;
}

The latter is easier to follow, and doesn't need a 'ret' variable, any gotos, or any labels at all.

...
...
...
etmq = cs_etm__alloc_queue(etm, queue_nr);
etmq = cs_etm__alloc_queue(etm);
if (!etmq)
        return -ENOMEM;
if (!etmq) {
        ret = -ENOMEM;
        goto out;
}
same here.

...
  queue->priv = etmq;
if (queue->cpu != -1)
        etmq->cpu = queue->cpu;
etmq->etm = etm;
etmq->queue_nr = queue_nr;
etmq->cpu = queue->cpu;
the != -1 condition above is being removed and not replaced, yet there's no explanation in the commit text. Should it be a part of a different patch?
Variable queue->cpu is either -1 (per-thread) or hold the value of a CPU (CPU-wide). So instead of invariably setting etmq->cpu to -1 in cs_etm__alloc_queue() and checking if it is worth changing that in cs_etm__setup_queue() as the current code does, directly doing etmq->cpu = queue->cpu yields exactly the same result.
OK, it would be nice to have that explanation in the commit text.

Thanks,

Kim

Kim Phillips

9:59 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Fri, 15 Jun 2018 15:41:52 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...

On Fri, 15 Jun 2018 at 15:06, Kim Phillips kim.phillips@arm.com wrote:

...
On Fri, 15 Jun 2018 10:03:43 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
On Thu, 14 Jun 2018 at 17:53, Kim Phillips kim.phillips@arm.com wrote:

...
On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {
int ret = 0;
struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
        return 0;
        goto out;
What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.
It streamlines the exit path.
How? It unnecessarily redirects it and sends the reader hunting.

This is how the function body looks after this patch:

{ int ret = 0; struct cs_etm_queue *etmq = queue->priv;
    if (list_empty(&queue->head) || etmq)
            goto out;

    etmq = cs_etm__alloc_queue(etm);

    if (!etmq) {
            ret = -ENOMEM;
            goto out;
    }

    queue->priv = etmq;
    etmq->etm = etm;
    etmq->queue_nr = queue_nr;
    etmq->cpu = queue->cpu;
    etmq->tid = queue->tid;
    etmq->pid = -1;
    etmq->offset = 0;
    etmq->period_instructions = 0;
out: return ret; }
The real gain from this approach becomes clearer when timestamp initialisation (second patchset) gets thrown in the mix.

It should be introduced there then.

Thanks,

Kim

Mathieu Poirier

10:13 p.m.

New subject: [PATCH 4/9] perf tools: Cleaning up function cs_etm__alloc_queue()

On Fri, 15 Jun 2018 at 15:59, Kim Phillips kim.phillips@arm.com wrote:

...

On Fri, 15 Jun 2018 15:41:52 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
On Fri, 15 Jun 2018 at 15:06, Kim Phillips kim.phillips@arm.com wrote:

...
On Fri, 15 Jun 2018 10:03:43 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
On Thu, 14 Jun 2018 at 17:53, Kim Phillips kim.phillips@arm.com wrote:

...
On Thu, 14 Jun 2018 13:41:33 -0600 Mathieu Poirier mathieu.poirier@linaro.org wrote:

...
@@ -430,24 +420,30 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr) {
int ret = 0;
struct cs_etm_queue *etmq = queue->priv;

if (list_empty(&queue->head) || etmq)
        return 0;
        goto out;
What does 'goto out' buy us here? Having the return in-place is easier/clearer to read.
It streamlines the exit path.
How? It unnecessarily redirects it and sends the reader hunting.

This is how the function body looks after this patch:

{ int ret = 0; struct cs_etm_queue *etmq = queue->priv;
    if (list_empty(&queue->head) || etmq)
            goto out;

    etmq = cs_etm__alloc_queue(etm);

    if (!etmq) {
            ret = -ENOMEM;
            goto out;
    }

    queue->priv = etmq;
    etmq->etm = etm;
    etmq->queue_nr = queue_nr;
    etmq->cpu = queue->cpu;
    etmq->tid = queue->tid;
    etmq->pid = -1;
    etmq->offset = 0;
    etmq->period_instructions = 0;
out: return ret; }
The real gain from this approach becomes clearer when timestamp initialisation (second patchset) gets thrown in the mix.
It should be introduced there then.

I do not agree. The goal of the patch is to re-write the alloc/setup function, which is exactly what I'm doing.

...

Thanks,

Kim

Mathieu Poirier

14 Jun 14 Jun

7:41 p.m.

New subject: [PATCH 5/9] perf tools: Rethink kernel address initialisation

Moving initialisation of the kernel start address to function cs_etm__setup_queues(), considered to be the common denominator for queue initialisation. That way we don't have to repeat the same code at different places.

No change of functionatlity is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 0ada0674e0fe..1b3360959ab1 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -451,6 +451,9 @@ static int cs_etm__setup_queues(struct cs_etm_auxtrace *etm) unsigned int i; int ret;

+ if (!etm->kernel_start) + etm->kernel_start = machine__kernel_start(etm->machine); + for (i = 0; i < etm->queues.nr_queues; i++) { ret = cs_etm__setup_queue(etm, &etm->queues.queue_array[i], i); if (ret) @@ -1025,14 +1028,10 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)

static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { - struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_buffer buffer; size_t buffer_used, processed; int err = 0;

- if (!etm->kernel_start) - etm->kernel_start = machine__kernel_start(etm->machine); - /* Go through each buffer in the queue and decode them one by one */ while (1) { buffer_used = 0;

-- 2.7.4

Mathieu Poirier

7:41 p.m.

New subject: [PATCH 6/9] perf tools: Make cs_etm__run_decoder() queue independent

This patch makes decoding of auxtrace buffer centered around a struct cs_etm_queue. This eliminates surperflous variables and is a precursor for work that simplifies the main decoder loop.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ---- tools/perf/util/cs-etm.c | 52 ++++++++++++------------- 2 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 226f5db37d34..b2b6c8b5ba95 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -17,13 +17,6 @@

struct cs_etm_decoder;

-struct cs_etm_buffer { - const unsigned char *buf; - size_t len; - u64 offset; - u64 ref_timestamp; -}; - enum cs_etm_sample_type { CS_ETM_EMPTY = 0, CS_ETM_RANGE = 1 << 0, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 1b3360959ab1..7ed705f950a7 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -84,6 +84,8 @@ struct cs_etm_queue { size_t last_branch_pos; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet; + const unsigned char *buf; + size_t buf_len, buf_used; };

static int cs_etm__update_queues(struct cs_etm_auxtrace *etm); @@ -610,7 +612,7 @@ static int cs_etm__inject_event(union perf_event *event,

static int -cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) +cs_etm__get_trace(struct cs_etm_queue *etmq) { struct auxtrace_buffer *aux_buffer = etmq->buffer; struct auxtrace_buffer *old_buffer = aux_buffer; @@ -624,7 +626,7 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (!aux_buffer) { if (old_buffer) auxtrace_buffer__drop_data(old_buffer); - buff->len = 0; + etmq->buf_len = 0; return 0; }

@@ -644,13 +646,11 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (old_buffer) auxtrace_buffer__drop_data(old_buffer);

- buff->offset = aux_buffer->offset; - buff->len = aux_buffer->size; - buff->buf = aux_buffer->data; + etmq->buf_used = 0; + etmq->buf_len = aux_buffer->size; + etmq->buf = aux_buffer->data;

- buff->ref_timestamp = aux_buffer->reference; - - return buff->len; + return etmq->buf_len; }

static void cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm, @@ -1028,24 +1028,23 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)

static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { - struct cs_etm_buffer buffer; - size_t buffer_used, processed; + size_t processed; int err = 0;

/* Go through each buffer in the queue and decode them one by one */ while (1) { - buffer_used = 0; - memset(&buffer, 0, sizeof(buffer)); - err = cs_etm__get_trace(&buffer, etmq); - if (err <= 0) - return err; - /* - * We cannot assume consecutive blocks in the data file are - * contiguous, reset the decoder to force re-sync. - */ - err = cs_etm_decoder__reset(etmq->decoder); - if (err != 0) - return err; + if (!etmq->buf_len) { + err = cs_etm__get_trace(etmq); + if (err <= 0) + return err; + /* + * We cannot assume consecutive blocks in the data file + * are contiguous, reset the decoder to force re-sync. + */ + err = cs_etm_decoder__reset(etmq->decoder); + if (err != 0) + return err; + }

/* Run trace decoder until buffer consumed or end of trace */ do { @@ -1053,14 +1052,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) err = cs_etm_decoder__process_data_block( etmq->decoder, etmq->offset, - &buffer.buf[buffer_used], - buffer.len - buffer_used, + &etmq->buf[etmq->buf_used], + etmq->buf_len, &processed); if (err) return err;

etmq->offset += processed; - buffer_used += processed; + etmq->buf_used += processed; + etmq->buf_len -= processed;

/* Process each packet in this chunk */ while (1) { @@ -1100,7 +1100,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) break; } } - } while (buffer.len > buffer_used); + } while (etmq->buf_len);

if (err == 0) /* Flush any remaining branch stack entries */

-- 2.7.4

Leo Yan

18 Jun 18 Jun

2:06 p.m.

New subject: [PATCH 6/9] perf tools: Make cs_etm__run_decoder() queue independent

On Thu, Jun 14, 2018 at 01:41:35PM -0600, Mathieu Poirier wrote:

...

This patch makes decoding of auxtrace buffer centered around a struct cs_etm_queue. This eliminates surperflous variables and is a precursor for work that simplifies the main decoder loop.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ---- tools/perf/util/cs-etm.c | 52 ++++++++++++------------- 2 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 226f5db37d34..b2b6c8b5ba95 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -17,13 +17,6 @@ struct cs_etm_decoder; -struct cs_etm_buffer {

const unsigned char *buf;

size_t len;

u64 offset;

u64 ref_timestamp;

-};

enum cs_etm_sample_type { CS_ETM_EMPTY = 0, CS_ETM_RANGE = 1 << 0, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 1b3360959ab1..7ed705f950a7 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -84,6 +84,8 @@ struct cs_etm_queue { size_t last_branch_pos; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet;

const unsigned char *buf;

size_t buf_len, buf_used;

}; static int cs_etm__update_queues(struct cs_etm_auxtrace *etm); @@ -610,7 +612,7 @@ static int cs_etm__inject_event(union perf_event *event, static int -cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) +cs_etm__get_trace(struct cs_etm_queue *etmq) { struct auxtrace_buffer *aux_buffer = etmq->buffer; struct auxtrace_buffer *old_buffer = aux_buffer; @@ -624,7 +626,7 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (!aux_buffer) { if (old_buffer) auxtrace_buffer__drop_data(old_buffer);
buff->len = 0;
etmq->buf_len = 0;
return 0; }
@@ -644,13 +646,11 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (old_buffer) auxtrace_buffer__drop_data(old_buffer);

buff->offset = aux_buffer->offset;

buff->len = aux_buffer->size;

buff->buf = aux_buffer->data;

etmq->buf_used = 0;

etmq->buf_len = aux_buffer->size;

etmq->buf = aux_buffer->data;

buff->ref_timestamp = aux_buffer->reference;

return buff->len;

return etmq->buf_len;

} static void cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm, @@ -1028,24 +1028,23 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) static int cs_etm__run_decoder(struct cs_etm_queue *etmq) {

struct cs_etm_buffer buffer;

size_t buffer_used, processed;

size_t processed; int err = 0;

/* Go through each buffer in the queue and decode them one by one */ while (1) {
buffer_used = 0;
memset(&buffer, 0, sizeof(buffer));
err = cs_etm__get_trace(&buffer, etmq);
if (err <= 0)
	return err;
/*
 * We cannot assume consecutive blocks in the data file are
 * contiguous, reset the decoder to force re-sync.
 */
err = cs_etm_decoder__reset(etmq->decoder);
if (err != 0)
	return err;
if (!etmq->buf_len) {
	err = cs_etm__get_trace(etmq);
	if (err <= 0)
		return err;
	/*
	 * We cannot assume consecutive blocks in the data file
	 * are contiguous, reset the decoder to force re-sync.
	 */
	err = cs_etm_decoder__reset(etmq->decoder);
	if (err != 0)
		return err;
}
/* Run trace decoder until buffer consumed or end of trace */ do { @@ -1053,14 +1052,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) err = cs_etm_decoder__process_data_block( etmq->decoder, etmq->offset,
		&buffer.buf[buffer_used],
		buffer.len - buffer_used,
		&etmq->buf[etmq->buf_used],
		etmq->buf_len,
	&processed);
if (err)
	return err;
etmq->offset += processed;
	buffer_used += processed;
	etmq->buf_used += processed;
	etmq->buf_len -= processed;
/* Process each packet in this chunk */ while (1) { @@ -1100,7 +1100,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) break; } }
} while (buffer.len > buffer_used);
} while (etmq->buf_len);

Seems to me, it's more safe to check 'etmq->buf_len > 0'. If buffer size isn't alignment with last processed size, it's possible to introduce infinite loop.

...

if (err == 0) /* Flush any remaining branch stack entries */ -- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

Mathieu Poirier

2:30 p.m.

New subject: [PATCH 6/9] perf tools: Make cs_etm__run_decoder() queue independent

On Mon, 18 Jun 2018 at 08:06, Leo Yan leo.yan@linaro.org wrote:

...

On Thu, Jun 14, 2018 at 01:41:35PM -0600, Mathieu Poirier wrote:

...
This patch makes decoding of auxtrace buffer centered around a struct cs_etm_queue. This eliminates surperflous variables and is a precursor for work that simplifies the main decoder loop.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ---- tools/perf/util/cs-etm.c | 52 ++++++++++++------------- 2 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 226f5db37d34..b2b6c8b5ba95 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -17,13 +17,6 @@

struct cs_etm_decoder;

-struct cs_etm_buffer {
const unsigned char *buf;
size_t len;
u64 offset;
u64 ref_timestamp;
-};

enum cs_etm_sample_type { CS_ETM_EMPTY = 0, CS_ETM_RANGE = 1 << 0, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 1b3360959ab1..7ed705f950a7 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -84,6 +84,8 @@ struct cs_etm_queue { size_t last_branch_pos; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet;
const unsigned char *buf;
size_t buf_len, buf_used;
};

static int cs_etm__update_queues(struct cs_etm_auxtrace *etm); @@ -610,7 +612,7 @@ static int cs_etm__inject_event(union perf_event *event,

static int -cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) +cs_etm__get_trace(struct cs_etm_queue *etmq) { struct auxtrace_buffer *aux_buffer = etmq->buffer; struct auxtrace_buffer *old_buffer = aux_buffer; @@ -624,7 +626,7 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (!aux_buffer) { if (old_buffer) auxtrace_buffer__drop_data(old_buffer);
        buff->len = 0;
        etmq->buf_len = 0;
        return 0;
}
@@ -644,13 +646,11 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (old_buffer) auxtrace_buffer__drop_data(old_buffer);
buff->offset = aux_buffer->offset;
buff->len = aux_buffer->size;
buff->buf = aux_buffer->data;
etmq->buf_used = 0;
etmq->buf_len = aux_buffer->size;
etmq->buf = aux_buffer->data;
buff->ref_timestamp = aux_buffer->reference;
return buff->len;
return etmq->buf_len;
}

static void cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm, @@ -1028,24 +1028,23 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)

static int cs_etm__run_decoder(struct cs_etm_queue *etmq) {
struct cs_etm_buffer buffer;
size_t buffer_used, processed;
size_t processed;
int err = 0;

/* Go through each buffer in the queue and decode them one by one */
while (1) {
        buffer_used = 0;
        memset(&buffer, 0, sizeof(buffer));
        err = cs_etm__get_trace(&buffer, etmq);
        if (err <= 0)
                return err;
        /*
         * We cannot assume consecutive blocks in the data file are
         * contiguous, reset the decoder to force re-sync.
         */
        err = cs_etm_decoder__reset(etmq->decoder);
        if (err != 0)
                return err;
        if (!etmq->buf_len) {
                err = cs_etm__get_trace(etmq);
                if (err <= 0)
                        return err;
                /*
                 * We cannot assume consecutive blocks in the data file
                 * are contiguous, reset the decoder to force re-sync.
                 */
                err = cs_etm_decoder__reset(etmq->decoder);
                if (err != 0)
                        return err;
        }

        /* Run trace decoder until buffer consumed or end of trace */
        do {
@@ -1053,14 +1052,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) err = cs_etm_decoder__process_data_block( etmq->decoder, etmq->offset,
                        &buffer.buf[buffer_used],
                        buffer.len - buffer_used,
                        &etmq->buf[etmq->buf_used],
                        etmq->buf_len,
                        &processed);
                if (err)
                        return err;

                etmq->offset += processed;
                buffer_used += processed;
                etmq->buf_used += processed;
                etmq->buf_len -= processed;

                /* Process each packet in this chunk */
                while (1) {
@@ -1100,7 +1100,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) break; } }
        } while (buffer.len > buffer_used);
        } while (etmq->buf_len);
Seems to me, it's more safe to check 'etmq->buf_len > 0'. If buffer size isn't alignment with last processed size, it's possible to introduce infinite loop.

Good point - I'll see if that condition can happen and will take your advise if it is the case.

Thanks, Mathieu

...

...
          if (err == 0)
                  /* Flush any remaining branch stack entries */
-- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

Mathieu Poirier

25 Jun 25 Jun

5:42 p.m.

New subject: [PATCH 6/9] perf tools: Make cs_etm__run_decoder() queue independent

On Mon, 18 Jun 2018 at 08:06, Leo Yan leo.yan@linaro.org wrote:

...

On Thu, Jun 14, 2018 at 01:41:35PM -0600, Mathieu Poirier wrote:

...
This patch makes decoding of auxtrace buffer centered around a struct cs_etm_queue. This eliminates surperflous variables and is a precursor for work that simplifies the main decoder loop.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ---- tools/perf/util/cs-etm.c | 52 ++++++++++++------------- 2 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 226f5db37d34..b2b6c8b5ba95 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -17,13 +17,6 @@

struct cs_etm_decoder;

-struct cs_etm_buffer {
const unsigned char *buf;
size_t len;
u64 offset;
u64 ref_timestamp;
-};

enum cs_etm_sample_type { CS_ETM_EMPTY = 0, CS_ETM_RANGE = 1 << 0, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 1b3360959ab1..7ed705f950a7 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -84,6 +84,8 @@ struct cs_etm_queue { size_t last_branch_pos; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet;
const unsigned char *buf;
size_t buf_len, buf_used;
};

static int cs_etm__update_queues(struct cs_etm_auxtrace *etm); @@ -610,7 +612,7 @@ static int cs_etm__inject_event(union perf_event *event,

static int -cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) +cs_etm__get_trace(struct cs_etm_queue *etmq) { struct auxtrace_buffer *aux_buffer = etmq->buffer; struct auxtrace_buffer *old_buffer = aux_buffer; @@ -624,7 +626,7 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (!aux_buffer) { if (old_buffer) auxtrace_buffer__drop_data(old_buffer);
        buff->len = 0;
        etmq->buf_len = 0;
        return 0;
}
@@ -644,13 +646,11 @@ cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) if (old_buffer) auxtrace_buffer__drop_data(old_buffer);
buff->offset = aux_buffer->offset;
buff->len = aux_buffer->size;
buff->buf = aux_buffer->data;
etmq->buf_used = 0;
etmq->buf_len = aux_buffer->size;
etmq->buf = aux_buffer->data;
buff->ref_timestamp = aux_buffer->reference;
return buff->len;
return etmq->buf_len;
}

static void cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm, @@ -1028,24 +1028,23 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)

static int cs_etm__run_decoder(struct cs_etm_queue *etmq) {
struct cs_etm_buffer buffer;
size_t buffer_used, processed;
size_t processed;
int err = 0;

/* Go through each buffer in the queue and decode them one by one */
while (1) {
        buffer_used = 0;
        memset(&buffer, 0, sizeof(buffer));
        err = cs_etm__get_trace(&buffer, etmq);
        if (err <= 0)
                return err;
        /*
         * We cannot assume consecutive blocks in the data file are
         * contiguous, reset the decoder to force re-sync.
         */
        err = cs_etm_decoder__reset(etmq->decoder);
        if (err != 0)
                return err;
        if (!etmq->buf_len) {
                err = cs_etm__get_trace(etmq);
                if (err <= 0)
                        return err;
                /*
                 * We cannot assume consecutive blocks in the data file
                 * are contiguous, reset the decoder to force re-sync.
                 */
                err = cs_etm_decoder__reset(etmq->decoder);
                if (err != 0)
                        return err;
        }

        /* Run trace decoder until buffer consumed or end of trace */
        do {
@@ -1053,14 +1052,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) err = cs_etm_decoder__process_data_block( etmq->decoder, etmq->offset,
                        &buffer.buf[buffer_used],
                        buffer.len - buffer_used,
                        &etmq->buf[etmq->buf_used],
                        etmq->buf_len,
                        &processed);
                if (err)
                        return err;

                etmq->offset += processed;
                buffer_used += processed;
                etmq->buf_used += processed;
                etmq->buf_len -= processed;

                /* Process each packet in this chunk */
                while (1) {
@@ -1100,7 +1100,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) break; } }
        } while (buffer.len > buffer_used);
        } while (etmq->buf_len);
Seems to me, it's more safe to check 'etmq->buf_len > 0'. If buffer size isn't alignment with last processed size, it's possible to introduce infinite loop.

You are correct, if things go off the rails this won't return. I'll fix it.

Thanks, Mathieu

...

...
          if (err == 0)
                  /* Flush any remaining branch stack entries */
-- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

Mathieu Poirier

14 Jun 14 Jun

7:41 p.m.

New subject: [PATCH 7/9] perf tools: Modularize main decoder function

Making the main decoder block modular so that it can be called from different decoding context (timeless vs. non-timeless), avoiding to repeat code.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 7ed705f950a7..dcf86417f66b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1026,9 +1026,36 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) return err; }

+static int cs_etm__decode_data_block(struct cs_etm_queue *etmq) +{ + int ret = 0; + size_t processed = 0; + + /* + * Packets are decoded and added to the decoder's packet queue + * until the decoder packet processing callback has requested that + * processing stops or there is nothing left in the buffer. Normal + * operations that stop processing are a timestamp packet or a full + * decoder buffer queue. + */ + ret = cs_etm_decoder__process_data_block(etmq->decoder, + etmq->offset, + &etmq->buf[etmq->buf_used], + etmq->buf_len, + &processed); + if (ret) + goto out; + + etmq->offset += processed; + etmq->buf_used += processed; + etmq->buf_len -= processed; + +out: + return ret; +} + static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { - size_t processed; int err = 0;

/* Go through each buffer in the queue and decode them one by one */ @@ -1048,20 +1075,10 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)

/* Run trace decoder until buffer consumed or end of trace */ do { - processed = 0; - err = cs_etm_decoder__process_data_block( - etmq->decoder, - etmq->offset, - &etmq->buf[etmq->buf_used], - etmq->buf_len, - &processed); + err = cs_etm__decode_data_block(etmq); if (err) return err;

- etmq->offset += processed; - etmq->buf_used += processed; - etmq->buf_len -= processed; - /* Process each packet in this chunk */ while (1) { err = cs_etm_decoder__get_packet(etmq->decoder,

-- 2.7.4

Mathieu Poirier

7:41 p.m.

New subject: [PATCH 8/9] perf tools: Modularize main packet processing loop

Making the main packet processing loop modular so that it can be called from different decoding context (timeless vs. non-timless), avoiding to repeat code.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 91 ++++++++++++++++++++++++++++-------------------- 1 file changed, 53 insertions(+), 38 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index dcf86417f66b..bfc73b3a4f23 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1054,6 +1054,52 @@ static int cs_etm__decode_data_block(struct cs_etm_queue *etmq) return ret; }

+static int cs_etm__process_decoder_queue(struct cs_etm_queue *etmq) +{ + int ret; + + /* Process each packet in this chunk */ + while (1) { + ret = cs_etm_decoder__get_packet(etmq->decoder, + etmq->packet); + if (ret <= 0) + /* + * Stop processing this chunk on + * end of data or error + */ + break; + switch (etmq->packet->sample_type) { + case CS_ETM_RANGE: + /* + * If the packet contains an instruction + * range, generate instruction sequence + * events. + */ + cs_etm__sample(etmq); + break; + case CS_ETM_TRACE_ON: + /* + * Discontinuity in trace, flush + * previous branch stack + */ + cs_etm__flush(etmq); + break; + case CS_ETM_EMPTY: + /* + * Should not receive empty packet, + * report error. + */ + pr_err("CS ETM Trace: empty packet\n"); + ret = -EINVAL; + break; + default: + break; + } + } + + return ret; +} + static int cs_etm__run_decoder(struct cs_etm_queue *etmq) { int err = 0; @@ -1079,44 +1125,13 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) if (err) return err;

- /* Process each packet in this chunk */ - while (1) { - err = cs_etm_decoder__get_packet(etmq->decoder, - etmq->packet); - if (err <= 0) - /* - * Stop processing this chunk on - * end of data or error - */ - break; - - switch (etmq->packet->sample_type) { - case CS_ETM_RANGE: - /* - * If the packet contains an instruction - * range, generate instruction sequence - * events. - */ - cs_etm__sample(etmq); - break; - case CS_ETM_TRACE_ON: - /* - * Discontinuity in trace, flush - * previous branch stack - */ - cs_etm__flush(etmq); - break; - case CS_ETM_EMPTY: - /* - * Should not receive empty packet, - * report error. - */ - pr_err("CS ETM Trace: empty packet\n"); - return -EINVAL; - default: - break; - } - } + /* + * Process each packet in this chunk, nothing to do if + * an error occurs other than hoping the next one will + * be better. + */ + err = cs_etm__process_decoder_queue(etmq); + } while (etmq->buf_len);

if (err == 0)

-- 2.7.4

Mathieu Poirier

7:41 p.m.

New subject: [PATCH 9/9] perf tools: Modularize auxtrace_buffer fetch function

Making the auxtrace_buffer fetch function modular so that it can be called from different decoding context (timeless vs. non-timeless), avoiding to repeat code.

No change in functionality is introduced by this patch.

Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/util/cs-etm.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index bfc73b3a4f23..28b05efcefbc 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1026,6 +1026,33 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) return err; }

+/* + * cs_etm__get_data_block: Fetch a block from the auxtrace_buffer queue + * if need be. + * Returns: < 0 if error + * = 0 if no more auxtrace_buffer to read + * > 0 if the current buffer isn't empty yet + */ +static int cs_etm__get_data_block(struct cs_etm_queue *etmq) +{ + int ret; + + if (!etmq->buf_len) { + ret = cs_etm__get_trace(etmq); + if (ret <= 0) + return ret; + /* + * We cannot assume consecutive blocks in the data file + * are contiguous, reset the decoder to force re-sync. + */ + ret = cs_etm_decoder__reset(etmq->decoder); + if (ret) + return ret; + } + + return etmq->buf_len; +} + static int cs_etm__decode_data_block(struct cs_etm_queue *etmq) { int ret = 0; @@ -1106,18 +1133,9 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)

/* Go through each buffer in the queue and decode them one by one */ while (1) { - if (!etmq->buf_len) { - err = cs_etm__get_trace(etmq); - if (err <= 0) - return err; - /* - * We cannot assume consecutive blocks in the data file - * are contiguous, reset the decoder to force re-sync. - */ - err = cs_etm_decoder__reset(etmq->decoder); - if (err != 0) - return err; - } + err = cs_etm__get_data_block(etmq); + if (err <= 0) + return err;

/* Run trace decoder until buffer consumed or end of trace */ do {

-- 2.7.4

Robert Walker

18 Jun 18 Jun

10:23 a.m.

Hi Mathieu,

On 14/06/18 20:41, Mathieu Poirier wrote:

...

This set serves no other purpose than doing preparatory work before support for CPU-wide trace scenarios is introduced. It cleans up a lot of code and reorganise things so that what is currently being used for per-thread decoding can be used as well for CPU-wide sessions.

It is based on Leo's latest submission to the CS mailing list. I am sending this here to circumvent the constraints imposed by the merge window and will go public next week when the first release candidate for the 4.18 cycle has been minted.

Leo and Rob, you guys are now familiar with the perf tools code and as such would appreciate your thougths on this.

I've had a quick read through of this and the CPU-wide set and it looks good to me, but I'm off on vacation for a couple of weeks so won't be able to have a detailed look until I return.

Regards

Rob

...

The code has been uploaded here[1].

Regards, Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-master-cpu-wide-support

Mathieu Poirier (9): perf tools: Remove unused structure field "state" perf tools: Introducing cs_etm_decoder__init_dparams() perf tools: Introducing function cs_etm__init_trace_params() perf tools: Cleaning up function cs_etm__alloc_queue() perf tools: Rethink kernel address initialisation perf tools: Make cs_etm__run_decoder() queue independent perf tools: Modularize main decoder function perf tools: Modularize main packet processing loop perf tools: Modularize auxtrace_buffer fetch function

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 13 +- tools/perf/util/cs-etm.c | 339 ++++++++++++++---------- 2 files changed, 210 insertions(+), 142 deletions(-)

Leo Yan

20 Jun 20 Jun

6:43 a.m.

On Thu, Jun 14, 2018 at 01:41:29PM -0600, Mathieu Poirier wrote:

...

This set serves no other purpose than doing preparatory work before support for CPU-wide trace scenarios is introduced. It cleans up a lot of code and reorganise things so that what is currently being used for per-thread decoding can be used as well for CPU-wide sessions.

It is based on Leo's latest submission to the CS mailing list. I am sending this here to circumvent the constraints imposed by the merge window and will go public next week when the first release candidate for the 4.18 cycle has been minted.

Leo and Rob, you guys are now familiar with the perf tools code and as such would appreciate your thougths on this.

The code has been uploaded here[1].

I have verified this patch series on Hikey620 board with below commands:

perf record -e cs_etm/@f6402000.etf/ --per-thread uname perf script

I also connect this patch series with the python decoder script, it also can pass the testing with below command:

perf script -s arm-cs-trace-disasm.py -i perf.data \ -F cpu,event,ip,addr,sym -- -d objdump -k ./vmlinux

So you could add my test tag for this patch series:

Tested-by: Leo Yan leo.yan@linaro.org

...

Regards, Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-master-cpu-wide-support

Mathieu Poirier (9): perf tools: Remove unused structure field "state" perf tools: Introducing cs_etm_decoder__init_dparams() perf tools: Introducing function cs_etm__init_trace_params() perf tools: Cleaning up function cs_etm__alloc_queue() perf tools: Rethink kernel address initialisation perf tools: Make cs_etm__run_decoder() queue independent perf tools: Modularize main decoder function perf tools: Modularize main packet processing loop perf tools: Modularize auxtrace_buffer fetch function

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 13 +- tools/perf/util/cs-etm.c | 339 ++++++++++++++---------- 2 files changed, 210 insertions(+), 142 deletions(-)

-- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

Mathieu Poirier

2:46 p.m.

On Wed, 20 Jun 2018 at 00:43, Leo Yan leo.yan@linaro.org wrote:

...

On Thu, Jun 14, 2018 at 01:41:29PM -0600, Mathieu Poirier wrote:

...
This set serves no other purpose than doing preparatory work before support for CPU-wide trace scenarios is introduced. It cleans up a lot of code and reorganise things so that what is currently being used for per-thread decoding can be used as well for CPU-wide sessions.

It is based on Leo's latest submission to the CS mailing list. I am sending this here to circumvent the constraints imposed by the merge window and will go public next week when the first release candidate for the 4.18 cycle has been minted.

Leo and Rob, you guys are now familiar with the perf tools code and as such would appreciate your thougths on this.

The code has been uploaded here[1].

I have verified this patch series on Hikey620 board with below commands:

perf record -e cs_etm/@f6402000.etf/ --per-thread uname perf script

I also connect this patch series with the python decoder script, it also can pass the testing with below command:

perf script -s arm-cs-trace-disasm.py -i perf.data \ -F cpu,event,ip,addr,sym -- -d objdump -k ./vmlinux

So you could add my test tag for this patch series:

Tested-by: Leo Yan leo.yan@linaro.org

Perfect, thanks for looking at this. Internal tags are always viewed as suspicious from the community when a patchset is released on the mailing list. As such I will send this set without your tag and you will have to reply it one more time.

...

...
Regards, Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-master-cpu-wide-support

Mathieu Poirier (9): perf tools: Remove unused structure field "state" perf tools: Introducing cs_etm_decoder__init_dparams() perf tools: Introducing function cs_etm__init_trace_params() perf tools: Cleaning up function cs_etm__alloc_queue() perf tools: Rethink kernel address initialisation perf tools: Make cs_etm__run_decoder() queue independent perf tools: Modularize main decoder function perf tools: Modularize main packet processing loop perf tools: Modularize auxtrace_buffer fetch function

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 13 +- tools/perf/util/cs-etm.c | 339 ++++++++++++++---------- 2 files changed, 210 insertions(+), 142 deletions(-)

-- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

leo.yan＠linaro.org

4:19 p.m.

On Wed, Jun 20, 2018 at 08:46:19AM -0600, Mathieu Poirier wrote:

...

On Wed, 20 Jun 2018 at 00:43, Leo Yan leo.yan@linaro.org wrote:

...
On Thu, Jun 14, 2018 at 01:41:29PM -0600, Mathieu Poirier wrote:

...
This set serves no other purpose than doing preparatory work before support for CPU-wide trace scenarios is introduced. It cleans up a lot of code and reorganise things so that what is currently being used for per-thread decoding can be used as well for CPU-wide sessions.

It is based on Leo's latest submission to the CS mailing list. I am sending this here to circumvent the constraints imposed by the merge window and will go public next week when the first release candidate for the 4.18 cycle has been minted.

Leo and Rob, you guys are now familiar with the perf tools code and as such would appreciate your thougths on this.

The code has been uploaded here[1].

I have verified this patch series on Hikey620 board with below commands:

perf record -e cs_etm/@f6402000.etf/ --per-thread uname perf script

I also connect this patch series with the python decoder script, it also can pass the testing with below command:

perf script -s arm-cs-trace-disasm.py -i perf.data \ -F cpu,event,ip,addr,sym -- -d objdump -k ./vmlinux

So you could add my test tag for this patch series:

Tested-by: Leo Yan leo.yan@linaro.org

Perfect, thanks for looking at this. Internal tags are always viewed as suspicious from the community when a patchset is released on the mailing list. As such I will send this set without your tag and you will have to reply it one more time.

Sure, will do this.

...

...
...
Regards, Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-master-cpu-wide-support

Mathieu Poirier (9): perf tools: Remove unused structure field "state" perf tools: Introducing cs_etm_decoder__init_dparams() perf tools: Introducing function cs_etm__init_trace_params() perf tools: Cleaning up function cs_etm__alloc_queue() perf tools: Rethink kernel address initialisation perf tools: Make cs_etm__run_decoder() queue independent perf tools: Modularize main decoder function perf tools: Modularize main packet processing loop perf tools: Modularize auxtrace_buffer fetch function

tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 13 +- tools/perf/util/cs-etm.c | 339 ++++++++++++++---------- 2 files changed, 210 insertions(+), 142 deletions(-)

-- 2.7.4

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

2581

days inactive

2592

days old

coresight@lists.linaro.org

24 comments

participants

tags (0)

participants (5)

Kim Phillips
Leo Yan
leo.yan＠linaro.org
Mathieu Poirier
Robert Walker