Hi Leo,
On Mon, 11 Jan 2021 at 15:06, Leo Yan leo.yan@linaro.org wrote:
Hi Mike,
On Mon, Jan 11, 2021 at 12:09:12PM +0000, Mike Leach wrote:
Hi Leo,
I think there is an issue here in that your modification assumes that all cpus in the system are of the same ETM type. The original routine allowed for differing ETM types, thus differing cpu ETM field lengths between ETMv4 / ETMv3, the field size was used after the relevant magic number for the cpu ETM was read.
You have replaced two different sizes - with a single calculated size.
Thanks for pointing out this.
Moving forwards we are seeing the newer FEAT_ETE protocol drivers appearing on the list, which will ultimately need a new metadata structure.
We have had discussions within ARM regarding the changing of the format to be more self describing - which should probably be opened out to the CS mailing list.
I think here have two options. One option is I think we can use __perf_cs_etmv3_magic/__perf_cs_etmv4_magic as indicator for the starting of next metadata array; when copy the metadata, always check the next item in the buffer, if it's __perf_cs_etmv3_magic or __perf_cs_etmv4_magic, will break loop and start copying metadata array for next CPU. The suggested change is pasted in below.
Another option is I drop patches 03,05/07 in the series and leave the backward compatibility fixing for a saperate patch series with self describing method. Especially, if you think the first option will introduce trouble for enabling self describing later, then I am happy to drop patches 03,05.
How about you think for this?
Thanks, Leo
---8<---
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index a2a369e2fbb6..edaec57362f0 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -2558,12 +2558,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; }
for (k = 0; k < CS_ETM_PRIV_MAX; k++)
for (k = 0; k < CS_ETM_PRIV_MAX; k++) { metadata[j][k] = ptr[i + k];
if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
ptr[i + k + 1] == __perf_cs_etmv4_magic) {
k++;
break;
}
}
/* The traceID is our handle */ idx = metadata[j][CS_ETM_ETMTRACEIDR];
i += CS_ETM_PRIV_MAX;
i += k; } else if (ptr[i] == __perf_cs_etmv4_magic) { metadata[j] = zalloc(sizeof(*metadata[j]) * CS_ETMV4_PRIV_MAX);
@@ -2571,12 +2578,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event, err = -ENOMEM; goto err_free_metadata; }
for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) { metadata[j][k] = ptr[i + k];
if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
ptr[i + k + 1] == __perf_cs_etmv4_magic) {
k++;
break;
}
}
/* The traceID is our handle */ idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
i += CS_ETMV4_PRIV_MAX;
i += k; } /* Get an RB node for this CPU */
That would be a spot fix for the read /copy case, but will not fix the print routine which will still bail out on older versions of the format. (when using perf report --dump).
The "self describing" format I have been looking at will add an NR_PARAMS value to the common block in the CPU metadata parameter list, increment the header version to '1' and update the format writer to use the version 1 format while having the reader understand both v0 and v1 formats.
i..e in cs-etm.h perf I add: /* * Update the version for new format. * * New version 1 format adds a param count to the per cpu metadata. * This allows easy adding of new metadata parameters. * Requires that new params always added after current ones. * Also allows client reader to handle file versions that are different by * checking the number of params in the file vs the number expected. */ #define CS_HEADER_CURRENT_VERSION 1
/* Beginning of header common to both ETMv3 and V4 */ enum { CS_ETM_MAGIC, CS_ETM_CPU, CS_ETM_NR_PARAMS, /* number of parameters to follow in this block */ };
where in verison 1, NR_PARAMS indicates the total number of params that follow - so adding new parameters can be added to the metadata enums and the tool will automatically adjust, and will handle v0 files, plus older and newer files that have differing numbers of parameters, as long as the parameters are only ever added to the end of the list.
I have been working on a patch for this today, which took a little longer than expected as it was a little more complex than expected (the printing routines in for the --dump command!).
I will post this tomorrow when tested - and if we agree it works it could be rolled into your set - it would make adding the PID parameter easier, and ensure that this new format is available for the upcoming developments.
Regards
Mike
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK