Hi Al,
On Mon, Jun 01, 2020 at 04:00:02PM +0000, Al Grant wrote:
(Hi - posted for review. There was a recent ABI breaking change (see commit reference below), after which perf inject on ETM creates corrupt files. The simple fix ends up with it using the new ABI. I have an alternative patch set that emulates the old ABI - for the benefit of tools like autofdo that can't yet consume the new ABI - but it's much messier.)
From: Al Grant al.grant@arm.com
Commit 42bbabed09ce6208026648a71a45b4394c74585a ("perf tools: Add hw_idx in struct branch_stack") changed the format of branch stacks in perf samples. When samples use this new format, a flag must be set in the corresponding event. Synthesized branch stacks generated from CoreSight ETM trace were using the new format, but not setting the event attribute, leading to consumers seeing corrupt data. This patch fixes the issue by setting the event attribute to indicate use of the new format.
Fixes: 42bbabed09ce ("perf tools: Add hw_idx in struct branch_stack") Signed-off-by: Al Grant al.grant@arm.com Acked-by: Andrea Brunato andrea.brunato@arm.com
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 62d2f9b9ce1b..71a056e29675 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1332,7 +1332,13 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, }
if (etm->synth_opts.last_branch)
{ attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
/* We don't use the hardware index, but the sample generation
code uses the new format branch_stack with this field,
so the event attributes must indicate that it's present. */
attr.branch_sample_type |= PERF_SAMPLE_BRANCH_HW_INDEX;
}
For this patch itself, it looks good to me. But the change seems a workaround rather than directly fixing issues.
Essentially, this issue is caused by the data structure definition is mess for branch record, thus, I prefer to use below change (I don't test it, so just want to use the change to demonstrate the basic idea):
Thanks, Leo
---8<---
diff --git a/kernel/events/core.c b/kernel/events/core.c index e296c5c59c6f..f00a49293787 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6831,6 +6831,8 @@ void perf_output_sample(struct perf_output_handle *handle, perf_output_put(handle, data->br_stack->nr); if (perf_sample_save_hw_index(event)) perf_output_put(handle, data->br_stack->hw_idx); + else + perf_output_put(handle, -1); perf_output_copy(handle, data->br_stack->entries, size); } else { /* diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h index 17b2ccc61094..111bfb9d6635 100644 --- a/tools/perf/util/branch.h +++ b/tools/perf/util/branch.h @@ -50,23 +50,11 @@ struct branch_stack { };
/* - * The hw_idx is only available when PERF_SAMPLE_BRANCH_HW_INDEX is applied. - * Otherwise, the output format of a sample with branch stack is - * struct branch_stack { - * u64 nr; - * struct branch_entry entries[0]; - * } - * Check whether the hw_idx is available, - * and return the corresponding pointer of entries[0]. + * Return the corresponding pointer of entries[0]. */ static inline struct branch_entry *perf_sample__branch_entries(struct perf_sample *sample) { - u64 *entry = (u64 *)sample->branch_stack; - - entry++; - if (sample->no_hw_idx) - return (struct branch_entry *)entry; - return (struct branch_entry *)(++entry); + return sample->branch_stack->entries; }
struct branch_type_stat { diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 96e5171dce41..88a919ee8f7c 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -2159,8 +2159,10 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event, return -EFAULT;
sz = data->branch_stack->nr * sizeof(struct branch_entry); + /* +hw_idx */ + sz += sizeof(u64); if (evsel__has_branch_hw_idx(evsel)) - sz += sizeof(u64); + data->no_hw_idx = false; else data->no_hw_idx = true; OVERFLOW_CHECK(array, sz, max_size);