6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kan Liang kan.liang@linux.intel.com
[ Upstream commit 3ef44458071a19e5b5832cdfe6f75273aa521b6e ]
The --total-cycles may output wrong information with the --stdio.
For example:
# perf record -e "{cycles,instructions}",cache-misses -b sleep 1 # perf report --total-cycles --stdio
The total cycles output of {cycles,instructions} and cache-misses are almost the same.
# Samples: 938 of events 'anon group { cycles, instructions }' # Event count (approx.): 938 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 11.19% 2.6K 0.10% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 0.45% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_> 5.11% 1.2K 0.33% 71 [native_write_msr+0 ->>
# Samples: 293 of event 'cache-misses' # Event count (approx.): 293 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 11.19% 2.6K 0.13% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 0.59% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_> 5.11% 1.2K 0.43% 71 [native_write_msr+0 ->>
With the symbol_conf.event_group, the 'perf report' should only report the block information of the leader event in a group.
However, the current implementation retrieves the next event's block information, rather than the next group leader's block information.
Make sure the index is updated even if the event is skipped.
With the patch,
# Samples: 293 of event 'cache-misses' # Event count (approx.): 293 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 37.98% 9.0K 4.05% 299 [perf_event_addr_filters_exec+0 -> perf_event_a> 11.19% 2.6K 0.28% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 1.32% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_>
Fixes: 6f7164fa231a5f36 ("perf report: Sort by sampled cycles percent per block for stdio") Reviewed-by: Andi Kleen ak@linux.intel.com Signed-off-by: Kan Liang kan.liang@linux.intel.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jin Yao yao.jin@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: https://lore.kernel.org/r/20240813160208.2493643-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-report.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 212760e4dd166..cd2f3f1a75633 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -562,6 +562,7 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c struct hists *hists = evsel__hists(pos); const char *evname = evsel__name(pos);
+ i++; if (symbol_conf.event_group && !evsel__is_group_leader(pos)) continue;
@@ -571,7 +572,7 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
if (rep->total_cycles_mode) { - report__browse_block_hists(&rep->block_reports[i++].hist, + report__browse_block_hists(&rep->block_reports[i - 1].hist, rep->min_percent, pos, NULL); continue; }