This patch series adds support for thread stack and callchain.
Patch 01 is to refactor the instruction size calculation and it is a preparation for patch 02.
Patch 02 is to add thread stack support, after applying this patch then the option '-F,+callindent' can be used by perf script tool; patch 03 is to add branch filter thus the perf tool can only display function calls and returns after enable the call indentation or call chain related options. Patch 04 is the patch to synthesize call chain for the instruction samples.
Patch 05 allows the instruction sample can be handled synchronously with the thread stack, thus it fixes an error for the callchain generation.
This patch set has been tested on 96boards Hikey620.
Test for option '-F,+callindent':
Before:
# perf script -F,+callindent main 2808 1 branches: coresight_test1 ffff8634f5c8 coresight_test1+0x3c (/root/coresight_test/libcstest.so) main 2808 1 branches: printf@plt aaaaba8d37ec main+0x28 (/root/coresight_test/main) main 2808 1 branches: printf@plt aaaaba8d36bc printf@plt+0xc (/root/coresight_test/main) main 2808 1 branches: _init aaaaba8d3650 _init+0x30 (/root/coresight_test/main) main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so) [...]
After:
# perf script -F,+callindent main 2808 1 branches: coresight_test1@plt aaaaba8d37d8 main+0x14 (/root/coresight_test/main) main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.s main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: do_lookup_x ffff8636a49c _dl_lookup_symbol_x+0x104 (/lib/aarch64-linux-gnu/ld-2.28. main 2808 1 branches: check_match ffff86369bf0 do_lookup_x+0x238 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: strcmp ffff86369888 check_match+0x70 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: printf@plt aaaaba8d37ec main+0x28 (/root/coresight_test/main) main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.s main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: do_lookup_x ffff8636a49c _dl_lookup_symbol_x+0x104 (/lib/aarch64-linux-gnu/ld-2.28. main 2808 1 branches: _dl_name_match_p ffff86369af0 do_lookup_x+0x138 (/lib/aarch64-linux-gnu/ld-2.28.so) main 2808 1 branches: strcmp ffff8636f7f0 _dl_name_match_p+0x18 (/lib/aarch64-linux-gnu/ld-2.28.so) [...]
Test for option '--itrace=g':
Before:
# perf script --itrace=g16l64i100 main 1579 100 instructions: ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms]) main 1579 100 instructions: ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms]) main 1579 100 instructions: ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms]) main 1579 100 instructions: ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms]) main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) [...]
After:
# perf script --itrace=g16l64i100
main 1579 100 instructions: ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])
main 1579 100 instructions: ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])
main 1579 100 instructions: ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms]) ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) [...]
Changes from v1: * Added comments for task thread handling (Mathieu). * Split patch 02 into two patches, one is for support thread stack and another is for callchain support (Mathieu). * Added a new patch to support branch filter.
Leo Yan (5): perf cs-etm: Refactor instruction size handling perf cs-etm: Support thread stack perf cs-etm: Support branch filter perf cs-etm: Support callchain for instruction sample perf cs-etm: Correct callchain for instruction sample
tools/perf/util/cs-etm.c | 141 ++++++++++++++++++++++++++++++++------- 1 file changed, 118 insertions(+), 23 deletions(-)