Changes in V5:
- In symbol-elf.c, branch to exit_close label if open file.
- In trace_event_python.c, correct indentation. set_sym_in_dict
call parameter "map_pgoff" renamed as "addr_map_pgoff" to
match local naming.
Changes in V4:
- In trace-event-python.c, fixed perf-tools-next merge problem.
Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 9 ++-
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 80 insertions(+), 8 deletions(-)
--
2.25.1
On 27/08/2024 5:44 pm, Leo Yan wrote:
> This commit dumps metadata with version 2. It uses two string arrays
> metadata_hdr_fmts and metadata_per_cpu_fmts as string formats for the
> header and per CPU data respectively, and the arm_spe_print_info()
> function is enhanced to support dumping metadata with the version 2
> format.
>
> After:
>
> 0 0 0x4a8 [0x170]: PERF_RECORD_AUXTRACE_INFO type: 4
> PMU Type :13
> Version :2
> Num of CPUs :8
> CPU # :0
> MIDR :0x410fd801
> Bound PMU Type :-1
> Min Interval :0
> Load Data Source :0
> CPU # :1
> MIDR :0x410fd801
> Bound PMU Type :-1
> Min Interval :0
> Load Data Source :0
> CPU # :2
> MIDR :0x410fd870
> Bound PMU Type :13
> Min Interval :1024
> Load Data Source :1
> CPU # :3
> MIDR :0x410fd870
> Bound PMU Type :13
> Min Interval :1024
> Load Data Source :1
> CPU # :4
> MIDR :0x410fd870
> Bound PMU Type :13
> Min Interval :1024
> Load Data Source :1
> CPU # :5
> MIDR :0x410fd870
> Bound PMU Type :13
> Min Interval :1024
> Load Data Source :1
> CPU # :6
> MIDR :0x410fd850
> Bound PMU Type :14
> Min Interval :1024
> Load Data Source :1
> CPU # :7
> MIDR :0x410fd850
> Bound PMU Type :14
> Min Interval :1024
> Load Data Source :1
>
> Signed-off-by: Leo Yan <leo.yan(a)arm.com>
> ---
> tools/perf/util/arm-spe.c | 43 ++++++++++++++++++++++++++++++++++-----
> 1 file changed, 38 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 87cf06db765b..be34d4c4306a 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -1067,16 +1067,49 @@ static bool arm_spe_evsel_is_auxtrace(struct perf_session *session __maybe_unuse
> return strstarts(evsel->name, ARM_SPE_PMU_NAME);
> }
>
> -static const char * const arm_spe_info_fmts[] = {
> - [ARM_SPE_PMU_TYPE] = " PMU Type %"PRId64"\n",
> +static const char * const metadata_hdr_fmts[] = {
> + [ARM_SPE_PMU_TYPE] = " PMU Type :%"PRId64"\n",
> + [ARM_SPE_HEADER_VERSION] = " Version :%"PRId64"\n",
> + [ARM_SPE_CPU_NUM] = " Num of CPUs :%"PRId64"\n",
> };
>
> -static void arm_spe_print_info(__u64 *arr)
> +static const char * const metadata_per_cpu_fmts[] = {
> + [ARM_SPE_CPU] = " CPU # :%"PRId64"\n",
> + [ARM_SPE_CPU_MIDR] = " MIDR :0x%"PRIx64"\n",
> + [ARM_SPE_CPU_PMU_TYPE] = " Bound PMU Type :%"PRId64"\n",
> + [ARM_SPE_CAP_MIN_IVAL] = " Min Interval :%"PRId64"\n",
> + [ARM_SPE_CAP_LDS] = " Load Data Source :%"PRId64"\n",
> +};
> +
> +static void arm_spe_print_info(struct arm_spe *spe, __u64 *arr)
> {
> + unsigned int i, cpu, header_size, cpu_num, per_cpu_size;
> +
> if (!dump_trace)
> return;
>
> - fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]);
> + if (spe->metadata_ver == 1) {
> + cpu_num = 0;
> + header_size = ARM_SPE_AUXTRACE_V1_PRIV_MAX;
> + per_cpu_size = 0;
> + } else if (spe->metadata_ver == 2) {
Assuming future version updates are backwards compatible and only add
new info this should be spe->metadata_ver >= 2, otherwise version bumps
end up causing errors when files get passed around.
I know there are arguments about what should and shouldn't be supported
when opening new files on old perfs, but in this case it's easy to only
add new info to the aux header and leave the old stuff intact.
> + cpu_num = arr[ARM_SPE_CPU_NUM];
> + header_size = ARM_SPE_AUXTRACE_V2_PRIV_MAX;
> + per_cpu_size = ARM_SPE_AUXTRACE_V2_PRIV_PER_CPU_MAX;
I think for coresight we also save the size of each per-cpu block rather
than use a constant, that way new items can be appended without breaking
readers.
That kind of leads to another point that this mechanism is mostly
duplicated from coresight. It saves a main header version, then per-cpu
groups of variable size with named elements. I'm not saying we should
definitely try to share the code, but it's worth keeping in mind.
> + } else {
> + pr_err("Cannot support metadata ver: %ld\n", spe->metadata_ver);
> + return;
> + }
> +
> + for (i = 0; i < header_size; i++)
> + fprintf(stdout, metadata_hdr_fmts[i], arr[i]);
> +
> + arr += header_size;
> + for (cpu = 0; cpu < cpu_num; cpu++) {
> + for (i = 0; i < per_cpu_size; i++)
> + fprintf(stdout, metadata_per_cpu_fmts[i], arr[i]);
> + arr += per_cpu_size;
> + }
> }
>
> static void arm_spe_set_event_name(struct evlist *evlist, u64 id,
> @@ -1383,7 +1416,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
> spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
> session->auxtrace = &spe->auxtrace;
>
> - arm_spe_print_info(&auxtrace_info->priv[0]);
> + arm_spe_print_info(spe, &auxtrace_info->priv[0]);
>
> if (dump_trace)
> return 0;
Changes in V4:
- In trace-event-python.c, fixed perf-tools-next merge problem.
Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 9 ++-
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 60 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 79 insertions(+), 8 deletions(-)
--
2.25.1
Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution
Fedora 37 distributed shared binary and executable mapped files show a
zero text section offset. Starting with the Fedora 38 distribution, the
shared binary and executable mapped files show a non-zero text section
offset for some binaries. The text offset parameter is never passed into
the arm-cs-trace-disasm.py script to allow the script to adjust the
start/end address range passed to objdump. This adjustment is required
to correctly offset into the dso text section. Not doing so results in
an incorrect user instruction trace display for Fedora 38 (and later)
user trace output.
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 9 ++-
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 60 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 79 insertions(+), 8 deletions(-)
--
2.25.1
On 8/27/2024 1:23 PM, Leo Yan wrote:
> On 8/26/2024 10:35 PM, Steve Clevenger wrote:
>>
>> Changes in V2:
>> - Updated mailing list distribution
>>
>> Extract map_pgoff parameter from the dictionary, and adjust start/end
>> range passed to objdump based on the value.
>>
>> The start_addr/stop_addr address checks are changed to print a warning
>> only if verbose == True. This script repeatedly sees a zero value passed
>> in for
>> start_addr = cpu_data[str(cpu) + 'addr']
>>
>> These zero values are not a new problem. The start_addr/stop_addr warning
>> clutters the instruction trace output, hence this change.
>>
>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>> ---
>> .../perf/scripts/python/arm-cs-trace-disasm.py | 17 +++++++++++++++--
>> 1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> index d973c2baed1c..6bf806078f9a 100755
>> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> @@ -187,6 +187,7 @@ def process_event(param_dict):
>> dso_start = get_optional(param_dict, "dso_map_start")
>> dso_end = get_optional(param_dict, "dso_map_end")
>> symbol = get_optional(param_dict, "symbol")
>> + map_pgoff = get_optional(param_dict, "map_pgoff")
>>
>> cpu = sample["cpu"]
>> ip = sample["ip"]
>> @@ -250,13 +251,25 @@ def process_event(param_dict):
>> return
>>
>> if (start_addr < int(dso_start) or start_addr > int(dso_end)):
>> - print("Start address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (start_addr, int(dso_start), int(dso_end), dso))
>> + if (options.verbose == True):
>> + print("Start address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (start_addr, int(dso_start), int(dso_end), dso))
>> return
>>
>> if (stop_addr < int(dso_start) or stop_addr > int(dso_end)):
>> - print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
>> + if (options.verbose == True):
>> + print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
>> return
>>
>> + if map_pgoff != None and map_pgoff != '[unknown]':
>> + if (dso == "[kernel.kallsyms]"):
>> + dso_vm_start = 0
>> + map_pgoff = '[unknown]'
>> + else:
>> + dso_vm_start = int(dso_start)
>> + start_addr += map_pgoff
>> + stop_addr += map_pgoff
>> + map_pgoff = '[unknown]'
>
> Every sample has its own field `map_pgoff`. So I am confused why we need the
> sentence: map_pgoff = '[unknown]'. And you can see with above change, we will
> set dso_vm_start delicately with below code section.
>
> How about a simple change like below?
>
> @@ -267,7 +268,7 @@ def process_event(param_dict):
>
> dso_fname = get_dso_file_path(dso, dso_bid)
> if path.exists(dso_fname):
> - print_disam(dso_fname, dso_vm_start, start_addr, stop_addr)
> + print_disam(dso_fname, dso_vm_start, start_addr + map_pgoff, stop_addr +
> map_pgoff)
>
> If this change does not work, I am curious if anything we missed to handle in
> the perf's C code (frontend).
>
> Thanks,
> Leo
Leo,
Sorry, the map_pgoff = '[unknown]' instances are actually leftover debug
artifacts used for print type debug. I resorted to print debug after I
didn't see success attempting mixed-mode debug with Visual Studio Code
on AARCH64. If you have a known working launch configuration, please
share it. See V3 of the patch series. I've rebased to the
perf-tools-next branch.
Steve
>
>> if (options.objdump_name != None):
>> # It doesn't need to decrease virtual memory offset for disassembly
>> # for kernel dso and executable file dso, so in this case we set
>> --
>> 2.25.1
>>
On 8/27/2024 12:55 PM, Leo Yan wrote:
> On 8/26/2024 10:35 PM, Steve Clevenger wrote:
>>
>> Changes in V2:
>> - Updated mailing list distribution
>>
>> Use dso__is_pie() to check whether the DSO file is a Position
>> Independent Executable (PIE). If PIE, change the MAPPING_TYPE to
>> MAPPING_TYPE__IDENTITY so a zero map pgoff (text offset) is passed
>> into the script.
>>
>>
>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>> ---
>> tools/perf/util/map.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index e1d14936a60d..df7c06fc373e 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -171,8 +171,11 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>> assert(!dso__kernel(dso));
>> map__init(result, start, start + len, pgoff, dso);
>>
>> + if (map->pgoff && !no_dso)
>> + no_dso = dso__is_pie(dso); // PIE check
>
> Here the logic is whatever the `pgoff` is (zero or non-zero), we need to
> always set the mapping type as MAPPING_TYPE__IDENTITY. It is no need to check
> pgoff and is no matter with 'no_dso'. So we can simply check dso__is_pie() and
> set IDENTITY flag for the true case.
>
> if (dso__is_pie(dso)) {
> map__set_mapping_type(map, MAPPING_TYPE__IDENTITY);
> } else if (anon || no_dso) {
> ...
> }
>
> BTW, please rebase the series on the top of the perf-tools-next branch [1], I
> saw this patch has merging conflict on it.
>
> Thanks,
> Leo
Hi Leo,
I combined the dso__is_pie() call onto the end of the existing 'if'
statement you now show as an 'else if'. Please see V3 of the patch series.
Steve
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
> branch: perf-tools-next
>> +
>> if (anon || no_dso) {
>> - map->mapping_type = MAPPING_TYPE__IDENTITY;
>> + map__set_mapping_type(map, MAPPING_TYPE__IDENTITY);
>>
>> /*
>> * Set memory without DSO as loaded. All map__find_*
>> --
>> 2.25.1
>>
Changes in V2:
- Updated mailing list distribution
Fedora 37 distributed shared binary and executable mapped files show a
zero text section offset. Starting with the Fedora 38 distribution, the
shared binary and executable mapped files show a non-zero text section
offset for some binaries. The text offset parameter is never passed into
the arm-cs-trace-disasm.py script to allow the script to adjust the
start/end address range passed to objdump. This adjustment is required
to correctly offset into the dso text section. Not doing so results in
an incorrect user instruction trace display for Fedora 38 (and later)
user trace output.
Steve Clevenger (5):
Add dso__is_pie call to identify ELF PIE
Add dso__is_pie prototype
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 17 +++++-
tools/perf/util/map.c | 5 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 60 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 90 insertions(+), 6 deletions(-)
--
2.25.1