A set of changes that came out of the issues reported here [1].
* First 2 patches fix a decode bug in Perf and add support for new consistency checks in OpenCSD * The remaining ones make the disassembly script easier to test and use. This also involves adding a new Python binding to Perf to get a config value (perf_config_get())
[1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarni...
Changes since V1: * Keep the flush function for discontinuities * Still remove the flush when the buffer fills, but now add cs_etm__end_block() for the end trace. That way we won't drop the last branch stack if the instruction sample period wasn't hit at the very end.
James Clark (7): perf cs-etm: Don't flush when packet_queue fills up perf cs-etm: Use new OpenCSD consistency checks perf scripting python: Add function to get a config value perf scripts python cs-etm: Update to use argparse perf scripts python cs-etm: Improve arguments perf scripts python cs-etm: Add start and stop arguments perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +- .../scripts/python/Perf-Trace-Util/Context.c | 11 ++ .../scripts/python/arm-cs-trace-disasm.py | 109 +++++++++++++----- .../tests/shell/test_arm_coresight_disasm.sh | 63 ++++++++++ tools/perf/util/config.c | 22 ++++ tools/perf/util/config.h | 1 + .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +- tools/perf/util/cs-etm.c | 25 ++-- 8 files changed, 205 insertions(+), 35 deletions(-) create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
cs_etm__flush(), like cs_etm__sample() is an operation that generates a sample and then swaps the current with the previous packet. Calling flush after processing the queues results in two swaps which corrupts the next sample. Therefore it wasn't appropriate to call flush here so remove it.
Flushing is still done on a discontinuity to explicitly clear the last branch buffer, but when the packet_queue fills up before reaching a timestamp, that's not a discontinuity and the call to cs_etm__process_traceid_queue() already generated samples and drained the buffers correctly.
This is visible by looking for a branch that has the same target as the previous branch and the following source is before the address of the last target, which is impossible as execution would have had to have gone backwards:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94 (packet_queue fills here before a timestamp, resulting in a flush and branch target ffff80008011cadc is duplicated.) ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94 ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
After removing the flush the correct branch target is used for the second sample, and ffff8000801117c4 is no longer before the previous address:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94 ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0 ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
Make sure that a final branch stack is output at the end of the trace by calling cs_etm__end_block(). This is already done for both the timeless decode paths.
Fixes: 21fe8dc1191a ("perf cs-etm: Add support for CPU-wide trace scenarios") Reported-by: Ganapatrao Kulkarni gankulkarni@os.amperecomputing.com Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecom... Signed-off-by: James Clark james.clark@linaro.org --- tools/perf/util/cs-etm.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 90f32f327b9b..242788ac9625 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -2490,12 +2490,6 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
/* Ignore return value */ cs_etm__process_traceid_queue(etmq, tidq); - - /* - * Generate an instruction sample with the remaining - * branchstack entries. - */ - cs_etm__flush(etmq, tidq); } }
@@ -2638,7 +2632,7 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
while (1) { if (!etm->heap.heap_cnt) - goto out; + break;
/* Take the entry at the top of the min heap */ cs_queue_nr = etm->heap.heap_array[0].queue_nr; @@ -2721,6 +2715,23 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm) ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp); }
+ for (i = 0; i < etm->queues.nr_queues; i++) { + struct int_node *inode; + + etmq = etm->queues.queue_array[i].priv; + if (!etmq) + continue; + + intlist__for_each_entry(inode, etmq->traceid_queues_list) { + int idx = (int)(intptr_t)inode->priv; + + /* Flush any remaining branch stack entries */ + tidq = etmq->traceid_queues[idx]; + ret = cs_etm__end_block(etmq, tidq); + if (ret) + return ret; + } + } out: return ret; }
Previously when the incorrect binary was used for decode, Perf would silently continue to generate incorrect samples. With OpenCSD 1.5.4 we can enable consistency checks that do a best effort to detect a mismatch in the image. When one is detected a warning is printed and sample generation stops until the trace resynchronizes with a good part of the image.
Reported-by: Ganapatrao Kulkarni gankulkarni@os.amperecomputing.com Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecom... Signed-off-by: James Clark james.clark@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index b78ef0262135..b85a8837bddc 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params, }
if (d_params->operation == CS_ETM_OPERATION_DECODE) { + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER; +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE | + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK; +#endif if (ocsd_dt_create_decoder(decoder->dcd_tree, decoder->decoder_name, - OCSD_CREATE_FLG_FULL_DECODER, + decode_flags, trace_config, &csid)) return -1;
This can be used to get config values like which objdump Perf uses for disassembly.
Signed-off-by: James Clark james.clark@linaro.org --- .../perf/Documentation/perf-script-python.txt | 2 +- .../scripts/python/Perf-Trace-Util/Context.c | 11 ++++++++++ tools/perf/util/config.c | 22 +++++++++++++++++++ tools/perf/util/config.h | 1 + 4 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-script-python.txt b/tools/perf/Documentation/perf-script-python.txt index 13e37e9385ee..27a1cac6fe76 100644 --- a/tools/perf/Documentation/perf-script-python.txt +++ b/tools/perf/Documentation/perf-script-python.txt @@ -624,7 +624,7 @@ as perf_trace_context.perf_script_context . perf_set_itrace_options(context, itrace_options) - set --itrace options if they have not been set already perf_sample_srcline(context) - returns source_file_name, line_number perf_sample_srccode(context) - returns source_file_name, line_number, source_line - + perf_config_get(config_name) - returns the value of the named config item, or None if unset
Util.py Module ~~~~~~~~~~~~~~ diff --git a/tools/perf/scripts/python/Perf-Trace-Util/Context.c b/tools/perf/scripts/python/Perf-Trace-Util/Context.c index 3954bd1587ce..01f54d6724a5 100644 --- a/tools/perf/scripts/python/Perf-Trace-Util/Context.c +++ b/tools/perf/scripts/python/Perf-Trace-Util/Context.c @@ -12,6 +12,7 @@ #define PY_SSIZE_T_CLEAN
#include <Python.h> +#include "../../../util/config.h" #include "../../../util/trace-event.h" #include "../../../util/event.h" #include "../../../util/symbol.h" @@ -182,6 +183,15 @@ static PyObject *perf_sample_srccode(PyObject *obj, PyObject *args) return perf_sample_src(obj, args, true); }
+static PyObject *__perf_config_get(PyObject *obj, PyObject *args) +{ + const char *config_name; + + if (!PyArg_ParseTuple(args, "s", &config_name)) + return NULL; + return Py_BuildValue("s", perf_config_get(config_name)); +} + static PyMethodDef ContextMethods[] = { #ifdef HAVE_LIBTRACEEVENT { "common_pc", perf_trace_context_common_pc, METH_VARARGS, @@ -199,6 +209,7 @@ static PyMethodDef ContextMethods[] = { METH_VARARGS, "Get source file name and line number."}, { "perf_sample_srccode", perf_sample_srccode, METH_VARARGS, "Get source file name, line number and line."}, + { "perf_config_get", __perf_config_get, METH_VARARGS, "Get perf config entry"}, { NULL, NULL, 0, NULL} };
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c index 7a650de0db83..68f9407ca74b 100644 --- a/tools/perf/util/config.c +++ b/tools/perf/util/config.c @@ -912,6 +912,7 @@ void set_buildid_dir(const char *dir) struct perf_config_scan_data { const char *name; const char *fmt; + const char *value; va_list args; int ret; }; @@ -939,3 +940,24 @@ int perf_config_scan(const char *name, const char *fmt, ...)
return d.ret; } + +static int perf_config_get_cb(const char *var, const char *value, void *data) +{ + struct perf_config_scan_data *d = data; + + if (!strcmp(var, d->name)) + d->value = value; + + return 0; +} + +const char *perf_config_get(const char *name) +{ + struct perf_config_scan_data d = { + .name = name, + .value = NULL, + }; + + perf_config(perf_config_get_cb, &d); + return d.value; +} diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h index 2e5e808928a5..9971313d61c1 100644 --- a/tools/perf/util/config.h +++ b/tools/perf/util/config.h @@ -30,6 +30,7 @@ typedef int (*config_fn_t)(const char *, const char *, void *); int perf_default_config(const char *, const char *, void *); int perf_config(config_fn_t fn, void *); int perf_config_scan(const char *name, const char *fmt, ...) __scanf(2, 3); +const char *perf_config_get(const char *name); int perf_config_set(struct perf_config_set *set, config_fn_t fn, void *data); int perf_config_int(int *dest, const char *, const char *);
optparse is deprecated and less flexible than argparse so update it.
Signed-off-by: James Clark james.clark@linaro.org --- .../scripts/python/arm-cs-trace-disasm.py | 28 +++++++------------ 1 file changed, 10 insertions(+), 18 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py index 7aff02d84ffb..45f682a8b34d 100755 --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py @@ -11,7 +11,7 @@ import os from os import path import re from subprocess import * -from optparse import OptionParser, make_option +import argparse
from perf_trace_context import perf_set_itrace_options, \ perf_sample_insn, perf_sample_srccode @@ -28,19 +28,11 @@ from perf_trace_context import perf_set_itrace_options, \ # perf script -s scripts/python/arm-cs-trace-disasm.py
# Command line parsing. -option_list = [ - # formatting options for the bottom entry of the stack - make_option("-k", "--vmlinux", dest="vmlinux_name", - help="Set path to vmlinux file"), - make_option("-d", "--objdump", dest="objdump_name", - help="Set path to objdump executable file"), - make_option("-v", "--verbose", dest="verbose", - action="store_true", default=False, - help="Enable debugging log") -] - -parser = OptionParser(option_list=option_list) -(options, args) = parser.parse_args() +args = argparse.ArgumentParser() +args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file") +args.add_argument("-d", "--objdump", help="Set path to objdump executable file"), +args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log") +options = args.parse_args()
# Initialize global dicts and regular expression disasm_cache = dict() @@ -65,8 +57,8 @@ def get_offset(perf_dict, field):
def get_dso_file_path(dso_name, dso_build_id): if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"): - if (options.vmlinux_name): - return options.vmlinux_name; + if (options.vmlinux): + return options.vmlinux; else: return dso_name
@@ -92,7 +84,7 @@ def read_disam(dso_fname, dso_start, start_addr, stop_addr): else: start_addr = start_addr - dso_start; stop_addr = stop_addr - dso_start; - disasm = [ options.objdump_name, "-d", "-z", + disasm = [ options.objdump, "-d", "-z", "--start-address="+format(start_addr,"#x"), "--stop-address="+format(stop_addr,"#x") ] disasm += [ dso_fname ] @@ -256,7 +248,7 @@ def process_event(param_dict): print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso)) return
- if (options.objdump_name != None): + if (options.objdump != None): # It doesn't need to decrease virtual memory offset for disassembly # for kernel dso and executable file dso, so in this case we set # vm_start to zero.
Make vmlinux detection automatic and use Perf's default objdump when -d is specified. This will make it easier for a test to use the script without having to provide arguments. And similarly for users.
Signed-off-by: James Clark james.clark@linaro.org --- .../scripts/python/arm-cs-trace-disasm.py | 63 ++++++++++++++++--- 1 file changed, 55 insertions(+), 8 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py index 45f682a8b34d..02e957d037ea 100755 --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py @@ -12,25 +12,48 @@ from os import path import re from subprocess import * import argparse +import platform
-from perf_trace_context import perf_set_itrace_options, \ - perf_sample_insn, perf_sample_srccode +from perf_trace_context import perf_sample_srccode, perf_config_get
# Below are some example commands for using this script. +# Note a --kcore recording is required for accurate decode +# due to the alternatives patching mechanism. However this +# script only supports reading vmlinux for disassembly dump, +# meaning that any patched instructions will appear +# as unpatched, but the instruction ranges themselves will +# be correct. In addition to this, source line info comes +# from Perf, and when using kcore there is no debug info. The +# following lists the supported features in each mode: +# +# +-----------+-----------------+------------------+------------------+ +# | Recording | Accurate decode | Source line dump | Disassembly dump | +# +-----------+-----------------+------------------+------------------+ +# | --kcore | yes | no | yes | +# | normal | no | yes | yes | +# +-----------+-----------------+------------------+------------------+ +# +# Output disassembly with objdump and auto detect vmlinux +# (when running on same machine.) +# perf script -s scripts/python/arm-cs-trace-disasm.py -d # -# Output disassembly with objdump: -# perf script -s scripts/python/arm-cs-trace-disasm.py \ -# -- -d objdump -k path/to/vmlinux # Output disassembly with llvm-objdump: # perf script -s scripts/python/arm-cs-trace-disasm.py \ # -- -d llvm-objdump-11 -k path/to/vmlinux +# # Output only source line and symbols: # perf script -s scripts/python/arm-cs-trace-disasm.py
+def default_objdump(): + config = perf_config_get("annotate.objdump") + return config if config else "objdump" + # Command line parsing. args = argparse.ArgumentParser() -args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file") -args.add_argument("-d", "--objdump", help="Set path to objdump executable file"), +args.add_argument("-k", "--vmlinux", + help="Set path to vmlinux file. Omit to autodetect if running on same machine") +args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(), + help="Show disassembly. Can also be used to change the objdump path"), args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log") options = args.parse_args()
@@ -45,6 +68,17 @@ glb_source_file_name = None glb_line_number = None glb_dso = None
+kver = platform.release() +vmlinux_paths = [ + f"/usr/lib/debug/boot/vmlinux-{kver}.debug", + f"/usr/lib/debug/lib/modules/{kver}/vmlinux", + f"/lib/modules/{kver}/build/vmlinux", + f"/usr/lib/debug/boot/vmlinux-{kver}", + f"/boot/vmlinux-{kver}", + f"/boot/vmlinux", + f"vmlinux" +] + def get_optional(perf_dict, field): if field in perf_dict: return perf_dict[field] @@ -55,12 +89,25 @@ def get_offset(perf_dict, field): return "+%#x" % perf_dict[field] return ""
+def find_vmlinux(): + if hasattr(find_vmlinux, "path"): + return find_vmlinux.path + + for v in vmlinux_paths: + if os.access(v, os.R_OK): + find_vmlinux.path = v + break + else: + find_vmlinux.path = None + + return find_vmlinux.path + def get_dso_file_path(dso_name, dso_build_id): if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"): if (options.vmlinux): return options.vmlinux; else: - return dso_name + return find_vmlinux() if find_vmlinux() else dso_name
if (dso_name == "[vdso]") : append = "/vdso"
Make it possible to only disassemble a range of timestamps or sample indexes. This will be used by the test to limit the runtime, but it's also useful for users.
Signed-off-by: James Clark james.clark@linaro.org --- .../scripts/python/arm-cs-trace-disasm.py | 22 +++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py index 02e957d037ea..a097995d8e7b 100755 --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py @@ -55,6 +55,11 @@ args.add_argument("-k", "--vmlinux", args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(), help="Show disassembly. Can also be used to change the objdump path"), args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log") +args.add_argument("--start-time", type=int, help="Time of sample to start from") +args.add_argument("--stop-time", type=int, help="Time of sample to stop at") +args.add_argument("--start-sample", type=int, help="Index of sample to start from") +args.add_argument("--stop-sample", type=int, help="Index of sample to stop at") + options = args.parse_args()
# Initialize global dicts and regular expression @@ -63,6 +68,7 @@ cpu_data = dict() disasm_re = re.compile(r"^\s*([0-9a-fA-F]+):") disasm_func_re = re.compile(r"^\s*([0-9a-fA-F]+)\s.*:") cache_size = 64*1024 +sample_idx = -1
glb_source_file_name = None glb_line_number = None @@ -151,10 +157,10 @@ def print_disam(dso_fname, dso_start, start_addr, stop_addr):
def print_sample(sample): print("Sample = { cpu: %04d addr: 0x%016x phys_addr: 0x%016x ip: 0x%016x " \ - "pid: %d tid: %d period: %d time: %d }" % \ + "pid: %d tid: %d period: %d time: %d index: %d}" % \ (sample['cpu'], sample['addr'], sample['phys_addr'], \ sample['ip'], sample['pid'], sample['tid'], \ - sample['period'], sample['time'])) + sample['period'], sample['time'], sample_idx))
def trace_begin(): print('ARM CoreSight Trace Data Assembler Dump') @@ -216,6 +222,7 @@ def print_srccode(comm, param_dict, sample, symbol, dso): def process_event(param_dict): global cache_size global options + global sample_idx
sample = param_dict["sample"] comm = param_dict["comm"] @@ -231,6 +238,17 @@ def process_event(param_dict): ip = sample["ip"] addr = sample["addr"]
+ sample_idx += 1 + + if (options.start_time and sample["time"] < options.start_time): + return + if (options.stop_time and sample["time"] > options.stop_time): + exit(0) + if (options.start_sample and sample_idx < options.start_sample): + return + if (options.stop_sample and sample_idx > options.stop_sample): + exit(0) + if (options.verbose == True): print("Event type: %s" % name) print_sample(sample)
Run a few samples through the disassembly script and check to see that at least one branch instruction is printed.
Signed-off-by: James Clark james.clark@linaro.org --- .../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh b/tools/perf/tests/shell/test_arm_coresight_disasm.sh new file mode 100755 index 000000000000..6d004bf29f80 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh @@ -0,0 +1,63 @@ +#!/bin/sh +# Check Arm CoreSight disassembly script completes without errors +# SPDX-License-Identifier: GPL-2.0 + +# The disassembly script reconstructs ranges of instructions and gives these to objdump to +# decode. objdump doesn't like ranges that go backwards, but these are a good indication +# that decoding has gone wrong either in OpenCSD, Perf or in the range reconstruction in +# the script. Test all 3 parts are working correctly by running the script. + +skip_if_no_cs_etm_event() { + perf list | grep -q 'cs_etm//' && return 0 + + # cs_etm event doesn't exist + return 2 +} + +skip_if_no_cs_etm_event || exit 2 + +# Assume an error unless we reach the very end +set -e +glb_err=1 + +perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX) +perfdata=${perfdata_dir}/perf.data +file=$(mktemp /tmp/temporary_file.XXXXX) + +cleanup_files() +{ + set +e + rm -rf ${perfdata_dir} + rm -f ${file} + trap - EXIT TERM INT + exit $glb_err +} + +trap cleanup_files EXIT TERM INT + +# Ranges start and end on branches, so check for some likely branch instructions +sep="\s|\s" +branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s" + +## Test kernel ## +if [ -e /proc/kcore ]; then + echo "Testing kernel disassembly" + perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file > /dev/null 2>&1 + perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \ + -d --stop-sample=30 2> /dev/null > ${file} + grep -q -e ${branch_search} ${file} + echo "Found kernel branches" +else + # kcore is required for correct kernel decode due to runtime code patching + echo "No kcore, skipping kernel test" +fi + +## Test user ## +echo "Testing userspace disassembly" +perf record -o ${perfdata} -e cs_etm//u -- touch $file > /dev/null 2>&1 +perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \ + -d --stop-sample=30 2> /dev/null > ${file} +grep -q -e ${branch_search} ${file} +echo "Found userspace branches" + +glb_err=0
On Thu, Sep 12, 2024 at 04:11:31PM +0100, James Clark wrote:
A set of changes that came out of the issues reported here [1].
- First 2 patches fix a decode bug in Perf and add support for new consistency checks in OpenCSD
- The remaining ones make the disassembly script easier to test and use. This also involves adding a new Python binding to Perf to get a config value (perf_config_get())
Looks ok from a quick look, but I can't test it, so since there are reporters for problems that are being fixed, it would be great to have a Tested-by: from the reporters and probably from someone with access to the hardware where this can be tested.
- Arnaldo
Changes since V1:
- Keep the flush function for discontinuities
- Still remove the flush when the buffer fills, but now add cs_etm__end_block() for the end trace. That way we won't drop the last branch stack if the instruction sample period wasn't hit at the very end.
James Clark (7): perf cs-etm: Don't flush when packet_queue fills up perf cs-etm: Use new OpenCSD consistency checks perf scripting python: Add function to get a config value perf scripts python cs-etm: Update to use argparse perf scripts python cs-etm: Improve arguments perf scripts python cs-etm: Add start and stop arguments perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +- .../scripts/python/Perf-Trace-Util/Context.c | 11 ++ .../scripts/python/arm-cs-trace-disasm.py | 109 +++++++++++++----- .../tests/shell/test_arm_coresight_disasm.sh | 63 ++++++++++ tools/perf/util/config.c | 22 ++++ tools/perf/util/config.h | 1 + .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +- tools/perf/util/cs-etm.c | 25 ++-- 8 files changed, 205 insertions(+), 35 deletions(-) create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
-- 2.34.1
On 12/09/2024 20:23, Arnaldo Carvalho de Melo wrote:
On Thu, Sep 12, 2024 at 04:11:31PM +0100, James Clark wrote:
A set of changes that came out of the issues reported here [1].
- First 2 patches fix a decode bug in Perf and add support for new consistency checks in OpenCSD
- The remaining ones make the disassembly script easier to test and use. This also involves adding a new Python binding to Perf to get a config value (perf_config_get())
Looks ok from a quick look, but I can't test it, so since there are reporters for problems that are being fixed, it would be great to have a Tested-by: from the reporters and probably from someone with access to the hardware where this can be tested.
- Arnaldo
I pinged Ganapat, hopefully he has some time to test it. But we can wait a bit and maybe find someone else if not.