A set of changes that came out of the issues reported here [1].
* First 2 patches fix a decode bug in Perf and add support for new
consistency checks in OpenCSD
* The remaining ones make the disassembly script easier to test
and use. This also involves adding a new Python binding to
Perf to get a config value (perf_config_get())
[1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarn…
Changes since V2:
* Check validity of start stop arguments
* Make test work if Perf was installed
* Document that start and stop time are monotonic clock values
Changes since V1:
* Keep the flush function for discontinuities
* Still remove the flush when the buffer fills, but now add
cs_etm__end_block() for the end trace. That way we won't drop
the last branch stack if the instruction sample period wasn't
hit at the very end.
James Clark (7):
perf cs-etm: Don't flush when packet_queue fills up
perf cs-etm: Use new OpenCSD consistency checks
perf scripting python: Add function to get a config value
perf scripts python cs-etm: Update to use argparse
perf scripts python cs-etm: Improve arguments
perf scripts python cs-etm: Add start and stop arguments
perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +-
.../scripts/python/Perf-Trace-Util/Context.c | 11 ++
.../scripts/python/arm-cs-trace-disasm.py | 127 ++++++++++++++----
.../tests/shell/test_arm_coresight_disasm.sh | 65 +++++++++
tools/perf/util/config.c | 22 +++
tools/perf/util/config.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +-
tools/perf/util/cs-etm.c | 25 +++-
8 files changed, 225 insertions(+), 35 deletions(-)
create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
--
2.34.1
On 13/09/2024 14:35, Leo Yan wrote:
>
>
> On 9/12/24 16:11, James Clark wrote:
>>
>> Run a few samples through the disassembly script and check to see that
>> at least one branch instruction is printed.
>>
>> Signed-off-by: James Clark <james.clark(a)linaro.org>
>> ---
>> .../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++
>> 1 file changed, 63 insertions(+)
>> create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
>>
>> diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> new file mode 100755
>> index 000000000000..6d004bf29f80
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> @@ -0,0 +1,63 @@
>> +#!/bin/sh
>> +# Check Arm CoreSight disassembly script completes without errors
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +# The disassembly script reconstructs ranges of instructions and
>> gives these to objdump to
>> +# decode. objdump doesn't like ranges that go backwards, but these
>> are a good indication
>> +# that decoding has gone wrong either in OpenCSD, Perf or in the
>> range reconstruction in
>> +# the script. Test all 3 parts are working correctly by running the
>> script.
>> +
>> +skip_if_no_cs_etm_event() {
>> + perf list | grep -q 'cs_etm//' && return 0
>> +
>> + # cs_etm event doesn't exist
>> + return 2
>> +}
>> +
>> +skip_if_no_cs_etm_event || exit 2
>> +
>> +# Assume an error unless we reach the very end
>> +set -e
>> +glb_err=1
>> +
>> +perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX)
>> +perfdata=${perfdata_dir}/perf.data
>> +file=$(mktemp /tmp/temporary_file.XXXXX)
>> +
>> +cleanup_files()
>> +{
>> + set +e
>> + rm -rf ${perfdata_dir}
>> + rm -f ${file}
>> + trap - EXIT TERM INT
>> + exit $glb_err
>> +}
>> +
>> +trap cleanup_files EXIT TERM INT
>> +
>> +# Ranges start and end on branches, so check for some likely branch
>> instructions
>> +sep="\s\|\s"
>> +branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s"
>> +
>> +## Test kernel ##
>> +if [ -e /proc/kcore ]; then
>> + echo "Testing kernel disassembly"
>> + perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file
>> > /dev/null 2>&1
>> + perf script -i ${perfdata} -s
>> python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
>> + -d --stop-sample=30 2> /dev/null > ${file}
>
> This is fine for self test. But for a CI test in a distro, will it fail to
> find script with prefix 'tools/perf/...'?
>
> Thanks,
> Leo
>
Nice catch, it should be this:
# Relative path works whether it's installed or running from repo
script_path=$(dirname "$0")/../../scripts/python/arm-cs-trace\
-disasm.py
This patch series is rebased on coresight-next-v6.12.
Changelog from v9:
* Add common helper function of_tmc_get_reserved_resource_by_name
for better code reuse
* Reserved buffer validity and crashdata validity has been separated to
avoid interdependence
* New fields added to crash metadata: version, ffcr, ffsr, mode
* Version checks added for metadata validation
* Special file /dev/crash_tmc_xxx would be available only when
crash metadata is valid
* Removed READ_CRASHDATA mode meant for special casing crashdata reads.
Instead, dedicated read function added for crashdata reads from reserved
buffer which is common for both ETR and ETF sinks as well.
* Documentation added to Documentation/tracing/coresight/panic.rst
Changelog from v8:
* Added missing exit path on error in __tmc_probe.
* Few whitespace fixes, checkpatch fixes.
* With perf sessions honouring stop_on_flush sysfs attribute,
removed redundant variable stop_on_flush_en.
Changelog from v7:
* Fixed breakage on perf test -vvvv "arm coresight".
No issues seen with and without "resrv" buffer mode
* Moved the crashdev registration into a separate function.
* Removed redundant variable in tmc_etr_setup_crashdata_buf
* Avoided a redundant memcpy in tmc_panic_sync_etf.
* Tested kernel panic with trace session started uisng perf.
Please see the title "Perf based testing" below for details.
For this, stop_on_flush sysfs attribute is taken into
consideration while starting perf sessions as well.
Changelog from v6:
* Added special device files for reading crashdata, so that
read_prevboot mode flag is removed.
* Added new sysfs TMC device attribute, stop_on_flush.
Stop on flush trigger event is disabled by default.
User need to explicitly enable this from sysfs for panic stop
to work.
* Address parameter for panicstop ETM configuration is
chosen as kernel "panic" address by default.
* Added missing tmc_wait_for_tmcready during panic handling
* Few other misc code rearrangements.
Changelog from v5:
* Fixed issues reported by CONFIG_DEBUG_ATOMIC_SLEEP
* Fixed a memory leak while reading data from /dev/tmc_etrx in
READ_PREVBOOT mode
* Tested reading trace data from crashdump kernel
Changelog from v4:
* Device tree binding
- Description is made more explicit on the usage of reserved memory
region
- Mismatch in memory region names in dts binding and driver fixed
- Removed "mem" suffix from the memory region names
* Rename "struct tmc_register_snapshot" -> "struct tmc_crash_metadata",
since it contains more than register snapshot.
Related variables are named accordingly.
* Rename struct tmc_drvdata members
resrv_buf -> crash_tbuf
metadata -> crash_mdata
* Size field in metadata refers to RSZ register and hence indicates the
size in 32 bit words. ETR metadata follows this convention, the same
has been extended to ETF metadata as well.
* Added crc32 for more robust metadata and tracedata validation.
* Added/modified dev_dbg messages during metadata validation
* Fixed a typo in patch 5 commit description
Changelog from v3:
* Converted the Coresight ETM driver change to a named configuration.
RFC tag has been removed with this change.
* Fixed yaml issues reported by "make dt_binding_check"
* Added names for reserved memory regions 0 and 1
* Added prevalidation checks for metadata processing
* Fixed a regression introduced in RFC v3
- TMC Status register was getting saved wrongly
* Reverted memremap attribute changes from _WB to _WC to match
with the dma map attributes
* Introduced reserved buffer mode specific .sync op.
This fixes a possible crash when reserved buffer mode was used in
normal trace capture, due to unwanted dma maintenance operations.
*** SUBJECT HERE ***
*** BLURB HERE ***
Linu Cherian (8):
dt-bindings: arm: coresight-tmc: Add "memory-region" property
coresight: tmc-etr: Add support to use reserved trace memory
coresight: core: Add provision for panic callbacks
coresight: tmc: Enable panic sync handling
coresight: tmc: Add support for reading crash data
coresight: tmc: Stop trace capture on FlIn
coresight: config: Add preloaded configuration
Documentation: coresight: Panic support
.../bindings/arm/arm,coresight-tmc.yaml | 26 ++
Documentation/trace/coresight/panic.rst | 356 ++++++++++++++++++
drivers/hwtracing/coresight/Makefile | 2 +-
.../coresight/coresight-cfg-preload.c | 2 +
.../coresight/coresight-cfg-preload.h | 2 +
.../hwtracing/coresight/coresight-cfg-pstop.c | 83 ++++
drivers/hwtracing/coresight/coresight-core.c | 42 +++
.../hwtracing/coresight/coresight-tmc-core.c | 296 ++++++++++++++-
.../hwtracing/coresight/coresight-tmc-etf.c | 124 +++++-
.../hwtracing/coresight/coresight-tmc-etr.c | 231 +++++++++++-
drivers/hwtracing/coresight/coresight-tmc.h | 106 ++++++
include/linux/coresight.h | 24 ++
12 files changed, 1282 insertions(+), 12 deletions(-)
create mode 100644 Documentation/trace/coresight/panic.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-pstop.c
--
2.34.1
On 13/09/2024 12:54, Leo Yan wrote:
> On 9/12/24 16:11, James Clark wrote:>
>>
>> Previously when the incorrect binary was used for decode, Perf would
>> silently continue to generate incorrect samples. With OpenCSD 1.5.4 we
>> can enable consistency checks that do a best effort to detect a mismatch
>> in the image. When one is detected a warning is printed and sample
>> generation stops until the trace resynchronizes with a good part of the
>> image.
>>
>> Reported-by: Ganapatrao Kulkarni <gankulkarni(a)os.amperecomputing.com>
>> Closes:
>> https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.ampereco…
>> Signed-off-by: James Clark <james.clark(a)linaro.org>
>> ---
>> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index b78ef0262135..b85a8837bddc 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct
>> cs_etm_decoder_params *d_params,
>> }
>>
>> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
>> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
>> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
>> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK |
>> OCSD_OPFLG_CHK_RANGE_CONTINUE |
>> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
>> +#endif
>
> Looks good to me.
>
> Just one question: should the flag ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK be set
> according to ETM version? E.g. it should be only set for ETMv4 or this is
> fine for ETE as well.
>
> Thanks,
> Leo
>
I asked Mike the same question about ETMv3 and he said none of the flags
overlap and it was ok to always pass them. So I assume the same applies
to ETE as well.
Change since V3:
1. Use ^ete-[0-9]+$ for the pattern of node name -- comments from Rob Herring
Change since V2:
1. Change the name in binding as 'ete'.
Change since V1:
1. Remove the pattern match of ETE node name.
2. Update the tmc-etr node name in DT.
Mao Jinlong (2):
dt-bindings: arm: coresight: Update the pattern of ete node name
arm64: dts: qcom: sm8450: Add coresight nodes
.../arm/arm,embedded-trace-extension.yaml | 6 +-
arch/arm64/boot/dts/qcom/sm8450.dtsi | 726 ++++++++++++++++++
2 files changed, 729 insertions(+), 3 deletions(-)
--
2.46.0
A set of changes that came out of the issues reported here [1].
* First 2 patches fix a decode bug in Perf and add support for new
consistency checks in OpenCSD
* The remaining ones make the disassembly script easier to test
and use. This also involves adding a new Python binding to
Perf to get a config value (perf_config_get())
[1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarn…
Changes since V1:
* Keep the flush function for discontinuities
* Still remove the flush when the buffer fills, but now add
cs_etm__end_block() for the end trace. That way we won't drop
the last branch stack if the instruction sample period wasn't
hit at the very end.
James Clark (7):
perf cs-etm: Don't flush when packet_queue fills up
perf cs-etm: Use new OpenCSD consistency checks
perf scripting python: Add function to get a config value
perf scripts python cs-etm: Update to use argparse
perf scripts python cs-etm: Improve arguments
perf scripts python cs-etm: Add start and stop arguments
perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +-
.../scripts/python/Perf-Trace-Util/Context.c | 11 ++
.../scripts/python/arm-cs-trace-disasm.py | 109 +++++++++++++-----
.../tests/shell/test_arm_coresight_disasm.sh | 63 ++++++++++
tools/perf/util/config.c | 22 ++++
tools/perf/util/config.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +-
tools/perf/util/cs-etm.c | 25 ++--
8 files changed, 205 insertions(+), 35 deletions(-)
create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
--
2.34.1
On 11/09/2024 09:14, Leo Yan wrote:
>
>
> On 9/10/2024 9:28 PM, Leo Yan wrote:
>> On 9/5/2024 11:50 AM, James Clark wrote:
>>
>> [...]
>>
>>> cs_etm__flush(), like cs_etm__sample() is an operation that generates a
>>> sample and then swaps the current with the previous packet. Calling
>>> flush after processing the queues results in two swaps which corrupts
>>> the next sample. Therefore it wasn't appropriate to call flush here so
>>> remove it.
>>
>> In the cs_etm__sample(), if the period is not overflow, it is not necessarily
>> to generate instruction samples and copy back stack entries. This is why we
>> want to call cs_etm__flush() to make sure the last packet can be recorded
>> properly for instruction sample with back stacks.
>>
>> We also need to take account into the case for the end of the session - in
>> this case we need to generate samples for the last packet for complete info.
>>
>> I am wandering should we remove the cs_etm__packet_swap() from cs_etm__sample()?
>
> Sorry for typo. I meant to remove the cs_etm__packet_swap() from cs_etm__flush().
>
> Thanks,
> Leo
Turns out there was already cs_etm__end_block() for the end of the
session, but it was only called for the timeless modes. I added it for
timestamped mode too in V2.
I also kept the existing flush() function for discontinuities. I changed
my mind that the differences to cs_etm__sample() weren't relevant.
So I think we still need to keep the swap in flush() because otherwise
the next sample won't start from the right place.
Some HW has static trace id which cannot be changed via
software programming. For this case, configure the trace id
in device tree with "arm,static-trace-id = <xxx>", and
call coresight_trace_id_get_static_system_id with the trace id value
in device probe function. The id will be reserved for the HW
all the time if the device is probed.
Changes since V3:
1. Adda new API function
int coresight_trace_id_get_system_static_id(int trace_id).
2. Use the term "static trace id" for these devices where
the hardware sets a non-programmable trace ID.
Changes since V2:
1. Change "trace-id" to "arm,trace-id".
2. Add trace id flag for getting preferred id or ODD id.
Changes since V1:
1. Add argument to coresight_trace_id_get_system_id for preferred id
instead of adding new function coresight_trace_id_reserve_system_id.
2. Add constraint to trace-id in dt-binding file.
Mao Jinlong (3):
dt-bindings: arm: Add arm,trace-id for coresight dummy source
coresight: Add support to get static id for system trace sources
coresight: dummy: Add static trace id support for dummy source
.../sysfs-bus-coresight-devices-dummy-source | 15 +++++
.../arm/arm,coresight-dummy-source.yaml | 6 ++
drivers/hwtracing/coresight/coresight-dummy.c | 59 +++++++++++++++++--
.../hwtracing/coresight/coresight-platform.c | 26 ++++++++
.../hwtracing/coresight/coresight-trace-id.c | 38 ++++++++----
.../hwtracing/coresight/coresight-trace-id.h | 9 +++
include/linux/coresight.h | 1 +
7 files changed, 140 insertions(+), 14 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-dummy-source
--
2.46.0
Changes in V8:
- in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
string.
- Remove map_pgoff integer conversion in dso not found print
message.
Changes in V7:
- In arm-cs-trace-disasm.py, fix print message core dump resulting
from mixed type arithmetic.
- Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
CS_ETM_TRACE_ON message is changed to print only in verbose mode.
- Removed verbose mode only notification for start_addr/stop_addr
outside of dso address range.
Changes in V6:
- In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
map_pgoff to start/end address for dso not found message.
- Added "Reviewed-by" trailer for patches 1-3 previously reviewed
by Leo Yan in V4 and V5.
Changes in V5:
- In symbol-elf.c, branch to exit_close label if open file.
- In trace_event_python.c, correct indentation. set_sym_in_dict
call parameter "map_pgoff" renamed as "addr_map_pgoff" to
match local naming.
Changes in V4:
- In trace-event-python.c, fixed perf-tools-next merge problem.
Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution.
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 86 insertions(+), 10 deletions(-)
--
2.44.0