Change since V4:
1. Use ^ete(-[0-9]+)?$ for the pattern of node name -- comments from Krzysztof Kozlowski <krzk(a)kernel.org>
2. Update commit message --- comments from Rob Herring <robh(a)kernel.org>
Change since V3:
1. Use ^ete-[0-9]+$ for the pattern of node name -- comments from Rob Herring
Change since V2:
1. Change the name in binding as 'ete'.
Change since V1:
1. Remove the pattern match of ETE node name.
2. Update the tmc-etr node name in DT.
Mao Jinlong (2):
dt-bindings: arm: coresight: Update the pattern of ete node name
arm64: dts: qcom: sm8450: Add coresight nodes
.../arm/arm,embedded-trace-extension.yaml | 6 +-
arch/arm64/boot/dts/qcom/sm8450.dtsi | 726 ++++++++++++++++++
2 files changed, 729 insertions(+), 3 deletions(-)
--
2.46.0
On 25/09/2024 12:39 am, Ilkka Koskinen wrote:
> If one builds perf with DEBUG=1, captures data on multiple CPUs and
> finally runs 'perf report -C <cpu>' for only one of the cpus, assert()
> aborts the program. This happens because there are empty queues with
> format set. This patch changes the condition to abort only if a queue
> is not empty and if the format is unset.
>
> $ make -C tools/perf DEBUG=1 CORESIGHT=1 CSLIBS=/usr/lib CSINCLUDES=/usr/include install
> $ perf record -o kcore --kcore -e cs_etm/timestamp/k -s -C 0-1 dd if=/dev/zero of=/dev/null bs=1M count=1
> $ perf report --input kcore/data --vmlinux=/home/ikoskine/projects/linux/vmlinux -C 1
> Aborted (core dumped)
>
> Fixes: 57880a7966be ("perf: cs-etm: Allocate queues for all CPUs")
> Signed-off-by: Ilkka Koskinen <ilkka(a)os.amperecomputing.com>
> ---
> tools/perf/util/cs-etm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 90f32f327b9b..40f047baef81 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -3323,7 +3323,7 @@ static int cs_etm__create_decoders(struct cs_etm_auxtrace *etm)
> * Don't create decoders for empty queues, mainly because
> * etmq->format is unknown for empty queues.
> */
> - assert(empty == (etmq->format == UNSET));
> + assert(empty || etmq->format != UNSET);
> if (empty)
> continue;
>
Oops I didn't realize you could filter on CPU in report mode. Thanks for
the fix. Adding a test to the end of test_arm_coresight.sh might be
quite useful. Either way:
Reviewed-by: James Clark <james.clark(a)linaro.org>
On 18/09/2024 12:23 pm, Ganapatrao Kulkarni wrote:
>
> Hi James,
>
> On 16-09-2024 07:27 pm, James Clark wrote:
>> A set of changes that came out of the issues reported here [1].
>>
>> * First 2 patches fix a decode bug in Perf and add support for new
>> consistency checks in OpenCSD
>> * The remaining ones make the disassembly script easier to test
>> and use. This also involves adding a new Python binding to
>> Perf to get a config value (perf_config_get())
>>
>> [1]:
>> https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarn…
>>
>
> Tried this series with below commands and issue is not seen as reported
> in [1].
>
> record:
> timeout 8s ./perf record -e cs_etm// -C 1 -o kcore --kcore dd
> if=/dev/zero of=/dev/null
>
> decode:
> ./perf script -i ./kcore -s scripts/python/arm-cs-trace-disasm.py -- -d
> objdump -k kcore/kcore_dir/kcore
>
> ./perf script -i ./kcore -s scripts/python/arm-cs-trace-disasm.py -F
> cpu,event,ip,addr,sym -- -d objdump -k kcore/kcore_dir/kcore
>
> Feel free to add for 1/7 and 2/7.
> Tested-by: Ganapatrao Kulkarni <gankulkarni(a)os.amperecomputing.com>
>
Thanks for testing!
A set of changes that came out of the issues reported here [1].
* First 2 patches fix a decode bug in Perf and add support for new
consistency checks in OpenCSD
* The remaining ones make the disassembly script easier to test
and use. This also involves adding a new Python binding to
Perf to get a config value (perf_config_get())
[1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarn…
Changes since V1:
* Keep the flush function for discontinuities
* Still remove the flush when the buffer fills, but now add
cs_etm__end_block() for the end trace. That way we won't drop
the last branch stack if the instruction sample period wasn't
hit at the very end.
James Clark (7):
perf cs-etm: Don't flush when packet_queue fills up
perf cs-etm: Use new OpenCSD consistency checks
perf scripting python: Add function to get a config value
perf scripts python cs-etm: Update to use argparse
perf scripts python cs-etm: Improve arguments
perf scripts python cs-etm: Add start and stop arguments
perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +-
.../scripts/python/Perf-Trace-Util/Context.c | 11 ++
.../scripts/python/arm-cs-trace-disasm.py | 109 +++++++++++++-----
.../tests/shell/test_arm_coresight_disasm.sh | 63 ++++++++++
tools/perf/util/config.c | 22 ++++
tools/perf/util/config.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +-
tools/perf/util/cs-etm.c | 25 ++--
8 files changed, 205 insertions(+), 35 deletions(-)
create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
--
2.34.1
A set of changes that came out of the issues reported here [1].
* First 2 patches fix a decode bug in Perf and add support for new
consistency checks in OpenCSD
* The remaining ones make the disassembly script easier to test
and use. This also involves adding a new Python binding to
Perf to get a config value (perf_config_get())
[1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarn…
Changes since V2:
* Check validity of start stop arguments
* Make test work if Perf was installed
* Document that start and stop time are monotonic clock values
Changes since V1:
* Keep the flush function for discontinuities
* Still remove the flush when the buffer fills, but now add
cs_etm__end_block() for the end trace. That way we won't drop
the last branch stack if the instruction sample period wasn't
hit at the very end.
James Clark (7):
perf cs-etm: Don't flush when packet_queue fills up
perf cs-etm: Use new OpenCSD consistency checks
perf scripting python: Add function to get a config value
perf scripts python cs-etm: Update to use argparse
perf scripts python cs-etm: Improve arguments
perf scripts python cs-etm: Add start and stop arguments
perf test: cs-etm: Test Coresight disassembly script
.../perf/Documentation/perf-script-python.txt | 2 +-
.../scripts/python/Perf-Trace-Util/Context.c | 11 ++
.../scripts/python/arm-cs-trace-disasm.py | 127 ++++++++++++++----
.../tests/shell/test_arm_coresight_disasm.sh | 65 +++++++++
tools/perf/util/config.c | 22 +++
tools/perf/util/config.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +-
tools/perf/util/cs-etm.c | 25 +++-
8 files changed, 225 insertions(+), 35 deletions(-)
create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
--
2.34.1
On 13/09/2024 14:35, Leo Yan wrote:
>
>
> On 9/12/24 16:11, James Clark wrote:
>>
>> Run a few samples through the disassembly script and check to see that
>> at least one branch instruction is printed.
>>
>> Signed-off-by: James Clark <james.clark(a)linaro.org>
>> ---
>> .../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++
>> 1 file changed, 63 insertions(+)
>> create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
>>
>> diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> new file mode 100755
>> index 000000000000..6d004bf29f80
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> @@ -0,0 +1,63 @@
>> +#!/bin/sh
>> +# Check Arm CoreSight disassembly script completes without errors
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +# The disassembly script reconstructs ranges of instructions and
>> gives these to objdump to
>> +# decode. objdump doesn't like ranges that go backwards, but these
>> are a good indication
>> +# that decoding has gone wrong either in OpenCSD, Perf or in the
>> range reconstruction in
>> +# the script. Test all 3 parts are working correctly by running the
>> script.
>> +
>> +skip_if_no_cs_etm_event() {
>> + perf list | grep -q 'cs_etm//' && return 0
>> +
>> + # cs_etm event doesn't exist
>> + return 2
>> +}
>> +
>> +skip_if_no_cs_etm_event || exit 2
>> +
>> +# Assume an error unless we reach the very end
>> +set -e
>> +glb_err=1
>> +
>> +perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX)
>> +perfdata=${perfdata_dir}/perf.data
>> +file=$(mktemp /tmp/temporary_file.XXXXX)
>> +
>> +cleanup_files()
>> +{
>> + set +e
>> + rm -rf ${perfdata_dir}
>> + rm -f ${file}
>> + trap - EXIT TERM INT
>> + exit $glb_err
>> +}
>> +
>> +trap cleanup_files EXIT TERM INT
>> +
>> +# Ranges start and end on branches, so check for some likely branch
>> instructions
>> +sep="\s\|\s"
>> +branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s"
>> +
>> +## Test kernel ##
>> +if [ -e /proc/kcore ]; then
>> + echo "Testing kernel disassembly"
>> + perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file
>> > /dev/null 2>&1
>> + perf script -i ${perfdata} -s
>> python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
>> + -d --stop-sample=30 2> /dev/null > ${file}
>
> This is fine for self test. But for a CI test in a distro, will it fail to
> find script with prefix 'tools/perf/...'?
>
> Thanks,
> Leo
>
Nice catch, it should be this:
# Relative path works whether it's installed or running from repo
script_path=$(dirname "$0")/../../scripts/python/arm-cs-trace\
-disasm.py
On 13/09/2024 12:54, Leo Yan wrote:
> On 9/12/24 16:11, James Clark wrote:>
>>
>> Previously when the incorrect binary was used for decode, Perf would
>> silently continue to generate incorrect samples. With OpenCSD 1.5.4 we
>> can enable consistency checks that do a best effort to detect a mismatch
>> in the image. When one is detected a warning is printed and sample
>> generation stops until the trace resynchronizes with a good part of the
>> image.
>>
>> Reported-by: Ganapatrao Kulkarni <gankulkarni(a)os.amperecomputing.com>
>> Closes:
>> https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.ampereco…
>> Signed-off-by: James Clark <james.clark(a)linaro.org>
>> ---
>> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index b78ef0262135..b85a8837bddc 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct
>> cs_etm_decoder_params *d_params,
>> }
>>
>> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
>> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
>> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
>> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK |
>> OCSD_OPFLG_CHK_RANGE_CONTINUE |
>> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
>> +#endif
>
> Looks good to me.
>
> Just one question: should the flag ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK be set
> according to ETM version? E.g. it should be only set for ETMv4 or this is
> fine for ETE as well.
>
> Thanks,
> Leo
>
I asked Mike the same question about ETMv3 and he said none of the flags
overlap and it was ok to always pass them. So I assume the same applies
to ETE as well.
Change since V3:
1. Use ^ete-[0-9]+$ for the pattern of node name -- comments from Rob Herring
Change since V2:
1. Change the name in binding as 'ete'.
Change since V1:
1. Remove the pattern match of ETE node name.
2. Update the tmc-etr node name in DT.
Mao Jinlong (2):
dt-bindings: arm: coresight: Update the pattern of ete node name
arm64: dts: qcom: sm8450: Add coresight nodes
.../arm/arm,embedded-trace-extension.yaml | 6 +-
arch/arm64/boot/dts/qcom/sm8450.dtsi | 726 ++++++++++++++++++
2 files changed, 729 insertions(+), 3 deletions(-)
--
2.46.0