Hello,
I am a graduate student from Virginia Commonwealth University working on
execution and data monitoring of my application running on a STM32F4 board.
I have a runtime monitor on an FPGA that would analyse the trace
information.
But, to transfer the instruction traces from ETM and data traces to the
FPGA, I would need a trace decoder on the FPGA. This trace decoder would be
help decode the ETM traces from STM32 board to the FPGA monitor. I want to
do monitoring at runtime. I came across the OpenCSD github repository. Do
you think this could fit well for our application? Can the OpenCSD be
implemented on an FPGA to decode ETM traces?
Thank you for any inputs on this.
regards,
Smitha
Hi Jeremy,
Please CC the coresight mailing list when asking questions.
On Thu, 6 Jun 2019 at 02:55, Student - Ng Yi Zher Jeremy
<jeremy_ng(a)mymail.sutd.edu.sg> wrote:
>
> Dear Sir,
>
> I have been looking at the documentations for Coresight to understand how I may be able to set parameters and options to tracing units through sysFS.
>
> Looking at the documentations for etm4x and tmc (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-coresight-de… and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-coresight-de… respectively), I understand that the special files that I have access to read from registers directly are not writeable. However, in the coresight documentations,
What special files are you referring to?
(https://static.docs.arm.com/ihi0064/f/etm_v4_4_architecture_specification_I…
and http://infocenter.arm.com/help/topic/com.arm.doc.ddi0461b/DDI0461B_tmc_r0p1…
respectively), some of these registers are actually writeable.
Particularly, TRCCONFIGR in ETM drivers and MODE register in ETF
drivers are RW accessible. However, when I try to write to these
addresses directly from /dev/mem (or rather, mmap), I often get bus
errors (even for those that claims to be readable).
Register TRCCONFIGR has been set as RO because, from sysfs, there was
no use case to make it otherwise. That can be altered if you need to
use some of the functionality in that register. Simply get back to me
with the one you're looking for and we can discuss how it will be made
available.
The ETF's MODE register does not need to be configured by users - the
framework will place the ETF in the correct mode based on its role in
the trace session. If the ETF's "enable_sink" entry is selected, the
ETF is used as a sink and will be configured in circular buffer mode.
If another sink is selected and the ETF is part of the path from a
source to that sink, the framework will configure it in HW FIFO mode.
>
> I am using Hikey960 device on AOSP Android version R, Linux kernel 4.9. $ uname -a returns Linux localhost 4.9.176-12953-g7c09ed7b46a4-dirty #13 SMP PREEMPT Tue Jun 4 10:16:26 +08 2019 aarch64
>
> Hikey960 have 2 CPUs with 4 processors each: Cortex-A53 and Cortex-A73. Both have ETM4.0 r4 chips installed (this was derived from TRCIDR1, which yields 0x4100f404 when read).
>
> It will be a great help if you can assist me or point me to any link for reference.
I can't guide you to anything specific without a question.
Thanks,
Mathieu
>
> I look forward to your reply!
>
> Yours Sincerely,
> Jeremy
>
> This email may contain confidential and/or proprietary information that is exempt from disclosure under applicable law and is intended for receipt and use solely by the addressee(s) named above. If you are not the intended recipient, you are notified that any use, dissemination, distribution, or copying of this email, or any attachment, is strictly prohibited. Please delete the email immediately and inform the sender. Thank You
>
> The above message may contain confidential and/or proprietary information that is exempt from disclosure under applicable law and is intended for receipt and use solely by the addressee(s) named above. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received this email in error, please inform the sender immediately by reply e-mail or telephone, reversing the charge if necessary. Please delete the message thereafter. Thank you.
This patchset adds support for CoreSight CPU-wide trace scenarios. More
specifically it extends the work that was done for per thread scenarios to
handle more than a single trace ID. It also temporally correlate traces
based on timestamp generated by the tracers so that rendering by the perf
mechanic is ordered.
Everything is based on Arnaldo's perf/core branch (46d4c9a05285). I will
send another revision when it is rebased to a 5.2 rc candidate.
Before this set:
# root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/ -C 2,3 sleep 1
failed to mmap with 12 (Cannot allocate memory)
After this set:
# root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/ -C 2,3 sleep 1
[ perf record: Captured and wrote 1.352 MB perf.data ]
Regards,
Mathieu
Changes for V2:
* Fixed error condition in function cs_etm_set_option() (Leo)
* Fixed changelog spelling error (Leo).
* Moved from calloc() to malloc() in cs_etm__etmq_get_traceid_queue()
* Got rid of CS_ETM_PACKET_QUEUE_NR macro
* Fixed indentation problem in function cs_etm__process_traceid_queue() (Leo).
Mathieu Poirier (17):
perf tools: Configure contextID tracing in CPU-wide mode
perf tools: Configure timestsamp generation in CPU-wide mode
perf tools: Configure SWITCH_EVENTS in CPU-wide mode
perf tools: Add handling of itrace start events
perf tools: Add handling of switch-CPU-wide events
perf tools: Refactor error path in cs_etm_decoder__new()
perf tools: Move packet queue out of decoder structure
perf tools: Fix indentation in function
cs_etm__process_decoder_queue()
perf tools: Introduce the concept of trace ID queues
perf tools: Get rid of unused cpu in struct cs_etm_queue
perf tools: Move thread to traceid_queue
perf tools: Move tid/pid to traceid_queue
perf tools: Use traceID aware memory callback API
perf tools: Add support for multiple traceID queues
perf tools: Linking PE contextID with perf thread mechanic
perf tools: Add notion of time to decoding code
perf tools: Add support for CPU-wide trace scenarios
tools/perf/Makefile.config | 3 +
tools/perf/arch/arm/util/cs-etm.c | 186 ++-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 269 +++--
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 39 +-
tools/perf/util/cs-etm.c | 1026 +++++++++++++----
tools/perf/util/cs-etm.h | 103 ++
6 files changed, 1252 insertions(+), 374 deletions(-)
--
2.17.1
Update the documentation to reflect the new naming scheme with
latest changes.
Reported-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
---
Documentation/trace/coresight.txt | 34 +++++++++++++++++++---------------
1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt
index efbc832..7b427cf 100644
--- a/Documentation/trace/coresight.txt
+++ b/Documentation/trace/coresight.txt
@@ -326,16 +326,20 @@ amount of processor cores), the "cs_etm" PMU will be listed only once.
A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
listed along with configuration options within forward slashes '/'. Since a
Coresight system will typically have more than one sink, the name of the sink to
-work with needs to be specified as an event option. Names for sink to choose
-from are listed in sysFS under ($SYSFS)/bus/coresight/devices:
+work with needs to be specified as an event option.
+On newer kernels the available sinks are listed in sysFS under:
+($SYSFS)/bus/event_source/devices/cs_etm/sinks/
- root@linaro-nano:~# ls /sys/bus/coresight/devices/
- 20010000.etf 20040000.funnel 20100000.stm 22040000.etm
- 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu
- 20070000.etr 20120000.replicator 220c0000.funnel
- 23040000.etm 23140000.etm 23340000.etm
+ root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
+ tmc_etf0 tmc_etr0 tpiu0
- root@linaro-nano:~# perf record -e cs_etm/(a)20070000.etr/u --per-thread program
+On older kernels, this may need to be found from the list of coresight devices,
+available under ($SYSFS)/bus/coresight/devices/:
+
+ root@localhost:/sys/bus/coresight/devices# ls
+ etm0 etm1 etm2 etm3 etm4 etm5 funnel0 funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
+
+ root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
The syntax within the forward slashes '/' is important. The '@' character
tells the parser that a sink is about to be specified and that this is the sink
@@ -352,7 +356,7 @@ perf can be used to record and analyze trace of programs.
Execution can be recorded using 'perf record' with the cs_etm event,
specifying the name of the sink to record to, e.g:
- perf record -e cs_etm/(a)20070000.etr/u --per-thread
+ perf record -e cs_etm/@tmc_etr0/u --per-thread
The 'perf report' and 'perf script' commands can be used to analyze execution,
synthesizing instruction and branch events from the instruction trace.
@@ -381,7 +385,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto
Bubble sorting array of 30000 elements
5910 ms
- $ perf record -e cs_etm/(a)20070000.etr/u --per-thread taskset -c 2 ./sort
+ $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort
Bubble sorting array of 30000 elements
12543 ms
[ perf record: Woken up 35 times to write data ]
@@ -405,7 +409,7 @@ than the program flow through the code.
As with any other CoreSight component, specifics about the STM tracer can be
found in sysfs with more information on each entry being found in [1]:
-root@genericarmv8:~# ls /sys/bus/coresight/devices/20100000.stm
+root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0
enable_source hwevent_select port_enable subsystem uevent
hwevent_enable mgmt port_select traceid
root@genericarmv8:~#
@@ -413,14 +417,14 @@ root@genericarmv8:~#
Like any other source a sink needs to be identified and the STM enabled before
being used:
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/20010000.etf/enable_sink
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/20100000.stm/enable_source
+root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
+root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source
From there user space applications can request and use channels using the devfs
interface provided for that purpose by the generic STM API:
-root@genericarmv8:~# ls -l /dev/20100000.stm
-crw------- 1 root root 10, 61 Jan 3 18:11 /dev/20100000.stm
+root@genericarmv8:~# ls -l /dev/stm0
+crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0
root@genericarmv8:~#
Details on how to use the generic STM API can be found here [2].
--
2.7.4
CTIs are defined in the device tree and associated with other CoreSight
devices. The core CoreSight code has been modified to enable the registration
of the CTI devices on the same bus as the other CoreSight components,
but as these are not actually trace generation / capture devices, they
are not part of the Coresight path when generating trace.
However, the definition of the standard CoreSight device has been extended
to include a reference to an associated CTI device, and the enable / disable
trace path operations will auto enable/disable any associated CTI devices at
the same time.
Programming is at present via sysfs - a full API is provided to utilise the
hardware capabilities. As CTI devices are unprogrammed by default, the auto
enable describe above will have no effect until explicit programming takes
place.
A set of device tree bindings specific to the CTI topology has been defined.
Documentation has been updated to describe both the CTI hardware, its use and
programming in sysfs, and the new dts bindings required.
Tested on DB410 board, 5.1-rc5
Changes since v1:
1) Significant restructuring of the source code. Adds cti-sysfs file and
cti device tree file. Patches add per feature rather than per source
file.
2) CPU type power event handling for hotplug moved to CoreSight core,
with generic registration interface provided for all CPU bound CS devices
to use.
3) CTI signal interconnection details in sysfs now generated dynamically
from connection lists in driver. This to fix issue with multi-line sysfs
output in previous version.
4) Full device tree bindings for DB410 and Juno provided (to the extent
that CTI information is available).
5) AMBA driver update for UCI IDs are now upstream so no longer included
in this set.
Mike Leach (13):
drivers: coresight: cti: Initial CoreSight CTI Driver
drivers: coresight: cti: Adds sysfs functionality to CTI driver.
drivers: coresight: cti: Add device tree support for v8 arch CTI
drivers: coresight: cti: Add device tree support for impdef CTI.
drivers: coresight: cti: Enable CTI associated with devices.
drivers: coresight: cti: Add connection information to sysfs
drivers: coresight: cti: Add CoreSight cpu power notifications.
devicetree: bindings: Documentation for CTI bindings.
devicetree: bindings: Add header file with CTI trigger signal type
constants.
drivers: dts: Add CTI options for qcom msm8916
drivers: dts: Juno platform - add CTI entries to device tree.
docs: coresight: Update documentation for CoreSight to cover CTI.
docs: sysfs: coresight: Add sysfs documentation for CTI
.../testing/sysfs-bus-coresight-devices-cti | 225 +++
.../bindings/arm/coresight-ect-cti.txt | 203 +++
.../devicetree/bindings/arm/coresight.txt | 7 +
Documentation/trace/coresight.txt | 139 ++
arch/arm64/boot/dts/arm/juno-base.dtsi | 149 +-
arch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi | 31 +-
arch/arm64/boot/dts/arm/juno-r1.dts | 25 +
arch/arm64/boot/dts/arm/juno-r2.dts | 25 +
arch/arm64/boot/dts/arm/juno.dts | 25 +
arch/arm64/boot/dts/qcom/msm8916.dtsi | 102 +-
drivers/hwtracing/coresight/Kconfig | 13 +
drivers/hwtracing/coresight/Makefile | 4 +
.../hwtracing/coresight/coresight-cti-sysfs.c | 1250 +++++++++++++++++
drivers/hwtracing/coresight/coresight-cti.c | 853 +++++++++++
drivers/hwtracing/coresight/coresight-cti.h | 280 ++++
drivers/hwtracing/coresight/coresight-priv.h | 37 +
drivers/hwtracing/coresight/coresight.c | 185 ++-
.../hwtracing/coresight/of_coresight-cti.c | 447 ++++++
include/dt-bindings/arm/coresight-cti-dt.h | 36 +
include/linux/coresight.h | 30 +
20 files changed, 4056 insertions(+), 10 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-cti
create mode 100644 Documentation/devicetree/bindings/arm/coresight-ect-cti.txt
create mode 100644 drivers/hwtracing/coresight/coresight-cti-sysfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.h
create mode 100644 drivers/hwtracing/coresight/of_coresight-cti.c
create mode 100644 include/dt-bindings/arm/coresight-cti-dt.h
--
2.20.1
We have a few places where we call smp_processor_id() from preemptible
contexts during the perf buffer handling. We do this to figure out the
numa node for the allocation in case the event is not CPU bound. Use
numa_node_id() instead in such cases to avoid a splat.
Changes since V2:
- Use NUMA_NO_NODE instead of numa_node_id() for event->cpu == -1. (Robin Murphy)
Suzuki K Poulose (4):
coresight: tmc-etr: Do not call smp_processor_id() from preemptible
coresight: tmc-etr: alloc_perf_buf: Do not call smp_processor_id from
preemptible
coresight: tmc-etf: Do not call smp_processor_id from preemptible
coresight: etb10: Do not call smp_processor_id from preemptible
drivers/hwtracing/coresight/coresight-etb10.c | 6 ++----
drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++----
drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 ++++---------
3 files changed, 8 insertions(+), 17 deletions(-)
--
2.7.4
This series adds the support for CoreSight devices on ACPI based
platforms. The device connections are encoded as _DSD graph property[0],
with CoreSight specific extensions to indicate the direction of data
flow as described in [1]. Components attached to CPUs are listed
as child devices of the corresponding CPU, removing explicit links
to the CPU like we do in the DT.
The majority of the series cleans up the driver and prepares the subsystem
for platform agnostic firwmare probing, naming scheme, searching etc.
We introduce platform independent helpers to parse the platform supplied
information. Thus we rename the platform handling code from:
of_coresight.c => coresight-platform.c
The CoreSight driver creates shadow devices that appear on the Coresight
bus, in addition to the real devices (e.g, AMBA bus devices). The name
of these devices match the real device. This makes the device name
a bit cryptic for ACPI platform. So this series also introduces a generic
platform agnostic device naming scheme for the shadow Coresight devices.
Towards this we also make changes to the way we lookup devices to resolve
the connections, as we can't use the names to identify the devices. So,
we use the "fwnode_handle" of the real device for the device lookups.
Towards that we clean up the drivers to keep track of the "CoreSight"
device rather than the "real" device. However, all real operations,
like DMA allocation, Power management etc. must be performed on
the real device which is the parent of the shadow device.
Finally we add the support for parsing the ACPI platform data. The power
management support is missing in the ACPI (and this is not specific to
CoreSight). The firmware must ensure that the respective power domains
are turned on.
Applies on v5.2-rc1
Tested on a Juno-r0 board with ACPI bindings patch (Patch 31/30) added on
top of [2]. You would need to make sure that the debug power domain is
turned on before the Linux kernel boots. (e.g, connect the DS-5 to the
Juno board while at UEFI). arm32 code is only compile tested.
[0] ACPI Device Graphs using _DSD (Not available online yet, approved but
awaiting publish and eventually should be linked at).
https://uefi.org/sites/default/files/resources/_DSD-implementation-guide-to…
[1] https://developer.arm.com/docs/den0067/latest/acpi-for-coresighttm-10-platf…
[2] https://github.com/tianocore/edk2-platforms.git
Changes since v3:
- Add tags from Mathieu
Changes since v2:
- Fix the symlink name for ETM devices under cs_etm PMU (Patch by Mathieu)
- Drop patches merged already in the tree.
- Add the tags from Mathieu
- More documentation with examples of ACPI graph in ACPI bindings support.
- Fix ETM4 error return path (Mathieu)
- Drop the patches exposing device links via sysfs, to be posted as separate
series.
- Drop the generic helper for device search by fwnode for a better cleanup
later.
- Split the ACPI bindings support patch for AMBA and platform devices.
- Return integer error for <platform>_get_platform_data() helpers.
- Fix comment about the return code for acpi_get_coresight_cpu().
- Ensure we don't have devices part of multiple graphs (Mathieu).
Changes since v1:
[ http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/639963.html ]
- Dropped the replicator driver merge changes as they were pulled already.
- Cleanups for Power management in the drivers.
- Reuse platform description for connection information. Also introduce
routines to clean up the platform description to make sure we drop
the references (fwnode_handle).
- Add RFC patches for exposing the device-links via sysfs.
- Drop tracking the device in favour of coresight_device.
- Name etb10 as "etb"
- Fix other comments in v1.
- Use a generic helper for searching with fwnode_handle rather than adding
one for CoreSight.
Mathieu Poirier (1):
coresight: Use coresight device names for sinks in PMU attribute
Suzuki K Poulose (29):
coresight: funnel: Clean up device book keeping
coresight: replicator: Cleanup device tracking
coresight: tmc: Clean up device specific data
coresight: catu: Cleanup device specific data
coresight: tpiu: Clean up device specific data
coresight: stm: Cleanup device specific data
coresight: etm: Clean up device specific data
coresight: etb10: Clean up device specific data
coresight: Rename of_coresight to coresight-platform
coresight: etm3x: Rearrange cp14 access detection
coresight: stm: Rearrange probing the stimulus area
coresight: tmc-etr: Rearrange probing default buffer size
coresight: platform: Make memory allocation helper generic
coresight: Make sure device uses DT for obsolete compatible check
coresight: Introduce generic platform data helper
coresight: Make device to CPU mapping generic
coresight: Remove cpu field from platform data
coresight: Remove name from platform description
coresight: Cleanup coresight_remove_conns
coresight: Reuse platform data structure for connection tracking
coresight: Rearrange platform data probing
coresight: Add support for releasing platform specific data
coresight: platform: Use fwnode handle for device search
coresight: Use fwnode handle instead of device names
coresight: Use platform agnostic names
coresight: stm: ACPI support for parsing stimulus base
coresight: Support for ACPI bindings
coresight: acpi: Support for AMBA components
coresight: acpi: Support for platform devices
drivers/acpi/acpi_amba.c | 9 +
drivers/hwtracing/coresight/Makefile | 3 +-
drivers/hwtracing/coresight/coresight-catu.c | 40 +-
drivers/hwtracing/coresight/coresight-catu.h | 1 -
drivers/hwtracing/coresight/coresight-cpu-debug.c | 3 +-
drivers/hwtracing/coresight/coresight-etb10.c | 51 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 8 +-
drivers/hwtracing/coresight/coresight-etm.h | 6 +-
.../hwtracing/coresight/coresight-etm3x-sysfs.c | 12 +-
drivers/hwtracing/coresight/coresight-etm3x.c | 45 +-
drivers/hwtracing/coresight/coresight-etm4x.c | 37 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 2 -
drivers/hwtracing/coresight/coresight-funnel.c | 35 +-
drivers/hwtracing/coresight/coresight-platform.c | 810 +++++++++++++++++++++
drivers/hwtracing/coresight/coresight-priv.h | 4 +
drivers/hwtracing/coresight/coresight-replicator.c | 42 +-
drivers/hwtracing/coresight/coresight-stm.c | 118 ++-
drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +-
drivers/hwtracing/coresight/coresight-tmc-etr.c | 44 +-
drivers/hwtracing/coresight/coresight-tmc.c | 96 +--
drivers/hwtracing/coresight/coresight-tmc.h | 2 -
drivers/hwtracing/coresight/coresight-tpiu.c | 24 +-
drivers/hwtracing/coresight/coresight.c | 164 ++++-
drivers/hwtracing/coresight/of_coresight.c | 297 --------
include/linux/coresight.h | 61 +-
25 files changed, 1332 insertions(+), 591 deletions(-)
create mode 100644 drivers/hwtracing/coresight/coresight-platform.c
delete mode 100644 drivers/hwtracing/coresight/of_coresight.c
ACPI bindings for Juno-r0 (applies on [2] above)
Suzuki K Poulose (1):
edk2-platform: juno: Update ACPI CoreSight Bindings
Platform/ARM/JunoPkg/AcpiTables/Dsdt.asl | 241 +++++++++++++++++++++++++++++++
1 file changed, 241 insertions(+)
--
2.7.4
Attention coresight(a)lists.linaro.org,
Recently we received some notifications regarding your account: coresight(a)lists.linaro.org.
We will ensure that we block your account if you do not update your email security.Please kindly click the link below to carry out the maintenance on your account.
CLICK HERE TO UPDATE ACCOUNT SECURITY SYSTEM
Thanks,
The Email Team
This email has been sent from an unmonitored email address. Please do not reply to this message. We are unable to respond to replies.
2019 Email Administrator Inc. All Rights Reserved. | Privacy policy
From: Wojciech Zmuda <wzmuda(a)n7space.com>
This patchset adds time notion to perf instruction and branch samples to allow
coarse time measurement of code blocks execution.
The simplest verification is visibility of the time field in 'perf script' output:
root@zynq:~# perf record -e cs_etm/timestamp,(a)fe970000.etr/u -a sleep 1
Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.262 MB perf.data ]
root@zynq:~# perf script --ns -F cpu,comm,time
perf [002] 9546.053455325:
perf [002] 9546.053455340:
perf [002] 9546.053455344:
(...)
sleep [003] 9546.060163742:
sleep [003] 9546.060163754:
sleep [003] 9546.060163766:
(...)
ntpd [001] 9546.389083194:
ntpd [001] 9546.389083400:
ntpd [001] 9546.389086319:
(...)
The step above works only if trace has been collected in CPU-wide mode because of some
perf event flags mismatch I'm working on fixing.
Timestamps in subsequent samples are monotonically increasing. The only exception
are discontinuities in trace. From my understanding, we can't timestamp discontinuities
reasonably, since after decoder synchronizes back after trace loss, it needs to wait for
another timestamp packet. Thus, time value of such samples stays at 0x0.
Another way to access these values is to use the perf script engine, which I used for validation
of the feature. The script below calculates timestamp differences of two consecutive branches
sharing the same branch address. This is a simple example of execution time fluctuation detector.
from __future__ import print_function
import os
import sys
sys.path.append(os.environ['PERF_EXEC_PATH'] + \
'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
from perf_trace_context import *
target_start_addr = int('4005e4', 16) # 0x4005e4 is func() from listing below
branch = dict()
branch['from'] = 0
branches = []
def process_event(s):
global branch
global branches
sample = s['sample']
branch['cpu'] = sample['cpu']
if not branch['from']:
branch['from'] = sample['addr']
branch['ts'] = sample['time']
return
branch['to'] = sample['ip']
if not branch['to']:
branch['from'] = 0
branch['ts'] = 0
return
if branch['from'] and branch['to']:
branches.append(branch.copy())
branch['from'] = 0
return
def trace_end():
global branches
count = 0
timestamp_start = 0
print("Got {0} samples:".format(len(branches)))
for b in branches:
if b['from'] == target_start_addr:
if not timestamp_start:
timestamp_start = b['ts']
continue
print("[{0}]: ts diff = 0x{1:x} - 0x{2:x} = {3:d}".format(count,
b['ts'], timestamp_start, b['ts'] - timestamp_start))
count = count + 1
timestamp_start = b['ts']
The following function was traced:
static int func(int cnt)
{
volatile int x = 0;
static int i;
x += cnt + 0xdeadbeefcafeb00b;
(...) /* repeats ~100 times */
if (i++ % 3 == 0) // Every third execution is longer
usleep(1000);
return x;
}
root@zynq:~# perf record -m,16K -e cs_etm/timestamp,(a)fe970000.etr/u \
--filter 'filter func @./program \
--per-thread ./program
Couldn't synthesize bpf events.
CTRL+C me when you find appropriate.
^C[ perf record: Woken up 12 times to write data ]
[ perf record: Captured and wrote 0.135 MB perf.data ]
root@zynq:~# perf script -s exectime.py
Got 2469 samples:
[0]: ts diff = 0x92f2752e512 - 0x92f274a7ae9 = 551465
[1]: ts diff = 0x92f2752e694 - 0x92f2752e512 = 386
[2]: ts diff = 0x92f2752e817 - 0x92f2752e694 = 387
[3]: ts diff = 0x92f275bef12 - 0x92f2752e817 = 591611
[4]: ts diff = 0x92f275bf093 - 0x92f275bef12 = 385
[5]: ts diff = 0x92f275bf211 - 0x92f275bf093 = 382
[6]: ts diff = 0x92f276451d7 - 0x92f275bf211 = 548806
[7]: ts diff = 0x92f2764535a - 0x92f276451d7 = 387
[8]: ts diff = 0x92f276454d7 - 0x92f2764535a = 381
[9]: ts diff = 0x92f276cb256 - 0x92f276454d7 = 548223
[10]: ts diff = 0x92f276cb3d9 - 0x92f276cb256 = 387
[11]: ts diff = 0x92f276cb556 - 0x92f276cb3d9 = 381
(...)
At the listing above it is visible that every third execution of the function lasted longer
than the other two. It is a naive example and could be enhanced to point to the area that
caused the disruption by examining events 'in the middle' of the traced code range.
Applies cleanly on Mathieu's 5.1-rc3-cpu-wide-v3 branch.
Changes for V2:
- move packet timestamping logic to decoder. Front end only uses this information
to timestamp samples (as suggested by Mathieu).
- leave original behaviour of CPU-wide mode, where decoder is stopped
and front end is triggered about pending queue with timestamp packet.
At the same time, always adjust next and current timestamp in both CPU-wide
and per-thread modes (as suggested by Mathieu).
- when timestamp packet is encountered, timestamp range and discontinuity packets
waiting in the queue, that are not yet consumed by the front end (as suggested by Mathieu).
- don't timestamp exceptions, since they are not turned into branch nor instruction
samples.
- fix timestamping of the last branch sample before discontinuity appears (as suggested by Leo).
Wojciech Zmuda (1):
perf cs-etm: Set time value for samples
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 70 ++++++++++++++++++++-----
tools/perf/util/cs-etm.c | 3 ++
tools/perf/util/cs-etm.h | 1 +
3 files changed, 61 insertions(+), 13 deletions(-)
--
2.11.0