Hello,
I want to gather trace data of closed source binaries using CoreSight
ETMv4 on a Hikey620. I want to know the source and destination address
for all taken jumps of the traced program, like in the output of "perf
script". It would be great if I could get feedback on how to achieve this.
I'm not sure where to turn to with such a broad CoreSight problem, so
I'm sorry if you are not the right ones to turn to, but I'd be happy for
any help or advice you might have.
My main problem is, that I don't know which approaches are promising to
try. Below I describe two Ideas that I tried but where I got stuck after
a while. Are they any good for my use case? If yes, then how can I solve
the respective problems that have come up, or where can I look to solve
them? If not, are there maybe better ways to approach this, which I've
overlooked until now?
After hearing a presentation from Mathieu Poirier, I thought sysFS was
the (only) way to go. However, the decoded trace seems to show only the
jump address, instead of both the source and destination addresses, and
I did not find a register to change that.
Also, the trace gathered seems to lose some of the branch addresses.
Inserting a sleep instruction after each regular instruction into my
test program, fixed that. But since it should also work for closed
source binaries and has to be fast this is probably not an option.
Then I tried to copy the way "perf record" is tracing, and extract the
relevant code parts. But then I realized, that perf record doesn't use
sysFS, apart from enabling the sink in "util/cs-etm.c" (which apparently
is not used, and not even deactivated afterward).
So there is another way to gather trace, maybe by interacting with the
CoreSight driver directly. But looking into the "perf report" source
code I couldn't find it yet.
Thanks and regards,
Dominik
The sparse tool complains as follows:
drivers/hwtracing/coresight/coresight-core.c:26:1: warning:
symbol '__pcpu_scope_csdev_sink' was not declared. Should it be static?
As csdev_sink is not used outside of coresight-core.c after the
introduction of coresight_[set|get]_percpu_sink() helpers, this
change marks it static.
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1(a)huawei.com>
---
v1 -> v2: remove fixes tag and description.
---
drivers/hwtracing/coresight/coresight-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 3e779e1619ed..6c68d34d956e 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -23,7 +23,7 @@
#include "coresight-priv.h"
static DEFINE_MUTEX(coresight_mutex);
-DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
+static DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
/**
* struct coresight_node - elements of a path, from source to sink
In case of error, the function devm_kasprintf() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value
check should be replaced with NULL test.
Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1(a)huawei.com>
---
drivers/hwtracing/coresight/coresight-trbe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 5ce239875c98..176868496879 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -871,7 +871,7 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
dev = &cpudata->drvdata->pdev->dev;
desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
- if (IS_ERR(desc.name))
+ if (!desc.name)
goto cpu_clear;
desc.type = CORESIGHT_DEV_TYPE_SINK;
The sparse tool complains as follows:
drivers/hwtracing/coresight/coresight-core.c:26:1: warning:
symbol '__pcpu_scope_csdev_sink' was not declared. Should it be static?
This symbol is not used outside of coresight-core.c, so this
commit marks it static.
Fixes: 2cd87a7b293d ("coresight: core: Add support for dedicated percpu sinks")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1(a)huawei.com>
---
drivers/hwtracing/coresight/coresight-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 3e779e1619ed..6c68d34d956e 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -23,7 +23,7 @@
#include "coresight-priv.h"
static DEFINE_MUTEX(coresight_mutex);
-DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
+static DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
/**
* struct coresight_node - elements of a path, from source to sink
This series enables future IP trace features Embedded Trace Extension
(ETE) and Trace Buffer Extension (TRBE). This series applies on
kvmarm/fixes tree. A standalone tree is also available here [0].
The queued patches (almost there) are included in this posting for
the sake of constructing a tree from the posting.
ETE is the PE (CPU) trace unit for CPUs, implementing future
architecture extensions. ETE overlaps with the ETMv4 architecture, with
additions to support the newer architecture features and some restrictions
on the supported features w.r.t ETMv4. The ETE support is added by extending
the ETMv4 driver to recognise the ETE and handle the features as exposed by
the TRCIDRx registers. ETE only supports system instructions access from the
host CPU. The ETE could be integrated with a TRBE (see below), or with
the legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same
firmware description as the ETMs and requires a node per instance.
Trace Buffer Extensions (TRBE) implements a per CPU trace buffer, which
is accessible via the system registers and can be combined with the ETE to
provide a 1x1 configuration of source & sink. TRBE is being represented
here as a CoreSight sink. Primary reason is that the ETE source could
work with other traditional CoreSight sink devices. As TRBE captures the
trace data which is produced by ETE, it cannot work alone.
TRBE representation here have some distinct deviations from a
traditional CoreSight sink device. Coresight path between ETE and TRBE are
not built during boot looking at respective DT or ACPI entries.
Unlike traditional sinks, TRBE can generate interrupts to signal
including many other things, buffer got filled. The interrupt is a PPI and
should be communicated from the platform. DT or ACPI entry representing TRBE
should have the PPI number for a given platform. During perf session, the
TRBE IRQ handler should capture trace for perf auxiliary buffer before restarting
it back. System registers being used here to configure ETE and TRBE could
be referred in the link below.
https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers.
[0] https://gitlab.arm.com/linux-arm/linux-skp/-/tree/coresight/ete/v6/
Changes since v6:
- Rebased to kvmarm/fixes tree (which has some patches queued that this
series depends on)
- Fixed a sparse warning in TRBE driver Reported-by: kernel test robot
<lkp(a)intel.com>
- Add explicit undef handler for TRFCR_EL1 (Marc Zyngier)
- Check for the SPE support early, kvm_arch_vcpu_load(), instead of
every single time in guest_enter() (Patch 06)
Changes since V4:
https://lkml.kernel.org/r/20210225193543.2920532-1-suzuki.poulose@arm.com
- Split the documentation patches from the TRBE driver
- Drop the patches already queued for v5.12.
- Mark the buffer TRUNCATED in case of a WRAP event
- Fix error code for vmap failure
- Fix build break on arm32 for per-cpu sink patch
- Address comments on ETE dts bindings.
- Make ete_sysreg_read/write static (kernel test robot)
- Merged ete sysreg definition patch with ete support, to avoid
a "defined but unused warning" on build in part of the series.
- Add new bindings to MAINTAINERS
Changes since V3:
https://lore.kernel.org/linux-arm-kernel/1611737738-1493-1-git-send-email-a…
- ETE and TRBE changes have been captured in the respective patches
- Better support for nVHE
- Re-ordered and splitted the patches to keep the changes separate
for the generic/arm64 tree from CoreSight driver specific changes.
- Fixes for KVM handling of Trace/SPE
Changes since V2:
https://lore.kernel.org/linux-arm-kernel/1610511498-4058-1-git-send-email-a…
- Rebased on coresight/next
- Changed DT bindings for ETE
- Included additional patches for arm64 nvhe, perf aux buffer flags etc
- TRBE changes have been captured in the respective patches
Changes since V1:
https://lore.kernel.org/linux-arm-kernel/1608717823-18387-1-git-send-email-…
- Converted both ETE and TRBE DT bindings into Yaml
- TRBE changes have been captured in the respective patches
Changes since RFC:
- There are not much ETE changes from Suzuki apart from splitting of the
ETE DTS patch
- TRBE changes have been captured in the respective patches
RFC:
https://lore.kernel.org/linux-arm-kernel/1605012309-24812-1-git-send-email-…
Cc: Will Deacon <will(a)kernel.org>
Cc: Marc Zyngier <maz(a)kernel.org>
Cc: Peter Zilstra <peterz(a)infradead.org>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Linu Cherian <lcherian(a)marvell.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Anshuman Khandual (5):
arm64: Add TRBE definitions
coresight: core: Add support for dedicated percpu sinks
coresight: sink: Add TRBE driver
Documentation: coresight: trbe: Sysfs ABI description
Documentation: trace: Add documentation for TRBE
Suzuki K Poulose (15):
perf: aux: Add flags for the buffer format
perf: aux: Add CoreSight PMU buffer formats
arm64: Add support for trace synchronization barrier
kvm: arm64: Handle access to TRFCR_EL1
kvm: arm64: Move SPE availability check to VCPU load
arm64: kvm: Enable access to TRBE support for host
coresight: etm4x: Move ETM to prohibited region for disable
coresight: etm-perf: Allow an event to use different sinks
coresight: Do not scan for graph if none is present
coresight: etm4x: Add support for PE OS lock
coresight: ete: Add support for ETE sysreg access
coresight: ete: Add support for ETE tracing
dts: bindings: Document device tree bindings for ETE
coresight: etm-perf: Handle stale output handles
dts: bindings: Document device tree bindings for Arm TRBE
.../testing/sysfs-bus-coresight-devices-trbe | 14 +
.../devicetree/bindings/arm/ete.yaml | 75 ++
.../devicetree/bindings/arm/trbe.yaml | 49 +
.../trace/coresight/coresight-trbe.rst | 38 +
MAINTAINERS | 2 +
arch/arm64/include/asm/barrier.h | 1 +
arch/arm64/include/asm/el2_setup.h | 13 +
arch/arm64/include/asm/kvm_arm.h | 2 +
arch/arm64/include/asm/kvm_host.h | 8 +
arch/arm64/include/asm/sysreg.h | 50 +
arch/arm64/kernel/hyp-stub.S | 3 +-
arch/arm64/kvm/arm.c | 2 +
arch/arm64/kvm/debug.c | 36 +-
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 56 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 1 +
arch/arm64/kvm/sys_regs.c | 1 +
drivers/hwtracing/coresight/Kconfig | 24 +-
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-core.c | 29 +-
.../hwtracing/coresight/coresight-etm-perf.c | 119 +-
.../coresight/coresight-etm4x-core.c | 161 ++-
.../coresight/coresight-etm4x-sysfs.c | 19 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 83 +-
.../hwtracing/coresight/coresight-platform.c | 6 +
drivers/hwtracing/coresight/coresight-priv.h | 3 +
drivers/hwtracing/coresight/coresight-trbe.c | 1157 +++++++++++++++++
drivers/hwtracing/coresight/coresight-trbe.h | 152 +++
include/linux/coresight.h | 13 +
include/uapi/linux/perf_event.h | 13 +-
29 files changed, 2054 insertions(+), 77 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml
create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
--
2.24.1
This patchset introduces initial concepts in CoreSight system
configuration management support. to allow more detailed and complex
programming to be applied to CoreSight systems during trace capture.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/events. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuration can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to coresight/next (5.12-rc2 base)
Changes since v4: (based on comments from Matthieu and Suzuki).
No large functional changes - primarily code improvements and naming schema.
1) Updated entire set to ensure a consistent naming scheme was used for
variables and struct members that refer to the key objects in the system.
Suffixes _desc used for all references to feature and configuraion descriptors,
suffix _csdev used for all references to load feature and configs in the csdev
instances. (Mathieu & Suzuki).
2) Dropped the 'configurations' sub dir in cs_etm perf directories as superfluous
with the configfs containing the same information. (Mathieu).
3) Simplified perf handling code (suzuki)
4) Multiple simplifications and improvements for code readability (Matthieu
and Suzuki)
Changes since v3: (Primarily based on comments from Matthieu)
1) Locking mechanisms simplified.
2) Removed the possibility to enable features independently from
configurations.Only configurations can be enabled now. Simplifies programming
logic.
3) Configuration now uses an activate->enable mechanism. This means that perf
will activate a selected configuration at the start of a session (during
setup_aux), and disable at the end of a session (around free_aux)
The active configuration and associated features will be programmed into the
CoreSight device instances when they are enabled. This locks the configuration
into the system while in use. Parameters cannot be altered while this is
in place. This mechanism will be extended in future for dynamic load / unload
of configurations to prevent removal while in use.
4) Removed the custom bus / driver as un-necessary. A single device is
registered to own perf fs elements and configfs.
5) Various other minor issues addressed.
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device
basis, in favour of a single point adjustment in configfs that is pushed to all
device instances.
5) Altered how the config and preset command line options are handled in perf
and the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (10):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: syscfg: Add API to activate and enable configurations
coresight: etm-perf: Update to activate selected configuration
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support
coresight: docs: Add documentation for CoreSight config
.../trace/coresight/coresight-config.rst | 244 ++++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Makefile | 7 +-
.../hwtracing/coresight/coresight-cfg-afdo.c | 149 ++++
.../coresight/coresight-cfg-preload.c | 27 +
.../coresight/coresight-cfg-preload.h | 11 +
.../hwtracing/coresight/coresight-config.c | 274 +++++++
.../hwtracing/coresight/coresight-config.h | 253 ++++++
drivers/hwtracing/coresight/coresight-core.c | 12 +-
.../hwtracing/coresight/coresight-etm-perf.c | 155 +++-
.../hwtracing/coresight/coresight-etm-perf.h | 12 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 182 +++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 30 +
.../coresight/coresight-etm4x-core.c | 38 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 399 ++++++++++
.../coresight/coresight-syscfg-configfs.h | 45 ++
.../hwtracing/coresight/coresight-syscfg.c | 738 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 81 ++
include/linux/coresight.h | 7 +
20 files changed, 2644 insertions(+), 39 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-afdo.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.h
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
Hi Mathieu & Suzuki,
Here is a question about the amount of data traced by ETM device.
In high-CPU-utilization scene, volume of ETM trace data may be very
large, but limited buffer in sink device which is used to store trace
data can not hold all of these data. It seems that data loss in sink
device is inevitable.
So how could we ensure the data we are interested in isn't lost, or what
we can do to reduce this kind of data loss?
Any response will be highly appreciated.
Thanks,
Qi