On production systems with ETMs enabled, it is preferred to
exclude kernel mode(NS EL1) tracing for security concerns and
support only userspace(NS EL0) tracing. So provide an option
via kconfig to exclude kernel mode tracing if it is required.
This config is disabled by default and would not affect the
current configuration which has both kernel and userspace
tracing enabled by default.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan(a)codeaurora.org>
---
drivers/hwtracing/coresight/Kconfig | 9 +++++++++
drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index c1198245461d..52435de8824c 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -110,6 +110,15 @@ config CORESIGHT_SOURCE_ETM4X
To compile this driver as a module, choose M here: the
module will be called coresight-etm4x.
+config CORESIGHT_ETM4X_EXCL_KERN
+ bool "Coresight ETM 4.x exclude kernel mode tracing"
+ depends on CORESIGHT_SOURCE_ETM4X
+ help
+ This will exclude kernel mode(NS EL1) tracing if enabled. This option
+ will be useful to provide more flexible options on production systems
+ where only userspace(NS EL0) tracing might be preferred for security
+ reasons.
+
config CORESIGHT_STM
tristate "CoreSight System Trace Macrocell driver"
depends on (ARM && !(CPU_32v3 || CPU_32v4 || CPU_32v4T)) || ARM64
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index abd706b216ac..7e5669e5cd1f 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -832,6 +832,9 @@ static u64 etm4_get_ns_access_type(struct etmv4_config *config)
{
u64 access_type = 0;
+ if (IS_ENABLED(CONFIG_CORESIGHT_ETM4X_EXCL_KERN))
+ config->mode |= ETM_MODE_EXCL_KERN;
+
/*
* EXLEVEL_NS, bits[15:12]
* The Exception levels are:
@@ -849,7 +852,8 @@ static u64 etm4_get_ns_access_type(struct etmv4_config *config)
access_type = ETM_EXLEVEL_NS_HYP;
}
- if (config->mode & ETM_MODE_EXCL_USER)
+ if (config->mode & ETM_MODE_EXCL_USER &&
+ !IS_ENABLED(CONFIG_CORESIGHT_ETM4X_EXCL_KERN))
access_type |= ETM_EXLEVEL_NS_APP;
return access_type;
base-commit: 3477326277451000bc667dfcc4fd0774c039184c
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
This patchset introduces initial concepts in CoreSight complex system
configuration support.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/cs_config. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuation can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to coresight/next (5.10-rc1 base)
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device basis,
in favour of a single point adjustment in configfs that is pushed to all device
instances.
5) Altered how the config and preset command line options are handled in perf and
the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (9):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support.
coresight: syscfg: Allow update of feature params from configfs
coresight: docs: Add documentation for CoreSight config.
.../trace/coresight/coresight-config.rst | 230 ++++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Makefile | 6 +-
.../coresight/coresight-cfg-preload.c | 160 +++++
.../hwtracing/coresight/coresight-config.c | 392 +++++++++++
.../hwtracing/coresight/coresight-config.h | 311 +++++++++
drivers/hwtracing/coresight/coresight-core.c | 18 +-
.../hwtracing/coresight/coresight-etm-perf.c | 166 ++++-
.../hwtracing/coresight/coresight-etm-perf.h | 10 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 181 +++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 29 +
.../coresight/coresight-etm4x-core.c | 36 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 421 +++++++++++
.../coresight/coresight-syscfg-configfs.h | 47 ++
.../hwtracing/coresight/coresight-syscfg.c | 656 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 83 +++
include/linux/coresight.h | 7 +
18 files changed, 2743 insertions(+), 29 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
Hi Liu,
On Wed, Aug 19, 2020 at 04:06:37PM +0800, Qi Liu wrote:
> When too much trace information is generated on-chip, the ETM will
> overflow, and cause data loss. This is a common phenomenon on ETM
> devices.
>
> But sometimes we do not want to lose performance trace data, so we
> suppress the speed of instructions sent from CPU core to ETM to
> avoid the overflow of ETM.
>
> Signed-off-by: Qi Liu <liuqi115(a)huawei.com>
> ---
>
> Changes since v1:
> - ETM on HiSilicon Hip09 platform supports backpressure, so does
> not need to modify core commit.
>
> drivers/hwtracing/coresight/coresight-etm4x.c | 43 +++++++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
> index 7797a57..7641f89 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.c
> @@ -43,6 +43,10 @@ MODULE_PARM_DESC(boot_enable, "Enable tracing on boot");
> #define PARAM_PM_SAVE_NEVER 1 /* never save any state */
> #define PARAM_PM_SAVE_SELF_HOSTED 2 /* save self-hosted state only */
>
> +#define CORE_COMMIT_CLEAR 0x3000
> +#define CORE_COMMIT_SHIFT 12
> +#define HISI_ETM_AMBA_ID_V1 0x000b6d01
> +
> static int pm_save_enable = PARAM_PM_SAVE_FIRMWARE;
> module_param(pm_save_enable, int, 0444);
> MODULE_PARM_DESC(pm_save_enable,
> @@ -104,11 +108,40 @@ struct etm4_enable_arg {
> int rc;
> };
>
> +static void etm4_cpu_actlr1_cfg(void *info)
> +{
> + struct etm4_enable_arg *arg = (struct etm4_enable_arg *)info;
> + u64 val;
> +
> + asm volatile("mrs %0,s3_1_c15_c2_5" : "=r"(val));
> + val &= ~CORE_COMMIT_CLEAR;
> + val |= arg->rc << CORE_COMMIT_SHIFT;
> + asm volatile("msr s3_1_c15_c2_5,%0" : : "r"(val));
> +}
> +
> +static void etm4_config_core_commit(int cpu, int val)
> +{
> + struct etm4_enable_arg arg = {0};
> +
> + arg.rc = val;
> + smp_call_function_single(cpu, etm4_cpu_actlr1_cfg, &arg, 1);
Function etm4_enable/disable_hw() are already running on the CPU they are
supposed to so no need to call smp_call_function_single().
> +}
> +
> static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> {
> int i, rc;
> + struct amba_device *adev;
> struct etmv4_config *config = &drvdata->config;
> struct device *etm_dev = &drvdata->csdev->dev;
> + struct device *dev = drvdata->csdev->dev.parent;
> +
> + adev = container_of(dev, struct amba_device, dev);
> + /*
> + * If ETM device is HiSilicon ETM device, reduce the
> + * core-commit to avoid ETM overflow.
> + */
> + if (adev->periphid == HISI_ETM_AMBA_ID_V1)
Do you have any documentation on this back pressure feature? I doubt this is
specific to Hip09 platform and as such would prefer to have a more generic
approach that works on any platform that supports it.
Anyone on the CS mailing list that knows what this is about?
Thanks,
Mathieu
> + etm4_config_core_commit(drvdata->cpu, 1);
>
> CS_UNLOCK(drvdata->base);
>
> @@ -472,10 +505,20 @@ static void etm4_disable_hw(void *info)
> {
> u32 control;
> struct etmv4_drvdata *drvdata = info;
> + struct device *dev = drvdata->csdev->dev.parent;
> struct etmv4_config *config = &drvdata->config;
> struct device *etm_dev = &drvdata->csdev->dev;
> + struct amba_device *adev;
> int i;
>
> + adev = container_of(dev, struct amba_device, dev);
> + /*
> + * If ETM device is HiSilicon ETM device, resume the
> + * core-commit after ETM trace is complete.
> + */
> + if (adev->periphid == HISI_ETM_AMBA_ID_V1)
> + etm4_config_core_commit(drvdata->cpu, 0);
> +
> CS_UNLOCK(drvdata->base);
>
> if (!drvdata->skip_power_up) {
> --
> 2.8.1
>
This patch series tries to fix the sysfs breakage on topologies
with per core sink.
Changes since v3:
- References to coresight_get_enabled_sink in perf interface
has been removed and marked deprecated as a new patch.
- To avoid changes to coresight_find_sink for ease of maintenance,
search function specific to sysfs usage has been added.
- Sysfs being the only user for coresight_get_enabled sink,
reset option is removed as well.
Changes since v2:
- Fixed checkpatch issue
Changes since v1:
- Misc fixes in commit message
Applies on https://git.linaro.org/kernel/coresight.git/log/?h=next
Linu Cherian (2):
coresight: etm: perf: Sink selection using sysfs is deprecated
coresight: Make sysFS functional on topologies with per core sink
.../hwtracing/coresight/coresight-etm-perf.c | 2 -
drivers/hwtracing/coresight/coresight-priv.h | 3 +-
drivers/hwtracing/coresight/coresight.c | 58 +++++++++----------
3 files changed, 29 insertions(+), 34 deletions(-)
base-commit: 17f17c8f02a35a746376c2ecd054386575835b8b
--
2.25.1
Good morning,
Is tracing a multi-threaded program a supported use case for perf cs-etm?
If yes, are there any flags that should be specified with perf?
Thanks,
Andrea
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
CoreSight ETMv4.4 obsoletes memory mapped access to ETM and
mandates the system instructions for registers.
This also implies that they may not be on the amba bus.
Right now all the CoreSight components are accessed via memory
map. Also, we have some common routines in coresight generic
code driver (e.g, CS_LOCK, claim/disclaim), which assume the
mmio. In order to preserve the generic algorithms at a single
place and to allow dynamic switch for ETMs, this series introduces
an abstraction layer for accessing a coresight device. It is
designed such that the mmio access are fast tracked (i.e, without
an indirect function call).
This will also help us to get rid of the driver+attribute specific
sysfs show/store routines and replace them with a single routine
to access a given register offset (which can be embedded in the
dev_ext_attribute). This is not currently implemented in the series,
but can be achieved.
Further we switch the generic routines to work with the abstraction.
With this in place, we refactor the etm4x code a bit to allow for
supporting the system instructions with very little new code. The
changes also switch to using the system instructions by default
even when we may have an MMIO.
We use TRCDEVARCH for the detection of the ETM component, which
is a standard register as per CoreSight architecture, rather than
the etm specific id register TRCIDR1. This is for making sure
that we are able to detect the ETM via system instructions accurately,
when the the trace unit could be anything (etm or a custom trace unit).
To keep the backward compatibility for any existing broken impelementation
which may not implement TRCDEVARCH, we fall back to TRCIDR1. Also
this covers us for the changes in the future architecture [0].
The series has been mildly tested on a model for system instructions.
I would really appreciate any testing on real hardware.
Applies on coresight/next.
[0] https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers/trcidr1
Known issues:
Checkpatch failure for "coresight: etm4x: Add sysreg access helpers" :
ERROR: Macros with complex values should be enclosed in parentheses
#121: FILE: drivers/hwtracing/coresight/coresight-etm4x.h:153:
+#define CASE_READ(res, x) \
+ case (x): { (res) = read_etm4x_sysreg_const_offset((x)); break; }
I don't know how to fix that without breaking the build ! Suggestions
welcome.
Changes since V2:
- Several fixes to the ETM register accesses. Access a register
when it is present.
- Add support for TRCIDR3.NUMPROCS for v4.2+
- Drop OS lock detection. Use software lock only in case of mmio.
- Fix issues with the Exception level masks (Mike Leach)
- Fall back to using TRCIDR1 when TRCDEVARCH is not "present"
- Use a generic notion of ETM architecture (rather than using
the encoding as in registers)
- Fixed some checkpatch issues.
- Add support for Self Hosted tracing Arm v8.4 extensions. (Mike
Leach)
Originally written by Jonathan, refactored and cleaned up.
- Changed the dts compatible string to "arm,coresight-etm-sysreg"
(Mike Leach)
Changes since V1:
- Flip the switch for iomem from no_iomem to io_mem in csdev_access.
- Split patches for claim/disclaim and CS_LOCK/UNLOCK conversions.
- Move device access initialisation for etm4x to the target CPU
- Cleanup secure exception level mask handling.
- Switch to use TRCDEVARCH for ETM component discovery. This
is for making
- Check the availability of OS/Software Locks before using them.
Suzuki K Poulose (26):
coresight: etm4x: Fix accesses to TRCVMIDCTLR1
coresight: etm4x: Fix accesses to TRCCIDCTLR1
coresight: etm4x: Update TRCIDR3.NUMPROCS handling to match v4.2
coresight: etm4x: Fix accesses to TRCPROCSELR
coresight: etm4x: Handle TRCVIPCSSCTLR accesses
coresight: etm4x: Handle access to TRCSSPCICRn
coresight: Introduce device access abstraction
coresight: tpiu: Prepare for using coresight device access abstraction
coresight: Convert coresight_timeout to use access abstraction
coresight: Convert claim/disclaim operations to use access wrappers
coresight: etm4x: Always read the registers on the host CPU
coresight: etm4x: Convert all register accesses
coresight: etm4x: Add commentary on the registers
coresight: etm4x: Add sysreg access helpers
coresight: etm4x: Define DEVARCH register fields
coresight: etm4x: Check for Software Lock
coresight: etm4x: Cleanup secure exception level masks
coresight: etm4x: Clean up exception level masks
coresight: etm4x: Detect access early on the target CPU
coresight: etm4x: Handle ETM architecture version
coresight: etm4x: Use TRCDEVARCH for component discovery
coresight: etm4x: Add necessary synchronization for sysreg access
coresight: etm4x: Detect system instructions support
coresight: etm4x: Refactor probing routine
coresight: etm4x: Add support for sysreg only devices
dts: bindings: coresight: ETM system register access only units
.../devicetree/bindings/arm/coresight.txt | 5 +-
drivers/hwtracing/coresight/coresight-catu.c | 12 +-
drivers/hwtracing/coresight/coresight-core.c | 130 ++-
.../hwtracing/coresight/coresight-cti-core.c | 18 +-
drivers/hwtracing/coresight/coresight-etb10.c | 10 +-
.../coresight/coresight-etm3x-core.c | 9 +-
.../coresight/coresight-etm4x-core.c | 758 +++++++++++-------
.../coresight/coresight-etm4x-sysfs.c | 44 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 501 +++++++++++-
.../hwtracing/coresight/coresight-funnel.c | 7 +-
.../coresight/coresight-replicator.c | 17 +-
drivers/hwtracing/coresight/coresight-stm.c | 4 +-
.../hwtracing/coresight/coresight-tmc-core.c | 16 +-
.../hwtracing/coresight/coresight-tmc-etf.c | 10 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 4 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 31 +-
include/linux/coresight.h | 230 +++++-
17 files changed, 1376 insertions(+), 430 deletions(-)
--
2.24.1
This patch series adds support for thread stack and callchain; this patch
set depends on the instruction sample fix patch set [1].
This patch set get more complex, so before divide into small groups, I'd
like to use this patch set version to include all relevant patches, hope
this can give whole context for related code change.
Briefly, this patch can be divided into three parts, which also can be
reviewed separately for every part:
Patches 01, 02 are used to fix samples for one corner case is for
accessing the branch's target address and trigger an exception.
Essentially, an extra branch sample is added to reflect this
mediate branch between the previous branch and exception entry.
Patches 03, 04, 05, 06 are coming from patch v4, which are used to
support thread stack and callchain.
Patches 07, 08, 09 are used to fixup for exception entry and exit. This
is mainly used to fix two cases, one part is to fixup the thread stack
and callchain for the case when access branch target address and trigger
exception; another part is to fixup the thread stack for instruction
emulation (and other single step cases).
This patch set has been tested on Juno-r2 after applied on perf/core
branch with latest commit 85fc95d75970 ("perf maps: Add missing unlock
to maps__insert() error case"), and this patch set is also applied on
top of instruction sample fix patch set [1].
Test for option '-F,+callindent':
# perf script -F,+callindent
main 3258 1 branches: main ffffad684d20 __libc_start_main+0xe0 (/usr/lib/aarch64-linux-gnu/libc-2.28.so)
main 3258 1 branches: lib_loop_test@plt aaaae2c4d78c main+0x18 (/root/coresight_test/main)
main 3258 1 branches: _dl_fixup ffffad811b4c _dl_runtime_resolve+0x40 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 1 branches: _dl_lookup_symbol_x ffffad80c078 _dl_fixup+0xb8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 1 branches: do_lookup_x ffffad80849c _dl_lookup_symbol_x+0x104 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 1 branches: check_match ffffad807bf0 do_lookup_x+0x238 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 1 branches: strcmp ffffad807888 check_match+0x70 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 1 branches: lib_loop_test@plt aaaae2c4d78c main+0x18 (/root/coresight_test/main)
main 3258 1 branches: lib_loop_test@plt aaaae2c4d78c main+0x18 (/root/coresight_test/main)
main 3258 1 branches: lib_loop_test@plt aaaae2c4d78c main+0x18 (/root/coresight_test/main)
main 3258 1 branches: lib_loop_test@plt aaaae2c4d78c main+0x18 (/root/coresight_test/main)
[...]
Test for option '--itrace=g':
# perf script --itrace=g16l64i100
main 3258 100 instructions:
ffffad816a80 memcpy+0x70 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad809468 _dl_new_object+0xa8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad801840 dl_main+0x778 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad81384c _dl_sysdep_start+0x36c (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800884 _dl_start_final+0xac (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 100 instructions:
ffffad80952c _dl_new_object+0x16c (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad801840 dl_main+0x778 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad81384c _dl_sysdep_start+0x36c (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800884 _dl_start_final+0xac (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 100 instructions:
ffffad8018dc dl_main+0x814 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad81384c _dl_sysdep_start+0x36c (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800884 _dl_start_final+0xac (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
main 3258 100 instructions:
ffff8000100878d0 el0_sync_handler+0x168 ([kernel.kallsyms])
ffff800010082d00 el0_sync+0x140 ([kernel.kallsyms])
ffffad801910 dl_main+0x848 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad81384c _dl_sysdep_start+0x36c (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800884 _dl_start_final+0xac (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
[...]
Changes from v4:
* Addressed Mike's suggestion for performance improvement for function
cs_etm__instr_addr() for quick calculation for non T32;
* Removed the patch 'perf cs-etm: Synchronize instruction sample with
the thread stack' (Mike);
* Fixed the issue for exception is taken for branch target address
accessing, for the branch sample and stack thread handling, the
related patches are 01, 02, 07;
* Fixed the stack thread handling for instruction emulation and single
step with patches 08, 09.
Changes from v3:
* Split out separate patch set for instruction samples fixing.
* Rebased on latest perf/core branch.
Changes from v2:
* Added patch 01 to fix the unsigned variable comparison to zero
(Suzuki).
* Refined commit logs.
Changes from v1:
* Added comments for task thread handling (Mathieu).
* Split patch 02 into two patches, one is for support thread stack and
another is for callchain support (Mathieu).
* Added a new patch to support branch filter.
[1] https://lkml.org/lkml/2020/2/18/1406
Leo Yan (9):
perf cs-etm: Defer to assign exception sample flag
perf cs-etm: Reflect branch prior to exception
perf cs-etm: Refactor instruction size handling
perf cs-etm: Support thread stack
perf cs-etm: Support branch filter
perf cs-etm: Support callchain for instruction sample
perf cs-etm: Fixup exception entry for thread stack
perf thread: Add helper to get top return address
perf cs-etm: Fixup exception exit for thread stack
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 1 +
tools/perf/util/cs-etm.c | 290 ++++++++++++++++--
tools/perf/util/thread-stack.c | 10 +
tools/perf/util/thread-stack.h | 1 +
4 files changed, 268 insertions(+), 34 deletions(-)
--
2.17.1
There was a report of NULL pointer dereference in ETF enable
path for perf CS mode with PID monitoring. It is almost 100%
reproducible when the process to monitor is something very
active such as chrome and with ETF as the sink and not ETR.
Currently in a bid to find the pid, the owner is dereferenced
via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
owner being NULL, we get a NULL pointer dereference.
Looking at the ETR and other places in the kernel, ETF and the
ETB are the only places trying to dereference the task(owner)
in tmc_enable_etf_sink_perf() which is also called from the
sched_in path as in the call trace. Owner(task) is NULL even
in the case of ETR in tmc_enable_etr_sink_perf(), but since we
cache the PID in alloc_buffer() callback and it is done as part
of etm_setup_aux() when allocating buffer for ETR sink, we never
dereference this NULL pointer and we are safe. So lets do the
same thing with ETF and ETB and cache the PID to which the
cs_buffer belongs in alloc_buffer() callback for ETF and ETB as
done for ETR. This will also remove the unnecessary function calls
(task_pid_nr()) in tmc_enable_etr_sink_perf() and etb_enable_perf().
In addition to this, add a check to validate event->owner before
dereferencing it in ETR, ETB and ETF to avoid any possible NULL
pointer dereference crashes in their corresponding alloc_buffer
callbacks and check for kernel events as well.
Easily reproducible running below:
perf record -e cs_etm/@tmc_etf0/ -N -p <pid>
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000548
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
<snip>...
Call trace:
tmc_enable_etf_sink+0xe4/0x280
coresight_enable_path+0x168/0x1fc
etm_event_start+0x8c/0xf8
etm_event_add+0x38/0x54
event_sched_in+0x194/0x2ac
group_sched_in+0x54/0x12c
flexible_sched_in+0xd8/0x120
visit_groups_merge+0x100/0x16c
ctx_flexible_sched_in+0x50/0x74
ctx_sched_in+0xa4/0xa8
perf_event_sched_in+0x60/0x6c
perf_event_context_sched_in+0x98/0xe0
__perf_event_task_sched_in+0x5c/0xd8
finish_task_switch+0x184/0x1cc
schedule_tail+0x20/0xec
ret_from_fork+0x4/0x18
Sai Prakash Ranjan (4):
perf/core: Export is_kernel_event()
coresight: tmc-etf: Fix NULL ptr dereference in
tmc_enable_etf_sink_perf()
coresight: etb10: Fix possible NULL ptr dereference in
etb_enable_perf()
coresight: tmc-etr: Fix possible NULL ptr dereference in
get_perf_etr_buf_cpu_wide()
drivers/hwtracing/coresight/coresight-etb10.c | 8 +++++++-
drivers/hwtracing/coresight/coresight-priv.h | 2 ++
drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 +++++++-
drivers/hwtracing/coresight/coresight-tmc-etr.c | 6 +++++-
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 3 ++-
6 files changed, 25 insertions(+), 4 deletions(-)
base-commit: f4cb5e9daedf56671badc93ac7f364043aa33886
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
hi,
while testing the implementation in gdb of branch tracing on arm
processors using etm, I faced the the situation where a breakpoint was
set, was hit and then the execution of the program was continued. While
decoding generated traces, I got the address of the breakpoint
(0x400552) executed twice, and then the following address (0x400554)
also executed twice. the instruction at (0x400554) is a BL ( a function
call) and the second execution corrupts the function history.
here is a dump of generated trace elements
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4
the explanation I have for this behavior is that :
-when setting the software breakpoint, the memory content of the
instruction (at 0x400552) was altered to the instruction BKPT,
-when the breakpoint was hit, the original opcode was set at (0x400552)
and a BKPT was set to the next instruction address (0x400554), then the
execution was continued
-when the second breakpoint (0x400554) was hit, the a BKPT opcode was
set at (0x400552) and the original opcode was set at (0x400554) then the
execution was continued
I am using the function "int target_read_code (CORE_ADDR memaddr,
gdb_byte *myaddr, ssize_t len)" to give program memory content to the
decoder. so the collected etm traces are correct, but, as memory was
altered in between, the decoder is "cheated".
I need to identify the re-execution of code due to breakpoint handling,
and roll back its impact on etm decoding.
is there a mean to get the actual content of program memory including
patched addresses?
is there a means of getting the history of patched addresses during the
debugging of a program?
what is the type and subtype of a BKPT instruction in a decoded trace
elements?
do you have any other idea for handling this situation?
I am attaching the source code of the program as well as the
disassembled binary. the code was compiled as an application running on
linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43
in the source code (line 238 in the disassembled code)
Kind Regards
Zied Guermazi
From: Linu Cherian <lcherian(a)marvell.com>
[ Upstream commit 6d578258b955fc8888e1bbd9a8fefe7b10065a84 ]
Coresight driver assumes sink is common across all the ETMs,
and tries to build a path between ETM and the first enabled
sink found using bus based search. This breaks sysFS usage
on implementations that has multiple per core sinks in
enabled state.
To fix this, coresight_get_enabled_sink API is updated to
do a connection based search starting from the given source,
instead of bus based search.
With sink selection using sysfs depecrated for perf interface,
provision for reset is removed as well in this API.
Signed-off-by: Linu Cherian <lcherian(a)marvell.com>
[Fixed indentation problem and removed obsolete comment]
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Link: https://lore.kernel.org/r/20200916191737.4001561-15-mathieu.poirier@linaro.…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/hwtracing/coresight/coresight-priv.h | 3 +-
drivers/hwtracing/coresight/coresight.c | 62 +++++++++-----------
2 files changed, 29 insertions(+), 36 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f2dc625ea5856..5fe773c4d6cc5 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -148,7 +148,8 @@ static inline void coresight_write_reg_pair(void __iomem *addr, u64 val,
void coresight_disable_path(struct list_head *path);
int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data);
struct coresight_device *coresight_get_sink(struct list_head *path);
-struct coresight_device *coresight_get_enabled_sink(bool reset);
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source);
struct coresight_device *coresight_get_sink_by_id(u32 id);
struct coresight_device *
coresight_find_default_sink(struct coresight_device *csdev);
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index e9c90f2de34ac..bb4f9e0a5438d 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -540,50 +540,46 @@ struct coresight_device *coresight_get_sink(struct list_head *path)
return csdev;
}
-static int coresight_enabled_sink(struct device *dev, const void *data)
+static struct coresight_device *
+coresight_find_enabled_sink(struct coresight_device *csdev)
{
- const bool *reset = data;
- struct coresight_device *csdev = to_coresight_device(dev);
+ int i;
+ struct coresight_device *sink;
if ((csdev->type == CORESIGHT_DEV_TYPE_SINK ||
csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) &&
- csdev->activated) {
- /*
- * Now that we have a handle on the sink for this session,
- * disable the sysFS "enable_sink" flag so that possible
- * concurrent perf session that wish to use another sink don't
- * trip on it. Doing so has no ramification for the current
- * session.
- */
- if (*reset)
- csdev->activated = false;
+ csdev->activated)
+ return csdev;
- return 1;
+ /*
+ * Recursively explore each port found on this element.
+ */
+ for (i = 0; i < csdev->pdata->nr_outport; i++) {
+ struct coresight_device *child_dev;
+
+ child_dev = csdev->pdata->conns[i].child_dev;
+ if (child_dev)
+ sink = coresight_find_enabled_sink(child_dev);
+ if (sink)
+ return sink;
}
- return 0;
+ return NULL;
}
/**
- * coresight_get_enabled_sink - returns the first enabled sink found on the bus
- * @deactivate: Whether the 'enable_sink' flag should be reset
- *
- * When operated from perf the deactivate parameter should be set to 'true'.
- * That way the "enabled_sink" flag of the sink that was selected can be reset,
- * allowing for other concurrent perf sessions to choose a different sink.
+ * coresight_get_enabled_sink - returns the first enabled sink using
+ * connection based search starting from the source reference
*
- * When operated from sysFS users have full control and as such the deactivate
- * parameter should be set to 'false', hence mandating users to explicitly
- * clear the flag.
+ * @source: Coresight source device reference
*/
-struct coresight_device *coresight_get_enabled_sink(bool deactivate)
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source)
{
- struct device *dev = NULL;
-
- dev = bus_find_device(&coresight_bustype, NULL, &deactivate,
- coresight_enabled_sink);
+ if (!source)
+ return NULL;
- return dev ? to_coresight_device(dev) : NULL;
+ return coresight_find_enabled_sink(source);
}
static int coresight_sink_by_id(struct device *dev, const void *data)
@@ -988,11 +984,7 @@ int coresight_enable(struct coresight_device *csdev)
goto out;
}
- /*
- * Search for a valid sink for this session but don't reset the
- * "enable_sink" flag in sysFS. Users get to do that explicitly.
- */
- sink = coresight_get_enabled_sink(false);
+ sink = coresight_get_enabled_sink(csdev);
if (!sink) {
ret = -EINVAL;
goto out;
--
2.25.1