This patchset introduces initial concepts in CoreSight system
configuration management support. to allow more detailed and complex
programming to be applied to CoreSight systems during trace capture.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/cs_config. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuation can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to coresight/next (5.11-rc2 base)
Changes since v3: (Primarily based on comments from Matthieu)
1) Locking mechanisms simplified.
2) Removed the possibility to enable features independently from
configurations.Only configurations can be enabled now. Simplifies programming
logic.
3) Configuration now uses an activate->enable mechanism. This means that perf
will activate a selected configuration at the start of a session (during
setup_aux), and disable at the end of a session (around free_aux)
The active configuration and associated features will be programmed into the
CoreSight device instances when they are enabled. This locks the configuration
into the system while in use. Parameters cannot be altered while this is
in place. This mechanism will be extended in future for dynamic load / unload
of configurations to prevent removal while in use.
4) Removed the custom bus / driver as un-necessary. A single device is
registered to own perf fs elements and configfs.
5) Various other minor issues addressed.
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device
basis, in favour of a single point adjustment in configfs that is pushed to all
device instances.
5) Altered how the config and preset command line options are handled in perf
and the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (10):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: syscfg: Add API to activate and enable configurations
coresight: etm-perf: Update to activate selected configuration
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support
coresight: docs: Add documentation for CoreSight config
.../trace/coresight/coresight-config.rst | 244 ++++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Makefile | 7 +-
.../hwtracing/coresight/coresight-cfg-afdo.c | 154 ++++
.../coresight/coresight-cfg-preload.c | 25 +
.../coresight/coresight-cfg-preload.h | 11 +
.../hwtracing/coresight/coresight-config.c | 246 ++++++
.../hwtracing/coresight/coresight-config.h | 282 +++++++
drivers/hwtracing/coresight/coresight-core.c | 18 +-
.../hwtracing/coresight/coresight-etm-perf.c | 180 ++++-
.../hwtracing/coresight/coresight-etm-perf.h | 12 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 184 +++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 29 +
.../coresight/coresight-etm4x-core.c | 38 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 399 +++++++++
.../coresight/coresight-syscfg-configfs.h | 45 ++
.../hwtracing/coresight/coresight-syscfg.c | 761 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 90 +++
include/linux/coresight.h | 7 +
20 files changed, 2721 insertions(+), 30 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-afdo.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.h
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
Hardware assisted tracing families such as ARM Coresight, Intel PT
provides rich tracing capabilities including instruction level
tracing and accurate timestamps which are very useful for profiling
and also pose a significant security risk. One such example of
security risk is when kernel mode tracing is not excluded and these
hardware assisted tracing can be used to analyze cryptographic code
execution. In this case, even the root user must not be able to infer
anything.
To explain it more clearly, here is the quote from a member of the
security team (credits: Mattias Nissler),
"Consider a system where disk contents are encrypted and the encryption
key is set up by the user when mounting the file system. From that point
on the encryption key resides in the kernel. It seems reasonable to
expect that the disk encryption key be protected from exfiltration even
if the system later suffers a root compromise (or even against insiders
that have root access), at least as long as the attacker doesn't
manage to compromise the kernel."
Here the idea is to protect such important information from all users
including root users since root privileges does not have to mean full
control over the kernel [1] and root compromise does not have to be
the end of the world.
Currently we can exclude kernel mode tracing via perf_event_paranoid
sysctl but it has following limitations,
* It is applicable to all PMUs and not just the ones supporting
instruction tracing.
* No option to restrict kernel mode instruction tracing by the
root user.
* Not possible to restrict kernel mode instruction tracing when the
hardware assisted tracing IPs like ARM Coresight ETMs use an
additional interface via sysfs for tracing in addition to perf
interface.
So introduce a new config CONFIG_EXCLUDE_KERNEL_HW_ITRACE to exclude
kernel mode instruction tracing which will be generic and applicable
to all hardware tracing families and which can also be used with other
interfaces like sysfs in case of ETMs.
Patch 1 adds this new config and the support in perf core to exclude
kernel mode tracing for PMUs which support instruction mode tracing.
Patch 2 adds the perf evsel warning message when the perf tool users
attempt to perform a kernel mode instruction trace with the config
enabled to exclude the kernel mode tracing.
Patch 3 and Patch 4 adds the support for excluding kernel mode for
ARM Coresight ETM{4,3}XX sysfs mode using the newly introduced generic
config.
[1] https://lwn.net/Articles/796866/
Sai Prakash Ranjan (4):
perf/core: Add support to exclude kernel mode instruction tracing
perf evsel: Print warning for excluding kernel mode instruction
tracing
coresight: etm4x: Add support to exclude kernel mode tracing
coresight: etm3x: Add support to exclude kernel mode tracing
drivers/hwtracing/coresight/coresight-etm3x-core.c | 11 +++++++++++
.../hwtracing/coresight/coresight-etm3x-sysfs.c | 3 ++-
drivers/hwtracing/coresight/coresight-etm4x-core.c | 14 +++++++++++++-
.../hwtracing/coresight/coresight-etm4x-sysfs.c | 3 ++-
init/Kconfig | 12 ++++++++++++
kernel/events/core.c | 6 ++++++
tools/perf/util/evsel.c | 3 ++-
7 files changed, 48 insertions(+), 4 deletions(-)
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
This series enables future IP trace features Embedded Trace Extension (ETE)
and Trace Buffer Extension (TRBE). This series depends on the ETM system
register instruction support series [0] which is available here [1]. This
series which applies on [1] is avaialble here [2] for quick access.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture
extensions. ETE overlaps with the ETMv4 architecture, with additions to
support the newer architecture features and some restrictions on the
supported features w.r.t ETMv4. The ETE support is added by extending the
ETMv4 driver to recognise the ETE and handle the features as exposed by the
TRCIDRx registers. ETE only supports system instructions access from the
host CPU. The ETE could be integrated with a TRBE (see below), or with the
legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware
description as the ETMs and requires a node per instance.
Trace Buffer Extensions (TRBE) implements a per CPU trace buffer, which is
accessible via the system registers and can be combined with the ETE to
provide a 1x1 configuration of source & sink. TRBE is being represented
here as a CoreSight sink. Primary reason is that the ETE source could work
with other traditional CoreSight sink devices. As TRBE captures the trace
data which is produced by ETE, it cannot work alone.
TRBE representation here have some distinct deviations from a traditional
CoreSight sink device. Coresight path between ETE and TRBE are not built
during boot looking at respective DT or ACPI entries.
Unlike traditional sinks, TRBE can generate interrupts to signal including
many other things, buffer got filled. The interrupt is a PPI and should be
communicated from the platform. DT or ACPI entry representing TRBE should
have the PPI number for a given platform. During perf session, the TRBE IRQ
handler should capture trace for perf auxiliary buffer before restarting it
back. System registers being used here to configure ETE and TRBE could be
referred in the link below.
https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers.
Question:
- Should we implement sysfs based trace sessions for TRBE ?
[0] https://lore.kernel.org/linux-arm-kernel/20210110224850.1880240-1-suzuki.po…
[1] https://gitlab.arm.com/linux-arm/linux-skp/-/tree/coresight/etm/sysreg-v7
[2] https://gitlab.arm.com/linux-arm/linux-anshuman/-/tree/coresight/ete_trbe_v2
Changes in V2:
- Converted both ETE and TRBE DT bindings into Yaml
- TRBE changes have been captured in the respective patches
Changes in V1:
https://lore.kernel.org/linux-arm-kernel/1608717823-18387-1-git-send-email-…
- There are not much ETE changes from Suzuki apart from splitting of the ETE DTS patch
- TRBE changes have been captured in the respective patches
Changes in RFC:
https://lore.kernel.org/linux-arm-kernel/1605012309-24812-1-git-send-email-…
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Linu Cherian <lcherian(a)marvell.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Anshuman Khandual (4):
arm64: Add TRBE definitions
coresight: core: Add support for dedicated percpu sinks
coresight: etm-perf: Truncate the perf record if handle has no space
coresight: sink: Add TRBE driver
Suzuki K Poulose (7):
coresight: etm-perf: Allow an event to use different sinks
coresight: Do not scan for graph if none is present
coresight: etm4x: Add support for PE OS lock
coresight: ete: Add support for ETE sysreg access
coresight: ete: Add support for ETE tracing
dts: bindings: Document device tree bindings for ETE
dts: bindings: Document device tree bindings for Arm TRBE
Documentation/devicetree/bindings/arm/ete.yaml | 71 ++
Documentation/devicetree/bindings/arm/trbe.yaml | 46 +
Documentation/trace/coresight/coresight-trbe.rst | 39 +
arch/arm64/include/asm/sysreg.h | 51 ++
drivers/hwtracing/coresight/Kconfig | 21 +-
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-core.c | 14 +
drivers/hwtracing/coresight/coresight-etm-perf.c | 51 +-
drivers/hwtracing/coresight/coresight-etm4x-core.c | 138 ++-
.../hwtracing/coresight/coresight-etm4x-sysfs.c | 19 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 81 +-
drivers/hwtracing/coresight/coresight-platform.c | 6 +
drivers/hwtracing/coresight/coresight-trbe.c | 966 +++++++++++++++++++++
drivers/hwtracing/coresight/coresight-trbe.h | 216 +++++
include/linux/coresight.h | 12 +
15 files changed, 1683 insertions(+), 49 deletions(-)
create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml
create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
--
2.7.4
This series enables future IP trace features Embedded Trace Extension (ETE)
and Trace Buffer Extension (TRBE). This series depends on the ETM system
register instruction support series [0] which is available here [1]. This
series which applies on [1] is avaialble here [2] for quick access.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture
extensions. ETE overlaps with the ETMv4 architecture, with additions to
support the newer architecture features and some restrictions on the
supported features w.r.t ETMv4. The ETE support is added by extending the
ETMv4 driver to recognise the ETE and handle the features as exposed by the
TRCIDRx registers. ETE only supports system instructions access from the
host CPU. The ETE could be integrated with a TRBE (see below), or with the
legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware
description as the ETMs and requires a node per instance.
Trace Buffer Extensions (TRBE) implements a per CPU trace buffer, which is
accessible via the system registers and can be combined with the ETE to
provide a 1x1 configuration of source & sink. TRBE is being represented
here as a CoreSight sink. Primary reason is that the ETE source could work
with other traditional CoreSight sink devices. As TRBE captures the trace
data which is produced by ETE, it cannot work alone.
TRBE representation here have some distinct deviations from a traditional
CoreSight sink device. Coresight path between ETE and TRBE are not built
during boot looking at respective DT or ACPI entries.
Unlike traditional sinks, TRBE can generate interrupts to signal including
many other things, buffer got filled. The interrupt is a PPI and should be
communicated from the platform. DT or ACPI entry representing TRBE should
have the PPI number for a given platform. During perf session, the TRBE IRQ
handler should capture trace for perf auxiliary buffer before restarting it
back. System registers being used here to configure ETE and TRBE could be
referred in the link below.
https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers.
Question:
- Should we implement sysfs based trace sessions for TRBE ?
[0] https://lore.kernel.org/linux-arm-kernel/20210110224850.1880240-1-suzuki.po…
[1] https://gitlab.arm.com/linux-arm/linux-skp/-/tree/coresight/etm/sysreg-v7
[2] https://gitlab.arm.com/linux-arm/linux-anshuman/-/tree/coresight/ete_trbe_v3
Changes in V3:
- Rebased on coresight/next
- Changed DT bindings for ETE
- Included additional patches for arm64 nvhe, perf aux buffer flags etc
- TRBE changes have been captured in the respective patches
Changes in V2:
https://lore.kernel.org/linux-arm-kernel/1610511498-4058-1-git-send-email-a…
- Converted both ETE and TRBE DT bindings into Yaml
- TRBE changes have been captured in the respective patches
Changes in V1:
https://lore.kernel.org/linux-arm-kernel/1608717823-18387-1-git-send-email-…
- There are not much ETE changes from Suzuki apart from splitting of the ETE DTS patch
- TRBE changes have been captured in the respective patches
Changes in RFC:
https://lore.kernel.org/linux-arm-kernel/1605012309-24812-1-git-send-email-…
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Linu Cherian <lcherian(a)marvell.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Anshuman Khandual (3):
coresight: core: Add support for dedicated percpu sinks
arm64: Add TRBE definitions
coresight: sink: Add TRBE driver
Suzuki K Poulose (11):
coresight: etm-perf: Allow an event to use different sinks
coresight: Do not scan for graph if none is present
coresight: etm4x: Add support for PE OS lock
coresight: ete: Add support for ETE sysreg access
coresight: ete: Add support for ETE tracing
dts: bindings: Document device tree bindings for ETE
coresight: etm-perf: Handle stale output handles
arm64: nvhe: Allow TRBE access at EL1
dts: bindings: Document device tree bindings for Arm TRBE
perf: aux: Add flags for the buffer format
coresight: etm-perf: Add support for trace buffer format
Documentation/devicetree/bindings/arm/ete.yaml | 74 ++
Documentation/devicetree/bindings/arm/trbe.yaml | 49 +
Documentation/trace/coresight/coresight-trbe.rst | 39 +
arch/arm64/include/asm/el2_setup.h | 19 +
arch/arm64/include/asm/kvm_arm.h | 2 +
arch/arm64/include/asm/sysreg.h | 51 +
drivers/hwtracing/coresight/Kconfig | 21 +-
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-core.c | 16 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 93 +-
drivers/hwtracing/coresight/coresight-etm4x-core.c | 138 ++-
.../hwtracing/coresight/coresight-etm4x-sysfs.c | 19 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 81 +-
drivers/hwtracing/coresight/coresight-platform.c | 6 +
drivers/hwtracing/coresight/coresight-trbe.c | 1025 ++++++++++++++++++++
drivers/hwtracing/coresight/coresight-trbe.h | 160 +++
include/linux/coresight.h | 12 +
include/uapi/linux/perf_event.h | 13 +-
18 files changed, 1759 insertions(+), 60 deletions(-)
create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml
create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
--
2.7.4
CoreSight ETMv4.4 obsoletes memory mapped access to ETM and
mandates the system instructions for registers.
This also implies that they may not be on the amba bus.
Right now all the CoreSight components are accessed via memory
map. Also, we have some common routines in coresight generic
code driver (e.g, CS_LOCK, claim/disclaim), which assume the
mmio. In order to preserve the generic algorithms at a single
place and to allow dynamic switch for ETMs, this series introduces
an abstraction layer for accessing a coresight device. It is
designed such that the mmio access are fast tracked (i.e, without
an indirect function call).
This will also help us to get rid of the driver+attribute specific
sysfs show/store routines and replace them with a single routine
to access a given register offset (which can be embedded in the
dev_ext_attribute). This is not currently implemented in the series,
but can be achieved.
Further we switch the generic routines to work with the abstraction.
With this in place, we refactor the etm4x code a bit to allow for
supporting the system instructions with very little new code.
We use TRCDEVARCH for the detection of the ETM component, which
is a standard register as per CoreSight architecture, rather than
the etm specific id register TRCIDR1. This is for making sure
that we are able to detect the ETM via system instructions accurately,
when the the trace unit could be anything (etm or a custom trace unit).
To keep the backward compatibility for any existing broken
impelementation which may not implement TRCDEVARCH, we fall back to TRCIDR1.
Also this covers us for the changes in the future architecture [0].
Also, v8.4 self-hosted tracing extensions (coupled with ETMv4.4) adds
new filtering registers for trace by exception level. So on a v8.4
system, with Trace Filtering support, without the appropriate
programming of the Trace filter registers (TRFCR_ELx), tracing
will not be enabled. This series also includes the TraceFiltering
support to cover the ETM-v4.4 support.
The series has been mildly tested on a model for system instructions.
I would really appreciate any testing on real hardware.
Applies on coresight/next. A tree is available here [1].
[0] https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers/trcidr1
[1] https://gitlab.arm.com/linux-arm/linux-skp coresight/etm/sysreg-v7
Changes since v6:
- New patch: Patch9 : Prepare the sysfs attributes for
filtering by offset of the register
- New patch: Patch12: Hide ETM registers unaccessible
on the ETM (for system instructions based ETMs)
- Split the list of ETM registers to memory mapped only
and common registers (Patch 11)
- Fixed the alignment issues pointed by Mathieu
(Patch 3, 14, 24)
Changes since v5:
- Rebased on to coresight/next.
- Moved trcdevarch to mgmt/ in sysfs and updated the sysfs ABI
document (Mike Leach)
- New patch : Moved the etm4_check_arch_features to run on the CPU, since
the PID of the ETM has to be read on the CPU and is unavailable
otherwise.
Changes since v4:
- Fix typo in commit description for the patches 02 & 15
- Refactor the AMBA device "remove" call back for use with
paltform_driver. (patch 21). Thus remove Review tag by Mathieu,
even though the changes are minimal.
- Added "remove" callback for platform_driver in patch 22, removed
Review tag by Mathieu
- Add 'U' suffix for constants in Patch 24 (Catalin)
- Fixed field extraction in Patch 25
Changes since v3:
- Device tree compatible changed to etm4x
- Use etm4x_** instead of generalizing etm_ in etm4x driver.
- Added v8.4 self hosted trace support patches, reworked
from Jonathan's series.
- Dropped queued patches.
- Expose TRCDEVARCH via trcidr, as this will be needed for
the userspace tools to determine the trace major/minor
arch versions.
- Remove csa argument to read()/write() (Mathieu)
- Fix secure exception mask calculation (Mathieu)
- Fix various coding style comments (Mathieu)
(See individual patches for change log)
Changes since V2:
- Several fixes to the ETM register accesses. Access a register
when it is present.
- Add support for TRCIDR3.NUMPROCS for v4.2+
- Drop OS lock detection. Use software lock only in case of mmio.
- Fix issues with the Exception level masks (Mike Leach)
- Fall back to using TRCIDR1 when TRCDEVARCH is not "present"
- Use a generic notion of ETM architecture (rather than using
the encoding as in registers)
- Fixed some checkpatch issues.
- Changed the dts compatible string to "arm,coresight-etm-sysreg"
(Mike Leach)
Changes since V1:
- Flip the switch for iomem from no_iomem to io_mem in csdev_access.
- Split patches for claim/disclaim and CS_LOCK/UNLOCK conversions.
- Move device access initialisation for etm4x to the target CPU
- Cleanup secure exception level mask handling.
- Switch to use TRCDEVARCH for ETM component discovery. This
is for making
- Check the availability of OS/Software Locks before using them.
Known issues:
Checkpatch failure for "coresight: etm4x: Add sysreg access helpers" :
ERROR: Macros with complex values should be enclosed in parentheses
#121: FILE: drivers/hwtracing/coresight/coresight-etm4x.h:153:
+#define CASE_READ(res, x) \
+ case (x): { (res) = read_etm4x_sysreg_const_offset((x)); break; }
I don't know a way to fix the warning without loosing the code
readability, which I believe is crucial for such a construct.
Jonathan Zhou (2):
arm64: Add TRFCR_ELx definitions
coresight: Add support for v8.4 SelfHosted tracing
Suzuki K Poulose (26):
coresight: etm4x: Handle access to TRCSSPCICRn
coresight: etm4x: Skip accessing TRCPDCR in save/restore
coresight: Introduce device access abstraction
coresight: tpiu: Prepare for using coresight device access abstraction
coresight: Convert coresight_timeout to use access abstraction
coresight: Convert claim/disclaim operations to use access wrappers
coresight: etm4x: Always read the registers on the host CPU
coresight: etm4x: Convert all register accesses
coresight: etm4x: Make offset available for sysfs attributes
coresight: etm4x: Add commentary on the registers
coresight: etm4x: Add sysreg access helpers
coresight: etm4x: Hide sysfs attributes for unavailable registers
coresight: etm4x: Define DEVARCH register fields
coresight: etm4x: Check for Software Lock
coresight: etm4x: Cleanup secure exception level masks
coresight: etm4x: Clean up exception level masks
coresight: etm4x: Handle ETM architecture version
coresight: etm4x: Detect access early on the target CPU
coresight: etm4x: Use TRCDEVARCH for component discovery
coresight: etm4x: Expose trcdevarch via sysfs
coresight: etm4x: Add necessary synchronization for sysreg access
coresight: etm4x: Detect system instructions support
coresight: etm4x: Refactor probing routine
coresight: etm4x: Run arch feature detection on the CPU
coresight: etm4x: Add support for sysreg only devices
dts: bindings: coresight: ETM system register access only units
.../testing/sysfs-bus-coresight-devices-etm4x | 8 +
.../devicetree/bindings/arm/coresight.txt | 5 +-
arch/arm64/include/asm/sysreg.h | 11 +
drivers/hwtracing/coresight/coresight-catu.c | 12 +-
drivers/hwtracing/coresight/coresight-core.c | 122 ++-
.../hwtracing/coresight/coresight-cti-core.c | 18 +-
drivers/hwtracing/coresight/coresight-etb10.c | 10 +-
.../coresight/coresight-etm3x-core.c | 9 +-
.../coresight/coresight-etm4x-core.c | 805 ++++++++++++------
.../coresight/coresight-etm4x-sysfs.c | 187 ++--
drivers/hwtracing/coresight/coresight-etm4x.h | 505 ++++++++++-
.../hwtracing/coresight/coresight-funnel.c | 7 +-
.../coresight/coresight-replicator.c | 13 +-
drivers/hwtracing/coresight/coresight-stm.c | 4 +-
.../hwtracing/coresight/coresight-tmc-core.c | 16 +-
.../hwtracing/coresight/coresight-tmc-etf.c | 10 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 4 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 31 +-
include/linux/coresight.h | 220 ++++-
19 files changed, 1520 insertions(+), 477 deletions(-)
--
2.24.1
The current fixed metadata version format (version 0), means that adding
metadata parameter items renders files from a previous version of perf
unreadable. Per CPU parameters appear in a fixed order, but there is no
field to indicate the number of ETM parameters per CPU.
This patch updates the per CPU parameter blocks to include a NR_PARAMs
value which indicates the number of parameters in the block.
The header version is incremented to 1. Fixed ordering is retained,
new ETM parameters are added to the end of the list.
The reader code is updated to be able to read current version 0 files,
For version 1, the reader will read the number of parameters in the
per CPU block. This allows the reader to process older or newer files
that may have different numbers of parameters than in use at the
time perf was built.
Changes since v3
1. Fixed index bug (Leo)
Changes since v2
1. Add error path print to improve --dump logging
2. Replace some hardcoded values with enum consts (Mathieu).
Changes since v1 (from Review by Leo):
1. Split printing routine into sub functions per version
2. Renamed NR_PARAMs to NR_TRC_PARAMs to emphasise use as count of trace
related parameters, not total block parameter.
3. Misc other fixes.
Signed-off-by: Mike Leach <mike.leach(a)linaro.org>
---
tools/perf/arch/arm/util/cs-etm.c | 7 +-
tools/perf/util/cs-etm.c | 235 ++++++++++++++++++++++++------
tools/perf/util/cs-etm.h | 30 +++-
3 files changed, 223 insertions(+), 49 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index bd446aba64f7..b0470f2a955a 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -572,7 +572,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
struct auxtrace_record *itr,
struct perf_record_auxtrace_info *info)
{
- u32 increment;
+ u32 increment, nr_trc_params;
u64 magic;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
@@ -607,6 +607,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETMV4_PRIV_MAX;
+ nr_trc_params = CS_ETMV4_PRIV_MAX - CS_ETMV4_TRCCONFIGR;
} else {
magic = __perf_cs_etmv3_magic;
/* Get configuration register */
@@ -624,11 +625,13 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETM_PRIV_MAX;
+ nr_trc_params = CS_ETM_PRIV_MAX - CS_ETM_ETMCR;
}
/* Build generic header portion */
info->priv[*offset + CS_ETM_MAGIC] = magic;
info->priv[*offset + CS_ETM_CPU] = cpu;
+ info->priv[*offset + CS_ETM_NR_TRC_PARAMS] = nr_trc_params;
/* Where the next CPU entry should start from */
*offset += increment;
}
@@ -674,7 +677,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
/* First fill out the session header */
info->type = PERF_AUXTRACE_CS_ETM;
- info->priv[CS_HEADER_VERSION_0] = 0;
+ info->priv[CS_HEADER_VERSION] = CS_HEADER_CURRENT_VERSION;
info->priv[CS_PMU_TYPE_CPUS] = type << 32;
info->priv[CS_PMU_TYPE_CPUS] |= nr_cpu;
info->priv[CS_ETM_SNAPSHOT] = ptr->snapshot_mode;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a2a369e2fbb6..f9af3fe0ed83 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2435,7 +2435,7 @@ static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
}
static const char * const cs_etm_global_header_fmts[] = {
- [CS_HEADER_VERSION_0] = " Header version %llx\n",
+ [CS_HEADER_VERSION] = " Header version %llx\n",
[CS_PMU_TYPE_CPUS] = " PMU type/num cpus %llx\n",
[CS_ETM_SNAPSHOT] = " Snapshot %llx\n",
};
@@ -2443,6 +2443,7 @@ static const char * const cs_etm_global_header_fmts[] = {
static const char * const cs_etm_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETM_ETMCR] = " ETMCR %llx\n",
[CS_ETM_ETMTRACEIDR] = " ETMTRACEIDR %llx\n",
[CS_ETM_ETMCCER] = " ETMCCER %llx\n",
@@ -2452,6 +2453,7 @@ static const char * const cs_etm_priv_fmts[] = {
static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETMV4_TRCCONFIGR] = " TRCCONFIGR %llx\n",
[CS_ETMV4_TRCTRACEIDR] = " TRCTRACEIDR %llx\n",
[CS_ETMV4_TRCIDR0] = " TRCIDR0 %llx\n",
@@ -2461,26 +2463,167 @@ static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETMV4_TRCAUTHSTATUS] = " TRCAUTHSTATUS %llx\n",
};
-static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+static const char * const param_unk_fmt =
+ " Unknown parameter [%d] %llx\n";
+static const char * const magic_unk_fmt =
+ " Magic number Unknown %llx\n";
+
+static int cs_etm__print_cpu_metadata_v0(__u64 *val, int *offset)
{
- int i, j, cpu = 0;
+ int i = *offset, j, nr_params = 0, fmt_offset;
+ __u64 magic;
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
- fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+ /* check magic value */
+ magic = val[i + CS_ETM_MAGIC];
+ if ((magic != __perf_cs_etmv3_magic) &&
+ (magic != __perf_cs_etmv4_magic)) {
+ /* failure - note bad magic value */
+ fprintf(stdout, magic_unk_fmt, magic);
+ return -EINVAL;
+ }
+
+ /* print common header block */
+ fprintf(stdout, cs_etm_priv_fmts[CS_ETM_MAGIC], val[i++]);
+ fprintf(stdout, cs_etm_priv_fmts[CS_ETM_CPU], val[i++]);
+
+ if (magic == __perf_cs_etmv3_magic) {
+ nr_params = CS_ETM_NR_TRC_PARAMS_V0;
+ fmt_offset = CS_ETM_ETMCR;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = fmt_offset; j < nr_params + fmt_offset; j++, i++)
+ fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
+ } else if (magic == __perf_cs_etmv4_magic) {
+ nr_params = CS_ETMV4_NR_TRC_PARAMS_V0;
+ fmt_offset = CS_ETMV4_TRCCONFIGR;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = fmt_offset; j < nr_params + fmt_offset; j++, i++)
+ fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
+ }
+ *offset = i;
+ return 0;
+}
+
+static int cs_etm__print_cpu_metadata_v1(__u64 *val, int *offset)
+{
+ int i = *offset, j, total_params = 0;
+ __u64 magic;
+
+ magic = val[i + CS_ETM_MAGIC];
+ /* total params to print is NR_PARAMS + common block size for v1 */
+ total_params = val[i + CS_ETM_NR_TRC_PARAMS] + CS_ETM_COMMON_BLK_MAX_V1;
- for (i = CS_HEADER_VERSION_0_MAX; cpu < num; cpu++) {
- if (val[i] == __perf_cs_etmv3_magic)
- for (j = 0; j < CS_ETM_PRIV_MAX; j++, i++)
+ if (magic == __perf_cs_etmv3_magic) {
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETM_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
- else if (val[i] == __perf_cs_etmv4_magic)
- for (j = 0; j < CS_ETMV4_PRIV_MAX; j++, i++)
+ }
+ } else if (magic == __perf_cs_etmv4_magic) {
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETMV4_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
- else
- /* failure.. return */
+ }
+ } else {
+ /* failure - note bad magic value and error out */
+ fprintf(stdout, magic_unk_fmt, magic);
+ return -EINVAL;
+ }
+ *offset = i;
+ return 0;
+}
+
+static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+{
+ int i, cpu = 0, version, err;
+
+ /* bail out early on bad header version */
+ version = val[0];
+ if (version > CS_HEADER_CURRENT_VERSION) {
+ /* failure.. return */
+ fprintf(stdout, " Unknown Header Version = %x, ", version);
+ fprintf(stdout, "Version supported <= %x\n", CS_HEADER_CURRENT_VERSION);
+ return;
+ }
+
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
+ fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+
+ for (i = CS_HEADER_VERSION_MAX; cpu < num; cpu++) {
+ if (version == 0)
+ err = cs_etm__print_cpu_metadata_v0(val, &i);
+ else if (version == 1)
+ err = cs_etm__print_cpu_metadata_v1(val, &i);
+ if (err)
return;
}
}
+/*
+ * Read a single cpu parameter block from the auxtrace_info priv block.
+ *
+ * For version 1 there is a per cpu nr_params entry. If we are handling
+ * version 1 file, then there may be less, the same, or more params
+ * indicated by this value than the compile time number we understand.
+ *
+ * For a version 0 info block, there are a fixed number, and we need to
+ * fill out the nr_param value in the metadata we create.
+ */
+static u64 *cs_etm__create_meta_blk(u64 *buff_in, int *buff_in_offset,
+ int out_blk_size, int nr_params_v0)
+{
+ u64 *metadata = NULL;
+ int hdr_version;
+ int nr_in_params, nr_out_params, nr_cmn_params;
+ int i, k;
+
+ metadata = zalloc(sizeof(*metadata) * out_blk_size);
+ if (!metadata)
+ return NULL;
+
+ /* read block current index & version */
+ i = *buff_in_offset;
+ hdr_version = buff_in[CS_HEADER_VERSION];
+
+ if (!hdr_version) {
+ /* read version 0 info block into a version 1 metadata block */
+ nr_in_params = nr_params_v0;
+ metadata[CS_ETM_MAGIC] = buff_in[i + CS_ETM_MAGIC];
+ metadata[CS_ETM_CPU] = buff_in[i + CS_ETM_CPU];
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_in_params;
+ /* remaining block params at offset +1 from source */
+ for (k = CS_ETM_ETMCR - 1; k < nr_in_params; k++)
+ metadata[k+1] = buff_in[i + k];
+ /* version 0 has 2 common params */
+ nr_cmn_params = 2;
+ } else {
+ /* read version 1 info block - input and output nr_params may differ */
+ /* version 1 has 3 common params */
+ nr_cmn_params = 3;
+ nr_in_params = buff_in[i + CS_ETM_NR_TRC_PARAMS];
+
+ /* if input has more params than output - skip excess */
+ nr_out_params = nr_in_params + nr_cmn_params;
+ if (nr_out_params > out_blk_size)
+ nr_out_params = out_blk_size;
+
+ for (k = CS_ETM_MAGIC; k < nr_out_params; k++)
+ metadata[k] = buff_in[i + k];
+
+ /* record the actual nr params we copied */
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_out_params - nr_cmn_params;
+ }
+
+ /* adjust in offset by number of in params used */
+ i += nr_in_params + nr_cmn_params;
+ *buff_in_offset = i;
+ return metadata;
+}
+
int cs_etm__process_auxtrace_info(union perf_event *event,
struct perf_session *session)
{
@@ -2492,11 +2635,12 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
int info_header_size;
int total_size = auxtrace_info->header.size;
int priv_size = 0;
- int num_cpu;
- int err = 0, idx = -1;
- int i, j, k;
+ int num_cpu, trcidr_idx;
+ int err = 0;
+ int i, j;
u64 *ptr, *hdr = NULL;
u64 **metadata = NULL;
+ u64 hdr_version;
/*
* sizeof(auxtrace_info_event::type) +
@@ -2512,16 +2656,21 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
/* First the global part */
ptr = (u64 *) auxtrace_info->priv;
- /* Look for version '0' of the header */
- if (ptr[0] != 0)
+ /* Look for version of the header */
+ hdr_version = ptr[0];
+ if (hdr_version > CS_HEADER_CURRENT_VERSION) {
+ /* print routine will print an error on bad version */
+ if (dump_trace)
+ cs_etm__print_auxtrace_info(auxtrace_info->priv, 0);
return -EINVAL;
+ }
- hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_0_MAX);
+ hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_MAX);
if (!hdr)
return -ENOMEM;
/* Extract header information - see cs-etm.h for format */
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
hdr[i] = ptr[i];
num_cpu = hdr[CS_PMU_TYPE_CPUS] & 0xffffffff;
pmu_type = (unsigned int) ((hdr[CS_PMU_TYPE_CPUS] >> 32) &
@@ -2552,35 +2701,31 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
*/
for (j = 0; j < num_cpu; j++) {
if (ptr[i] == __perf_cs_etmv3_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETM_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETM_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETM_PRIV_MAX,
+ CS_ETM_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETM_ETMTRACEIDR];
- i += CS_ETM_PRIV_MAX;
+ trcidr_idx = CS_ETM_ETMTRACEIDR;
+
} else if (ptr[i] == __perf_cs_etmv4_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETMV4_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETMV4_PRIV_MAX,
+ CS_ETMV4_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
- i += CS_ETMV4_PRIV_MAX;
+ trcidr_idx = CS_ETMV4_TRCTRACEIDR;
+ }
+
+ if (!metadata[j]) {
+ err = -ENOMEM;
+ goto err_free_metadata;
}
/* Get an RB node for this CPU */
- inode = intlist__findnew(traceid_list, idx);
+ inode = intlist__findnew(traceid_list, metadata[j][trcidr_idx]);
/* Something went wrong, no need to continue */
if (!inode) {
@@ -2601,7 +2746,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
}
/*
- * Each of CS_HEADER_VERSION_0_MAX, CS_ETM_PRIV_MAX and
+ * Each of CS_HEADER_VERSION_MAX, CS_ETM_PRIV_MAX and
* CS_ETMV4_PRIV_MAX mark how many double words are in the
* global metadata, and each cpu's metadata respectively.
* The following tests if the correct number of double words was
@@ -2703,6 +2848,12 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
intlist__delete(traceid_list);
err_free_hdr:
zfree(&hdr);
-
+ /*
+ * At this point, as a minimum we have valid header. Dump the rest of
+ * the info section - the print routines will error out on structural
+ * issues.
+ */
+ if (dump_trace)
+ cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
return err;
}
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 4ad925d6d799..e153d02df0de 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -17,23 +17,37 @@ struct perf_session;
*/
enum {
/* Starting with 0x0 */
- CS_HEADER_VERSION_0,
+ CS_HEADER_VERSION,
/* PMU->type (32 bit), total # of CPUs (32 bit) */
CS_PMU_TYPE_CPUS,
CS_ETM_SNAPSHOT,
- CS_HEADER_VERSION_0_MAX,
+ CS_HEADER_VERSION_MAX,
};
+/*
+ * Update the version for new format.
+ *
+ * New version 1 format adds a param count to the per cpu metadata.
+ * This allows easy adding of new metadata parameters.
+ * Requires that new params always added after current ones.
+ * Also allows client reader to handle file versions that are different by
+ * checking the number of params in the file vs the number expected.
+ */
+#define CS_HEADER_CURRENT_VERSION 1
+
/* Beginning of header common to both ETMv3 and V4 */
enum {
CS_ETM_MAGIC,
CS_ETM_CPU,
+ /* Number of trace config params in following ETM specific block */
+ CS_ETM_NR_TRC_PARAMS,
+ CS_ETM_COMMON_BLK_MAX_V1,
};
/* ETMv3/PTM metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETM_ETMCR = CS_ETM_CPU + 1,
+ CS_ETM_ETMCR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETM_ETMTRACEIDR,
/* RO, taken from sysFS */
CS_ETM_ETMCCER,
@@ -41,10 +55,13 @@ enum {
CS_ETM_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETM_NR_TRC_PARAMS_V0 (CS_ETM_ETMIDR - CS_ETM_ETMCR + 1)
+
/* ETMv4 metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETMV4_TRCCONFIGR = CS_ETM_CPU + 1,
+ CS_ETMV4_TRCCONFIGR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETMV4_TRCTRACEIDR,
/* RO, taken from sysFS */
CS_ETMV4_TRCIDR0,
@@ -55,6 +72,9 @@ enum {
CS_ETMV4_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETMV4_NR_TRC_PARAMS_V0 (CS_ETMV4_TRCAUTHSTATUS - CS_ETMV4_TRCCONFIGR + 1)
+
/*
* ETMv3 exception encoding number:
* See Embedded Trace Macrocell spcification (ARM IHI 0014Q)
@@ -162,7 +182,7 @@ struct cs_etm_packet_queue {
#define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
-#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_0_MAX * sizeof(u64))
+#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_MAX * sizeof(u64))
#define __perf_cs_etmv3_magic 0x3030303030303030ULL
#define __perf_cs_etmv4_magic 0x4040404040404040ULL
--
2.17.1
The current fixed metadata version format (version 0), means that adding
metadata parameter items renders files from a previous version of perf
unreadable. Per CPU parameters appear in a fixed order, but there is no
field to indicate the number of ETM parameters per CPU.
This patch updates the per CPU parameter blocks to include a NR_PARAMs
value which indicates the number of parameters in the block.
The header version is incremented to 1. Fixed ordering is retained,
new ETM parameters are added to the end of the list.
The reader code is updated to be able to read current version 0 files,
For version 1, the reader will read the number of parameters in the
per CPU block. This allows the reader to process older or newer files
that may have different numbers of parameters than in use at the
time perf was built.
Changes since v2
1. Add error path print to improve --dump logging
2. Replace some hardcoded values with enum consts (Mathieu).
Changes since v1 (from Review by Leo):
1. Split printing routine into sub functions per version
2. Renamed NR_PARAMs to NR_TRC_PARAMs to emphasise use as count of trace
related parameters, not total block parameter.
3. Misc other fixes.
Signed-off-by: Mike Leach <mike.leach(a)linaro.org>
---
tools/perf/arch/arm/util/cs-etm.c | 7 +-
tools/perf/util/cs-etm.c | 235 ++++++++++++++++++++++++------
tools/perf/util/cs-etm.h | 30 +++-
3 files changed, 223 insertions(+), 49 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index bd446aba64f7..b0470f2a955a 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -572,7 +572,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
struct auxtrace_record *itr,
struct perf_record_auxtrace_info *info)
{
- u32 increment;
+ u32 increment, nr_trc_params;
u64 magic;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
@@ -607,6 +607,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETMV4_PRIV_MAX;
+ nr_trc_params = CS_ETMV4_PRIV_MAX - CS_ETMV4_TRCCONFIGR;
} else {
magic = __perf_cs_etmv3_magic;
/* Get configuration register */
@@ -624,11 +625,13 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
/* How much space was used */
increment = CS_ETM_PRIV_MAX;
+ nr_trc_params = CS_ETM_PRIV_MAX - CS_ETM_ETMCR;
}
/* Build generic header portion */
info->priv[*offset + CS_ETM_MAGIC] = magic;
info->priv[*offset + CS_ETM_CPU] = cpu;
+ info->priv[*offset + CS_ETM_NR_TRC_PARAMS] = nr_trc_params;
/* Where the next CPU entry should start from */
*offset += increment;
}
@@ -674,7 +677,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
/* First fill out the session header */
info->type = PERF_AUXTRACE_CS_ETM;
- info->priv[CS_HEADER_VERSION_0] = 0;
+ info->priv[CS_HEADER_VERSION] = CS_HEADER_CURRENT_VERSION;
info->priv[CS_PMU_TYPE_CPUS] = type << 32;
info->priv[CS_PMU_TYPE_CPUS] |= nr_cpu;
info->priv[CS_ETM_SNAPSHOT] = ptr->snapshot_mode;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a2a369e2fbb6..36241e90f337 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2435,7 +2435,7 @@ static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
}
static const char * const cs_etm_global_header_fmts[] = {
- [CS_HEADER_VERSION_0] = " Header version %llx\n",
+ [CS_HEADER_VERSION] = " Header version %llx\n",
[CS_PMU_TYPE_CPUS] = " PMU type/num cpus %llx\n",
[CS_ETM_SNAPSHOT] = " Snapshot %llx\n",
};
@@ -2443,6 +2443,7 @@ static const char * const cs_etm_global_header_fmts[] = {
static const char * const cs_etm_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETM_ETMCR] = " ETMCR %llx\n",
[CS_ETM_ETMTRACEIDR] = " ETMTRACEIDR %llx\n",
[CS_ETM_ETMCCER] = " ETMCCER %llx\n",
@@ -2452,6 +2453,7 @@ static const char * const cs_etm_priv_fmts[] = {
static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETM_MAGIC] = " Magic number %llx\n",
[CS_ETM_CPU] = " CPU %lld\n",
+ [CS_ETM_NR_TRC_PARAMS] = " NR_TRC_PARAMS %llx\n",
[CS_ETMV4_TRCCONFIGR] = " TRCCONFIGR %llx\n",
[CS_ETMV4_TRCTRACEIDR] = " TRCTRACEIDR %llx\n",
[CS_ETMV4_TRCIDR0] = " TRCIDR0 %llx\n",
@@ -2461,26 +2463,167 @@ static const char * const cs_etmv4_priv_fmts[] = {
[CS_ETMV4_TRCAUTHSTATUS] = " TRCAUTHSTATUS %llx\n",
};
-static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+static const char * const param_unk_fmt =
+ " Unknown parameter [%d] %llx\n";
+static const char * const magic_unk_fmt =
+ " Magic number Unknown %llx\n";
+
+static int cs_etm__print_cpu_metadata_v0(__u64 *val, int *offset)
{
- int i, j, cpu = 0;
+ int i = *offset, j, nr_params = 0, fmt_offset;
+ __u64 magic;
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
- fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+ /* check magic value */
+ magic = val[i + CS_ETM_MAGIC];
+ if ((magic != __perf_cs_etmv3_magic) &&
+ (magic != __perf_cs_etmv4_magic)) {
+ /* failure - note bad magic value */
+ fprintf(stdout, magic_unk_fmt, magic);
+ return -EINVAL;
+ }
+
+ /* print common header block */
+ fprintf(stdout, cs_etm_priv_fmts[CS_ETM_MAGIC], val[i++]);
+ fprintf(stdout, cs_etm_priv_fmts[CS_ETM_CPU], val[i++]);
+
+ if (magic == __perf_cs_etmv3_magic) {
+ nr_params = CS_ETM_NR_TRC_PARAMS_V0;
+ fmt_offset = CS_ETM_ETMCR;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = fmt_offset; j < nr_params + fmt_offset; j++, i++)
+ fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
+ } else if (magic == __perf_cs_etmv4_magic) {
+ nr_params = CS_ETMV4_NR_TRC_PARAMS_V0;
+ fmt_offset = CS_ETMV4_TRCCONFIGR;
+ /* after common block, offset format index past NR_PARAMS */
+ for (j = fmt_offset; j < nr_params + fmt_offset; j++, i++)
+ fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
+ }
+ *offset = i;
+ return 0;
+}
+
+static int cs_etm__print_cpu_metadata_v1(__u64 *val, int *offset)
+{
+ int i = *offset, j, total_params = 0;
+ __u64 magic;
+
+ magic = val[i + CS_ETM_MAGIC];
+ /* total params to print is NR_PARAMS + common block size for v1 */
+ total_params = val[i + CS_ETM_NR_TRC_PARAMS] + CS_ETM_COMMON_BLK_MAX_V1;
- for (i = CS_HEADER_VERSION_0_MAX; cpu < num; cpu++) {
- if (val[i] == __perf_cs_etmv3_magic)
- for (j = 0; j < CS_ETM_PRIV_MAX; j++, i++)
+ if (magic == __perf_cs_etmv3_magic) {
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETM_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
- else if (val[i] == __perf_cs_etmv4_magic)
- for (j = 0; j < CS_ETMV4_PRIV_MAX; j++, i++)
+ }
+ } else if (magic == __perf_cs_etmv4_magic) {
+ for (j = 0; j < total_params; j++, i++) {
+ /* if newer record - could be excess params */
+ if (j >= CS_ETMV4_PRIV_MAX)
+ fprintf(stdout, param_unk_fmt, j, val[i]);
+ else
fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
- else
- /* failure.. return */
+ }
+ } else {
+ /* failure - note bad magic value and error out */
+ fprintf(stdout, magic_unk_fmt, magic);
+ return -EINVAL;
+ }
+ *offset = i;
+ return 0;
+}
+
+static void cs_etm__print_auxtrace_info(__u64 *val, int num)
+{
+ int i, cpu = 0, version, err;
+
+ /* bail out early on bad header version */
+ version = val[0];
+ if (version > CS_HEADER_CURRENT_VERSION) {
+ /* failure.. return */
+ fprintf(stdout, " Unknown Header Version = %x, ", version);
+ fprintf(stdout, "Version supported <= %x\n", CS_HEADER_CURRENT_VERSION);
+ return;
+ }
+
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
+ fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+
+ for (i = CS_HEADER_VERSION_MAX; cpu < num; cpu++) {
+ if (version == 0)
+ err = cs_etm__print_cpu_metadata_v0(val, &i);
+ else if (version == 1)
+ err = cs_etm__print_cpu_metadata_v1(val, &i);
+ if (err)
return;
}
}
+/*
+ * Read a single cpu parameter block from the auxtrace_info priv block.
+ *
+ * For version 1 there is a per cpu nr_params entry. If we are handling
+ * version 1 file, then there may be less, the same, or more params
+ * indicated by this value than the compile time number we understand.
+ *
+ * For a version 0 info block, there are a fixed number, and we need to
+ * fill out the nr_param value in the metadata we create.
+ */
+static u64 *cs_etm__create_meta_blk(u64 *buff_in, int *buff_in_offset,
+ int out_blk_size, int nr_params_v0)
+{
+ u64 *metadata = NULL;
+ int hdr_version;
+ int nr_in_params, nr_out_params, nr_cmn_params;
+ int i, k;
+
+ metadata = zalloc(sizeof(*metadata) * out_blk_size);
+ if (!metadata)
+ return NULL;
+
+ /* read block current index & version */
+ i = *buff_in_offset;
+ hdr_version = buff_in[CS_HEADER_VERSION];
+
+ if (!hdr_version) {
+ /* read version 0 info block into a version 1 metadata block */
+ nr_in_params = nr_params_v0;
+ metadata[CS_ETM_MAGIC] = buff_in[i + CS_ETM_MAGIC];
+ metadata[CS_ETM_CPU] = buff_in[i + CS_ETM_CPU];
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_in_params;
+ /* remaining block params at offset +1 from source */
+ for (k = CS_ETM_ETMCR; k < nr_in_params; k++)
+ metadata[k+1] = buff_in[i + k];
+ /* version 0 has 2 common params */
+ nr_cmn_params = 2;
+ } else {
+ /* read version 1 info block - input and output nr_params may differ */
+ /* version 1 has 3 common params */
+ nr_cmn_params = 3;
+ nr_in_params = buff_in[i + CS_ETM_NR_TRC_PARAMS];
+
+ /* if input has more params than output - skip excess */
+ nr_out_params = nr_in_params + nr_cmn_params;
+ if (nr_out_params > out_blk_size)
+ nr_out_params = out_blk_size;
+
+ for (k = CS_ETM_MAGIC; k < nr_out_params; k++)
+ metadata[k] = buff_in[i + k];
+
+ /* record the actual nr params we copied */
+ metadata[CS_ETM_NR_TRC_PARAMS] = nr_out_params - nr_cmn_params;
+ }
+
+ /* adjust in offset by number of in params used */
+ i += nr_in_params + nr_cmn_params;
+ *buff_in_offset = i;
+ return metadata;
+}
+
int cs_etm__process_auxtrace_info(union perf_event *event,
struct perf_session *session)
{
@@ -2492,11 +2635,12 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
int info_header_size;
int total_size = auxtrace_info->header.size;
int priv_size = 0;
- int num_cpu;
- int err = 0, idx = -1;
- int i, j, k;
+ int num_cpu, trcidr_idx;
+ int err = 0;
+ int i, j;
u64 *ptr, *hdr = NULL;
u64 **metadata = NULL;
+ u64 hdr_version;
/*
* sizeof(auxtrace_info_event::type) +
@@ -2512,16 +2656,21 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
/* First the global part */
ptr = (u64 *) auxtrace_info->priv;
- /* Look for version '0' of the header */
- if (ptr[0] != 0)
+ /* Look for version of the header */
+ hdr_version = ptr[0];
+ if (hdr_version > CS_HEADER_CURRENT_VERSION) {
+ /* print routine will print an error on bad version */
+ if (dump_trace)
+ cs_etm__print_auxtrace_info(auxtrace_info->priv, 0);
return -EINVAL;
+ }
- hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_0_MAX);
+ hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_MAX);
if (!hdr)
return -ENOMEM;
/* Extract header information - see cs-etm.h for format */
- for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+ for (i = 0; i < CS_HEADER_VERSION_MAX; i++)
hdr[i] = ptr[i];
num_cpu = hdr[CS_PMU_TYPE_CPUS] & 0xffffffff;
pmu_type = (unsigned int) ((hdr[CS_PMU_TYPE_CPUS] >> 32) &
@@ -2552,35 +2701,31 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
*/
for (j = 0; j < num_cpu; j++) {
if (ptr[i] == __perf_cs_etmv3_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETM_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETM_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETM_PRIV_MAX,
+ CS_ETM_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETM_ETMTRACEIDR];
- i += CS_ETM_PRIV_MAX;
+ trcidr_idx = CS_ETM_ETMTRACEIDR;
+
} else if (ptr[i] == __perf_cs_etmv4_magic) {
- metadata[j] = zalloc(sizeof(*metadata[j]) *
- CS_ETMV4_PRIV_MAX);
- if (!metadata[j]) {
- err = -ENOMEM;
- goto err_free_metadata;
- }
- for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
- metadata[j][k] = ptr[i + k];
+ metadata[j] =
+ cs_etm__create_meta_blk(ptr, &i,
+ CS_ETMV4_PRIV_MAX,
+ CS_ETMV4_NR_TRC_PARAMS_V0);
/* The traceID is our handle */
- idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
- i += CS_ETMV4_PRIV_MAX;
+ trcidr_idx = CS_ETMV4_TRCTRACEIDR;
+ }
+
+ if (!metadata[j]) {
+ err = -ENOMEM;
+ goto err_free_metadata;
}
/* Get an RB node for this CPU */
- inode = intlist__findnew(traceid_list, idx);
+ inode = intlist__findnew(traceid_list, metadata[j][trcidr_idx]);
/* Something went wrong, no need to continue */
if (!inode) {
@@ -2601,7 +2746,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
}
/*
- * Each of CS_HEADER_VERSION_0_MAX, CS_ETM_PRIV_MAX and
+ * Each of CS_HEADER_VERSION_MAX, CS_ETM_PRIV_MAX and
* CS_ETMV4_PRIV_MAX mark how many double words are in the
* global metadata, and each cpu's metadata respectively.
* The following tests if the correct number of double words was
@@ -2703,6 +2848,12 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
intlist__delete(traceid_list);
err_free_hdr:
zfree(&hdr);
-
+ /*
+ * At this point, as a minimum we have valid header. Dump the rest of
+ * the info section - the print routines will error out on structural
+ * issues.
+ */
+ if (dump_trace)
+ cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
return err;
}
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 4ad925d6d799..e153d02df0de 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -17,23 +17,37 @@ struct perf_session;
*/
enum {
/* Starting with 0x0 */
- CS_HEADER_VERSION_0,
+ CS_HEADER_VERSION,
/* PMU->type (32 bit), total # of CPUs (32 bit) */
CS_PMU_TYPE_CPUS,
CS_ETM_SNAPSHOT,
- CS_HEADER_VERSION_0_MAX,
+ CS_HEADER_VERSION_MAX,
};
+/*
+ * Update the version for new format.
+ *
+ * New version 1 format adds a param count to the per cpu metadata.
+ * This allows easy adding of new metadata parameters.
+ * Requires that new params always added after current ones.
+ * Also allows client reader to handle file versions that are different by
+ * checking the number of params in the file vs the number expected.
+ */
+#define CS_HEADER_CURRENT_VERSION 1
+
/* Beginning of header common to both ETMv3 and V4 */
enum {
CS_ETM_MAGIC,
CS_ETM_CPU,
+ /* Number of trace config params in following ETM specific block */
+ CS_ETM_NR_TRC_PARAMS,
+ CS_ETM_COMMON_BLK_MAX_V1,
};
/* ETMv3/PTM metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETM_ETMCR = CS_ETM_CPU + 1,
+ CS_ETM_ETMCR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETM_ETMTRACEIDR,
/* RO, taken from sysFS */
CS_ETM_ETMCCER,
@@ -41,10 +55,13 @@ enum {
CS_ETM_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETM_NR_TRC_PARAMS_V0 (CS_ETM_ETMIDR - CS_ETM_ETMCR + 1)
+
/* ETMv4 metadata */
enum {
/* Dynamic, configurable parameters */
- CS_ETMV4_TRCCONFIGR = CS_ETM_CPU + 1,
+ CS_ETMV4_TRCCONFIGR = CS_ETM_COMMON_BLK_MAX_V1,
CS_ETMV4_TRCTRACEIDR,
/* RO, taken from sysFS */
CS_ETMV4_TRCIDR0,
@@ -55,6 +72,9 @@ enum {
CS_ETMV4_PRIV_MAX,
};
+/* define fixed version 0 length - allow new format reader to read old files. */
+#define CS_ETMV4_NR_TRC_PARAMS_V0 (CS_ETMV4_TRCAUTHSTATUS - CS_ETMV4_TRCCONFIGR + 1)
+
/*
* ETMv3 exception encoding number:
* See Embedded Trace Macrocell spcification (ARM IHI 0014Q)
@@ -162,7 +182,7 @@ struct cs_etm_packet_queue {
#define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
-#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_0_MAX * sizeof(u64))
+#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_MAX * sizeof(u64))
#define __perf_cs_etmv3_magic 0x3030303030303030ULL
#define __perf_cs_etmv4_magic 0x4040404040404040ULL
--
2.17.1