This series enables future IP trace features Embedded Trace Extension (ETE) and Trace Buffer Extension (TRBE). This series applies on v5.12-rc4 + some patches queued. A standalone tree is also available here [0]. The queued patches (almost there) are included in this posting for the sake of constructing a tree from the posting.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture extensions. ETE overlaps with the ETMv4 architecture, with additions to support the newer architecture features and some restrictions on the supported features w.r.t ETMv4. The ETE support is added by extending the ETMv4 driver to recognise the ETE and handle the features as exposed by the TRCIDRx registers. ETE only supports system instructions access from the host CPU. The ETE could be integrated with a TRBE (see below), or with the legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware description as the ETMs and requires a node per instance.
Trace Buffer Extensions (TRBE) implements a per CPU trace buffer, which is accessible via the system registers and can be combined with the ETE to provide a 1x1 configuration of source & sink. TRBE is being represented here as a CoreSight sink. Primary reason is that the ETE source could work with other traditional CoreSight sink devices. As TRBE captures the trace data which is produced by ETE, it cannot work alone.
TRBE representation here have some distinct deviations from a traditional CoreSight sink device. Coresight path between ETE and TRBE are not built during boot looking at respective DT or ACPI entries.
Unlike traditional sinks, TRBE can generate interrupts to signal including many other things, buffer got filled. The interrupt is a PPI and should be communicated from the platform. DT or ACPI entry representing TRBE should have the PPI number for a given platform. During perf session, the TRBE IRQ handler should capture trace for perf auxiliary buffer before restarting it back. System registers being used here to configure ETE and TRBE could be referred in the link below.
https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers.
[0] https://gitlab.arm.com/linux-arm/linux-skp/-/tree/coresight/ete/v5/
Changes since V4:
https://lkml.kernel.org/r/20210225193543.2920532-1-suzuki.poulose@arm.com
- Split the documentation patches from the TRBE driver - Drop the patches already queued for v5.12. - Mark the buffer TRUNCATED in case of a WRAP event - Fix error code for vmap failure - Fix build break on arm32 for per-cpu sink patch - Address comments on ETE dts bindings. - Make ete_sysreg_read/write static (kernel test robot) - Merged ete sysreg definition patch with ete support, to avoid a "defined but unused warning" on build in part of the series. - Add new bindings to MAINTAINERS
Changes since V3:
https://lore.kernel.org/linux-arm-kernel/1611737738-1493-1-git-send-email-an...
- ETE and TRBE changes have been captured in the respective patches - Better support for nVHE - Re-ordered and splitted the patches to keep the changes separate for the generic/arm64 tree from CoreSight driver specific changes. - Fixes for KVM handling of Trace/SPE
Changes since V2:
https://lore.kernel.org/linux-arm-kernel/1610511498-4058-1-git-send-email-an...
- Rebased on coresight/next - Changed DT bindings for ETE - Included additional patches for arm64 nvhe, perf aux buffer flags etc - TRBE changes have been captured in the respective patches
Changes since V1:
https://lore.kernel.org/linux-arm-kernel/1608717823-18387-1-git-send-email-a...
- Converted both ETE and TRBE DT bindings into Yaml - TRBE changes have been captured in the respective patches
Changes since RFC:
- There are not much ETE changes from Suzuki apart from splitting of the ETE DTS patch - TRBE changes have been captured in the respective patches
RFC:
https://lore.kernel.org/linux-arm-kernel/1605012309-24812-1-git-send-email-a...
Cc: Will Deacon will@kernel.org Cc: Marc Zyngier maz@kernel.org Cc: Peter Zilstra peterz@infradead.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Suzuki K Poulose suzuki.poulose@arm.com Cc: Mike Leach mike.leach@linaro.org Cc: Linu Cherian lcherian@marvell.com Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org
Anshuman Khandual (5): arm64: Add TRBE definitions coresight: core: Add support for dedicated percpu sinks coresight: sink: Add TRBE driver Documentation: coresight: trbe: Sysfs ABI description Documentation: trace: Add documentation for TRBE
Suzuki K Poulose (14): [Queued] kvm: arm64: Hide system instruction access to Trace registers [Queued] kvm: arm64: Disable guest access to trace filter controls perf: aux: Add flags for the buffer format perf: aux: Add CoreSight PMU buffer formats arm64: Add support for trace synchronization barrier arm64: kvm: Enable access to TRBE support for host coresight: etm4x: Move ETM to prohibited region for disable coresight: etm-perf: Allow an event to use different sinks coresight: Do not scan for graph if none is present coresight: etm4x: Add support for PE OS lock coresight: ete: Add support for ETE tracing dts: bindings: Document device tree bindings for ETE coresight: etm-perf: Handle stale output handles dts: bindings: Document device tree bindings for Arm TRBE
.../testing/sysfs-bus-coresight-devices-trbe | 14 + .../devicetree/bindings/arm/ete.yaml | 75 ++ .../devicetree/bindings/arm/trbe.yaml | 49 + .../trace/coresight/coresight-trbe.rst | 38 + MAINTAINERS | 2 + arch/arm64/include/asm/barrier.h | 1 + arch/arm64/include/asm/el2_setup.h | 13 + arch/arm64/include/asm/kvm_arm.h | 3 + arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/asm/sysreg.h | 50 + arch/arm64/kernel/cpufeature.c | 1 - arch/arm64/kernel/hyp-stub.S | 3 +- arch/arm64/kvm/debug.c | 6 +- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 + arch/arm64/kvm/hyp/nvhe/switch.c | 1 + drivers/hwtracing/coresight/Kconfig | 24 +- drivers/hwtracing/coresight/Makefile | 1 + drivers/hwtracing/coresight/coresight-core.c | 29 +- .../hwtracing/coresight/coresight-etm-perf.c | 119 +- .../coresight/coresight-etm4x-core.c | 161 ++- .../coresight/coresight-etm4x-sysfs.c | 19 +- drivers/hwtracing/coresight/coresight-etm4x.h | 83 +- .../hwtracing/coresight/coresight-platform.c | 6 + drivers/hwtracing/coresight/coresight-priv.h | 3 + drivers/hwtracing/coresight/coresight-trbe.c | 1157 +++++++++++++++++ drivers/hwtracing/coresight/coresight-trbe.h | 152 +++ include/linux/coresight.h | 13 + include/uapi/linux/perf_event.h | 13 +- 28 files changed, 2016 insertions(+), 64 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml create mode 100644 Documentation/trace/coresight/coresight-trbe.rst create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
Currently we advertise the ID_AA6DFR0_EL1.TRACEVER for the guest, when the trace register accesses are trapped (CPTR_EL2.TTA == 1). So, the guest will get an undefined instruction, if trusts the ID registers and access one of the trace registers. Lets be nice to the guest and hide the feature to avoid unexpected behavior.
Even though this can be done at KVM sysreg emulation layer, we do this by removing the TRACEVER from the sanitised feature register field. This is fine as long as the ETM drivers can handle the individual trace units separately, even when there are differences among the CPUs.
Cc: Marc Zyngier maz@kernel.org Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Mark Rutland mark.rutland@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com -- Note: Marc has indicated that he will be picking this patch I have included in the series for ease of testing. --- arch/arm64/kernel/cpufeature.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 066030717a4c..a4698f09bf32 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -383,7 +383,6 @@ static const struct arm64_ftr_bits ftr_id_aa64dfr0[] = { * of support. */ S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64DFR0_PMUVER_SHIFT, 4, 0), - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_TRACEVER_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_DEBUGVER_SHIFT, 4, 0x6), ARM64_FTR_END, };
Disable guest access to the Trace Filter control registers. We do not advertise the Trace filter feature to the guest (ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest can still access the TRFCR_EL1 unless we trap it.
This will also make sure that the guest cannot fiddle with the filtering controls set by a nvhe host.
Cc: Marc Zyngier maz@kernel.org Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- Note: Marc has indicated that this will be queued for 5.12. I have included this in the series for the sake of constructing a full tree from the posting --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kvm/debug.c | 2 ++ 2 files changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 4e90c2debf70..94d4025acc0b 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,7 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1
/* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) #define MDCR_EL2_E2PB_SHIFT (UL(12)) diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index 7a7e425616b5..dbc890511631 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,6 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) * - Debug ROM Address (MDCR_EL2_TDRA) * - OS related registers (MDCR_EL2_TDOSA) * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB) + * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF) * * Additionally, KVM only traps guest accesses to the debug registers if * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY @@ -112,6 +113,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM | MDCR_EL2_TPMS | + MDCR_EL2_TTRF | MDCR_EL2_TPMCR | MDCR_EL2_TDRA | MDCR_EL2_TDOSA);
Allocate a byte for advertising the PMU specific format type of the given AUX record. A PMU could end up providing hardware trace data in multiple format in a single session.
e.g, The format of hardware buffer produced by CoreSight ETM PMU depends on the type of the "sink" device used for collection for an event (Traditional TMC-ETR/Bs with formatting or TRBEs without any formatting).
# Boring story of why this is needed. Goto The_End_of_Story for skipping.
CoreSight ETM trace allows instruction level tracing of Arm CPUs. The ETM generates the CPU excecution trace and pumps it into CoreSight AMBA Trace Bus and is collected by a different CoreSight component (traditionally CoreSight TMC-ETR /ETB/ETF), called "sink". Important to note that there is no guarantee that every CPU has a dedicated sink. Thus multiple ETMs could pump the trace data into the same "sink" and thus they apply additional formatting of the trace data for the user to decode it properly and attribute the trace data to the corresponding ETM.
However, with the introduction of Arm Trace buffer Extensions (TRBE), we now have a dedicated per-CPU architected sink for collecting the trace. Since the TRBE is always per-CPU, it doesn't apply any formatting of the trace. The support for this driver is under review [1].
Now a system could have a per-cpu TRBE and one or more shared TMC-ETRs on the system. A user could choose a "specific" sink for a perf session (e.g, a TMC-ETR) or the driver could automatically select the nearest sink for a given ETM. It is possible that some ETMs could end up using TMC-ETR (e.g, if the TRBE is not usable on the CPU) while the others using TRBE in a single perf session. Thus we now have "formatted" trace collected from TMC-ETR and "unformatted" trace collected from TRBE. However, we don't get into a situation where a single event could end up using TMC-ETR & TRBE. i.e, any AUX buffer is guaranteed to be either RAW or FORMATTED, but not a mix of both.
As for perf decoding, we need to know the type of the data in the individual AUX buffers, so that it can set up the "OpenCSD" (library for decoding CoreSight trace) decoder instance appropriately. Thus the perf.data file must conatin the hints for the tool to decode the data correctly.
Since this is a runtime variable, and perf tool doesn't have a control on what sink gets used (in case of automatic sink selection), we need this information made available from the PMU driver for each AUX record.
# The_End_of_Story
Cc: Peter Ziljstra peterz@infradead.org Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Cc: will@kernel.org Cc: mark.rutland@arm.com Cc: mike.leach@linaro.org Cc: acme@kernel.org Cc: jolsa@redhat.com Cc: Mathieu Poirier mathieu.poirer@linaro.org Reviewed by: Mike Leach mike.leach@linaro.org Acked-by: Peter Ziljstra peterz@infradead.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- include/uapi/linux/perf_event.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index ad15e40d7f5d..f006eeab6f0e 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1156,10 +1156,11 @@ enum perf_callchain_context { /** * PERF_RECORD_AUX::flags bits */ -#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */ -#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */ -#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */ -#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ +#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */ +#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */ +#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */ +#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ +#define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */
#define PERF_FLAG_FD_NO_GROUP (1UL << 0) #define PERF_FLAG_FD_OUTPUT (1UL << 1)
CoreSight PMU supports aux-buffer for the ETM tracing. The trace generated by the ETM (associated with individual CPUs, like Intel PT) is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
The TMC-ETR applies formatting of the raw ETM trace data, as it can collect traces from multiple ETMs, with the TraceID to indicate the source of a given trace packet.
Arm Trace Buffer Extension is new "sink" IP, attached to individual CPUs and thus do not provide additional formatting, like TMC-ETR.
Additionally, a system could have both TRBE *and* TMC-ETR for the trace collection. e.g, TMC-ETR could be used as a single trace buffer to collect data from multiple ETMs to correlate the traces from different CPUs. It is possible to have a perf session where some events end up collecting the trace in TMC-ETR while the others in TRBE. Thus we need a way to identify the type of the trace for each AUX record.
Define the trace formats exported by the CoreSight PMU. We don't define the flags following the "ETM" as this information is available to the user when issuing the session. What is missing is the additional formatting applied by the "sink" which is decided at the runtime and the user may not have a control on.
So we define : - CORESIGHT format (indicates the Frame format) - RAW format (indicates the format of the source)
The default value is CORESIGHT format for all the records (i,e == 0). Add the RAW format for others that use raw format.
Cc: Peter Zijlstra peterz@infradead.org Cc: Mike Leach mike.leach@linaro.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Leo Yan leo.yan@linaro.org Cc: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- include/uapi/linux/perf_event.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index f006eeab6f0e..63971eaef127 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1162,6 +1162,10 @@ enum perf_callchain_context { #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */
+/* CoreSight PMU AUX buffer formats */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */ + #define PERF_FLAG_FD_NO_GROUP (1UL << 0) #define PERF_FLAG_FD_OUTPUT (1UL << 1) #define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */
Hi Peter,
On Tue, Mar 23, 2021 at 12:06:32PM +0000, Suzuki K Poulose wrote:
CoreSight PMU supports aux-buffer for the ETM tracing. The trace generated by the ETM (associated with individual CPUs, like Intel PT) is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
The TMC-ETR applies formatting of the raw ETM trace data, as it can collect traces from multiple ETMs, with the TraceID to indicate the source of a given trace packet.
Arm Trace Buffer Extension is new "sink" IP, attached to individual CPUs and thus do not provide additional formatting, like TMC-ETR.
Additionally, a system could have both TRBE *and* TMC-ETR for the trace collection. e.g, TMC-ETR could be used as a single trace buffer to collect data from multiple ETMs to correlate the traces from different CPUs. It is possible to have a perf session where some events end up collecting the trace in TMC-ETR while the others in TRBE. Thus we need a way to identify the type of the trace for each AUX record.
Define the trace formats exported by the CoreSight PMU. We don't define the flags following the "ETM" as this information is available to the user when issuing the session. What is missing is the additional formatting applied by the "sink" which is decided at the runtime and the user may not have a control on.
So we define :
- CORESIGHT format (indicates the Frame format)
- RAW format (indicates the format of the source)
The default value is CORESIGHT format for all the records (i,e == 0). Add the RAW format for others that use raw format.
Cc: Peter Zijlstra peterz@infradead.org Cc: Mike Leach mike.leach@linaro.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Leo Yan leo.yan@linaro.org Cc: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
include/uapi/linux/perf_event.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index f006eeab6f0e..63971eaef127 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1162,6 +1162,10 @@ enum perf_callchain_context { #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */ +/* CoreSight PMU AUX buffer formats */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */
Have you had time to review this patch? Anything you'd like to see modified?
Thanks, Mathieu
#define PERF_FLAG_FD_NO_GROUP (1UL << 0) #define PERF_FLAG_FD_OUTPUT (1UL << 1)
#define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */
2.24.1
On Mon, Mar 29, 2021 at 10:56:25AM -0600, Mathieu Poirier wrote:
Hi Peter,
On Tue, Mar 23, 2021 at 12:06:32PM +0000, Suzuki K Poulose wrote:
CoreSight PMU supports aux-buffer for the ETM tracing. The trace generated by the ETM (associated with individual CPUs, like Intel PT) is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
The TMC-ETR applies formatting of the raw ETM trace data, as it can collect traces from multiple ETMs, with the TraceID to indicate the source of a given trace packet.
Arm Trace Buffer Extension is new "sink" IP, attached to individual CPUs and thus do not provide additional formatting, like TMC-ETR.
Additionally, a system could have both TRBE *and* TMC-ETR for the trace collection. e.g, TMC-ETR could be used as a single trace buffer to collect data from multiple ETMs to correlate the traces from different CPUs. It is possible to have a perf session where some events end up collecting the trace in TMC-ETR while the others in TRBE. Thus we need a way to identify the type of the trace for each AUX record.
Define the trace formats exported by the CoreSight PMU. We don't define the flags following the "ETM" as this information is available to the user when issuing the session. What is missing is the additional formatting applied by the "sink" which is decided at the runtime and the user may not have a control on.
So we define :
- CORESIGHT format (indicates the Frame format)
- RAW format (indicates the format of the source)
The default value is CORESIGHT format for all the records (i,e == 0). Add the RAW format for others that use raw format.
Cc: Peter Zijlstra peterz@infradead.org Cc: Mike Leach mike.leach@linaro.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Leo Yan leo.yan@linaro.org Cc: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
include/uapi/linux/perf_event.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index f006eeab6f0e..63971eaef127 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1162,6 +1162,10 @@ enum perf_callchain_context { #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */ +/* CoreSight PMU AUX buffer formats */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */
Have you had time to review this patch? Anything you'd like to see modified?
Ok I suppose..
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- arch/arm64/include/asm/barrier.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index c3009b0e5239..5a8367a2b868 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -23,6 +23,7 @@ #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
#define psb_csync() asm volatile("hint #17" : : : "memory") +#define tsb_csync() asm volatile("hint #18" : : : "memory") #define csdb() asm volatile("hint #20" : : : "memory")
#define spec_bar() asm volatile(ALTERNATIVE("dsb nsh\nisb\n", \
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Acked-by: Catalin Marinas catalin.marinas@arm.com
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
1) CoreSight next is based on rc2 at the moment and we have fixes gone into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
2) There could be conflicts with the kvmarm tree for the KVM host changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
Acked-by: Catalin Marinas catalin.marinas@arm.com
Thanks Suzuki
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Thanks,
M.
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
Sorry for the confusion.
Kind regards Suzuki
Thanks,
M.
On Wed, 24 Mar 2021 15:51:14 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
OK. I thought there was a need for these patches to prevent guest access to the v8.4 self hosted tracing feature that went in 5.12 though[1]... Did I get it wrong?
Thanks,
M.
[1] https://lore.kernel.org/r/cbe4ef17-38f9-c555-d838-796be752d4a3@arm.com
On 24/03/2021 16:16, Marc Zyngier wrote:
On Wed, 24 Mar 2021 15:51:14 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote:
tsb csync synchronizes the trace operation of instructions. The instruction is a nop when FEAT_TRF is not implemented.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
OK. I thought there was a need for these patches to prevent guest access to the v8.4 self hosted tracing feature that went in 5.12 though[1]... Did I get it wrong?
Yes, that is correct. The guest could access the Trace Filter Control register and fiddle with the host settings, without this patch. e.g, it could disable tracing at EL0/EL1, without the host being aware on nVHE host.
Cheers Suzuki
On Wed, 24 Mar 2021 16:25:12 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 16:16, Marc Zyngier wrote:
On Wed, 24 Mar 2021 15:51:14 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote:
Hi Suzuki?
On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote: > tsb csync synchronizes the trace operation of instructions. > The instruction is a nop when FEAT_TRF is not implemented. > > Cc: Mathieu Poirier mathieu.poirier@linaro.org > Cc: Mike Leach mike.leach@linaro.org > Cc: Catalin Marinas catalin.marinas@arm.com > Cc: Will Deacon will.deacon@arm.com > Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
How do you plan to merge these patches? If they go via the coresight tree:
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
OK. I thought there was a need for these patches to prevent guest access to the v8.4 self hosted tracing feature that went in 5.12 though[1]... Did I get it wrong?
Yes, that is correct. The guest could access the Trace Filter Control register and fiddle with the host settings, without this patch. e.g, it could disable tracing at EL0/EL1, without the host being aware on nVHE host.
OK, so we definitely do need these patches, don't we? Both? Just one? Please have a look at kvmarm/fixes and tell me what I must keep.
Thanks,
M.
On 24/03/2021 16:30, Marc Zyngier wrote:
On Wed, 24 Mar 2021 16:25:12 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 16:16, Marc Zyngier wrote:
On Wed, 24 Mar 2021 15:51:14 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 23/03/2021 18:21, Catalin Marinas wrote: > Hi Suzuki? > > On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote: >> tsb csync synchronizes the trace operation of instructions. >> The instruction is a nop when FEAT_TRF is not implemented. >> >> Cc: Mathieu Poirier mathieu.poirier@linaro.org >> Cc: Mike Leach mike.leach@linaro.org >> Cc: Catalin Marinas catalin.marinas@arm.com >> Cc: Will Deacon will.deacon@arm.com >> Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com > > How do you plan to merge these patches? If they go via the coresight > tree: >
Ideally all of this should go via the CoreSight tree to have the dependencies solved at one place. But there are some issues :
If this makes to 5.13 queue for CoreSight,
- CoreSight next is based on rc2 at the moment and we have fixes gone
into rc3 and later, which this series will depend on. (We could move the next tree forward to a later rc to solve this).
- There could be conflicts with the kvmarm tree for the KVM host
changes (That has dependency on the TRBE definitions patch).
If it doesn't make to 5.13 queue, it would be good to have this patch, the TRBE defintions and the KVM host patches queued for 5.13 (not sure if this is acceptable) and we could rebase the CoreSight changes on 5.13 and push it to next release.
I am open for other suggestions.
Marc, Mathieu,
Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
OK. I thought there was a need for these patches to prevent guest access to the v8.4 self hosted tracing feature that went in 5.12 though[1]... Did I get it wrong?
Yes, that is correct. The guest could access the Trace Filter Control register and fiddle with the host settings, without this patch. e.g, it could disable tracing at EL0/EL1, without the host being aware on nVHE host.
OK, so we definitely do need these patches, don't we? Both? Just one? Please have a look at kvmarm/fixes and tell me what I must keep.
Both of them are fixes.
commit "KVM: arm64: Disable guest access to trace filter controls" - This fixes guest fiddling with the trace filter control as described above.
commit "KVM: arm64: Hide system instruction access to Trace registers" - Fixes the Hypervisor to advertise what it doesn't support. i.e stop advertising trace system instruction access to a guest. Otherwise a guest which trusts the ID registers (ID_AA64DFR0_EL1.TRACEVER == 1) can crash while trying to access the trace register as we trap the accesses (CPTR_EL2.TTA == 1). On Linux, the ETM drivers need a DT explicitly advertising the support. So, this is not immediately impacted. And this fix goes a long way back in the history, when the CPTR_EL2.TTA was added.
Now, the reason for asking you to hold on is the way this could create conflicts in merging the rest of the series.
Suzuki
On Wed, Mar 24, 2021 at 05:06:58PM +0000, Suzuki K Poulose wrote:
On 24/03/2021 16:30, Marc Zyngier wrote:
On Wed, 24 Mar 2021 16:25:12 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 16:16, Marc Zyngier wrote:
On Wed, 24 Mar 2021 15:51:14 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 24/03/2021 13:49, Marc Zyngier wrote:
On Wed, 24 Mar 2021 09:39:13 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote: > > On 23/03/2021 18:21, Catalin Marinas wrote: > > Hi Suzuki? > > > > On Tue, Mar 23, 2021 at 12:06:33PM +0000, Suzuki K Poulose wrote: > > > tsb csync synchronizes the trace operation of instructions. > > > The instruction is a nop when FEAT_TRF is not implemented. > > > > > > Cc: Mathieu Poirier mathieu.poirier@linaro.org > > > Cc: Mike Leach mike.leach@linaro.org > > > Cc: Catalin Marinas catalin.marinas@arm.com > > > Cc: Will Deacon will.deacon@arm.com > > > Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com > > > > How do you plan to merge these patches? If they go via the coresight > > tree: > > > > Ideally all of this should go via the CoreSight tree to have the > dependencies solved at one place. But there are some issues : > > If this makes to 5.13 queue for CoreSight, > > 1) CoreSight next is based on rc2 at the moment and we have fixes gone > into rc3 and later, which this series will depend on. (We could move > the next tree forward to a later rc to solve this). > > 2) There could be conflicts with the kvmarm tree for the KVM host > changes (That has dependency on the TRBE definitions patch). > > If it doesn't make to 5.13 queue, it would be good to have this patch, > the TRBE defintions and the KVM host patches queued for 5.13 (not sure > if this is acceptable) and we could rebase the CoreSight changes on 5.13 > and push it to next release. > > I am open for other suggestions. > > Marc, Mathieu, > > Thoughts ?
I was planning to take the first two patches in 5.12 as fixes (they are queued already, and would hopefully land in -rc5). If that doesn't fit with the plan, please let me know ASAP.
Marc,
I think it would be better to hold on pushing those patches until we have a clarity on how things will go.
OK. I thought there was a need for these patches to prevent guest access to the v8.4 self hosted tracing feature that went in 5.12 though[1]... Did I get it wrong?
Yes, that is correct. The guest could access the Trace Filter Control register and fiddle with the host settings, without this patch. e.g, it could disable tracing at EL0/EL1, without the host being aware on nVHE host.
OK, so we definitely do need these patches, don't we? Both? Just one? Please have a look at kvmarm/fixes and tell me what I must keep.
Both of them are fixes.
commit "KVM: arm64: Disable guest access to trace filter controls"
- This fixes guest fiddling with the trace filter control as described
above.
commit "KVM: arm64: Hide system instruction access to Trace registers"
- Fixes the Hypervisor to advertise what it doesn't support. i.e stop advertising trace system instruction access to a guest. Otherwise a guest which trusts the ID registers (ID_AA64DFR0_EL1.TRACEVER == 1) can crash while trying to access the trace register as we trap the accesses (CPTR_EL2.TTA == 1). On Linux, the ETM drivers need a DT explicitly advertising the support. So, this is not immediately impacted. And this fix goes a long way back in the history, when the CPTR_EL2.TTA was added.
Now, the reason for asking you to hold on is the way this could create conflicts in merging the rest of the series.
The way we normally work around this is to either rebase your series on top of -rc5 when the fixes go in or, if you want an earlier -rc base, Marc can put them on a stable branch somewhere that you can use.
In the worst case you can merge the patches twice but that's rarely needed.
On Wed, 24 Mar 2021 17:19:36 +0000, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 24, 2021 at 05:06:58PM +0000, Suzuki K Poulose wrote:
On 24/03/2021 16:30, Marc Zyngier wrote:
OK, so we definitely do need these patches, don't we? Both? Just one? Please have a look at kvmarm/fixes and tell me what I must keep.
Both of them are fixes.
commit "KVM: arm64: Disable guest access to trace filter controls"
- This fixes guest fiddling with the trace filter control as described
above.
commit "KVM: arm64: Hide system instruction access to Trace registers"
- Fixes the Hypervisor to advertise what it doesn't support. i.e stop advertising trace system instruction access to a guest. Otherwise a guest which trusts the ID registers (ID_AA64DFR0_EL1.TRACEVER == 1) can crash while trying to access the trace register as we trap the accesses (CPTR_EL2.TTA == 1). On Linux, the ETM drivers need a DT explicitly advertising the support. So, this is not immediately impacted. And this fix goes a long way back in the history, when the CPTR_EL2.TTA was added.
Now, the reason for asking you to hold on is the way this could create conflicts in merging the rest of the series.
The way we normally work around this is to either rebase your series on top of -rc5 when the fixes go in or, if you want an earlier -rc base, Marc can put them on a stable branch somewhere that you can use.
Here's what I've done:
- the two patches are now on a branch[1] based off -rc3 which I officially declare stable. Feel free to rebase your series on top.
- the KVM fixes branch now embeds this branch (yes, I've rebased it -- we'll hopefully survive the outrage).
Thanks,
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/log/?h=tra...
On Wed, Mar 24, 2021 at 05:40:39PM +0000, Marc Zyngier wrote:
On Wed, 24 Mar 2021 17:19:36 +0000, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 24, 2021 at 05:06:58PM +0000, Suzuki K Poulose wrote:
On 24/03/2021 16:30, Marc Zyngier wrote:
OK, so we definitely do need these patches, don't we? Both? Just one? Please have a look at kvmarm/fixes and tell me what I must keep.
Both of them are fixes.
commit "KVM: arm64: Disable guest access to trace filter controls"
- This fixes guest fiddling with the trace filter control as described
above.
commit "KVM: arm64: Hide system instruction access to Trace registers"
- Fixes the Hypervisor to advertise what it doesn't support. i.e stop advertising trace system instruction access to a guest. Otherwise a guest which trusts the ID registers (ID_AA64DFR0_EL1.TRACEVER == 1) can crash while trying to access the trace register as we trap the accesses (CPTR_EL2.TTA == 1). On Linux, the ETM drivers need a DT explicitly advertising the support. So, this is not immediately impacted. And this fix goes a long way back in the history, when the CPTR_EL2.TTA was added.
Now, the reason for asking you to hold on is the way this could create conflicts in merging the rest of the series.
The way we normally work around this is to either rebase your series on top of -rc5 when the fixes go in or, if you want an earlier -rc base, Marc can put them on a stable branch somewhere that you can use.
Here's what I've done:
the two patches are now on a branch[1] based off -rc3 which I officially declare stable. Feel free to rebase your series on top.
the KVM fixes branch now embeds this branch (yes, I've rebased it -- we'll hopefully survive the outrage).
We don't have a choice to rebase. I will rebased CS next on [1] and apply this set on top of it. Hopefully the KVM fixes will have made it to GKH's char-misc tree by the time I send him the patches for the next merge window. Otherwise we'll have to merge patches twice as Catalin mentioned. We can deal with that if/when we get there.
Thanks, Mathieu
Thanks,
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/log/?h=tra...
-- Without deviation from the norm, progress is not possible.
From: Anshuman Khandual anshuman.khandual@arm.com
This adds TRBE related registers and corresponding feature macros.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Suzuki K Poulose suzuki.poulose@arm.com Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- arch/arm64/include/asm/sysreg.h | 50 +++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index d4a5fca984c3..a27ba8be46a2 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -333,6 +333,55 @@
/*** End of Statistical Profiling Extension ***/
+/* + * TRBE Registers + */ +#define SYS_TRBLIMITR_EL1 sys_reg(3, 0, 9, 11, 0) +#define SYS_TRBPTR_EL1 sys_reg(3, 0, 9, 11, 1) +#define SYS_TRBBASER_EL1 sys_reg(3, 0, 9, 11, 2) +#define SYS_TRBSR_EL1 sys_reg(3, 0, 9, 11, 3) +#define SYS_TRBMAR_EL1 sys_reg(3, 0, 9, 11, 4) +#define SYS_TRBTRG_EL1 sys_reg(3, 0, 9, 11, 6) +#define SYS_TRBIDR_EL1 sys_reg(3, 0, 9, 11, 7) + +#define TRBLIMITR_LIMIT_MASK GENMASK_ULL(51, 0) +#define TRBLIMITR_LIMIT_SHIFT 12 +#define TRBLIMITR_NVM BIT(5) +#define TRBLIMITR_TRIG_MODE_MASK GENMASK(1, 0) +#define TRBLIMITR_TRIG_MODE_SHIFT 3 +#define TRBLIMITR_FILL_MODE_MASK GENMASK(1, 0) +#define TRBLIMITR_FILL_MODE_SHIFT 1 +#define TRBLIMITR_ENABLE BIT(0) +#define TRBPTR_PTR_MASK GENMASK_ULL(63, 0) +#define TRBPTR_PTR_SHIFT 0 +#define TRBBASER_BASE_MASK GENMASK_ULL(51, 0) +#define TRBBASER_BASE_SHIFT 12 +#define TRBSR_EC_MASK GENMASK(5, 0) +#define TRBSR_EC_SHIFT 26 +#define TRBSR_IRQ BIT(22) +#define TRBSR_TRG BIT(21) +#define TRBSR_WRAP BIT(20) +#define TRBSR_ABORT BIT(18) +#define TRBSR_STOP BIT(17) +#define TRBSR_MSS_MASK GENMASK(15, 0) +#define TRBSR_MSS_SHIFT 0 +#define TRBSR_BSC_MASK GENMASK(5, 0) +#define TRBSR_BSC_SHIFT 0 +#define TRBSR_FSC_MASK GENMASK(5, 0) +#define TRBSR_FSC_SHIFT 0 +#define TRBMAR_SHARE_MASK GENMASK(1, 0) +#define TRBMAR_SHARE_SHIFT 8 +#define TRBMAR_OUTER_MASK GENMASK(3, 0) +#define TRBMAR_OUTER_SHIFT 4 +#define TRBMAR_INNER_MASK GENMASK(3, 0) +#define TRBMAR_INNER_SHIFT 0 +#define TRBTRG_TRG_MASK GENMASK(31, 0) +#define TRBTRG_TRG_SHIFT 0 +#define TRBIDR_FLAG BIT(5) +#define TRBIDR_PROG BIT(4) +#define TRBIDR_ALIGN_MASK GENMASK(3, 0) +#define TRBIDR_ALIGN_SHIFT 0 + #define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1) #define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)
@@ -840,6 +889,7 @@ #define ID_AA64MMFR2_CNP_SHIFT 0
/* id_aa64dfr0 */ +#define ID_AA64DFR0_TRBE_SHIFT 44 #define ID_AA64DFR0_TRACE_FILT_SHIFT 40 #define ID_AA64DFR0_DOUBLELOCK_SHIFT 36 #define ID_AA64DFR0_PMSVER_SHIFT 32
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation.
.Lskip_spe_@: + /* Trace buffer */ + ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4 + cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present + + mrs_s x0, SYS_TRBIDR_EL1 + and x0, x0, TRBIDR_PROG + cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2 + + mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) + orr x2, x2, x0 // allow the EL1&0 translation + // to own it. + +.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1
/* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24)) #define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1; + /* Self-hosted trace */ + u64 trfcr_el1; } host_debug_state;
/* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1 + // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) + bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0
// Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) * - Debug ROM Address (MDCR_EL2_TDRA) * - OS related registers (MDCR_EL2_TDOSA) * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB) - * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF) + * - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB) * * Additionally, KVM only traps guest accesses to the debug registers if * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY @@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
/* - * This also clears MDCR_EL2_E2PB_MASK to disable guest access - * to the profiling buffer. + * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK + * to disable guest access to the profiling and trace buffers */ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM | diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); }
+static void __debug_save_trace(u64 *trfcr_el1) +{ + + *trfcr_el1 = 0; + + /* Check if we have TRBE */ + if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1), + ID_AA64DFR0_TRBE_SHIFT)) + return; + + /* Check we can access the TRBE */ + if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG)) + return; + + /* Check if the TRBE is enabled */ + if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE)) + return; + /* + * Prohibit trace generation while we are in guest. + * Since access to TRFCR_EL1 is trapped, the guest can't + * modify the filtering set by the host. + */ + *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1); + write_sysreg_s(0, SYS_TRFCR_EL1); + isb(); + /* Drain the trace buffer to memory */ + tsb_csync(); + dsb(nsh); +} + +static void __debug_restore_trace(u64 trfcr_el1) +{ + if (!trfcr_el1) + return; + + /* Restore trace filter controls */ + write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1); +} + void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1); + /* Disable and flush Self-Hosted Trace generation */ + __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); + __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_host(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; + mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled())
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24)) #define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state;u64 trfcr_el1;
/* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0
// Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
*/
- *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
- write_sysreg_s(0, SYS_TRFCR_EL1);
- isb();
- /* Drain the trace buffer to memory */
- tsb_csync();
- dsb(nsh);
+}
+static void __debug_restore_trace(u64 trfcr_el1) +{
- if (!trfcr_el1)
return;
- /* Restore trace filter controls */
- write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
- /* Disable and flush Self-Hosted Trace generation */
- __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1);
} void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
- __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1);
} void __debug_switch_to_host(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
- mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled()) -- 2.24.1
Hi Mathieu,
On Fri, 26 Mar 2021 16:55:50 +0000, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I just went through the KVM patch, and I have a couple of question that Suzuki can hopefully address quickly enough. As for merging it via your tree, I'm worried that it will conflict with other patches that are in flight.
We can hopefully set up a stable branch between the two trees.
Thanks,
M.
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
Thoughts ?
Suzuki
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24)) #define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state;u64 trfcr_el1;
/* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0
// Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
*/
- *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
- write_sysreg_s(0, SYS_TRFCR_EL1);
- isb();
- /* Drain the trace buffer to memory */
- tsb_csync();
- dsb(nsh);
+}
+static void __debug_restore_trace(u64 trfcr_el1) +{
- if (!trfcr_el1)
return;
- /* Restore trace filter controls */
- write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
- void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
- /* Disable and flush Self-Hosted Trace generation */
- __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
- __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_host(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
- mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled()) -- 2.24.1
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so. The usual way to work with complex merge dependencies is to proceed in steps, which would mean that all KVM related patches go in the v5.13 merge window. When that is done we add the ETE/TRBE for the v5.14 merge window. I agree that we waste an entire cycle but it guarantees to avoid breaking builds and follows the conventional way to do things.
Thoughts ?
Suzuki
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24)) #define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state; /* VGIC state */u64 trfcr_el1;
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0 // Transfer the MM state from EL1 to EL2
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
*/
- *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
- write_sysreg_s(0, SYS_TRFCR_EL1);
- isb();
- /* Drain the trace buffer to memory */
- tsb_csync();
- dsb(nsh);
+}
+static void __debug_restore_trace(u64 trfcr_el1) +{
- if (!trfcr_el1)
return;
- /* Restore trace filter controls */
- write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
- void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
- /* Disable and flush Self-Hosted Trace generation */
- __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1); } void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
@@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
- __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1); } void __debug_switch_to_host(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
- mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT; write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled())
-- 2.24.1
On Tue, 30 Mar 2021 16:23:14 +0100, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so.
We do that all the times. Otherwise, it is hardly possible to build an infrastructure that spans across multiple subsystems *and* involves userspace. I really wouldn't worry about that.
M.
On Tue, Mar 30, 2021 at 09:23:14AM -0600, Mathieu Poirier wrote:
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
My tree can not be based on -next, and neither can any other maintainer's tree, as next is composed of maintainer trees :)
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so. The usual way to work with complex merge dependencies is to proceed in steps, which would mean that all KVM related patches go in the v5.13 merge window. When that is done we add the ETE/TRBE for the v5.14 merge window. I agree that we waste an entire cycle but it guarantees to avoid breaking builds and follows the conventional way to do things.
Or someone creates a single branch with a signed tag and it gets pulled into multiple maintainer's trees and never rebased. We've done that lots of time, nothing new there. Or everything goes through one tree, or you wait a release cycle.
You have 3 choices, pick one :)
thanks,
greg k-h
On Tue, 30 Mar 2021 at 09:35, Greg KH gregkh@linuxfoundation.org wrote:
On Tue, Mar 30, 2021 at 09:23:14AM -0600, Mathieu Poirier wrote:
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
My tree can not be based on -next, and neither can any other maintainer's tree, as next is composed of maintainer trees :)
Exactly
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so. The usual way to work with complex merge dependencies is to proceed in steps, which would mean that all KVM related patches go in the v5.13 merge window. When that is done we add the ETE/TRBE for the v5.14 merge window. I agree that we waste an entire cycle but it guarantees to avoid breaking builds and follows the conventional way to do things.
Or someone creates a single branch with a signed tag and it gets pulled into multiple maintainer's trees and never rebased. We've done that lots of time, nothing new there. Or everything goes through one tree, or you wait a release cycle.
You have 3 choices, pick one :)
I'm perfectly happy with getting this entire set merged via Marc's kvmarm tree, as long as you are fine with it.
thanks,
greg k-h
On Tue, Mar 30, 2021 at 10:33:51AM -0600, Mathieu Poirier wrote:
On Tue, 30 Mar 2021 at 09:35, Greg KH gregkh@linuxfoundation.org wrote:
On Tue, Mar 30, 2021 at 09:23:14AM -0600, Mathieu Poirier wrote:
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
My tree can not be based on -next, and neither can any other maintainer's tree, as next is composed of maintainer trees :)
Exactly
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so. The usual way to work with complex merge dependencies is to proceed in steps, which would mean that all KVM related patches go in the v5.13 merge window. When that is done we add the ETE/TRBE for the v5.14 merge window. I agree that we waste an entire cycle but it guarantees to avoid breaking builds and follows the conventional way to do things.
Or someone creates a single branch with a signed tag and it gets pulled into multiple maintainer's trees and never rebased. We've done that lots of time, nothing new there. Or everything goes through one tree, or you wait a release cycle.
You have 3 choices, pick one :)
I'm perfectly happy with getting this entire set merged via Marc's kvmarm tree, as long as you are fine with it.
No objection from me at all for this to go that way.
thanks,
greg k-h
On Tue, 30 Mar 2021 at 10:47, Greg KH gregkh@linuxfoundation.org wrote:
On Tue, Mar 30, 2021 at 10:33:51AM -0600, Mathieu Poirier wrote:
On Tue, 30 Mar 2021 at 09:35, Greg KH gregkh@linuxfoundation.org wrote:
On Tue, Mar 30, 2021 at 09:23:14AM -0600, Mathieu Poirier wrote:
On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
On 26/03/2021 16:55, Mathieu Poirier wrote:
On Tue, Mar 23, 2021 at 12:06:35PM +0000, Suzuki K Poulose wrote: > For a nvhe host, the EL2 must allow the EL1&0 translation > regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must > be saved/restored over a trip to the guest. Also, before > entering the guest, we must flush any trace data if the > TRBE was enabled. And we must prohibit the generation > of trace while we are in EL1 by clearing the TRFCR_EL1. > > For vhe, the EL2 must prevent the EL1 access to the Trace > Buffer. > > Cc: Will Deacon will@kernel.org > Cc: Catalin Marinas catalin.marinas@arm.com > Cc: Marc Zyngier maz@kernel.org > Cc: Mark Rutland mark.rutland@arm.com > Cc: Anshuman Khandual anshuman.khandual@arm.com > Acked-by: Mathieu Poirier mathieu.poirier@linaro.org > Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com > --- > arch/arm64/include/asm/el2_setup.h | 13 +++++++++ > arch/arm64/include/asm/kvm_arm.h | 2 ++ > arch/arm64/include/asm/kvm_host.h | 2 ++ > arch/arm64/kernel/hyp-stub.S | 3 ++- > arch/arm64/kvm/debug.c | 6 ++--- > arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ > arch/arm64/kvm/hyp/nvhe/switch.c | 1 + > 7 files changed, 65 insertions(+), 4 deletions(-) >
Marc - do you want me to pick up this one?
I think the kvmarm tree is the best route for this patch, given the amount of changes the tree is going through, in the areas this patch touches. Or else there would be conflicts with merging. And this patch depends on the patches from this series that were queued.
Here is the depency tree :
a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in v5.12-rc3)
b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
c) kvm-arm TRBE host support (Patch 7)
d) TRBE driver support (and the ETE changes)
(c) code merge depends on -> (a) + (b) (d) build (no conflicts) depends on -> (b)
Now (d) has an indirect dependency on (c) for operational correctness at runtime. So, if :
kvmarm tree picks up : b + c coresight tree picksup : b + d
and if we could ensure the merge order of the trees are in kvmarm greg-kh (device-misc tree) (coresight goes via this tree)
Greg's char-misc tree is based on the rc releases rather than next. As such it is a while before other branches like kvmarm get merged, causing all sort of compilation breakage.
My tree can not be based on -next, and neither can any other maintainer's tree, as next is composed of maintainer trees :)
Exactly
we should be fine.
Additionally, we could rip out the Kconfig changes from the TRBE patch and add it only at the rc1, once we verify both the trees are in to make sure the runtime operation dependency is not triggered.
We could also do that but Greg might frown at the tactic, and rightly so. The usual way to work with complex merge dependencies is to proceed in steps, which would mean that all KVM related patches go in the v5.13 merge window. When that is done we add the ETE/TRBE for the v5.14 merge window. I agree that we waste an entire cycle but it guarantees to avoid breaking builds and follows the conventional way to do things.
Or someone creates a single branch with a signed tag and it gets pulled into multiple maintainer's trees and never rebased. We've done that lots of time, nothing new there. Or everything goes through one tree, or you wait a release cycle.
You have 3 choices, pick one :)
I'm perfectly happy with getting this entire set merged via Marc's kvmarm tree, as long as you are fine with it.
No objection from me at all for this to go that way.
Swell - Marc, I'll send you a pull request.
thanks,
greg k-h
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24))
Where are these bits defined? DDI0487G_a has them as RES0.
#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state;u64 trfcr_el1;
/* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0
// Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
Spurious blank line?
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
Do we have a way to track this that doesn't involve reading an ID register? This is on the hot path, and is going to really suck badly with NV (which traps all ID regs for obvious reasons). I would have hoped that one way or another, we'd have a static key for this.
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
If TRFCR_EL1 is trapped, where is the trap handling? This series doesn't touch sys_regs.c, so I assume you rely on the "inject an UNDEF for anything unknown" default behaviour?
If that's the case, I'd rather you add an explicit handler.
*/
- *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
- write_sysreg_s(0, SYS_TRFCR_EL1);
- isb();
- /* Drain the trace buffer to memory */
- tsb_csync();
- dsb(nsh);
+}
+static void __debug_restore_trace(u64 trfcr_el1) +{
- if (!trfcr_el1)
return;
- /* Restore trace filter controls */
- write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
- /* Disable and flush Self-Hosted Trace generation */
- __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1);
} void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
- __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1);
} void __debug_switch_to_host(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
- mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled()) -- 2.24.1
Thanks,
M.
[1] https://lore.kernel.org/r/20210323180057.263356-1-alexandru.elisei@arm.com
Hi Marc
On 30/03/2021 11:12, Marc Zyngier wrote:
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24))
Where are these bits defined? DDI0487G_a has them as RES0.
They are part of the Future architecture technology and a register definition XML is available here :
https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/MD...
#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state;u64 trfcr_el1;
/* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0
// Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
Thanks for the heads up. I think that patch will also conflict with my fixes that is queued in kvmarm/fixes.
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
Spurious blank line?
Sure, will fix it
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
Do we have a way to track this that doesn't involve reading an ID register? This is on the hot path, and is going to really suck badly with NV (which traps all ID regs for obvious reasons). I would have hoped that one way or another, we'd have a static key for this.
TRBE, like SPE can be optionally enabled on a subset of the CPUs. We could have a per-CPU static key in the worst case. I guess this would apply to SPE as well.
May be we could do this check at kvm_arch_vcpu_load()/put() ?
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
If TRFCR_EL1 is trapped, where is the trap handling? This series doesn't touch sys_regs.c, so I assume you rely on the "inject an UNDEF for anything unknown" default behaviour?
Yes.
If that's the case, I'd rather you add an explicit handler.
I could add one.
Cheers Suzuki
*/
- *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
- write_sysreg_s(0, SYS_TRFCR_EL1);
- isb();
- /* Drain the trace buffer to memory */
- tsb_csync();
- dsb(nsh);
+}
+static void __debug_restore_trace(u64 trfcr_el1) +{
- if (!trfcr_el1)
return;
- /* Restore trace filter controls */
- write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
- void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
- /* Disable and flush Self-Hosted Trace generation */
- __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu) void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
- __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1); }
void __debug_switch_to_host(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 68ab6b4d5141..736805232521 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
- mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); if (is_protected_kvm_enabled()) -- 2.24.1
Thanks,
M.
[1] https://lore.kernel.org/r/20210323180057.263356-1-alexandru.elisei@arm.com
On Tue, 30 Mar 2021 12:12:49 +0100, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Marc
On 30/03/2021 11:12, Marc Zyngier wrote:
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24))
Where are these bits defined? DDI0487G_a has them as RES0.
They are part of the Future architecture technology and a register definition XML is available here :
https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/MD...
It be worth adding a pointer to that documentation until this is part of a released ARM ARM.
#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state; /* VGIC state */u64 trfcr_el1;
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0 // Transfer the MM state from EL1 to EL2
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
- Debug ROM Address (MDCR_EL2_TDRA)
- OS related registers (MDCR_EL2_TDOSA)
- Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
Thanks for the heads up. I think that patch will also conflict with my fixes that is queued in kvmarm/fixes.
Most probably. This is a popular landing spot, these days...
- Additionally, KVM only traps guest accesses to the debug registers if
- the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
Spurious blank line?
Sure, will fix it
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
Do we have a way to track this that doesn't involve reading an ID register? This is on the hot path, and is going to really suck badly with NV (which traps all ID regs for obvious reasons). I would have hoped that one way or another, we'd have a static key for this.
TRBE, like SPE can be optionally enabled on a subset of the CPUs. We could have a per-CPU static key in the worst case. I guess this would apply to SPE as well.
Ah, so you want to support asymmetric tracing... fair enough. But I don't think you need a per-CPU static key (and I'm not sure how that'd work either). You could have a static key indicating if *any* CPU implements tracing, in which case the check only happens when at least one CPU is capable of tracing.
You would only need a new capability.
May be we could do this check at kvm_arch_vcpu_load()/put() ?
That would extend the tracing blackout period enormously, wouldn't it? I'm not sure that's the best thing to do...
- /* Check we can access the TRBE */
- if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
return;
- /* Check if the TRBE is enabled */
- if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
return;
- /*
* Prohibit trace generation while we are in guest.
* Since access to TRFCR_EL1 is trapped, the guest can't
* modify the filtering set by the host.
If TRFCR_EL1 is trapped, where is the trap handling? This series doesn't touch sys_regs.c, so I assume you rely on the "inject an UNDEF for anything unknown" default behaviour?
Yes.
If that's the case, I'd rather you add an explicit handler.
I could add one.
Thanks,
M.
On 30/03/2021 13:15, Marc Zyngier wrote:
On Tue, 30 Mar 2021 12:12:49 +0100, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Marc
On 30/03/2021 11:12, Marc Zyngier wrote:
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1.
For vhe, the EL2 must prevent the EL1 access to the Trace Buffer.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
arch/arm64/include/asm/el2_setup.h | 13 +++++++++ arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kernel/hyp-stub.S | 3 ++- arch/arm64/kvm/debug.c | 6 ++--- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 1 + 7 files changed, 65 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index d77d358f9395..bda918948471 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -65,6 +65,19 @@ // use EL1&0 translation. .Lskip_spe_@:
- /* Trace buffer */
- ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
- cbz x0, .Lskip_trace_@ // Skip if TraceBuffer is not present
- mrs_s x0, SYS_TRBIDR_EL1
- and x0, x0, TRBIDR_PROG
- cbnz x0, .Lskip_trace_@ // If TRBE is available at EL2
- mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
- orr x2, x2, x0 // allow the EL1&0 translation
// to own it.
+.Lskip_trace_@: msr mdcr_el2, x2 // Configure debug traps .endm diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 94d4025acc0b..692c9049befa 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,8 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 /* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_E2TB_MASK (UL(0x3)) +#define MDCR_EL2_E2TB_SHIFT (UL(24))
Where are these bits defined? DDI0487G_a has them as RES0.
They are part of the Future architecture technology and a register definition XML is available here :
https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/MD...
It be worth adding a pointer to that documentation until this is part of a released ARM ARM.
#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch { struct kvm_guest_debug_arch regs; /* Statistical profiling extension */ u64 pmscr_el1;
/* Self-hosted trace */
} host_debug_state; /* VGIC state */u64 trfcr_el1;
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe) mrs_s x0, SYS_VBAR_EL12 msr vbar_el1, x0
- // Use EL2 translations for SPE and disable access from EL1
- // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
- bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) msr mdcr_el2, x0 // Transfer the MM state from EL1 to EL2
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) * - Debug ROM Address (MDCR_EL2_TDRA) * - OS related registers (MDCR_EL2_TDOSA) * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
Thanks for the heads up. I think that patch will also conflict with my fixes that is queued in kvmarm/fixes.
Most probably. This is a popular landing spot, these days...
* * Additionally, KVM only traps guest accesses to the debug registers if * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); /*
* This also clears MDCR_EL2_E2PB_MASK to disable guest access
* to the profiling buffer.
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
*/ vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |* to disable guest access to the profiling and trace buffers
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c index f401724f12ef..9499e18dd28f 100644 --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1) write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); } +static void __debug_save_trace(u64 *trfcr_el1) +{
Spurious blank line?
Sure, will fix it
- *trfcr_el1 = 0;
- /* Check if we have TRBE */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_TRBE_SHIFT))
return;
Do we have a way to track this that doesn't involve reading an ID register? This is on the hot path, and is going to really suck badly with NV (which traps all ID regs for obvious reasons). I would have hoped that one way or another, we'd have a static key for this.
TRBE, like SPE can be optionally enabled on a subset of the CPUs. We could have a per-CPU static key in the worst case. I guess this would apply to SPE as well.
Ah, so you want to support asymmetric tracing... fair enough. But I don't think you need a per-CPU static key (and I'm not sure how that'd work either). You could have a static key indicating if *any* CPU implements tracing, in which case the check only happens when at least one CPU is capable of tracing.
You would only need a new capability.
May be we could do this check at kvm_arch_vcpu_load()/put() ?
That would extend the tracing blackout period enormously, wouldn't it? I'm not sure that's the best thing to do...
Sorry for not making this clear. We could check if the SPE/TRBE is available on this CPU (including the PMB/TRB_IDR bits and a set a flag in the VCPU on every kvm_arch_vcpu_load() and cleared on put. The actual switching code could check this flag and check if the unit is enabled and then do the actual save/restore as we do below. (We may be able to even check if unit is enabled there, need to double check this.)
Cheers Suzuki
On Tue, 30 Mar 2021 14:34:23 +0100, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 30/03/2021 13:15, Marc Zyngier wrote:
On Tue, 30 Mar 2021 12:12:49 +0100, Suzuki K Poulose suzuki.poulose@arm.com wrote:
[...]
May be we could do this check at kvm_arch_vcpu_load()/put() ?
That would extend the tracing blackout period enormously, wouldn't it? I'm not sure that's the best thing to do...
Sorry for not making this clear. We could check if the SPE/TRBE is available on this CPU (including the PMB/TRB_IDR bits and a set a flag in the VCPU on every kvm_arch_vcpu_load() and cleared on put. The actual switching code could check this flag and check if the unit is enabled and then do the actual save/restore as we do below. (We may be able to even check if unit is enabled there, need to double check this.)
Ah, gotcha. Yes, this seems like a reasonable thing to do. We have the per-vcpu debug flags already, and you could piggy-back on that.
Thanks,
M.
Hello,
On 3/30/21 12:12 PM, Suzuki K Poulose wrote:
Hi Marc
On 30/03/2021 11:12, Marc Zyngier wrote:
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
[..]
#define MDCR_EL2_TTRF       (1 << 19)  #define MDCR_EL2_TPMS       (1 << 14)  #define MDCR_EL2_E2PB_MASK   (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch {          struct kvm_guest_debug_arch regs;          /* Statistical profiling extension */          u64 pmscr_el1; +       /* Self-hosted trace */ +       u64 trfcr_el1;      } host_debug_state;       /* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)      mrs_s   x0, SYS_VBAR_EL12      msr   vbar_el1, x0  -   // Use EL2 translations for SPE and disable access from EL1 +   // Use EL2 translations for SPE & TRBE and disable access from EL1      mrs   x0, mdcr_el2      bic   x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) +   bic   x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)      msr   mdcr_el2, x0       // Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)   * - Debug ROM Address (MDCR_EL2_TDRA)   * - OS related registers (MDCR_EL2_TDOSA)   * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- *Â - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- *Â - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
Thanks for the heads up. I think that patch will also conflict with my fixes that is queued in kvmarm/fixes.
How should I proceed with the patch [1]? For the next iteration, should I rebase it on top of kvmarm/fixes or on top of kvmarm/fixes + this series?
[1] https://lore.kernel.org/r/20210323180057.263356-1-alexandru.elisei@arm.com
Thanks, Alex
On Wed, 31 Mar 2021 16:28:46 +0100, Alexandru Elisei alexandru.elisei@arm.com wrote:
Hello,
On 3/30/21 12:12 PM, Suzuki K Poulose wrote:
Hi Marc
On 30/03/2021 11:12, Marc Zyngier wrote:
Hi Suzuki,
[+ Alex]
On Tue, 23 Mar 2021 12:06:35 +0000, Suzuki K Poulose suzuki.poulose@arm.com wrote:
[..]
#define MDCR_EL2_TTRF       (1 << 19)  #define MDCR_EL2_TPMS       (1 << 14)  #define MDCR_EL2_E2PB_MASK   (UL(0x3)) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3d10e6527f7d..80d0a1a82a4c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -315,6 +315,8 @@ struct kvm_vcpu_arch {          struct kvm_guest_debug_arch regs;          /* Statistical profiling extension */          u64 pmscr_el1; +       /* Self-hosted trace */ +       u64 trfcr_el1;      } host_debug_state;       /* VGIC state */ diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..05d25e645b46 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)      mrs_s   x0, SYS_VBAR_EL12      msr   vbar_el1, x0  -   // Use EL2 translations for SPE and disable access from EL1 +   // Use EL2 translations for SPE & TRBE and disable access from EL1      mrs   x0, mdcr_el2      bic   x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) +   bic   x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)      msr   mdcr_el2, x0       // Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbc890511631..7b16f42d39f4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)   * - Debug ROM Address (MDCR_EL2_TDRA)   * - OS related registers (MDCR_EL2_TDOSA)   * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- *Â - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
- *Â - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
For the record, this is likely to conflict with [1], although that patch still has some issues.
Thanks for the heads up. I think that patch will also conflict with my fixes that is queued in kvmarm/fixes.
How should I proceed with the patch [1]? For the next iteration, should I rebase it on top of kvmarm/fixes or on top of kvmarm/fixes + this series?
On top of kvmarm/fixes, please. I'll merge that branch into 5.13 anyway, as it is taking some time to land upstream.
I'll take care of the conflicts, and shout if I need help!
Thanks,
M.
If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF), move the ETM to trace prohibited region using TRFCR, while disabling.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++-- drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++ 2 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 15016f757828..00297906669c 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -31,6 +31,7 @@ #include <linux/pm_runtime.h> #include <linux/property.h>
+#include <asm/barrier.h> #include <asm/sections.h> #include <asm/sysreg.h> #include <asm/local.h> @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev, static void etm4_disable_hw(void *info) { u32 control; + u64 trfcr; struct etmv4_drvdata *drvdata = info; struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info) /* EN, bit[0] Trace unit enable bit */ control &= ~0x1;
+ /* + * If the CPU supports v8.4 Trace filter Control, + * set the ETM to trace prohibited region. + */ + if (drvdata->trfc) { + trfcr = read_sysreg_s(SYS_TRFCR_EL1); + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE), + SYS_TRFCR_EL1); + isb(); + } /* * Make sure everything completes before disabling, as recommended * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register, @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info) */ dsb(sy); isb(); + /* Trace synchronization barrier, is a nop if not supported */ + tsb_csync(); etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
/* wait for TRCSTATR.PMSTABLE to go to '1' */ if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1)) dev_err(etm_dev, "timeout while waiting for PM stable Trace Status\n"); + if (drvdata->trfc) + write_sysreg_s(trfcr, SYS_TRFCR_EL1);
/* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) { @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata, return false; }
-static void cpu_enable_tracing(void) +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata) { u64 dfr0 = read_sysreg(id_aa64dfr0_el1); u64 trfcr; @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void) if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT)) return;
+ drvdata->trfc = true; /* * If the CPU supports v8.4 SelfHosted Tracing, enable * tracing at the kernel EL and EL0, forcing to use the @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info) /* NUMCNTR, bits[30:28] number of counters available for tracing */ drvdata->nr_cntr = BMVAL(etmidr5, 28, 30); etm4_cs_lock(drvdata, csa); - cpu_enable_tracing(); + cpu_enable_tracing(drvdata); }
static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config) diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index 0af60571aa23..f6478ef642bf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -862,6 +862,7 @@ struct etmv4_save_state { * @nooverflow: Indicate if overflow prevention is supported. * @atbtrig: If the implementation can support ATB triggers * @lpoverride: If the implementation can support low-power state over. + * @trfc: If the implementation supports Arm v8.4 trace filter controls. * @config: structure holding configuration parameters. * @save_state: State to be preserved across power loss * @state_needs_restore: True when there is context to restore after PM exit @@ -912,6 +913,7 @@ struct etmv4_drvdata { bool nooverflow; bool atbtrig; bool lpoverride; + bool trfc; struct etmv4_config config; struct etmv4_save_state *save_state; bool state_needs_restore;
When a sink is not specified by the user, the etm perf driver finds a suitable sink automatically, based on the first ETM where this event could be scheduled. Then we allocate the sink buffer based on the selected sink. This is fine for a CPU bound event as the "sink" is always guaranteed to be reachable from the ETM (as this is the only ETM where the event is going to be scheduled). However, if we have a thread bound event, the event could be scheduled on any of the ETMs on the system. In this case, currently we automatically select a sink and exclude any ETMs that cannot reach the selected sink. This is problematic especially for 1x1 configurations. We end up in tracing the event only on the "first" ETM, as the default sink is local to the first ETM and unreachable from the rest. However, we could allow the other ETMs to trace if they all have a sink that is compatible with the "selected" sink and can use the sink buffer. This can be easily done by verifying that they are all driven by the same driver and matches the same subtype. Please note that at anytime there can be only one ETM tracing the event.
Adding support for different types of sinks for a single event is complex and is not something that we expect on a sane configuration.
Cc: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Reviewed-by: Mike Leach mike.leach@linaro.org Tested-by: Linu Cherian lcherian@marvell.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 60 ++++++++++++++++--- 1 file changed, 52 insertions(+), 8 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 0f603b4094f2..aa0974bd265b 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -232,6 +232,25 @@ static void etm_free_aux(void *data) schedule_work(&event_data->work); }
+/* + * Check if two given sinks are compatible with each other, + * so that they can use the same sink buffers, when an event + * moves around. + */ +static bool sinks_compatible(struct coresight_device *a, + struct coresight_device *b) +{ + if (!a || !b) + return false; + /* + * If the sinks are of the same subtype and driven + * by the same driver, we can use the same buffer + * on these sinks. + */ + return (a->subtype.sink_subtype == b->subtype.sink_subtype) && + (sink_ops(a) == sink_ops(b)); +} + static void *etm_setup_aux(struct perf_event *event, void **pages, int nr_pages, bool overwrite) { @@ -239,6 +258,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, int cpu = event->cpu; cpumask_t *mask; struct coresight_device *sink = NULL; + struct coresight_device *user_sink = NULL, *last_sink = NULL; struct etm_event_data *event_data = NULL;
event_data = alloc_event_data(cpu); @@ -249,7 +269,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, /* First get the selected sink from user space. */ if (event->attr.config2) { id = (u32)event->attr.config2; - sink = coresight_get_sink_by_id(id); + sink = user_sink = coresight_get_sink_by_id(id); }
mask = &event_data->mask; @@ -277,14 +297,33 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, }
/* - * No sink provided - look for a default sink for one of the - * devices. At present we only support topology where all CPUs - * use the same sink [N:1], so only need to find one sink. The - * coresight_build_path later will remove any CPU that does not - * attach to the sink, or if we have not found a sink. + * No sink provided - look for a default sink for all the ETMs, + * where this event can be scheduled. + * We allocate the sink specific buffers only once for this + * event. If the ETMs have different default sink devices, we + * can only use a single "type" of sink as the event can carry + * only one sink specific buffer. Thus we have to make sure + * that the sinks are of the same type and driven by the same + * driver, as the one we allocate the buffer for. As such + * we choose the first sink and check if the remaining ETMs + * have a compatible default sink. We don't trace on a CPU + * if the sink is not compatible. */ - if (!sink) + if (!user_sink) { + /* Find the default sink for this ETM */ sink = coresight_find_default_sink(csdev); + if (!sink) { + cpumask_clear_cpu(cpu, mask); + continue; + } + + /* Check if this sink compatible with the last sink */ + if (last_sink && !sinks_compatible(last_sink, sink)) { + cpumask_clear_cpu(cpu, mask); + continue; + } + last_sink = sink; + }
/* * Building a path doesn't enable it, it simply builds a @@ -312,7 +351,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer) goto err;
- /* Allocate the sink buffer for this session */ + /* + * Allocate the sink buffer for this session. All the sinks + * where this event can be scheduled are ensured to be of the + * same type. Thus the same sink configuration is used by the + * sinks. + */ event_data->snk_config = sink_ops(sink)->alloc_buffer(sink, event, pages, nr_pages, overwrite);
If a graph node is not found for a given node, of_get_next_endpoint() will emit the following error message :
OF: graph: no port node found in /<node_name>
If the given component doesn't have any explicit connections (e.g, ETE) we could simply ignore the graph parsing. As for any legacy component where this is mandatory, the device will not be usable as before this patch. Updating the DT bindings to Yaml and enabling the schema checks can detect such issues with the DT.
Cc: Mike Leach mike.leach@linaro.org Cc: Leo Yan leo.yan@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-platform.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-platform.c b/drivers/hwtracing/coresight/coresight-platform.c index 3629b7885aca..c594f45319fc 100644 --- a/drivers/hwtracing/coresight/coresight-platform.c +++ b/drivers/hwtracing/coresight/coresight-platform.c @@ -90,6 +90,12 @@ static void of_coresight_get_ports_legacy(const struct device_node *node, struct of_endpoint endpoint; int in = 0, out = 0;
+ /* + * Avoid warnings in of_graph_get_next_endpoint() + * if the device doesn't have any graph connections + */ + if (!of_graph_is_present(node)) + return; do { ep = of_graph_get_next_endpoint(node, ep); if (!ep)
ETE may not implement the OS lock and instead could rely on the PE OS Lock for the trace unit access. This is indicated by the TRCOLSR.OSM == 0b100. Add support for handling the PE OS lock
Cc: Mike Leach mike.leach@linaro.org Reviewed-by: mike.leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../coresight/coresight-etm4x-core.c | 50 +++++++++++++++---- drivers/hwtracing/coresight/coresight-etm4x.h | 15 ++++++ 2 files changed, 56 insertions(+), 9 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 00297906669c..35802caca32a 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -115,30 +115,59 @@ void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit) } }
-static void etm4_os_unlock_csa(struct etmv4_drvdata *drvdata, struct csdev_access *csa) +static void etm_detect_os_lock(struct etmv4_drvdata *drvdata, + struct csdev_access *csa) { - /* Writing 0 to TRCOSLAR unlocks the trace registers */ - etm4x_relaxed_write32(csa, 0x0, TRCOSLAR); - drvdata->os_unlock = true; + u32 oslsr = etm4x_relaxed_read32(csa, TRCOSLSR); + + drvdata->os_lock_model = ETM_OSLSR_OSLM(oslsr); +} + +static void etm_write_os_lock(struct etmv4_drvdata *drvdata, + struct csdev_access *csa, u32 val) +{ + val = !!val; + + switch (drvdata->os_lock_model) { + case ETM_OSLOCK_PRESENT: + etm4x_relaxed_write32(csa, val, TRCOSLAR); + break; + case ETM_OSLOCK_PE: + write_sysreg_s(val, SYS_OSLAR_EL1); + break; + default: + pr_warn_once("CPU%d: Unsupported Trace OSLock model: %x\n", + smp_processor_id(), drvdata->os_lock_model); + fallthrough; + case ETM_OSLOCK_NI: + return; + } isb(); }
+static inline void etm4_os_unlock_csa(struct etmv4_drvdata *drvdata, + struct csdev_access *csa) +{ + WARN_ON(drvdata->cpu != smp_processor_id()); + + /* Writing 0 to OS Lock unlocks the trace unit registers */ + etm_write_os_lock(drvdata, csa, 0x0); + drvdata->os_unlock = true; +} + static void etm4_os_unlock(struct etmv4_drvdata *drvdata) { if (!WARN_ON(!drvdata->csdev)) etm4_os_unlock_csa(drvdata, &drvdata->csdev->access); - }
static void etm4_os_lock(struct etmv4_drvdata *drvdata) { if (WARN_ON(!drvdata->csdev)) return; - - /* Writing 0x1 to TRCOSLAR locks the trace registers */ - etm4x_relaxed_write32(&drvdata->csdev->access, 0x1, TRCOSLAR); + /* Writing 0x1 to OS Lock locks the trace registers */ + etm_write_os_lock(drvdata, &drvdata->csdev->access, 0x1); drvdata->os_unlock = false; - isb(); }
static void etm4_cs_lock(struct etmv4_drvdata *drvdata, @@ -937,6 +966,9 @@ static void etm4_init_arch_data(void *info) if (!etm4_init_csdev_access(drvdata, csa)) return;
+ /* Detect the support for OS Lock before we actually use it */ + etm_detect_os_lock(drvdata, csa); + /* Make sure all registers are accessible */ etm4_os_unlock_csa(drvdata, csa); etm4_cs_unlock(drvdata, csa); diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index f6478ef642bf..5b961c5b78d1 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -505,6 +505,20 @@ ETM_MODE_EXCL_KERN | \ ETM_MODE_EXCL_USER)
+/* + * TRCOSLSR.OSLM advertises the OS Lock model. + * OSLM[2:0] = TRCOSLSR[4:3,0] + * + * 0b000 - Trace OS Lock is not implemented. + * 0b010 - Trace OS Lock is implemented. + * 0b100 - Trace OS Lock is not implemented, unit is controlled by PE OS Lock. + */ +#define ETM_OSLOCK_NI 0b000 +#define ETM_OSLOCK_PRESENT 0b010 +#define ETM_OSLOCK_PE 0b100 + +#define ETM_OSLSR_OSLM(oslsr) ((((oslsr) & GENMASK(4, 3)) >> 2) | (oslsr & 0x1)) + /* * TRCDEVARCH Bit field definitions * Bits[31:21] - ARCHITECT = Always Arm Ltd. @@ -898,6 +912,7 @@ struct etmv4_drvdata { u8 s_ex_level; u8 ns_ex_level; u8 q_support; + u8 os_lock_model; bool sticky_enable; bool boot_enable; bool os_unlock;
Add ETE as one of the supported device types we support with ETM4x driver. The devices are named following the existing convention as ete<N>.
ETE mandates that the trace resource status register is programmed before the tracing is turned on. For the moment simply write to it indicating TraceActive.
ETE shares most of the registers with ETMv4 except for some and also adds some new registers. Re-arrange the ETMv4x list to share the common definitions and add the ETE sysreg support.
Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- Changes since v4: - Make ete sysreg read/write static (kernel test robot) - Squashed ETE sysreg support patch to this one. - Write 0 to TRCSTATR --- drivers/hwtracing/coresight/Kconfig | 10 +-- .../coresight/coresight-etm4x-core.c | 90 ++++++++++++++++--- .../coresight/coresight-etm4x-sysfs.c | 19 +++- drivers/hwtracing/coresight/coresight-etm4x.h | 66 ++++++++++++-- 4 files changed, 155 insertions(+), 30 deletions(-)
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig index 7b44ba22cbe1..f154ae7e705d 100644 --- a/drivers/hwtracing/coresight/Kconfig +++ b/drivers/hwtracing/coresight/Kconfig @@ -97,15 +97,15 @@ config CORESIGHT_SOURCE_ETM3X module will be called coresight-etm3x.
config CORESIGHT_SOURCE_ETM4X - tristate "CoreSight Embedded Trace Macrocell 4.x driver" + tristate "CoreSight ETMv4.x / ETE driver" depends on ARM64 select CORESIGHT_LINKS_AND_SINKS select PID_IN_CONTEXTIDR help - This driver provides support for the ETM4.x tracer module, tracing the - instructions that a processor is executing. This is primarily useful - for instruction level tracing. Depending on the implemented version - data tracing may also be available. + This driver provides support for the CoreSight Embedded Trace Macrocell + version 4.x and the Embedded Trace Extensions (ETE). Both are CPU tracer + modules, tracing the instructions that a processor is executing. This is + primarily useful for instruction level tracing.
To compile this driver as a module, choose M here: the module will be called coresight-etm4x. diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 35802caca32a..efb84ced83dd 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -115,6 +115,38 @@ void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit) } }
+static u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit) +{ + u64 res = 0; + + switch (offset) { + ETE_READ_CASES(res) + default : + pr_warn_ratelimited("ete: trying to read unsupported register @%x\n", + offset); + } + + if (!_relaxed) + __iormb(res); /* Imitate the !relaxed I/O helpers */ + + return res; +} + +static void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit) +{ + if (!_relaxed) + __iowmb(); /* Imitate the !relaxed I/O helpers */ + if (!_64bit) + val &= GENMASK(31, 0); + + switch (offset) { + ETE_WRITE_CASES(val) + default : + pr_warn_ratelimited("ete: trying to write to unsupported register @%x\n", + offset); + } +} + static void etm_detect_os_lock(struct etmv4_drvdata *drvdata, struct csdev_access *csa) { @@ -401,6 +433,13 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
+ /* + * ETE mandates that the TRCRSR is written to before + * enabling it. + */ + if (etm4x_is_ete(drvdata)) + etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); + /* Enable the trace unit */ etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
@@ -862,13 +901,24 @@ static bool etm4_init_sysreg_access(struct etmv4_drvdata *drvdata, * ETMs implementing sysreg access must implement TRCDEVARCH. */ devarch = read_etm4x_sysreg_const_offset(TRCDEVARCH); - if ((devarch & ETM_DEVARCH_ID_MASK) != ETM_DEVARCH_ETMv4x_ARCH) + switch (devarch & ETM_DEVARCH_ID_MASK) { + case ETM_DEVARCH_ETMv4x_ARCH: + *csa = (struct csdev_access) { + .io_mem = false, + .read = etm4x_sysreg_read, + .write = etm4x_sysreg_write, + }; + break; + case ETM_DEVARCH_ETE_ARCH: + *csa = (struct csdev_access) { + .io_mem = false, + .read = ete_sysreg_read, + .write = ete_sysreg_write, + }; + break; + default: return false; - *csa = (struct csdev_access) { - .io_mem = false, - .read = etm4x_sysreg_read, - .write = etm4x_sysreg_write, - }; + }
drvdata->arch = etm_devarch_to_arch(devarch); return true; @@ -1809,6 +1859,8 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid) struct etmv4_drvdata *drvdata; struct coresight_desc desc = { 0 }; struct etm4_init_arg init_arg = { 0 }; + u8 major, minor; + char *type_name;
drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); if (!drvdata) @@ -1835,10 +1887,6 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid) if (drvdata->cpu < 0) return drvdata->cpu;
- desc.name = devm_kasprintf(dev, GFP_KERNEL, "etm%d", drvdata->cpu); - if (!desc.name) - return -ENOMEM; - init_arg.drvdata = drvdata; init_arg.csa = &desc.access; init_arg.pid = etm_pid; @@ -1855,6 +1903,22 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid) fwnode_property_present(dev_fwnode(dev), "qcom,skip-power-up")) drvdata->skip_power_up = true;
+ major = ETM_ARCH_MAJOR_VERSION(drvdata->arch); + minor = ETM_ARCH_MINOR_VERSION(drvdata->arch); + + if (etm4x_is_ete(drvdata)) { + type_name = "ete"; + /* ETE v1 has major version == 0b101. Adjust this for logging.*/ + major -= 4; + } else { + type_name = "etm"; + } + + desc.name = devm_kasprintf(dev, GFP_KERNEL, + "%s%d", type_name, drvdata->cpu); + if (!desc.name) + return -ENOMEM; + etm4_init_trace_id(drvdata); etm4_set_default(&drvdata->config);
@@ -1882,9 +1946,8 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
etmdrvdata[drvdata->cpu] = drvdata;
- dev_info(&drvdata->csdev->dev, "CPU%d: ETM v%d.%d initialized\n", - drvdata->cpu, ETM_ARCH_MAJOR_VERSION(drvdata->arch), - ETM_ARCH_MINOR_VERSION(drvdata->arch)); + dev_info(&drvdata->csdev->dev, "CPU%d: %s v%d.%d initialized\n", + drvdata->cpu, type_name, major, minor);
if (boot_enable) { coresight_enable(drvdata->csdev); @@ -2027,6 +2090,7 @@ static struct amba_driver etm4x_amba_driver = {
static const struct of_device_id etm4_sysreg_match[] = { { .compatible = "arm,coresight-etm4x-sysreg" }, + { .compatible = "arm,embedded-trace-extension" }, {} };
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c index 0995a10790f4..007bad9e7ad8 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c @@ -2374,12 +2374,20 @@ static inline bool etm4x_register_implemented(struct etmv4_drvdata *drvdata, u32 offset) { switch (offset) { - ETM4x_SYSREG_LIST_CASES + ETM_COMMON_SYSREG_LIST_CASES /* - * Registers accessible via system instructions are always - * implemented. + * Common registers to ETE & ETM4x accessible via system + * instructions are always implemented. */ return true; + + ETM4x_ONLY_SYSREG_LIST_CASES + /* + * We only support etm4x and ete. So if the device is not + * ETE, it must be ETMv4x. + */ + return !etm4x_is_ete(drvdata); + ETM4x_MMAP_LIST_CASES /* * Registers accessible only via memory-mapped registers @@ -2389,8 +2397,13 @@ etm4x_register_implemented(struct etmv4_drvdata *drvdata, u32 offset) * coresight_register() and the csdev is not initialized * until that is done. So rely on the drvdata->base to * detect if we have a memory mapped access. + * Also ETE doesn't implement memory mapped access, thus + * it is sufficient to check that we are using mmio. */ return !!drvdata->base; + + ETE_ONLY_SYSREG_LIST_CASES + return etm4x_is_ete(drvdata); }
return false; diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index 5b961c5b78d1..e5b79bdb9851 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -29,6 +29,7 @@ #define TRCAUXCTLR 0x018 #define TRCEVENTCTL0R 0x020 #define TRCEVENTCTL1R 0x024 +#define TRCRSR 0x028 #define TRCSTALLCTLR 0x02C #define TRCTSCTLR 0x030 #define TRCSYNCPR 0x034 @@ -49,6 +50,7 @@ #define TRCSEQRSTEVR 0x118 #define TRCSEQSTR 0x11C #define TRCEXTINSELR 0x120 +#define TRCEXTINSELRn(n) (0x120 + (n * 4)) /* n = 0-3 */ #define TRCCNTRLDVRn(n) (0x140 + (n * 4)) /* n = 0-3 */ #define TRCCNTCTLRn(n) (0x150 + (n * 4)) /* n = 0-3 */ #define TRCCNTVRn(n) (0x160 + (n * 4)) /* n = 0-3 */ @@ -126,6 +128,8 @@ #define TRCCIDR2 0xFF8 #define TRCCIDR3 0xFFC
+#define TRCRSR_TA BIT(12) + /* * System instructions to access ETM registers. * See ETMv4.4 spec ARM IHI0064F section 4.3.6 System instructions @@ -160,10 +164,22 @@ #define CASE_NOP(__unused, x) \ case (x): /* fall through */
+#define ETE_ONLY_SYSREG_LIST(op, val) \ + CASE_##op((val), TRCRSR) \ + CASE_##op((val), TRCEXTINSELRn(1)) \ + CASE_##op((val), TRCEXTINSELRn(2)) \ + CASE_##op((val), TRCEXTINSELRn(3)) + /* List of registers accessible via System instructions */ -#define ETM_SYSREG_LIST(op, val) \ - CASE_##op((val), TRCPRGCTLR) \ +#define ETM4x_ONLY_SYSREG_LIST(op, val) \ CASE_##op((val), TRCPROCSELR) \ + CASE_##op((val), TRCVDCTLR) \ + CASE_##op((val), TRCVDSACCTLR) \ + CASE_##op((val), TRCVDARCCTLR) \ + CASE_##op((val), TRCOSLAR) + +#define ETM_COMMON_SYSREG_LIST(op, val) \ + CASE_##op((val), TRCPRGCTLR) \ CASE_##op((val), TRCSTATR) \ CASE_##op((val), TRCCONFIGR) \ CASE_##op((val), TRCAUXCTLR) \ @@ -180,9 +196,6 @@ CASE_##op((val), TRCVIIECTLR) \ CASE_##op((val), TRCVISSCTLR) \ CASE_##op((val), TRCVIPCSSCTLR) \ - CASE_##op((val), TRCVDCTLR) \ - CASE_##op((val), TRCVDSACCTLR) \ - CASE_##op((val), TRCVDARCCTLR) \ CASE_##op((val), TRCSEQEVRn(0)) \ CASE_##op((val), TRCSEQEVRn(1)) \ CASE_##op((val), TRCSEQEVRn(2)) \ @@ -277,7 +290,6 @@ CASE_##op((val), TRCSSPCICRn(5)) \ CASE_##op((val), TRCSSPCICRn(6)) \ CASE_##op((val), TRCSSPCICRn(7)) \ - CASE_##op((val), TRCOSLAR) \ CASE_##op((val), TRCOSLSR) \ CASE_##op((val), TRCACVRn(0)) \ CASE_##op((val), TRCACVRn(1)) \ @@ -369,12 +381,38 @@ CASE_##op((val), TRCPIDR2) \ CASE_##op((val), TRCPIDR3)
-#define ETM4x_READ_SYSREG_CASES(res) ETM_SYSREG_LIST(READ, (res)) -#define ETM4x_WRITE_SYSREG_CASES(val) ETM_SYSREG_LIST(WRITE, (val)) +#define ETM4x_READ_SYSREG_CASES(res) \ + ETM_COMMON_SYSREG_LIST(READ, (res)) \ + ETM4x_ONLY_SYSREG_LIST(READ, (res)) + +#define ETM4x_WRITE_SYSREG_CASES(val) \ + ETM_COMMON_SYSREG_LIST(WRITE, (val)) \ + ETM4x_ONLY_SYSREG_LIST(WRITE, (val)) + +#define ETM_COMMON_SYSREG_LIST_CASES \ + ETM_COMMON_SYSREG_LIST(NOP, __unused) + +#define ETM4x_ONLY_SYSREG_LIST_CASES \ + ETM4x_ONLY_SYSREG_LIST(NOP, __unused) + +#define ETM4x_SYSREG_LIST_CASES \ + ETM_COMMON_SYSREG_LIST_CASES \ + ETM4x_ONLY_SYSREG_LIST(NOP, __unused)
-#define ETM4x_SYSREG_LIST_CASES ETM_SYSREG_LIST(NOP, __unused) #define ETM4x_MMAP_LIST_CASES ETM_MMAP_LIST(NOP, __unused)
+/* ETE only supports system register access */ +#define ETE_READ_CASES(res) \ + ETM_COMMON_SYSREG_LIST(READ, (res)) \ + ETE_ONLY_SYSREG_LIST(READ, (res)) + +#define ETE_WRITE_CASES(val) \ + ETM_COMMON_SYSREG_LIST(WRITE, (val)) \ + ETE_ONLY_SYSREG_LIST(WRITE, (val)) + +#define ETE_ONLY_SYSREG_LIST_CASES \ + ETE_ONLY_SYSREG_LIST(NOP, __unused) + #define read_etm4x_sysreg_offset(offset, _64bit) \ ({ \ u64 __val; \ @@ -555,11 +593,14 @@ ((ETM_DEVARCH_MAKE_ARCHID_ARCH_VER(major)) | ETM_DEVARCH_ARCHID_ARCH_PART(0xA13))
#define ETM_DEVARCH_ARCHID_ETMv4x ETM_DEVARCH_MAKE_ARCHID(0x4) +#define ETM_DEVARCH_ARCHID_ETE ETM_DEVARCH_MAKE_ARCHID(0x5)
#define ETM_DEVARCH_ID_MASK \ (ETM_DEVARCH_ARCHITECT_MASK | ETM_DEVARCH_ARCHID_MASK | ETM_DEVARCH_PRESENT) #define ETM_DEVARCH_ETMv4x_ARCH \ (ETM_DEVARCH_ARCHITECT_ARM | ETM_DEVARCH_ARCHID_ETMv4x | ETM_DEVARCH_PRESENT) +#define ETM_DEVARCH_ETE_ARCH \ + (ETM_DEVARCH_ARCHITECT_ARM | ETM_DEVARCH_ARCHID_ETE | ETM_DEVARCH_PRESENT)
#define TRCSTATR_IDLE_BIT 0 #define TRCSTATR_PMSTABLE_BIT 1 @@ -649,6 +690,8 @@ #define ETM_ARCH_MINOR_VERSION(arch) ((arch) & 0xfU)
#define ETM_ARCH_V4 ETM_ARCH_VERSION(4, 0) +#define ETM_ARCH_ETE ETM_ARCH_VERSION(5, 0) + /* Interpretation of resource numbers change at ETM v4.3 architecture */ #define ETM_ARCH_V4_3 ETM_ARCH_VERSION(4, 3)
@@ -957,4 +1000,9 @@ void etm4_config_trace_mode(struct etmv4_config *config);
u64 etm4x_sysreg_read(u32 offset, bool _relaxed, bool _64bit); void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit); + +static inline bool etm4x_is_ete(struct etmv4_drvdata *drvdata) +{ + return drvdata->arch >= ETM_ARCH_ETE; +} #endif
Document the device tree bindings for Embedded Trace Extensions. ETE can be connected to legacy coresight components and thus could optionally contain a connection graph as described by the CoreSight bindings.
Cc: devicetree@vger.kernel.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Rob Herring robh@kernel.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- Changes since v4: - Fix the out-ports definition (Rob Herring) --- .../devicetree/bindings/arm/ete.yaml | 75 +++++++++++++++++++ MAINTAINERS | 1 + 2 files changed, 76 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml new file mode 100644 index 000000000000..7f9b2d1e1147 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/ete.yaml @@ -0,0 +1,75 @@ +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause +# Copyright 2021, Arm Ltd +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/arm/ete.yaml#" +$schema: "http://devicetree.org/meta-schemas/core.yaml#" + +title: ARM Embedded Trace Extensions + +maintainers: + - Suzuki K Poulose suzuki.poulose@arm.com + - Mathieu Poirier mathieu.poirier@linaro.org + +description: | + Arm Embedded Trace Extension(ETE) is a per CPU trace component that + allows tracing the CPU execution. It overlaps with the CoreSight ETMv4 + architecture and has extended support for future architecture changes. + The trace generated by the ETE could be stored via legacy CoreSight + components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer + Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to + legacy CoreSight components, a node must be listed per instance, along + with any optional connection graph as per the coresight bindings. + See bindings/arm/coresight.txt. + +properties: + $nodename: + pattern: "^ete([0-9a-f]+)$" + compatible: + items: + - const: arm,embedded-trace-extension + + cpu: + description: | + Handle to the cpu this ETE is bound to. + $ref: /schemas/types.yaml#/definitions/phandle + + out-ports: + description: | + Output connections from the ETE to legacy CoreSight trace bus. + $ref: /schemas/graph.yaml#/properties/ports + properties: + port: + description: Output connection from the ETE to legacy CoreSight Trace bus. + $ref: /schemas/graph.yaml#/properties/port + +required: + - compatible + - cpu + +additionalProperties: false + +examples: + +# An ETE node without legacy CoreSight connections + - | + ete0 { + compatible = "arm,embedded-trace-extension"; + cpu = <&cpu_0>; + }; +# An ETE node with legacy CoreSight connections + - | + ete1 { + compatible = "arm,embedded-trace-extension"; + cpu = <&cpu_1>; + + out-ports { /* legacy coresight connection */ + port { + ete1_out_port: endpoint { + remote-endpoint = <&funnel_in_port0>; + }; + }; + }; + }; + +... diff --git a/MAINTAINERS b/MAINTAINERS index 9e876927c60d..3454ed1011c8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1761,6 +1761,7 @@ F: Documentation/ABI/testing/sysfs-bus-coresight-devices-* F: Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt F: Documentation/devicetree/bindings/arm/coresight-cti.yaml F: Documentation/devicetree/bindings/arm/coresight.txt +F: Documentation/devicetree/bindings/arm/ete.yaml F: Documentation/trace/coresight/* F: drivers/hwtracing/coresight/* F: include/dt-bindings/arm/coresight-cti-dt.h
On Tue, 23 Mar 2021 12:06:41 +0000, Suzuki K Poulose wrote:
Document the device tree bindings for Embedded Trace Extensions. ETE can be connected to legacy coresight components and thus could optionally contain a connection graph as described by the CoreSight bindings.
Cc: devicetree@vger.kernel.org Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Rob Herring robh@kernel.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
Changes since v4:
- Fix the out-ports definition (Rob Herring)
.../devicetree/bindings/arm/ete.yaml | 75 +++++++++++++++++++ MAINTAINERS | 1 + 2 files changed, 76 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
Reviewed-by: Rob Herring robh@kernel.org
The context associated with an ETM for a given perf event includes : - handle -> the perf output handle for the AUX buffer. - the path for the trace components - the buffer config for the sink.
The path and the buffer config are part of the "aux_priv" data (etm_event_data) setup by the setup_aux() callback, and made available via perf_get_aux(handle).
Now with a sink supporting IRQ, the sink could "end" an output handle when the buffer reaches the programmed limit and would try to restart a handle. This could fail if there is not enough space left the AUX buffer (e.g, the userspace has not consumed the data). This leaves the "handle" disconnected from the "event" and also the "perf_get_aux()" cleared. This all happens within the sink driver, without the etm_perf driver being aware. Now when the event is actually stopped, etm_event_stop() will need to access the "event_data". But since the handle is not valid anymore, we loose the information to stop the "trace" path. So, we need a reliable way to access the etm_event_data even when the handle may not be active.
This patch replaces the per_cpu handle array with a per_cpu context for the ETM, which tracks the "handle" as well as the "etm_event_data". The context notes the etm_event_data at etm_event_start() and clears it at etm_event_stop(). This makes sure that we don't access a stale "etm_event_data" as we are guaranteed that it is not freed by free_aux() as long as the event is active and tracing, also provides us with access to the critical information needed to wind up a session even in the absence of an active output_handle.
This is not an issue for the legacy sinks as none of them supports an IRQ and is centrally handled by the etm-perf.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Leo Yan leo.yan@linaro.org Cc: Mike Leach mike.leach@linaro.org Reviewed-by: Mike Leach mike.leach@linaro.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../hwtracing/coresight/coresight-etm-perf.c | 59 +++++++++++++++++-- 1 file changed, 54 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index aa0974bd265b..f123c26b9f54 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -24,7 +24,26 @@ static struct pmu etm_pmu; static bool etm_perf_up;
-static DEFINE_PER_CPU(struct perf_output_handle, ctx_handle); +/* + * An ETM context for a running event includes the perf aux handle + * and aux_data. For ETM, the aux_data (etm_event_data), consists of + * the trace path and the sink configuration. The event data is accessible + * via perf_get_aux(handle). However, a sink could "end" a perf output + * handle via the IRQ handler. And if the "sink" encounters a failure + * to "begin" another session (e.g due to lack of space in the buffer), + * the handle will be cleared. Thus, the event_data may not be accessible + * from the handle when we get to the etm_event_stop(), which is required + * for stopping the trace path. The event_data is guaranteed to stay alive + * until "free_aux()", which cannot happen as long as the event is active on + * the ETM. Thus the event_data for the session must be part of the ETM context + * to make sure we can disable the trace path. + */ +struct etm_ctxt { + struct perf_output_handle handle; + struct etm_event_data *event_data; +}; + +static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt); static DEFINE_PER_CPU(struct coresight_device *, csdev_src);
/* @@ -376,13 +395,18 @@ static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); struct etm_event_data *event_data; - struct perf_output_handle *handle = this_cpu_ptr(&ctx_handle); + struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt); + struct perf_output_handle *handle = &ctxt->handle; struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu); struct list_head *path;
if (!csdev) goto fail;
+ /* Have we messed up our tracking ? */ + if (WARN_ON(ctxt->event_data)) + goto fail; + /* * Deal with the ring buffer API and get a handle on the * session's information. @@ -418,6 +442,8 @@ static void etm_event_start(struct perf_event *event, int flags) if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF)) goto fail_disable_path;
+ /* Save the event_data for this ETM */ + ctxt->event_data = event_data; out: return;
@@ -436,13 +462,30 @@ static void etm_event_stop(struct perf_event *event, int mode) int cpu = smp_processor_id(); unsigned long size; struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu); - struct perf_output_handle *handle = this_cpu_ptr(&ctx_handle); - struct etm_event_data *event_data = perf_get_aux(handle); + struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt); + struct perf_output_handle *handle = &ctxt->handle; + struct etm_event_data *event_data; struct list_head *path;
+ /* + * If we still have access to the event_data via handle, + * confirm that we haven't messed up the tracking. + */ + if (handle->event && + WARN_ON(perf_get_aux(handle) != ctxt->event_data)) + return; + + event_data = ctxt->event_data; + /* Clear the event_data as this ETM is stopping the trace. */ + ctxt->event_data = NULL; + if (event->hw.state == PERF_HES_STOPPED) return;
+ /* We must have a valid event_data for a running event */ + if (WARN_ON(!event_data)) + return; + if (!csdev) return;
@@ -460,7 +503,13 @@ static void etm_event_stop(struct perf_event *event, int mode) /* tell the core */ event->hw.state = PERF_HES_STOPPED;
- if (mode & PERF_EF_UPDATE) { + /* + * If the handle is not bound to an event anymore + * (e.g, the sink driver was unable to restart the + * handle due to lack of buffer space), we don't + * have to do anything here. + */ + if (handle->event && (mode & PERF_EF_UPDATE)) { if (WARN_ON_ONCE(handle->event != event)) return;
From: Anshuman Khandual anshuman.khandual@arm.com
Add support for dedicated sinks that are bound to individual CPUs. (e.g, TRBE). To allow quicker access to the sink for a given CPU bound source, keep a percpu array of the sink devices. Also, add support for building a path to the CPU local sink from the ETM.
This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM. This new sink type is exclusively available and can only work with percpu source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.
This defines a percpu structure that accommodates a single coresight_device which can be used to store an initialized instance from a sink driver. As these sinks are exclusively linked and dependent on corresponding percpu sources devices, they should also be the default sink device during a perf session.
Outwards device connections are scanned while establishing paths between a source and a sink device. But such connections are not present for certain percpu source and sink devices which are exclusively linked and dependent. Build the path directly and skip connection scanning for such devices.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Suzuki K Poulose suzuki.poulose@arm.com Tested-by: Suzuki K Poulose suzuki.poulose@arm.com Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Reviewed-by: Mike Leach mike.leach@linaro.org Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com [Moved the set/get percpu sink APIs from TRBE patch to here Fixed build break on arm32 ] Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- Changes since v4: - Fix build on arm32 kernel --- drivers/hwtracing/coresight/coresight-core.c | 29 ++++++++++++++++++-- drivers/hwtracing/coresight/coresight-priv.h | 3 ++ include/linux/coresight.h | 13 +++++++++ 3 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 0062c8935653..55c645616bf6 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -23,6 +23,7 @@ #include "coresight-priv.h"
static DEFINE_MUTEX(coresight_mutex); +DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
/** * struct coresight_node - elements of a path, from source to sink @@ -70,6 +71,18 @@ void coresight_remove_cti_ops(void) } EXPORT_SYMBOL_GPL(coresight_remove_cti_ops);
+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev) +{ + per_cpu(csdev_sink, cpu) = csdev; +} +EXPORT_SYMBOL_GPL(coresight_set_percpu_sink); + +struct coresight_device *coresight_get_percpu_sink(int cpu) +{ + return per_cpu(csdev_sink, cpu); +} +EXPORT_SYMBOL_GPL(coresight_get_percpu_sink); + static int coresight_id_match(struct device *dev, void *data) { int trace_id, i_trace_id; @@ -784,6 +797,14 @@ static int _coresight_build_path(struct coresight_device *csdev, if (csdev == sink) goto out;
+ if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) && + sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) { + if (_coresight_build_path(sink, sink, path) == 0) { + found = true; + goto out; + } + } + /* Not a sink - recursively explore each port found on this element */ for (i = 0; i < csdev->pdata->nr_outport; i++) { struct coresight_device *child_dev; @@ -999,8 +1020,12 @@ coresight_find_default_sink(struct coresight_device *csdev) int depth = 0;
/* look for a default sink if we have not found for this device */ - if (!csdev->def_sink) - csdev->def_sink = coresight_find_sink(csdev, &depth); + if (!csdev->def_sink) { + if (coresight_is_percpu_source(csdev)) + csdev->def_sink = per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev)); + if (!csdev->def_sink) + csdev->def_sink = coresight_find_sink(csdev, &depth); + } return csdev->def_sink; }
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index f5f654ea2994..ff1dd2092ac5 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -232,4 +232,7 @@ coresight_find_csdev_by_fwnode(struct fwnode_handle *r_fwnode); void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev, struct coresight_device *ect_csdev);
+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); +struct coresight_device *coresight_get_percpu_sink(int cpu); + #endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 976ec2697610..85008a65e21f 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -50,6 +50,7 @@ enum coresight_dev_subtype_sink { CORESIGHT_DEV_SUBTYPE_SINK_PORT, CORESIGHT_DEV_SUBTYPE_SINK_BUFFER, CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM, + CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM, };
enum coresight_dev_subtype_link { @@ -455,6 +456,18 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o } #endif /* CONFIG_64BIT */
+static inline bool coresight_is_percpu_source(struct coresight_device *csdev) +{ + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) && + (csdev->subtype.source_subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_PROC); +} + +static inline bool coresight_is_percpu_sink(struct coresight_device *csdev) +{ + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) && + (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM); +} + extern struct coresight_device * coresight_register(struct coresight_desc *desc); extern void coresight_unregister(struct coresight_device *csdev);
From: Anshuman Khandual anshuman.khandual@arm.com
Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is accessible via the system registers. The TRBE supports different addressing modes including CPU virtual address and buffer modes including the circular buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1), an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the access to the trace buffer could be prohibited by a higher exception level (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU private interrupt (PPI) on address translation errors and when the buffer is full. Overall implementation here is inspired from the Arm SPE driver.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com [ Mark the buffer truncated on WRAP event, error code cleanup ] Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- Changes since v4: - Mark the buffer as truncated on WRAP event (Mike Leach) - Tidy up limit_pointer computation from TRBLIMITR_EL1 - Fix error code for vmap failure. (Mathieu) --- drivers/hwtracing/coresight/Kconfig | 14 + drivers/hwtracing/coresight/Makefile | 1 + drivers/hwtracing/coresight/coresight-trbe.c | 1157 ++++++++++++++++++ drivers/hwtracing/coresight/coresight-trbe.h | 152 +++ 4 files changed, 1324 insertions(+) create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig index f154ae7e705d..84530fd80998 100644 --- a/drivers/hwtracing/coresight/Kconfig +++ b/drivers/hwtracing/coresight/Kconfig @@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS CTI trigger connections between this and other devices.These registers are not used in normal operation and can leave devices in an inconsistent state. + +config CORESIGHT_TRBE + tristate "Trace Buffer Extension (TRBE) driver" + depends on ARM64 && CORESIGHT_SOURCE_ETM4X + help + This driver provides support for percpu Trace Buffer Extension (TRBE). + TRBE always needs to be used along with it's corresponding percpu ETE + component. ETE generates trace data which is then captured with TRBE. + Unlike traditional sink devices, TRBE is a CPU feature accessible via + system registers. But it's explicit dependency with trace unit (ETE) + requires it to be plugged in as a coresight sink device. + + To compile this driver as a module, choose M here: the module will be + called coresight-trbe. endif diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile index f20e357758d1..d60816509755 100644 --- a/drivers/hwtracing/coresight/Makefile +++ b/drivers/hwtracing/coresight/Makefile @@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o +obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \ coresight-cti-sysfs.o diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c new file mode 100644 index 000000000000..edd70c37fffb --- /dev/null +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -0,0 +1,1157 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight + * sink device could then pair with an appropriate per-cpu coresight source + * device (ETE) thus generating required trace data. Trace can be enabled + * via the perf framework. + * + * The AUX buffer handling is inspired from Arm SPE PMU driver. + * + * Copyright (C) 2020 ARM Ltd. + * + * Author: Anshuman Khandual anshuman.khandual@arm.com + */ +#define DRVNAME "arm_trbe" + +#define pr_fmt(fmt) DRVNAME ": " fmt + +#include <asm/barrier.h> +#include "coresight-trbe.h" + +#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) + +/* + * A padding packet that will help the user space tools + * in skipping relevant sections in the captured trace + * data which could not be decoded. TRBE doesn't support + * formatting the trace data, unlike the legacy CoreSight + * sinks and thus we use ETE trace packets to pad the + * sections of the buffer. + */ +#define ETE_IGNORE_PACKET 0x70 + +/* + * Minimum amount of meaningful trace will contain: + * A-Sync, Trace Info, Trace On, Address, Atom. + * This is about 44bytes of ETE trace. To be on + * the safer side, we assume 64bytes is the minimum + * space required for a meaningful session, before + * we hit a "WRAP" event. + */ +#define TRBE_TRACE_MIN_BUF_SIZE 64 + +enum trbe_fault_action { + TRBE_FAULT_ACT_WRAP, + TRBE_FAULT_ACT_SPURIOUS, + TRBE_FAULT_ACT_FATAL, +}; + +struct trbe_buf { + /* + * Even though trbe_base represents vmap() + * mapped allocated buffer's start address, + * it's being as unsigned long for various + * arithmetic and comparision operations & + * also to be consistent with trbe_write & + * trbe_limit sibling pointers. + */ + unsigned long trbe_base; + unsigned long trbe_limit; + unsigned long trbe_write; + int nr_pages; + void **pages; + bool snapshot; + struct trbe_cpudata *cpudata; +}; + +struct trbe_cpudata { + bool trbe_flag; + u64 trbe_align; + int cpu; + enum cs_mode mode; + struct trbe_buf *buf; + struct trbe_drvdata *drvdata; +}; + +struct trbe_drvdata { + struct trbe_cpudata __percpu *cpudata; + struct perf_output_handle __percpu **handle; + struct hlist_node hotplug_node; + int irq; + cpumask_t supported_cpus; + enum cpuhp_state trbe_online; + struct platform_device *pdev; +}; + +static int trbe_alloc_node(struct perf_event *event) +{ + if (event->cpu == -1) + return NUMA_NO_NODE; + return cpu_to_node(event->cpu); +} + +static void trbe_drain_buffer(void) +{ + tsb_csync(); + dsb(nsh); +} + +static void trbe_drain_and_disable_local(void) +{ + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); + + trbe_drain_buffer(); + + /* + * Disable the TRBE without clearing LIMITPTR which + * might be required for fetching the buffer limits. + */ + trblimitr &= ~TRBLIMITR_ENABLE; + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1); + isb(); +} + +static void trbe_reset_local(void) +{ + trbe_drain_and_disable_local(); + write_sysreg_s(0, SYS_TRBLIMITR_EL1); + write_sysreg_s(0, SYS_TRBPTR_EL1); + write_sysreg_s(0, SYS_TRBBASER_EL1); + write_sysreg_s(0, SYS_TRBSR_EL1); +} + +static void trbe_stop_and_truncate_event(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + + /* + * We cannot proceed with the buffer collection and we + * do not have any data for the current session. The + * etm_perf driver expects to close out the aux_buffer + * at event_stop(). So disable the TRBE here and leave + * the update_buffer() to return a 0 size. + */ + trbe_drain_and_disable_local(); + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL; +} + +/* + * TRBE Buffer Management + * + * The TRBE buffer spans from the base pointer till the limit pointer. When enabled, + * it starts writing trace data from the write pointer onward till the limit pointer. + * When the write pointer reaches the address just before the limit pointer, it gets + * wrapped around again to the base pointer. This is called a TRBE wrap event, which + * generates a maintenance interrupt when operated in WRAP or FILL mode. This driver + * uses FILL mode, where the TRBE stops the trace collection at wrap event. The IRQ + * handler updates the AUX buffer and re-enables the TRBE with updated WRITE and + * LIMIT pointers. + * + * Wrap around with an IRQ + * ------ < ------ < ------- < ----- < ----- + * | | + * ------ > ------ > ------- > ----- > ----- + * + * +---------------+-----------------------+ + * | | | + * +---------------+-----------------------+ + * Base Pointer Write Pointer Limit Pointer + * + * The base and limit pointers always needs to be PAGE_SIZE aligned. But the write + * pointer can be aligned to the implementation defined TRBE trace buffer alignment + * as captured in trbe_cpudata->trbe_align. + * + * + * head tail wakeup + * +---------------------------------------+----- ~ ~ ------ + * |$$$$$$$|################|$$$$$$$$$$$$$$| | + * +---------------------------------------+----- ~ ~ ------ + * Base Pointer Write Pointer Limit Pointer + * + * The perf_output_handle indices (head, tail, wakeup) are monotonically increasing + * values which tracks all the driver writes and user reads from the perf auxiliary + * buffer. Generally [head..tail] is the area where the driver can write into unless + * the wakeup is behind the tail. Enabled TRBE buffer span needs to be adjusted and + * configured depending on the perf_output_handle indices, so that the driver does + * not override into areas in the perf auxiliary buffer which is being or yet to be + * consumed from the user space. The enabled TRBE buffer area is a moving subset of + * the allocated perf auxiliary buffer. + */ +static void trbe_pad_buf(struct perf_output_handle *handle, int len) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + u64 head = PERF_IDX2OFF(handle->head, buf); + + memset((void *)buf->trbe_base + head, ETE_IGNORE_PACKET, len); + if (!buf->snapshot) + perf_aux_output_skip(handle, len); +} + +static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + + /* + * The ETE trace has alignment synchronization packets allowing + * the decoder to reset in case of an overflow or corruption. + * So we can use the entire buffer for the snapshot mode. + */ + return buf->nr_pages * PAGE_SIZE; +} + +/* + * TRBE Limit Calculation + * + * The following markers are used to illustrate various TRBE buffer situations. + * + * $$$$ - Data area, unconsumed captured trace data, not to be overridden + * #### - Free area, enabled, trace will be written + * %%%% - Free area, disabled, trace will not be written + * ==== - Free area, padded with ETE_IGNORE_PACKET, trace will be skipped + */ +static unsigned long __trbe_normal_offset(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + struct trbe_cpudata *cpudata = buf->cpudata; + const u64 bufsize = buf->nr_pages * PAGE_SIZE; + u64 limit = bufsize; + u64 head, tail, wakeup; + + head = PERF_IDX2OFF(handle->head, buf); + + /* + * head + * ------->| + * | + * head TRBE align tail + * +----|-------|---------------|-------+ + * |$$$$|=======|###############|$$$$$$$| + * +----|-------|---------------|-------+ + * trbe_base trbe_base + nr_pages + * + * Perf aux buffer output head position can be misaligned depending on + * various factors including user space reads. In case misaligned, head + * needs to be aligned before TRBE can be configured. Pad the alignment + * gap with ETE_IGNORE_PACKET bytes that will be ignored by user tools + * and skip this section thus advancing the head. + */ + if (!IS_ALIGNED(head, cpudata->trbe_align)) { + unsigned long delta = roundup(head, cpudata->trbe_align) - head; + + delta = min(delta, handle->size); + trbe_pad_buf(handle, delta); + head = PERF_IDX2OFF(handle->head, buf); + } + + /* + * head = tail (size = 0) + * +----|-------------------------------+ + * |$$$$|$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ | + * +----|-------------------------------+ + * trbe_base trbe_base + nr_pages + * + * Perf aux buffer does not have any space for the driver to write into. + * Just communicate trace truncation event to the user space by marking + * it with PERF_AUX_FLAG_TRUNCATED. + */ + if (!handle->size) { + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + return 0; + } + + /* Compute the tail and wakeup indices now that we've aligned head */ + tail = PERF_IDX2OFF(handle->head + handle->size, buf); + wakeup = PERF_IDX2OFF(handle->wakeup, buf); + + /* + * Lets calculate the buffer area which TRBE could write into. There + * are three possible scenarios here. Limit needs to be aligned with + * PAGE_SIZE per the TRBE requirement. Always avoid clobbering the + * unconsumed data. + * + * 1) head < tail + * + * head tail + * +----|-----------------------|-------+ + * |$$$$|#######################|$$$$$$$| + * +----|-----------------------|-------+ + * trbe_base limit trbe_base + nr_pages + * + * TRBE could write into [head..tail] area. Unless the tail is right at + * the end of the buffer, neither an wrap around nor an IRQ is expected + * while being enabled. + * + * 2) head == tail + * + * head = tail (size > 0) + * +----|-------------------------------+ + * |%%%%|###############################| + * +----|-------------------------------+ + * trbe_base limit = trbe_base + nr_pages + * + * TRBE should just write into [head..base + nr_pages] area even though + * the entire buffer is empty. Reason being, when the trace reaches the + * end of the buffer, it will just wrap around with an IRQ giving an + * opportunity to reconfigure the buffer. + * + * 3) tail < head + * + * tail head + * +----|-----------------------|-------+ + * |%%%%|$$$$$$$$$$$$$$$$$$$$$$$|#######| + * +----|-----------------------|-------+ + * trbe_base limit = trbe_base + nr_pages + * + * TRBE should just write into [head..base + nr_pages] area even though + * the [trbe_base..tail] is also empty. Reason being, when the trace + * reaches the end of the buffer, it will just wrap around with an IRQ + * giving an opportunity to reconfigure the buffer. + */ + if (head < tail) + limit = round_down(tail, PAGE_SIZE); + + /* + * Wakeup may be arbitrarily far into the future. If it's not in the + * current generation, either we'll wrap before hitting it, or it's + * in the past and has been handled already. + * + * If there's a wakeup before we wrap, arrange to be woken up by the + * page boundary following it. Keep the tail boundary if that's lower. + * + * head wakeup tail + * +----|---------------|-------|-------+ + * |$$$$|###############|%%%%%%%|$$$$$$$| + * +----|---------------|-------|-------+ + * trbe_base limit trbe_base + nr_pages + */ + if (handle->wakeup < (handle->head + handle->size) && head <= wakeup) + limit = min(limit, round_up(wakeup, PAGE_SIZE)); + + /* + * There are two situation when this can happen i.e limit is before + * the head and hence TRBE cannot be configured. + * + * 1) head < tail (aligned down with PAGE_SIZE) and also they are both + * within the same PAGE size range. + * + * PAGE_SIZE + * |----------------------| + * + * limit head tail + * +------------|------|--------|-------+ + * |$$$$$$$$$$$$$$$$$$$|========|$$$$$$$| + * +------------|------|--------|-------+ + * trbe_base trbe_base + nr_pages + * + * 2) head < wakeup (aligned up with PAGE_SIZE) < tail and also both + * head and wakeup are within same PAGE size range. + * + * PAGE_SIZE + * |----------------------| + * + * limit head wakeup tail + * +----|------|-------|--------|-------+ + * |$$$$$$$$$$$|=======|========|$$$$$$$| + * +----|------|-------|--------|-------+ + * trbe_base trbe_base + nr_pages + */ + if (limit > head) + return limit; + + trbe_pad_buf(handle, handle->size); + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + return 0; +} + +static unsigned long trbe_normal_offset(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = perf_get_aux(handle); + u64 limit = __trbe_normal_offset(handle); + u64 head = PERF_IDX2OFF(handle->head, buf); + + /* + * If the head is too close to the limit and we don't + * have space for a meaningful run, we rather pad it + * and start fresh. + */ + if (limit && (limit - head < TRBE_TRACE_MIN_BUF_SIZE)) { + trbe_pad_buf(handle, limit - head); + limit = __trbe_normal_offset(handle); + } + return limit; +} + +static unsigned long compute_trbe_buffer_limit(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + unsigned long offset; + + if (buf->snapshot) + offset = trbe_snapshot_offset(handle); + else + offset = trbe_normal_offset(handle); + return buf->trbe_base + offset; +} + +static void clr_trbe_status(void) +{ + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1); + + WARN_ON(is_trbe_enabled()); + trbsr &= ~TRBSR_IRQ; + trbsr &= ~TRBSR_TRG; + trbsr &= ~TRBSR_WRAP; + trbsr &= ~(TRBSR_EC_MASK << TRBSR_EC_SHIFT); + trbsr &= ~(TRBSR_BSC_MASK << TRBSR_BSC_SHIFT); + trbsr &= ~TRBSR_STOP; + write_sysreg_s(trbsr, SYS_TRBSR_EL1); +} + +static void set_trbe_limit_pointer_enabled(unsigned long addr) +{ + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); + + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBLIMITR_LIMIT_SHIFT))); + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE)); + + trblimitr &= ~TRBLIMITR_NVM; + trblimitr &= ~(TRBLIMITR_FILL_MODE_MASK << TRBLIMITR_FILL_MODE_SHIFT); + trblimitr &= ~(TRBLIMITR_TRIG_MODE_MASK << TRBLIMITR_TRIG_MODE_SHIFT); + trblimitr &= ~(TRBLIMITR_LIMIT_MASK << TRBLIMITR_LIMIT_SHIFT); + + /* + * Fill trace buffer mode is used here while configuring the + * TRBE for trace capture. In this particular mode, the trace + * collection is stopped and a maintenance interrupt is raised + * when the current write pointer wraps. This pause in trace + * collection gives the software an opportunity to capture the + * trace data in the interrupt handler, before reconfiguring + * the TRBE. + */ + trblimitr |= (TRBE_FILL_MODE_FILL & TRBLIMITR_FILL_MODE_MASK) << TRBLIMITR_FILL_MODE_SHIFT; + + /* + * Trigger mode is not used here while configuring the TRBE for + * the trace capture. Hence just keep this in the ignore mode. + */ + trblimitr |= (TRBE_TRIG_MODE_IGNORE & TRBLIMITR_TRIG_MODE_MASK) << + TRBLIMITR_TRIG_MODE_SHIFT; + trblimitr |= (addr & PAGE_MASK); + + trblimitr |= TRBLIMITR_ENABLE; + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1); + + /* Synchronize the TRBE enable event */ + isb(); +} + +static void trbe_enable_hw(struct trbe_buf *buf) +{ + WARN_ON(buf->trbe_write < buf->trbe_base); + WARN_ON(buf->trbe_write >= buf->trbe_limit); + set_trbe_disabled(); + isb(); + clr_trbe_status(); + set_trbe_base_pointer(buf->trbe_base); + set_trbe_write_pointer(buf->trbe_write); + + /* + * Synchronize all the register updates + * till now before enabling the TRBE. + */ + isb(); + set_trbe_limit_pointer_enabled(buf->trbe_limit); +} + +static enum trbe_fault_action trbe_get_fault_act(u64 trbsr) +{ + int ec = get_trbe_ec(trbsr); + int bsc = get_trbe_bsc(trbsr); + + WARN_ON(is_trbe_running(trbsr)); + if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr)) + return TRBE_FAULT_ACT_FATAL; + + if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT)) + return TRBE_FAULT_ACT_FATAL; + + if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) { + if (get_trbe_write_pointer() == get_trbe_base_pointer()) + return TRBE_FAULT_ACT_WRAP; + } + return TRBE_FAULT_ACT_SPURIOUS; +} + +static void *arm_trbe_alloc_buffer(struct coresight_device *csdev, + struct perf_event *event, void **pages, + int nr_pages, bool snapshot) +{ + struct trbe_buf *buf; + struct page **pglist; + int i; + + /* + * TRBE LIMIT and TRBE WRITE pointers must be page aligned. But with + * just a single page, there would not be any room left while writing + * into a partially filled TRBE buffer after the page size alignment. + * Hence restrict the minimum buffer size as two pages. + */ + if (nr_pages < 2) + return NULL; + + buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event)); + if (!buf) + return ERR_PTR(-ENOMEM); + + pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL); + if (!pglist) { + kfree(buf); + return ERR_PTR(-ENOMEM); + } + + for (i = 0; i < nr_pages; i++) + pglist[i] = virt_to_page(pages[i]); + + buf->trbe_base = (unsigned long)vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL); + if (!buf->trbe_base) { + kfree(pglist); + kfree(buf); + return ERR_PTR(-ENOMEM); + } + buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE; + buf->trbe_write = buf->trbe_base; + buf->snapshot = snapshot; + buf->nr_pages = nr_pages; + buf->pages = pages; + kfree(pglist); + return buf; +} + +static void arm_trbe_free_buffer(void *config) +{ + struct trbe_buf *buf = config; + + vunmap((void *)buf->trbe_base); + kfree(buf); +} + +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *config) +{ + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev); + struct trbe_buf *buf = config; + enum trbe_fault_action act; + unsigned long size, offset; + unsigned long write, base, status; + unsigned long flags; + + WARN_ON(buf->cpudata != cpudata); + WARN_ON(cpudata->cpu != smp_processor_id()); + WARN_ON(cpudata->drvdata != drvdata); + if (cpudata->mode != CS_MODE_PERF) + return 0; + + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW); + + /* + * We are about to disable the TRBE. And this could in turn + * fill up the buffer triggering, an IRQ. This could be consumed + * by the PE asynchronously, causing a race here against + * the IRQ handler in closing out the handle. So, let us + * make sure the IRQ can't trigger while we are collecting + * the buffer. We also make sure that a WRAP event is handled + * accordingly. + */ + local_irq_save(flags); + + /* + * If the TRBE was disabled due to lack of space in the AUX buffer or a + * spurious fault, the driver leaves it disabled, truncating the buffer. + * Since the etm_perf driver expects to close out the AUX buffer, the + * driver skips it. Thus, just pass in 0 size here to indicate that the + * buffer was truncated. + */ + if (!is_trbe_enabled()) { + size = 0; + goto done; + } + /* + * perf handle structure needs to be shared with the TRBE IRQ handler for + * capturing trace data and restarting the handle. There is a probability + * of an undefined reference based crash when etm event is being stopped + * while a TRBE IRQ also getting processed. This happens due the release + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping + * the TRBE here will ensure that no IRQ could be generated when the perf + * handle gets freed in etm_event_stop(). + */ + trbe_drain_and_disable_local(); + write = get_trbe_write_pointer(); + base = get_trbe_base_pointer(); + + /* Check if there is a pending interrupt and handle it here */ + status = read_sysreg_s(SYS_TRBSR_EL1); + if (is_trbe_irq(status)) { + + /* + * Now that we are handling the IRQ here, clear the IRQ + * from the status, to let the irq handler know that it + * is taken care of. + */ + clr_trbe_irq(); + isb(); + + act = trbe_get_fault_act(status); + /* + * If this was not due to a WRAP event, we have some + * errors and as such buffer is empty. + */ + if (act != TRBE_FAULT_ACT_WRAP) { + size = 0; + goto done; + } + + /* + * Otherwise, the buffer is full and the write pointer + * has reached base. Adjust this back to the Limit pointer + * for correct size. Also, mark the buffer truncated. + */ + write = get_trbe_limit_pointer(); + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + } + + offset = write - base; + if (WARN_ON_ONCE(offset < PERF_IDX2OFF(handle->head, buf))) + size = 0; + else + size = offset - PERF_IDX2OFF(handle->head, buf); + +done: + local_irq_restore(flags); + + if (buf->snapshot) + handle->head += size; + return size; +} + +static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data) +{ + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev); + struct perf_output_handle *handle = data; + struct trbe_buf *buf = etm_perf_sink_config(handle); + + WARN_ON(cpudata->cpu != smp_processor_id()); + WARN_ON(cpudata->drvdata != drvdata); + if (mode != CS_MODE_PERF) + return -EINVAL; + + *this_cpu_ptr(drvdata->handle) = handle; + cpudata->buf = buf; + cpudata->mode = mode; + buf->cpudata = cpudata; + buf->trbe_limit = compute_trbe_buffer_limit(handle); + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf); + if (buf->trbe_limit == buf->trbe_base) { + trbe_stop_and_truncate_event(handle); + return 0; + } + trbe_enable_hw(buf); + return 0; +} + +static int arm_trbe_disable(struct coresight_device *csdev) +{ + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev); + struct trbe_buf *buf = cpudata->buf; + + WARN_ON(buf->cpudata != cpudata); + WARN_ON(cpudata->cpu != smp_processor_id()); + WARN_ON(cpudata->drvdata != drvdata); + if (cpudata->mode != CS_MODE_PERF) + return -EINVAL; + + trbe_drain_and_disable_local(); + buf->cpudata = NULL; + cpudata->buf = NULL; + cpudata->mode = CS_MODE_DISABLED; + return 0; +} + +static void trbe_handle_spurious(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + + buf->trbe_limit = compute_trbe_buffer_limit(handle); + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf); + if (buf->trbe_limit == buf->trbe_base) { + trbe_drain_and_disable_local(); + return; + } + trbe_enable_hw(buf); +} + +static void trbe_handle_overflow(struct perf_output_handle *handle) +{ + struct perf_event *event = handle->event; + struct trbe_buf *buf = etm_perf_sink_config(handle); + unsigned long offset, size; + struct etm_event_data *event_data; + + offset = get_trbe_limit_pointer() - get_trbe_base_pointer(); + size = offset - PERF_IDX2OFF(handle->head, buf); + if (buf->snapshot) + handle->head += size; + + /* + * Mark the buffer as truncated, as we have stopped the trace + * collection upon the WRAP event, without stopping the source. + */ + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW | + PERF_AUX_FLAG_TRUNCATED); + perf_aux_output_end(handle, size); + event_data = perf_aux_output_begin(handle, event); + if (!event_data) { + /* + * We are unable to restart the trace collection, + * thus leave the TRBE disabled. The etm-perf driver + * is able to detect this with a disconnected handle + * (handle->event = NULL). + */ + trbe_drain_and_disable_local(); + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL; + return; + } + buf->trbe_limit = compute_trbe_buffer_limit(handle); + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf); + if (buf->trbe_limit == buf->trbe_base) { + trbe_stop_and_truncate_event(handle); + return; + } + *this_cpu_ptr(buf->cpudata->drvdata->handle) = handle; + trbe_enable_hw(buf); +} + +static bool is_perf_trbe(struct perf_output_handle *handle) +{ + struct trbe_buf *buf = etm_perf_sink_config(handle); + struct trbe_cpudata *cpudata = buf->cpudata; + struct trbe_drvdata *drvdata = cpudata->drvdata; + int cpu = smp_processor_id(); + + WARN_ON(buf->trbe_base != get_trbe_base_pointer()); + WARN_ON(buf->trbe_limit != get_trbe_limit_pointer()); + + if (cpudata->mode != CS_MODE_PERF) + return false; + + if (cpudata->cpu != cpu) + return false; + + if (!cpumask_test_cpu(cpu, &drvdata->supported_cpus)) + return false; + + return true; +} + +static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) +{ + struct perf_output_handle **handle_ptr = dev; + struct perf_output_handle *handle = *handle_ptr; + enum trbe_fault_action act; + u64 status; + + /* + * Ensure the trace is visible to the CPUs and + * any external aborts have been resolved. + */ + trbe_drain_and_disable_local(); + + status = read_sysreg_s(SYS_TRBSR_EL1); + /* + * If the pending IRQ was handled by update_buffer callback + * we have nothing to do here. + */ + if (!is_trbe_irq(status)) + return IRQ_NONE; + + clr_trbe_irq(); + isb(); + + if (WARN_ON_ONCE(!handle) || !perf_get_aux(handle)) + return IRQ_NONE; + + if (!is_perf_trbe(handle)) + return IRQ_NONE; + + /* + * Ensure perf callbacks have completed, which may disable + * the trace buffer in response to a TRUNCATION flag. + */ + irq_work_run(); + + act = trbe_get_fault_act(status); + switch (act) { + case TRBE_FAULT_ACT_WRAP: + trbe_handle_overflow(handle); + break; + case TRBE_FAULT_ACT_SPURIOUS: + trbe_handle_spurious(handle); + break; + case TRBE_FAULT_ACT_FATAL: + trbe_stop_and_truncate_event(handle); + break; + } + return IRQ_HANDLED; +} + +static const struct coresight_ops_sink arm_trbe_sink_ops = { + .enable = arm_trbe_enable, + .disable = arm_trbe_disable, + .alloc_buffer = arm_trbe_alloc_buffer, + .free_buffer = arm_trbe_free_buffer, + .update_buffer = arm_trbe_update_buffer, +}; + +static const struct coresight_ops arm_trbe_cs_ops = { + .sink_ops = &arm_trbe_sink_ops, +}; + +static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct trbe_cpudata *cpudata = dev_get_drvdata(dev); + + return sprintf(buf, "%llx\n", cpudata->trbe_align); +} +static DEVICE_ATTR_RO(align); + +static ssize_t flag_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct trbe_cpudata *cpudata = dev_get_drvdata(dev); + + return sprintf(buf, "%d\n", cpudata->trbe_flag); +} +static DEVICE_ATTR_RO(flag); + +static struct attribute *arm_trbe_attrs[] = { + &dev_attr_align.attr, + &dev_attr_flag.attr, + NULL, +}; + +static const struct attribute_group arm_trbe_group = { + .attrs = arm_trbe_attrs, +}; + +static const struct attribute_group *arm_trbe_groups[] = { + &arm_trbe_group, + NULL, +}; + +static void arm_trbe_enable_cpu(void *info) +{ + struct trbe_drvdata *drvdata = info; + + trbe_reset_local(); + enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE); +} + +static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu) +{ + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu); + struct coresight_desc desc = { 0 }; + struct device *dev; + + if (WARN_ON(trbe_csdev)) + return; + + dev = &cpudata->drvdata->pdev->dev; + desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu); + if (IS_ERR(desc.name)) + goto cpu_clear; + + desc.type = CORESIGHT_DEV_TYPE_SINK; + desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM; + desc.ops = &arm_trbe_cs_ops; + desc.pdata = dev_get_platdata(dev); + desc.groups = arm_trbe_groups; + desc.dev = dev; + trbe_csdev = coresight_register(&desc); + if (IS_ERR(trbe_csdev)) + goto cpu_clear; + + dev_set_drvdata(&trbe_csdev->dev, cpudata); + coresight_set_percpu_sink(cpu, trbe_csdev); + return; +cpu_clear: + cpumask_clear_cpu(cpu, &drvdata->supported_cpus); +} + +static void arm_trbe_probe_cpu(void *info) +{ + struct trbe_drvdata *drvdata = info; + int cpu = smp_processor_id(); + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); + u64 trbidr; + + if (WARN_ON(!cpudata)) + goto cpu_clear; + + if (!is_trbe_available()) { + pr_err("TRBE is not implemented on cpu %d\n", cpu); + goto cpu_clear; + } + + trbidr = read_sysreg_s(SYS_TRBIDR_EL1); + if (!is_trbe_programmable(trbidr)) { + pr_err("TRBE is owned in higher exception level on cpu %d\n", cpu); + goto cpu_clear; + } + + cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr); + if (cpudata->trbe_align > SZ_2K) { + pr_err("Unsupported alignment on cpu %d\n", cpu); + goto cpu_clear; + } + cpudata->trbe_flag = get_trbe_flag_update(trbidr); + cpudata->cpu = cpu; + cpudata->drvdata = drvdata; + return; +cpu_clear: + cpumask_clear_cpu(cpu, &drvdata->supported_cpus); +} + +static void arm_trbe_remove_coresight_cpu(void *info) +{ + int cpu = smp_processor_id(); + struct trbe_drvdata *drvdata = info; + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu); + + disable_percpu_irq(drvdata->irq); + trbe_reset_local(); + if (trbe_csdev) { + coresight_unregister(trbe_csdev); + cpudata->drvdata = NULL; + coresight_set_percpu_sink(cpu, NULL); + } +} + +static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata) +{ + int cpu; + + drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata)); + if (!drvdata->cpudata) + return -ENOMEM; + + for_each_cpu(cpu, &drvdata->supported_cpus) { + smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1); + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) + arm_trbe_register_coresight_cpu(drvdata, cpu); + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) + smp_call_function_single(cpu, arm_trbe_enable_cpu, drvdata, 1); + } + return 0; +} + +static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata) +{ + int cpu; + + for_each_cpu(cpu, &drvdata->supported_cpus) + smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1); + free_percpu(drvdata->cpudata); + return 0; +} + +static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node) +{ + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node); + + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) { + + /* + * If this CPU was not probed for TRBE, + * initialize it now. + */ + if (!coresight_get_percpu_sink(cpu)) { + arm_trbe_probe_cpu(drvdata); + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) + arm_trbe_register_coresight_cpu(drvdata, cpu); + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) + arm_trbe_enable_cpu(drvdata); + } else { + arm_trbe_enable_cpu(drvdata); + } + } + return 0; +} + +static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node) +{ + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node); + + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) { + disable_percpu_irq(drvdata->irq); + trbe_reset_local(); + } + return 0; +} + +static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata) +{ + enum cpuhp_state trbe_online; + int ret; + + trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME, + arm_trbe_cpu_startup, arm_trbe_cpu_teardown); + if (trbe_online < 0) + return trbe_online; + + ret = cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node); + if (ret) { + cpuhp_remove_multi_state(trbe_online); + return ret; + } + drvdata->trbe_online = trbe_online; + return 0; +} + +static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata) +{ + cpuhp_remove_multi_state(drvdata->trbe_online); +} + +static int arm_trbe_probe_irq(struct platform_device *pdev, + struct trbe_drvdata *drvdata) +{ + int ret; + + drvdata->irq = platform_get_irq(pdev, 0); + if (drvdata->irq < 0) { + pr_err("IRQ not found for the platform device\n"); + return drvdata->irq; + } + + if (!irq_is_percpu(drvdata->irq)) { + pr_err("IRQ is not a PPI\n"); + return -EINVAL; + } + + if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus)) + return -EINVAL; + + drvdata->handle = alloc_percpu(typeof(*drvdata->handle)); + if (!drvdata->handle) + return -ENOMEM; + + ret = request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle); + if (ret) { + free_percpu(drvdata->handle); + return ret; + } + return 0; +} + +static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata) +{ + free_percpu_irq(drvdata->irq, drvdata->handle); + free_percpu(drvdata->handle); +} + +static int arm_trbe_device_probe(struct platform_device *pdev) +{ + struct coresight_platform_data *pdata; + struct trbe_drvdata *drvdata; + struct device *dev = &pdev->dev; + int ret; + + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); + if (!drvdata) + return -ENOMEM; + + pdata = coresight_get_platform_data(dev); + if (IS_ERR(pdata)) + return PTR_ERR(pdata); + + dev_set_drvdata(dev, drvdata); + dev->platform_data = pdata; + drvdata->pdev = pdev; + ret = arm_trbe_probe_irq(pdev, drvdata); + if (ret) + return ret; + + ret = arm_trbe_probe_coresight(drvdata); + if (ret) + goto probe_failed; + + ret = arm_trbe_probe_cpuhp(drvdata); + if (ret) + goto cpuhp_failed; + + return 0; +cpuhp_failed: + arm_trbe_remove_coresight(drvdata); +probe_failed: + arm_trbe_remove_irq(drvdata); + return ret; +} + +static int arm_trbe_device_remove(struct platform_device *pdev) +{ + struct trbe_drvdata *drvdata = platform_get_drvdata(pdev); + + arm_trbe_remove_cpuhp(drvdata); + arm_trbe_remove_coresight(drvdata); + arm_trbe_remove_irq(drvdata); + return 0; +} + +static const struct of_device_id arm_trbe_of_match[] = { + { .compatible = "arm,trace-buffer-extension"}, + {}, +}; +MODULE_DEVICE_TABLE(of, arm_trbe_of_match); + +static struct platform_driver arm_trbe_driver = { + .driver = { + .name = DRVNAME, + .of_match_table = of_match_ptr(arm_trbe_of_match), + .suppress_bind_attrs = true, + }, + .probe = arm_trbe_device_probe, + .remove = arm_trbe_device_remove, +}; + +static int __init arm_trbe_init(void) +{ + int ret; + + if (arm64_kernel_unmapped_at_el0()) { + pr_err("TRBE wouldn't work if kernel gets unmapped at EL0\n"); + return -EOPNOTSUPP; + } + + ret = platform_driver_register(&arm_trbe_driver); + if (!ret) + return 0; + + pr_err("Error registering %s platform driver\n", DRVNAME); + return ret; +} + +static void __exit arm_trbe_exit(void) +{ + platform_driver_unregister(&arm_trbe_driver); +} +module_init(arm_trbe_init); +module_exit(arm_trbe_exit); + +MODULE_AUTHOR("Anshuman Khandual anshuman.khandual@arm.com"); +MODULE_DESCRIPTION("Arm Trace Buffer Extension (TRBE) driver"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h new file mode 100644 index 000000000000..abf3e36082f0 --- /dev/null +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -0,0 +1,152 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * This contains all required hardware related helper functions for + * Trace Buffer Extension (TRBE) driver in the coresight framework. + * + * Copyright (C) 2020 ARM Ltd. + * + * Author: Anshuman Khandual anshuman.khandual@arm.com + */ +#include <linux/coresight.h> +#include <linux/device.h> +#include <linux/irq.h> +#include <linux/kernel.h> +#include <linux/of.h> +#include <linux/platform_device.h> +#include <linux/smp.h> + +#include "coresight-etm-perf.h" + +static inline bool is_trbe_available(void) +{ + u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1); + unsigned int trbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_TRBE_SHIFT); + + return trbe >= 0b0001; +} + +static inline bool is_trbe_enabled(void) +{ + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); + + return trblimitr & TRBLIMITR_ENABLE; +} + +#define TRBE_EC_OTHERS 0 +#define TRBE_EC_STAGE1_ABORT 36 +#define TRBE_EC_STAGE2_ABORT 37 + +static inline int get_trbe_ec(u64 trbsr) +{ + return (trbsr >> TRBSR_EC_SHIFT) & TRBSR_EC_MASK; +} + +#define TRBE_BSC_NOT_STOPPED 0 +#define TRBE_BSC_FILLED 1 +#define TRBE_BSC_TRIGGERED 2 + +static inline int get_trbe_bsc(u64 trbsr) +{ + return (trbsr >> TRBSR_BSC_SHIFT) & TRBSR_BSC_MASK; +} + +static inline void clr_trbe_irq(void) +{ + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1); + + trbsr &= ~TRBSR_IRQ; + write_sysreg_s(trbsr, SYS_TRBSR_EL1); +} + +static inline bool is_trbe_irq(u64 trbsr) +{ + return trbsr & TRBSR_IRQ; +} + +static inline bool is_trbe_trg(u64 trbsr) +{ + return trbsr & TRBSR_TRG; +} + +static inline bool is_trbe_wrap(u64 trbsr) +{ + return trbsr & TRBSR_WRAP; +} + +static inline bool is_trbe_abort(u64 trbsr) +{ + return trbsr & TRBSR_ABORT; +} + +static inline bool is_trbe_running(u64 trbsr) +{ + return !(trbsr & TRBSR_STOP); +} + +#define TRBE_TRIG_MODE_STOP 0 +#define TRBE_TRIG_MODE_IRQ 1 +#define TRBE_TRIG_MODE_IGNORE 3 + +#define TRBE_FILL_MODE_FILL 0 +#define TRBE_FILL_MODE_WRAP 1 +#define TRBE_FILL_MODE_CIRCULAR_BUFFER 3 + +static inline void set_trbe_disabled(void) +{ + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); + + trblimitr &= ~TRBLIMITR_ENABLE; + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1); +} + +static inline bool get_trbe_flag_update(u64 trbidr) +{ + return trbidr & TRBIDR_FLAG; +} + +static inline bool is_trbe_programmable(u64 trbidr) +{ + return !(trbidr & TRBIDR_PROG); +} + +static inline int get_trbe_address_align(u64 trbidr) +{ + return (trbidr >> TRBIDR_ALIGN_SHIFT) & TRBIDR_ALIGN_MASK; +} + +static inline unsigned long get_trbe_write_pointer(void) +{ + return read_sysreg_s(SYS_TRBPTR_EL1); +} + +static inline void set_trbe_write_pointer(unsigned long addr) +{ + WARN_ON(is_trbe_enabled()); + write_sysreg_s(addr, SYS_TRBPTR_EL1); +} + +static inline unsigned long get_trbe_limit_pointer(void) +{ + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); + unsigned long addr = trblimitr & (TRBLIMITR_LIMIT_MASK << TRBLIMITR_LIMIT_SHIFT); + + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE)); + return addr; +} + +static inline unsigned long get_trbe_base_pointer(void) +{ + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1); + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT); + + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE)); + return addr; +} + +static inline void set_trbe_base_pointer(unsigned long addr) +{ + WARN_ON(is_trbe_enabled()); + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBBASER_BASE_SHIFT))); + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE)); + write_sysreg_s(addr, SYS_TRBBASER_EL1); +}
From: Anshuman Khandual anshuman.khandual@arm.com
Add sysfs ABI documentation for the TRBE devices.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Jonathan Corbet corbet@lwn.net Cc: linux-doc@vger.kernel.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com [ Split from the TRBE driver patch ] Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../ABI/testing/sysfs-bus-coresight-devices-trbe | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe new file mode 100644 index 000000000000..ad3bbc6fa751 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe @@ -0,0 +1,14 @@ +What: /sys/bus/coresight/devices/trbe<cpu>/align +Date: March 2021 +KernelVersion: 5.13 +Contact: Anshuman Khandual anshuman.khandual@arm.com +Description: (Read) Shows the TRBE write pointer alignment. This value + is fetched from the TRBIDR register. + +What: /sys/bus/coresight/devices/trbe<cpu>/flag +Date: March 2021 +KernelVersion: 5.13 +Contact: Anshuman Khandual anshuman.khandual@arm.com +Description: (Read) Shows if TRBE updates in the memory are with access + and dirty flag updates as well. This value is fetched from + the TRBIDR register.
From: Anshuman Khandual anshuman.khandual@arm.com
Add documentation for the TRBE under trace/coresight.
Cc: Jonathan Corbet corbet@lwn.net Cc: linux-doc@vger.kernel.org Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com [ Split from the TRBE driver patch ] Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../trace/coresight/coresight-trbe.rst | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst new file mode 100644 index 000000000000..b9928ef148da --- /dev/null +++ b/Documentation/trace/coresight/coresight-trbe.rst @@ -0,0 +1,38 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================== +Trace Buffer Extension (TRBE). +============================== + + :Author: Anshuman Khandual anshuman.khandual@arm.com + :Date: November 2020 + +Hardware Description +-------------------- + +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system +memory, CPU traces generated from a corresponding percpu tracing unit. This +gets plugged in as a coresight sink device because the corresponding trace +generators (ETE), are plugged in as source device. + +The TRBE is not compliant to CoreSight architecture specifications, but is +driven via the CoreSight driver framework to support the ETE (which is +CoreSight compliant) integration. + +Sysfs files and directories +--------------------------- + +The TRBE devices appear on the existing coresight bus alongside the other +coresight devices:: + + >$ ls /sys/bus/coresight/devices + trbe0 trbe1 trbe2 trbe3 + +The ``trbe<N>`` named TRBEs are associated with a CPU.:: + + >$ ls /sys/bus/coresight/devices/trbe0/ + align flag + +*Key file items are:-* + * ``align``: TRBE write pointer alignment + * ``flag``: TRBE updates memory with access and dirty flags
Document the device tree bindings for Trace Buffer Extension (TRBE).
Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Rob Herring robh@kernel.org Cc: devicetree@vger.kernel.org Reviewed-by: Rob Herring robh@kernel.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../devicetree/bindings/arm/trbe.yaml | 49 +++++++++++++++++++ MAINTAINERS | 1 + 2 files changed, 50 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml
diff --git a/Documentation/devicetree/bindings/arm/trbe.yaml b/Documentation/devicetree/bindings/arm/trbe.yaml new file mode 100644 index 000000000000..4402d7bfd1fc --- /dev/null +++ b/Documentation/devicetree/bindings/arm/trbe.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause +# Copyright 2021, Arm Ltd +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/arm/trbe.yaml#" +$schema: "http://devicetree.org/meta-schemas/core.yaml#" + +title: ARM Trace Buffer Extensions + +maintainers: + - Anshuman Khandual anshuman.khandual@arm.com + +description: | + Arm Trace Buffer Extension (TRBE) is a per CPU component + for storing trace generated on the CPU to memory. It is + accessed via CPU system registers. The software can verify + if it is permitted to use the component by checking the + TRBIDR register. + +properties: + $nodename: + const: "trbe" + compatible: + items: + - const: arm,trace-buffer-extension + + interrupts: + description: | + Exactly 1 PPI must be listed. For heterogeneous systems where + TRBE is only supported on a subset of the CPUs, please consult + the arm,gic-v3 binding for details on describing a PPI partition. + maxItems: 1 + +required: + - compatible + - interrupts + +additionalProperties: false + +examples: + + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + + trbe { + compatible = "arm,trace-buffer-extension"; + interrupts = <GIC_PPI 15 IRQ_TYPE_LEVEL_HIGH>; + }; +... diff --git a/MAINTAINERS b/MAINTAINERS index 3454ed1011c8..fbe863456ed1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1762,6 +1762,7 @@ F: Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt F: Documentation/devicetree/bindings/arm/coresight-cti.yaml F: Documentation/devicetree/bindings/arm/coresight.txt F: Documentation/devicetree/bindings/arm/ete.yaml +F: Documentation/devicetree/bindings/arm/trbe.yaml F: Documentation/trace/coresight/* F: drivers/hwtracing/coresight/* F: include/dt-bindings/arm/coresight-cti-dt.h
On Tue, 23 Mar 2021 12:06:28 +0000, Suzuki K Poulose wrote:
This series enables future IP trace features Embedded Trace Extension (ETE) and Trace Buffer Extension (TRBE). This series applies on v5.12-rc4 + some patches queued. A standalone tree is also available here [0]. The queued patches (almost there) are included in this posting for the sake of constructing a tree from the posting.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture extensions. ETE overlaps with the ETMv4 architecture, with additions to support the newer architecture features and some restrictions on the supported features w.r.t ETMv4. The ETE support is added by extending the ETMv4 driver to recognise the ETE and handle the features as exposed by the TRCIDRx registers. ETE only supports system instructions access from the host CPU. The ETE could be integrated with a TRBE (see below), or with the legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware description as the ETMs and requires a node per instance.
[...]
Applied to fixes, thanks!
[02/19] kvm: arm64: Disable guest access to trace filter controls commit: eed31b2332ed02ffb368eba7654350746185e6e7
Cheers,
M.
On Tue, 23 Mar 2021 12:06:28 +0000, Suzuki K Poulose wrote:
This series enables future IP trace features Embedded Trace Extension (ETE) and Trace Buffer Extension (TRBE). This series applies on v5.12-rc4 + some patches queued. A standalone tree is also available here [0]. The queued patches (almost there) are included in this posting for the sake of constructing a tree from the posting.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture extensions. ETE overlaps with the ETMv4 architecture, with additions to support the newer architecture features and some restrictions on the supported features w.r.t ETMv4. The ETE support is added by extending the ETMv4 driver to recognise the ETE and handle the features as exposed by the TRCIDRx registers. ETE only supports system instructions access from the host CPU. The ETE could be integrated with a TRBE (see below), or with the legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware description as the ETMs and requires a node per instance.
[...]
Applied to fixes, thanks!
[01/19] kvm: arm64: Hide system instruction access to Trace registers commit: 4af0afe252a2701732c317585f7c3ef6596b8f3d
Cheers,
M.
On 23/03/2021 16:34, Marc Zyngier wrote:
On Tue, 23 Mar 2021 12:06:28 +0000, Suzuki K Poulose wrote:
This series enables future IP trace features Embedded Trace Extension (ETE) and Trace Buffer Extension (TRBE). This series applies on v5.12-rc4 + some patches queued. A standalone tree is also available here [0]. The queued patches (almost there) are included in this posting for the sake of constructing a tree from the posting.
ETE is the PE (CPU) trace unit for CPUs, implementing future architecture extensions. ETE overlaps with the ETMv4 architecture, with additions to support the newer architecture features and some restrictions on the supported features w.r.t ETMv4. The ETE support is added by extending the ETMv4 driver to recognise the ETE and handle the features as exposed by the TRCIDRx registers. ETE only supports system instructions access from the host CPU. The ETE could be integrated with a TRBE (see below), or with the legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware description as the ETMs and requires a node per instance.
[...]
Applied to fixes, thanks!
[01/19] kvm: arm64: Hide system instruction access to Trace registers commit: 4af0afe252a2701732c317585f7c3ef6596b8f3d
Thanks Marc !