The arm64 Guarded Control Stack (GCS) feature provides support for hardware protected stacks of return addresses, intended to provide hardening against return oriented programming (ROP) attacks and to make it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is maintained, protected with a memory attribute which means that it can only be written with specific GCS operations. The current GCS pointer can not be directly written to by userspace. When a BL is executed the value stored in LR is also pushed onto the GCS, and when a RET is executed the top of the GCS is popped and compared to LR with a fault being raised if the values do not match. GCS operations may only be performed on GCS pages, a data abort is generated if they are not.
The combination of hardware enforcement and lack of extra instructions in the function entry and exit paths should result in something which has less overhead and is more difficult to attack than a purely software implementation like clang's shadow stacks.
This series implements support for managing GCS for KVM guests.
Signed-off-by: Mark Brown broonie@kernel.org --- Changes in v16: - Rebase onto v6.17-rc3. - Also expose the feature to nested guests. - Implement emulation of EXLOCK when returning from nested guests. - Rename enter_exception_gcs() to compute_exlock(). - Move all ID_AA64PFR1_EL1 handling to the final kernel patch. - Drop unneeded forwarding of GCS exceptions. - Commit and cover message updates. - Link to v15: https://lore.kernel.org/r/20250820-arm64-gcs-v15-0-5e334da18b84@kernel.org
Changes in v15: - Rebase onto v6.17-rc1. - Link to v14: https://lore.kernel.org/r/20241005-arm64-gcs-v14-0-59060cd6092b@kernel.org
Changes in v14: - Rebase onto arm64/for-next/gcs which includes all the non-KVM support. - Manage the fine grained traps for GCS instructions. - Manage PSTATE.EXLOCK when delivering exceptions to KVM guests. - Link to v13: https://lore.kernel.org/r/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.org
Changes in v13: - Rebase onto v6.12-rc1. - Allocate VM_HIGH_ARCH_6 since protection keys used all the existing bits. - Implement mm_release() and free transparently allocated GCSs there. - Use bit 32 of AT_HWCAP for GCS due to AT_HWCAP2 being filled. - Since we now only set GCSCRE0_EL1 on change ensure that it is initialised with GCSPR_EL0 accessible to EL0. - Fix OOM handling on thread copy. - Link to v12: https://lore.kernel.org/r/20240829-arm64-gcs-v12-0-42fec947436a@kernel.org
Changes in v12: - Clarify and simplify the signal handling code so we work with the register state. - When checking for write aborts to shadow stack pages ensure the fault is a data abort. - Depend on !UPROBES. - Comment cleanups. - Link to v11: https://lore.kernel.org/r/20240822-arm64-gcs-v11-0-41b81947ecb5@kernel.org
Changes in v11: - Remove the dependency on the addition of clone3() support for shadow stacks, rebasing onto v6.11-rc3. - Make ID_AA64PFR1_EL1.GCS writeable in KVM. - Hide GCS registers when GCS is not enabled for KVM guests. - Require HCRX_EL2.GCSEn if booting at EL1. - Require that GCSCR_EL1 and GCSCRE0_EL1 be initialised regardless of if we boot at EL2 or EL1. - Remove some stray use of bit 63 in signal cap tokens. - Warn if we see a GCS with VM_SHARED. - Remove rdundant check for VM_WRITE in fault handling. - Cleanups and clarifications in the ABI document. - Clean up and improve documentation of some sync placement. - Only set the EL0 GCS mode if it's actually changed. - Various minor fixes and tweaks. - Link to v10: https://lore.kernel.org/r/20240801-arm64-gcs-v10-0-699e2bd2190b@kernel.org
Changes in v10: - Fix issues with THP. - Tighten up requirements for initialising GCSCR*. - Only generate GCS signal frames for threads using GCS. - Only context switch EL1 GCS registers if S1PIE is enabled. - Move context switch of GCSCRE0_EL1 to EL0 context switch. - Make GCS registers unconditionally visible to userspace. - Use FHU infrastructure. - Don't change writability of ID_AA64PFR1_EL1 for KVM. - Remove unused arguments from alloc_gcs(). - Typo fixes. - Link to v9: https://lore.kernel.org/r/20240625-arm64-gcs-v9-0-0f634469b8f0@kernel.org
Changes in v9: - Rebase onto v6.10-rc3. - Restructure and clarify memory management fault handling. - Fix up basic-gcs for the latest clone3() changes. - Convert to newly merged KVM ID register based feature configuration. - Fixes for NV traps. - Link to v8: https://lore.kernel.org/r/20240203-arm64-gcs-v8-0-c9fec77673ef@kernel.org
Changes in v8: - Invalidate signal cap token on stack when consuming. - Typo and other trivial fixes. - Don't try to use process_vm_write() on GCS, it intentionally does not work. - Fix leak of thread GCSs. - Rebase onto latest clone3() series. - Link to v7: https://lore.kernel.org/r/20231122-arm64-gcs-v7-0-201c483bd775@kernel.org
Changes in v7: - Rebase onto v6.7-rc2 via the clone3() patch series. - Change the token used to cap the stack during signal handling to be compatible with GCSPOPM. - Fix flags for new page types. - Fold in support for clone3(). - Replace copy_to_user_gcs() with put_user_gcs(). - Link to v6: https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org
Changes in v6: - Rebase onto v6.6-rc3. - Add some more gcsb_dsync() barriers following spec clarifications. - Due to ongoing discussion around clone()/clone3() I've not updated anything there, the behaviour is the same as on previous versions. - Link to v5: https://lore.kernel.org/r/20230822-arm64-gcs-v5-0-9ef181dd6324@kernel.org
Changes in v5: - Don't map any permissions for user GCSs, we always use EL0 accessors or use a separate mapping of the page. - Reduce the standard size of the GCS to RLIMIT_STACK/2. - Enforce a PAGE_SIZE alignment requirement on map_shadow_stack(). - Clarifications and fixes to documentation. - More tests. - Link to v4: https://lore.kernel.org/r/20230807-arm64-gcs-v4-0-68cfa37f9069@kernel.org
Changes in v4: - Implement flags for map_shadow_stack() allowing the cap and end of stack marker to be enabled independently or not at all. - Relax size and alignment requirements for map_shadow_stack(). - Add more blurb explaining the advantages of hardware enforcement. - Link to v3: https://lore.kernel.org/r/20230731-arm64-gcs-v3-0-cddf9f980d98@kernel.org
Changes in v3: - Rebase onto v6.5-rc4. - Add a GCS barrier on context switch. - Add a GCS stress test. - Link to v2: https://lore.kernel.org/r/20230724-arm64-gcs-v2-0-dc2c1d44c2eb@kernel.org
Changes in v2: - Rebase onto v6.5-rc3. - Rework prctl() interface to allow each bit to be locked independently. - map_shadow_stack() now places the cap token based on the size requested by the caller not the actual space allocated. - Mode changes other than enable via ptrace are now supported. - Expand test coverage. - Various smaller fixes and adjustments. - Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org
--- Mark Brown (6): arm64/gcs: Ensure FGTs for EL1 GCS instructions are disabled KVM: arm64: Manage GCS access and registers for guests KVM: arm64: Set PSTATE.EXLOCK when entering an exception KVM: arm64: Validate GCS exception lock when emulating ERET KVM: arm64: Allow GCS to be enabled for guests KVM: selftests: arm64: Add GCS registers to get-reg-list
arch/arm64/include/asm/el2_setup.h | 5 +++ arch/arm64/include/asm/kvm_emulate.h | 3 ++ arch/arm64/include/asm/kvm_host.h | 14 +++++++++ arch/arm64/include/asm/vncr_mapping.h | 2 ++ arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kvm/emulate-nested.c | 40 +++++++++++++++++++++++- arch/arm64/kvm/hyp/exception.c | 37 ++++++++++++++++++++++ arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 31 ++++++++++++++++++ arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 10 ++++++ arch/arm64/kvm/nested.c | 7 +++-- arch/arm64/kvm/sys_regs.c | 32 +++++++++++++++++-- tools/testing/selftests/kvm/arm64/get-reg-list.c | 12 +++++++ 12 files changed, 188 insertions(+), 6 deletions(-) --- base-commit: 1b237f190eb3d36f52dffe07a40b5eb210280e00 change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards, -- Mark Brown broonie@kernel.org
The initial EL2 setup for GCS did not include disabling of EL1 usage of GCS instructions, also disable these traps. This is the first disabling of instruction traps, use x2 to store the value to be written.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/el2_setup.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 46033027510c..d174f405c44a 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -353,6 +353,11 @@ orr x0, x0, #HFGRTR_EL2_nGCS_EL1_MASK orr x0, x0, #HFGRTR_EL2_nGCS_EL0_MASK
+ /* Disable traps of GCS instructions at EL1 */ + orr x2, x2, #HFGITR_EL2_nGCSEPP_MASK + orr x2, x2, #HFGITR_EL2_nGCSSTR_EL1_MASK + orr x2, x2, #HFGITR_EL2_nGCSPUSHM_EL1_MASK + .Lskip_gce_fgt_@:
.Lset_fgt_@:
GCS introduces a number of system registers, on systems with GCS we need to context switch them and expose them to VMMs to allow guests to use GCS.
In order to allow guests to use GCS we also need to configure HCRX_EL2.GCSEn, if this is not set GCS instructions will be noops and CHKFEAT will report GCS as disabled.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_emulate.h | 3 +++ arch/arm64/include/asm/kvm_host.h | 14 ++++++++++++++ arch/arm64/include/asm/vncr_mapping.h | 2 ++ arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 31 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 10 ++++++++++ arch/arm64/kvm/sys_regs.c | 30 +++++++++++++++++++++++++++++ 6 files changed, 90 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index fa8a08a1ccd5..9ab1e7616854 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -672,6 +672,9 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
if (kvm_has_sctlr2(kvm)) vcpu->arch.hcrx_el2 |= HCRX_EL2_SCTLR2En; + + if (kvm_has_gcs(kvm)) + vcpu->arch.hcrx_el2 |= HCRX_EL2_GCSEn; } } #endif /* __ARM64_KVM_EMULATE_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2f2394cce24e..22a3fa9d97aa 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -480,6 +480,10 @@ enum vcpu_sysreg {
POR_EL0, /* Permission Overlay Register 0 (EL0) */
+ /* Guarded Control Stack registers */ + GCSCRE0_EL1, /* Guarded Control Stack Control (EL0) */ + GCSPR_EL0, /* Guarded Control Stack Pointer (EL0) */ + /* FP/SIMD/SVE */ SVCR, FPMR, @@ -502,6 +506,8 @@ enum vcpu_sysreg { PIRE0_EL2, /* Permission Indirection Register 0 (EL2) */ PIR_EL2, /* Permission Indirection Register 1 (EL2) */ POR_EL2, /* Permission Overlay Register 2 (EL2) */ + GCSCR_EL2, /* Guarded Control Stack Control Register (EL2) */ + GCSPR_EL2, /* Guarded Control Stack Pointer Register (EL2) */ SPSR_EL2, /* EL2 saved program status register */ ELR_EL2, /* EL2 exception link register */ AFSR0_EL2, /* Auxiliary Fault Status Register 0 (EL2) */ @@ -571,6 +577,10 @@ enum vcpu_sysreg { VNCR(VDISR_EL2), VNCR(VSESR_EL2),
+ /* Guarded Control Stack registers */ + VNCR(GCSPR_EL1), /* Guarded Control Stack Pointer (EL1) */ + VNCR(GCSCR_EL1), /* Guarded Control Stack Control (EL1) */ + VNCR(HFGRTR_EL2), VNCR(HFGWTR_EL2), VNCR(HFGITR_EL2), @@ -1697,6 +1707,10 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); #define kvm_has_sctlr2(k) \ (kvm_has_feat((k), ID_AA64MMFR3_EL1, SCTLRX, IMP))
+#define kvm_has_gcs(k) \ + (system_supports_gcs() && \ + kvm_has_feat((k), ID_AA64PFR1_EL1, GCS, IMP)) + static inline bool kvm_arch_has_irq_bypass(void) { return true; diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h index f6ec500ad3fa..e9fac6218d44 100644 --- a/arch/arm64/include/asm/vncr_mapping.h +++ b/arch/arm64/include/asm/vncr_mapping.h @@ -95,6 +95,8 @@ #define VNCR_PMSIRR_EL1 0x840 #define VNCR_PMSLATFR_EL1 0x848 #define VNCR_TRFCR_EL1 0x880 +#define VNCR_GCSPR_EL1 0x8C0 +#define VNCR_GCSCR_EL1 0x8D0 #define VNCR_MPAM1_EL1 0x900 #define VNCR_MPAMHCR_EL2 0x930 #define VNCR_MPAMVPMV_EL2 0x938 diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h index a17cbe7582de..053d7b3c5104 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -17,6 +17,7 @@ #include <asm/kvm_mmu.h>
static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt); +static inline bool ctxt_has_gcs(struct kvm_cpu_context *ctxt);
static inline struct kvm_vcpu *ctxt_to_vcpu(struct kvm_cpu_context *ctxt) { @@ -67,6 +68,11 @@ static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt) { ctxt_sys_reg(ctxt, TPIDR_EL0) = read_sysreg(tpidr_el0); ctxt_sys_reg(ctxt, TPIDRRO_EL0) = read_sysreg(tpidrro_el0); + + if (ctxt_has_gcs(ctxt)) { + ctxt_sys_reg(ctxt, GCSPR_EL0) = read_sysreg_s(SYS_GCSPR_EL0); + ctxt_sys_reg(ctxt, GCSCRE0_EL1) = read_sysreg_s(SYS_GCSCRE0_EL1); + } }
static inline bool ctxt_has_mte(struct kvm_cpu_context *ctxt) @@ -131,6 +137,17 @@ static inline bool ctxt_has_sctlr2(struct kvm_cpu_context *ctxt) return kvm_has_sctlr2(kern_hyp_va(vcpu->kvm)); }
+static inline bool ctxt_has_gcs(struct kvm_cpu_context *ctxt) +{ + struct kvm_vcpu *vcpu; + + if (!cpus_have_final_cap(ARM64_HAS_GCS)) + return false; + + vcpu = ctxt_to_vcpu(ctxt); + return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64PFR1_EL1, GCS, IMP); +} + static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) { ctxt_sys_reg(ctxt, SCTLR_EL1) = read_sysreg_el1(SYS_SCTLR); @@ -144,6 +161,10 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) if (ctxt_has_s1pie(ctxt)) { ctxt_sys_reg(ctxt, PIR_EL1) = read_sysreg_el1(SYS_PIR); ctxt_sys_reg(ctxt, PIRE0_EL1) = read_sysreg_el1(SYS_PIRE0); + if (ctxt_has_gcs(ctxt)) { + ctxt_sys_reg(ctxt, GCSPR_EL1) = read_sysreg_el1(SYS_GCSPR); + ctxt_sys_reg(ctxt, GCSCR_EL1) = read_sysreg_el1(SYS_GCSCR); + } }
if (ctxt_has_s1poe(ctxt)) @@ -206,6 +227,11 @@ static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) { write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL0), tpidr_el0); write_sysreg(ctxt_sys_reg(ctxt, TPIDRRO_EL0), tpidrro_el0); + if (ctxt_has_gcs(ctxt)) { + write_sysreg_s(ctxt_sys_reg(ctxt, GCSPR_EL0), SYS_GCSPR_EL0); + write_sysreg_s(ctxt_sys_reg(ctxt, GCSCRE0_EL1), + SYS_GCSCRE0_EL1); + } }
static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt, @@ -239,6 +265,11 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt, if (ctxt_has_s1pie(ctxt)) { write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1), SYS_PIR); write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1), SYS_PIRE0); + + if (ctxt_has_gcs(ctxt)) { + write_sysreg_el1(ctxt_sys_reg(ctxt, GCSPR_EL1), SYS_GCSPR); + write_sysreg_el1(ctxt_sys_reg(ctxt, GCSCR_EL1), SYS_GCSCR); + } }
if (ctxt_has_s1poe(ctxt)) diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c index f28c6cf4fe1b..e63b473d66d1 100644 --- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c @@ -61,6 +61,11 @@ static void __sysreg_save_vel2_state(struct kvm_vcpu *vcpu)
if (ctxt_has_s1poe(&vcpu->arch.ctxt)) __vcpu_assign_sys_reg(vcpu, POR_EL2, read_sysreg_el1(SYS_POR)); + + if (ctxt_has_gcs(&vcpu->arch.ctxt)) { + __vcpu_assign_sys_reg(vcpu, GCSCR_EL2, read_sysreg_el1(SYS_GCSCR)); + __vcpu_assign_sys_reg(vcpu, GCSPR_EL2, read_sysreg_el1(SYS_GCSPR)); + } }
/* @@ -133,6 +138,11 @@ static void __sysreg_restore_vel2_state(struct kvm_vcpu *vcpu)
if (ctxt_has_s1poe(&vcpu->arch.ctxt)) write_sysreg_el1(__vcpu_sys_reg(vcpu, POR_EL2), SYS_POR); + + if (ctxt_has_gcs(&vcpu->arch.ctxt)) { + write_sysreg_el1(__vcpu_sys_reg(vcpu, GCSCR_EL2), SYS_GCSCR); + write_sysreg_el1(__vcpu_sys_reg(vcpu, GCSPR_EL2), SYS_GCSPR); + } }
write_sysreg_el1(__vcpu_sys_reg(vcpu, ESR_EL2), SYS_ESR); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 82ffb3b3b3cf..03fb2dce0b80 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -138,6 +138,8 @@ static bool get_el2_to_el1_mapping(unsigned int reg, MAPPED_EL2_SYSREG(PIR_EL2, PIR_EL1, NULL ); MAPPED_EL2_SYSREG(PIRE0_EL2, PIRE0_EL1, NULL ); MAPPED_EL2_SYSREG(POR_EL2, POR_EL1, NULL ); + MAPPED_EL2_SYSREG(GCSCR_EL2, GCSCR_EL1, NULL ); + MAPPED_EL2_SYSREG(GCSPR_EL2, GCSPR_EL1, NULL ); MAPPED_EL2_SYSREG(AMAIR_EL2, AMAIR_EL1, NULL ); MAPPED_EL2_SYSREG(ELR_EL2, ELR_EL1, NULL ); MAPPED_EL2_SYSREG(SPSR_EL2, SPSR_EL1, NULL ); @@ -2654,6 +2656,21 @@ static unsigned int s1pie_el2_visibility(const struct kvm_vcpu *vcpu, return __el2_visibility(vcpu, rd, s1pie_visibility); }
+static unsigned int gcs_visibility(const struct kvm_vcpu *vcpu, + const struct sys_reg_desc *r) +{ + if (kvm_has_gcs(vcpu->kvm)) + return 0; + + return REG_HIDDEN; +} + +static unsigned int gcs_el2_visibility(const struct kvm_vcpu *vcpu, + const struct sys_reg_desc *rd) +{ + return __el2_visibility(vcpu, rd, gcs_visibility); +} + static bool access_mdcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) @@ -3048,6 +3065,13 @@ static const struct sys_reg_desc sys_reg_descs[] = { PTRAUTH_KEY(APDB), PTRAUTH_KEY(APGA),
+ { SYS_DESC(SYS_GCSCR_EL1), NULL, reset_val, GCSCR_EL1, 0, + .visibility = gcs_visibility }, + { SYS_DESC(SYS_GCSPR_EL1), NULL, reset_unknown, GCSPR_EL1, + .visibility = gcs_visibility }, + { SYS_DESC(SYS_GCSCRE0_EL1), NULL, reset_val, GCSCRE0_EL1, 0, + .visibility = gcs_visibility }, + { SYS_DESC(SYS_SPSR_EL1), access_spsr}, { SYS_DESC(SYS_ELR_EL1), access_elr},
@@ -3162,6 +3186,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { CTR_EL0_DminLine_MASK | CTR_EL0_L1Ip_MASK | CTR_EL0_IminLine_MASK), + { SYS_DESC(SYS_GCSPR_EL0), NULL, reset_unknown, GCSPR_EL0, + .visibility = gcs_visibility }, { SYS_DESC(SYS_SVCR), undef_access, reset_val, SVCR, 0, .visibility = sme_visibility }, { SYS_DESC(SYS_FPMR), undef_access, reset_val, FPMR, 0, .visibility = fp8_visibility },
@@ -3401,6 +3427,10 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG_FILTERED(VNCR_EL2, bad_vncr_trap, reset_val, 0, vncr_el2_visibility),
+ EL2_REG_FILTERED(GCSCR_EL2, access_rw, reset_val, 0, + gcs_el2_visibility), + EL2_REG_FILTERED(GCSPR_EL2, access_rw, reset_val, 0, + gcs_el2_visibility), { SYS_DESC(SYS_DACR32_EL2), undef_access, reset_unknown, DACR32_EL2 }, EL2_REG_VNCR_FILT(HDFGRTR2_EL2, fgt2_visibility), EL2_REG_VNCR_FILT(HDFGWTR2_EL2, fgt2_visibility),
On Fri, 12 Sep 2025 10:25:28 +0100, Mark Brown broonie@kernel.org wrote:
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 82ffb3b3b3cf..03fb2dce0b80 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -138,6 +138,8 @@ static bool get_el2_to_el1_mapping(unsigned int reg, MAPPED_EL2_SYSREG(PIR_EL2, PIR_EL1, NULL ); MAPPED_EL2_SYSREG(PIRE0_EL2, PIRE0_EL1, NULL ); MAPPED_EL2_SYSREG(POR_EL2, POR_EL1, NULL );
MAPPED_EL2_SYSREG(GCSCR_EL2, GCSCR_EL1, NULL );
MAPPED_EL2_SYSREG(AMAIR_EL2, AMAIR_EL1, NULL ); MAPPED_EL2_SYSREG(ELR_EL2, ELR_EL1, NULL ); MAPPED_EL2_SYSREG(SPSR_EL2, SPSR_EL1, NULL );MAPPED_EL2_SYSREG(GCSPR_EL2, GCSPR_EL1, NULL );
Just like the previous version, you're missing the accessors that would be this table useful. Meaning that the vcpu_read_sys_reg() and vcpu_write_sys_reg() accessors will fail for all 4 GSC registers.
M.
On Fri, Sep 12, 2025 at 12:59:23PM +0100, Marc Zyngier wrote:
On Fri, 12 Sep 2025 10:25:28 +0100, Mark Brown broonie@kernel.org wrote:
MAPPED_EL2_SYSREG(PIR_EL2, PIR_EL1, NULL ); MAPPED_EL2_SYSREG(PIRE0_EL2, PIRE0_EL1, NULL ); MAPPED_EL2_SYSREG(POR_EL2, POR_EL1, NULL );
MAPPED_EL2_SYSREG(GCSCR_EL2, GCSCR_EL1, NULL );
MAPPED_EL2_SYSREG(AMAIR_EL2, AMAIR_EL1, NULL ); MAPPED_EL2_SYSREG(ELR_EL2, ELR_EL1, NULL ); MAPPED_EL2_SYSREG(SPSR_EL2, SPSR_EL1, NULL );MAPPED_EL2_SYSREG(GCSPR_EL2, GCSPR_EL1, NULL );
Just like the previous version, you're missing the accessors that would be this table useful. Meaning that the vcpu_read_sys_reg() and vcpu_write_sys_reg() accessors will fail for all 4 GSC registers.
Just to confirm, this is __vcpu_{read,write}_sysreg()?
Sorry, I missed your comment about this on the prior version due to UI confusion with my mail cllent. My mistake.
On Fri, Sep 12, 2025 at 05:33:39PM +0100, Mark Brown wrote:
On Fri, Sep 12, 2025 at 12:59:23PM +0100, Marc Zyngier wrote:
Just like the previous version, you're missing the accessors that would be this table useful. Meaning that the vcpu_read_sys_reg() and vcpu_write_sys_reg() accessors will fail for all 4 GSC registers.
Just to confirm, this is __vcpu_{read,write}_sysreg()?
Sorry, that should have been __vcpu_{read,write}_sys_reg_{to,from}_cpu() *sigh*
As per DDI 0487 R_WTXBY we need to manage PSTATE.EXLOCK when entering an exception, when the exception is entered from a lower EL the bit is cleared while if entering from the same EL it is set to GCSCR_ELx.EXLOCKEN. Implement this behaviour in enter_exception64().
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kvm/hyp/exception.c | 37 ++++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 0f39ba4f3efd..f2fb029fb61a 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -56,6 +56,7 @@ #define PSR_C_BIT 0x20000000 #define PSR_Z_BIT 0x40000000 #define PSR_N_BIT 0x80000000 +#define PSR_EXLOCK_BIT 0x400000000
#define PSR_BTYPE_SHIFT 10
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c index 95d186e0bf54..888cea250dc2 100644 --- a/arch/arm64/kvm/hyp/exception.c +++ b/arch/arm64/kvm/hyp/exception.c @@ -73,6 +73,38 @@ static void __vcpu_write_spsr_und(struct kvm_vcpu *vcpu, u64 val) vcpu->arch.ctxt.spsr_und = val; }
+static unsigned long compute_exlock(struct kvm_vcpu *vcpu, + unsigned long mode, + unsigned long target_mode) +{ + u64 gcscr; + + if (!kvm_has_gcs(kern_hyp_va(vcpu->kvm))) + return 0; + + /* GCS can't be enabled for 32 bit */ + if (mode & PSR_MODE32_BIT) + return 0; + + /* When taking an exception to a higher EL EXLOCK is cleared. */ + if ((mode | PSR_MODE_THREAD_BIT) != target_mode) + return 0; + + /* + * When taking an exception to the same EL EXLOCK is set to + * the effective value of GCSR_ELx.EXLOCKEN. + */ + if (vcpu_is_el2(vcpu)) + gcscr = __vcpu_read_sys_reg(vcpu, GCSCR_EL2); + else + gcscr = __vcpu_read_sys_reg(vcpu, GCSCR_EL1); + + if (gcscr & GCSCR_ELx_EXLOCKEN) + return PSR_EXLOCK_BIT; + + return 0; +} + /* * This performs the exception entry at a given EL (@target_mode), stashing PC * and PSTATE into ELR and SPSR respectively, and compute the new PC/PSTATE. @@ -162,6 +194,11 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode, // PSTATE.BTYPE is set to zero upon any exception to AArch64 // See ARM DDI 0487E.a, pages D1-2293 to D1-2294.
+ // PSTATE.EXLOCK is set to 0 upon any exception to a higher + // EL, or to GCSCR_ELx.EXLOCKEN for an exception to the same + // exception level. See ARM DDI 0487 R_WTXBY. + new |= compute_exlock(vcpu, mode, target_mode); + new |= PSR_D_BIT; new |= PSR_A_BIT; new |= PSR_I_BIT;
As per DDI0487 R_TYTWB GCS adds an additional case where an illegal exception return can be generated. If all of:
- PSTATE.EXLOCK is 0. - The EL is not being changed by the ERET. - GCSCR_ELx.EXLOCKEN is 1.
are true then the return is illegal. Emulate this behaviour when emulating ERET for nested guests.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/emulate-nested.c | 40 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index 90cb4b7ae0ff..9b02b85eda64 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -2632,6 +2632,41 @@ bool forward_debug_exception(struct kvm_vcpu *vcpu) return forward_mdcr_traps(vcpu, MDCR_EL2_TDE); }
+/* + * A subset of the pseudocode ELFromSPSR(), validity checks are + * assumed to have been done in code that is not GCS specific. + */ +static inline int exlock_el_from_spsr(u64 spsr) +{ + return FIELD_GET(GENMASK(3, 2), spsr); +} + +/* See IllegalExceptionReturn() pseudocode */ +static bool kvm_check_illegal_exlock_return(struct kvm_vcpu *vcpu, u64 spsr) +{ + u64 cur_el, target_el; + u64 gcscr; + + if (!kvm_has_gcs(vcpu->kvm)) + return false; + + if (spsr & PSR_EXLOCK_BIT) + return false; + + cur_el = exlock_el_from_spsr(vcpu->arch.ctxt.regs.pstate); + target_el = exlock_el_from_spsr(spsr); + + if (cur_el != target_el) + return false; + + if (vcpu_is_el2(vcpu)) + gcscr = __vcpu_sys_reg(vcpu, GCSCR_EL2); + else + gcscr = __vcpu_sys_reg(vcpu, GCSCR_EL1); + + return gcscr & GCSCR_ELx_EXLOCKEN; +} + static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) { u64 mode = spsr & PSR_MODE_MASK; @@ -2642,12 +2677,15 @@ static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) * - trying to return to an illegal M value * - trying to return to a 32bit EL * - trying to return to EL1 with HCR_EL2.TGE set + * - GCSCR_ELx.EXLOCKEN is 1 and PSTATE.EXLOCK is 0 when attempting + * to return from ELx the same EL. */ if (mode == PSR_MODE_EL3t || mode == PSR_MODE_EL3h || mode == 0b00001 || (mode & BIT(1)) || (spsr & PSR_MODE32_BIT) || (vcpu_el2_tge_is_set(vcpu) && (mode == PSR_MODE_EL1t || - mode == PSR_MODE_EL1h))) { + mode == PSR_MODE_EL1h)) || + kvm_check_illegal_exlock_return(vcpu, spsr)) { /* * The guest is playing with our nerves. Preserve EL, SP, * masks, flags from the existing PSTATE, and set IL.
On Fri, 12 Sep 2025 10:25:30 +0100, Mark Brown broonie@kernel.org wrote:
As per DDI0487 R_TYTWB GCS adds an additional case where an illegal exception return can be generated. If all of:
- PSTATE.EXLOCK is 0.
- The EL is not being changed by the ERET.
- GCSCR_ELx.EXLOCKEN is 1.
are true then the return is illegal. Emulate this behaviour when emulating ERET for nested guests.
Signed-off-by: Mark Brown broonie@kernel.org
arch/arm64/kvm/emulate-nested.c | 40 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index 90cb4b7ae0ff..9b02b85eda64 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -2632,6 +2632,41 @@ bool forward_debug_exception(struct kvm_vcpu *vcpu) return forward_mdcr_traps(vcpu, MDCR_EL2_TDE); } +/*
- A subset of the pseudocode ELFromSPSR(), validity checks are
- assumed to have been done in code that is not GCS specific.
- */
+static inline int exlock_el_from_spsr(u64 spsr) +{
- return FIELD_GET(GENMASK(3, 2), spsr);
+}
+/* See IllegalExceptionReturn() pseudocode */ +static bool kvm_check_illegal_exlock_return(struct kvm_vcpu *vcpu, u64 spsr) +{
- u64 cur_el, target_el;
- u64 gcscr;
- if (!kvm_has_gcs(vcpu->kvm))
return false;
- if (spsr & PSR_EXLOCK_BIT)
return false;
- cur_el = exlock_el_from_spsr(vcpu->arch.ctxt.regs.pstate);
- target_el = exlock_el_from_spsr(spsr);
- if (cur_el != target_el)
return false;
- if (vcpu_is_el2(vcpu))
gcscr = __vcpu_sys_reg(vcpu, GCSCR_EL2);
- else
gcscr = __vcpu_sys_reg(vcpu, GCSCR_EL1);
At the point where we check for an illegal exception return, the state is live on the CPU. How does this work? Also, we only handle ERET traps for EL2, not EL1.
- return gcscr & GCSCR_ELx_EXLOCKEN;
+}
static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) { u64 mode = spsr & PSR_MODE_MASK; @@ -2642,12 +2677,15 @@ static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) * - trying to return to an illegal M value * - trying to return to a 32bit EL * - trying to return to EL1 with HCR_EL2.TGE set
* - GCSCR_ELx.EXLOCKEN is 1 and PSTATE.EXLOCK is 0 when attempting
*/ if (mode == PSR_MODE_EL3t || mode == PSR_MODE_EL3h || mode == 0b00001 || (mode & BIT(1)) || (spsr & PSR_MODE32_BIT) || (vcpu_el2_tge_is_set(vcpu) && (mode == PSR_MODE_EL1t ||* to return from ELx the same EL.
mode == PSR_MODE_EL1h))) {
mode == PSR_MODE_EL1h)) ||
kvm_check_illegal_exlock_return(vcpu, spsr)) {
This code is simply never reached. Hint: kvm_hyp_handle_eret().
M.
Now that required functionality for GCS is in place expose ID_AA64PFR1_EL1.GCS, allowing guests to be given the feature.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/nested.c | 7 ++++--- arch/arm64/kvm/sys_regs.c | 2 -- 2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 153b3e11b115..d2d55e18c610 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -1459,9 +1459,10 @@ u64 limit_nv_id_reg(struct kvm *kvm, u32 reg, u64 val)
case SYS_ID_AA64PFR1_EL1: /* Only support BTI, SSBS, CSV2_frac */ - val &= (ID_AA64PFR1_EL1_BT | - ID_AA64PFR1_EL1_SSBS | - ID_AA64PFR1_EL1_CSV2_frac); + val &= (ID_AA64PFR1_EL1_BT | + ID_AA64PFR1_EL1_SSBS | + ID_AA64PFR1_EL1_CSV2_frac | + ID_AA64PFR1_EL1_GCS); break;
case SYS_ID_AA64MMFR0_EL1: diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 03fb2dce0b80..60e234422064 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1616,7 +1616,6 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_RNDR_trap); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_NMI); - val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_GCS); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_THE); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTEX); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_PFAR); @@ -2953,7 +2952,6 @@ static const struct sys_reg_desc sys_reg_descs[] = { ~(ID_AA64PFR1_EL1_PFAR | ID_AA64PFR1_EL1_MTEX | ID_AA64PFR1_EL1_THE | - ID_AA64PFR1_EL1_GCS | ID_AA64PFR1_EL1_MTE_frac | ID_AA64PFR1_EL1_NMI | ID_AA64PFR1_EL1_RNDR_trap |
GCS adds new registers GCSCR_EL1, GCSCRE0_EL1, GCSPR_EL1 and GCSPR_EL0. Add these to those validated by get-reg-list.
Reviewed-by: Thiago Jung Bauermann thiago.bauermann@linaro.org Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/kvm/arm64/get-reg-list.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/tools/testing/selftests/kvm/arm64/get-reg-list.c b/tools/testing/selftests/kvm/arm64/get-reg-list.c index 011fad95dd02..9bf33064377b 100644 --- a/tools/testing/selftests/kvm/arm64/get-reg-list.c +++ b/tools/testing/selftests/kvm/arm64/get-reg-list.c @@ -42,6 +42,12 @@ struct feature_id_reg { static struct feature_id_reg feat_id_regs[] = { REG_FEAT(TCR2_EL1, ID_AA64MMFR3_EL1, TCRX, IMP), REG_FEAT(TCR2_EL2, ID_AA64MMFR3_EL1, TCRX, IMP), + REG_FEAT(GCSPR_EL0, ID_AA64PFR1_EL1, GCS, IMP), + REG_FEAT(GCSPR_EL1, ID_AA64PFR1_EL1, GCS, IMP), + REG_FEAT(GCSPR_EL2, ID_AA64PFR1_EL1, GCS, IMP), + REG_FEAT(GCSCRE0_EL1, ID_AA64PFR1_EL1, GCS, IMP), + REG_FEAT(GCSCR_EL1, ID_AA64PFR1_EL1, GCS, IMP), + REG_FEAT(GCSCR_EL2, ID_AA64PFR1_EL1, GCS, IMP), REG_FEAT(PIRE0_EL1, ID_AA64MMFR3_EL1, S1PIE, IMP), REG_FEAT(PIRE0_EL2, ID_AA64MMFR3_EL1, S1PIE, IMP), REG_FEAT(PIR_EL1, ID_AA64MMFR3_EL1, S1PIE, IMP), @@ -486,6 +492,9 @@ static __u64 base_regs[] = { ARM64_SYS_REG(3, 0, 2, 0, 1), /* TTBR1_EL1 */ ARM64_SYS_REG(3, 0, 2, 0, 2), /* TCR_EL1 */ ARM64_SYS_REG(3, 0, 2, 0, 3), /* TCR2_EL1 */ + ARM64_SYS_REG(3, 0, 2, 5, 0), /* GCSCR_EL1 */ + ARM64_SYS_REG(3, 0, 2, 5, 1), /* GCSPR_EL1 */ + ARM64_SYS_REG(3, 0, 2, 5, 2), /* GCSCRE0_EL1 */ ARM64_SYS_REG(3, 0, 5, 1, 0), /* AFSR0_EL1 */ ARM64_SYS_REG(3, 0, 5, 1, 1), /* AFSR1_EL1 */ ARM64_SYS_REG(3, 0, 5, 2, 0), /* ESR_EL1 */ @@ -502,6 +511,7 @@ static __u64 base_regs[] = { ARM64_SYS_REG(3, 0, 13, 0, 4), /* TPIDR_EL1 */ ARM64_SYS_REG(3, 0, 14, 1, 0), /* CNTKCTL_EL1 */ ARM64_SYS_REG(3, 2, 0, 0, 0), /* CSSELR_EL1 */ + ARM64_SYS_REG(3, 3, 2, 5, 1), /* GCSPR_EL0 */ ARM64_SYS_REG(3, 3, 10, 2, 4), /* POR_EL0 */ ARM64_SYS_REG(3, 3, 13, 0, 2), /* TPIDR_EL0 */ ARM64_SYS_REG(3, 3, 13, 0, 3), /* TPIDRRO_EL0 */ @@ -740,6 +750,8 @@ static __u64 el2_regs[] = { SYS_REG(PIRE0_EL2), SYS_REG(PIR_EL2), SYS_REG(POR_EL2), + SYS_REG(GCSPR_EL2), + SYS_REG(GCSCR_EL2), SYS_REG(AMAIR_EL2), SYS_REG(VBAR_EL2), SYS_REG(CONTEXTIDR_EL2),
linux-kselftest-mirror@lists.linaro.org