This series implements support for SME use in non-protected KVM guests. Much of this is very similar to SVE, the main additional challenge that SME presents is that it introduces two new controls which change the registers seen by guests:
- PSTATE.ZA enables the ZA matrix register and, if SME2 is supported, the ZT0 LUT register. - PSTATE.SM enables streaming mode, a new floating point mode which uses the SVE register set with a separately configured vector length. In streaming mode implementation of the FFR register is optional.
It is also permitted to build systems which support SME without SVE, in this case when not in streaming mode no SVE registers or instructions are available. Further, there is no requirement that there be any overlap in the set of vector lengths supported by SVE and SME in a system, this is expected to be a common situation in practical systems.
Since there is a new vector length to configure we introduce a new feature parallel to the existing SVE one with a new pseudo register for the streaming mode vector length. Due to the overlap with SVE caused by streaming mode rather than finalising SME as a separate feature we use the existing SVE finalisation to also finalise SME, a new define KVM_ARM_VCPU_VEC is provided to help make user code clearer. Finalising SVE and SME separately would introduce complication with register access since finalising SVE makes the SVE regsiters writeable by userspace and doing multiple finalisations results in an error being reported. Dealing with a state where the SVE registers are writeable due to one of SVE or SME being finalised but may have their VL changed by the other being finalised seems like needless complexity with minimal practical utility, it seems clearer to just express directly that only one finalisation can be done in the ABI.
We represent the streaming mode registers to userspace by always using the existing SVE registers to access the floating point state, using the larger of the SME and (if enabled for the guest) SVE vector lengths.
There are a large number of subfeatures for SME, most of which only offer additional instructions but some of which (SME2 and FA64) add architectural state. The expectation is that these will be configured via the ID registers but since the mechanism for doing this is still unclear the current code enables SME2 and FA64 for the guest if the host supports them regardless of what the ID registers say.
Since we do not yet have support for SVE in protected guests and SME is very reliant on SVE this series does not implement support for SME in protected guests. This will be added separately once SVE support is merged into mainline (or along with merging that), there is code for protected guests using SVE in the Android tree.
The new KVM_ARM_VCPU_VEC feature and ZA and ZT0 registers have not been added to the get-reg-list selftest, the idea of supporting additional features there without restructuring the program to generate all possible feature combinations has been rejected. I will post a separate series which does that restructuring.
I am seeing some test failures currently which I've not got to the bottom of, at this point I'm reasonably sure these are preexisting issues in the kernel which are more apparent in a guest.
To: Marc Zyngier maz@kernel.org To: Oliver Upton oliver.upton@linux.dev To: James Morse james.morse@arm.com To: Suzuki K Poulose suzuki.poulose@arm.com To: Catalin Marinas catalin.marinas@arm.com To: Will Deacon will@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.linux.dev Cc: linux-kernel@vger.kernel.org To: Paolo Bonzini pbonzini@redhat.com To: Jonathan Corbet corbet@lwn.net Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org To: Shuah Khan shuah@kernel.org Cc: linux-kselftest@vger.kernel.org Signed-off-by: Mark Brown broonie@kernel.org
Changes in v2: - Rebase onto v6.7-rc3. - Configure subfeatures based on host system only. - Complete nVHE support. - There was some snafu with sending v1 out, it didn't make it to the lists but in case it hit people's inboxes I'm sending as v2.
--- Mark Brown (22): KVM: arm64: Document why we trap SVE access from the host arm64/fpsimd: Make SVE<->FPSIMD rewriting available to KVM KVM: arm64: Move SVE state access macros after feature test macros KVM: arm64: Store vector lengths in an array KVM: arm64: Document the KVM ABI for SME KVM: arm64: Make FFR restore optional in __sve_restore_state() KVM: arm64: Define guest flags for SME KVM: arm64: Rename SVE finalization constants to be more general KVM: arm64: Basic SME system register descriptions KVM: arm64: Add support for TPIDR2_EL0 KVM: arm64: Make SMPRI_EL1 RES0 for SME guests KVM: arm64: Make SVCR a normal system register KVM: arm64: Context switch SME state for guest KVM: arm64: Manage and handle SME traps KVM: arm64: Implement SME vector length configuration KVM: arm64: Rename sve_state_reg_region KVM: arm64: Support userspace access to streaming mode SVE registers KVM: arm64: Expose ZA to userspace KVM: arm64: Provide userspace access to ZT0 KVM: arm64: Support SME version configuration via ID registers KVM: arm64: Provide userspace ABI for enabling SME KVM: arm64: selftests: Add SME system registers to get-reg-list
Documentation/virt/kvm/api.rst | 104 +++++--- arch/arm64/include/asm/fpsimd.h | 5 + arch/arm64/include/asm/kvm_emulate.h | 13 +- arch/arm64/include/asm/kvm_host.h | 99 +++++--- arch/arm64/include/asm/kvm_hyp.h | 3 +- arch/arm64/include/uapi/asm/kvm.h | 33 +++ arch/arm64/kernel/fpsimd.c | 51 +++- arch/arm64/kvm/arm.c | 16 +- arch/arm64/kvm/fpsimd.c | 266 ++++++++++++++++++--- arch/arm64/kvm/guest.c | 230 +++++++++++++++--- arch/arm64/kvm/handle_exit.c | 11 + arch/arm64/kvm/hyp/fpsimd.S | 11 +- arch/arm64/kvm/hyp/include/hyp/switch.h | 86 ++++++- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 16 ++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 60 ++++- arch/arm64/kvm/hyp/nvhe/switch.c | 13 +- arch/arm64/kvm/hyp/vhe/switch.c | 3 + arch/arm64/kvm/reset.c | 150 +++++++++--- arch/arm64/kvm/sys_regs.c | 67 +++++- include/uapi/linux/kvm.h | 1 + tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 ++- 21 files changed, 1063 insertions(+), 207 deletions(-) --- base-commit: 4ae6e89253b387476c2ba0202c3a80f2e1284e91 change-id: 20230301-kvm-arm64-sme-06a1246d3636
Best regards,
When we exit from a SVE guest we leave the SVE configuration in EL2 as it was for the guest, only switching back to the host configuration on next use by the host. This is perhaps a little surprising, add comments explaining what is going on both in the trap handler and when we configure the traps.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_emulate.h | 1 + arch/arm64/kvm/hyp/nvhe/hyp-main.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 78a550537b67..14d6ff2e2a39 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -598,6 +598,7 @@ static __always_inline u64 kvm_get_reset_cptr_el2(struct kvm_vcpu *vcpu) } else { val = CPTR_NVHE_EL2_RES1;
+ /* Leave traps enabled, we will restore EL2 lazily */ if (vcpu_has_sve(vcpu) && (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED)) val |= CPTR_EL2_TZ; diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 2385fd03ed87..84deed83e580 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -420,6 +420,7 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) handle_host_smc(host_ctxt); break; case ESR_ELx_EC_SVE: + /* Handle lazy restore of the host VL */ if (has_hvhe()) sysreg_clear_set(cpacr_el1, 0, (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN));
We have routines to rewrite between FPSIMD and SVE data formats, make these visible to the KVM host code so that we can use them to present the FP register state in SVE format for guests which have SME without SVE.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/fpsimd.h | 5 +++++ arch/arm64/kernel/fpsimd.c | 23 +++++++++++++++-------- 2 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 50e5f25d3024..7092e7f944ae 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -157,6 +157,11 @@ extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused);
extern u64 read_smcr_features(void);
+void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst, + unsigned int vq); +void __sve_to_fpsimd(struct user_fpsimd_state *fst, void const *sst, + unsigned int vq); + /* * Helpers to translate bit indices in sve_vq_map to VQ values (and * vice versa). This allows find_next_bit() to be used to find the diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 1559c706d32d..e6a4dd68f62a 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -647,8 +647,8 @@ static __uint128_t arm64_cpu_to_le128(__uint128_t x)
#define arm64_le128_to_cpu(x) arm64_cpu_to_le128(x)
-static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst, - unsigned int vq) +void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst, + unsigned int vq) { unsigned int i; __uint128_t *p; @@ -659,6 +659,18 @@ static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst, } }
+void __sve_to_fpsimd(struct user_fpsimd_state *fst, const void *sst, + unsigned int vq) +{ + unsigned int i; + __uint128_t const *p; + + for (i = 0; i < SVE_NUM_ZREGS; ++i) { + p = (__uint128_t const *)ZREG(sst, vq, i); + fst->vregs[i] = arm64_le128_to_cpu(*p); + } +} + /* * Transfer the FPSIMD state in task->thread.uw.fpsimd_state to * task->thread.sve_state. @@ -700,18 +712,13 @@ static void sve_to_fpsimd(struct task_struct *task) unsigned int vq, vl; void const *sst = task->thread.sve_state; struct user_fpsimd_state *fst = &task->thread.uw.fpsimd_state; - unsigned int i; - __uint128_t const *p;
if (!system_supports_sve() && !system_supports_sme()) return;
vl = thread_get_cur_vl(&task->thread); vq = sve_vq_from_vl(vl); - for (i = 0; i < SVE_NUM_ZREGS; ++i) { - p = (__uint128_t const *)ZREG(sst, vq, i); - fst->vregs[i] = arm64_le128_to_cpu(*p); - } + __sve_to_fpsimd(fst, sst, vq); }
#ifdef CONFIG_ARM64_SVE
In preparation for SME support move the macros used to access SVE state after the feature test macros, we will need to test for SME subfeatures to determine the size of the SME state.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 40 +++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 824f29f04916..9180713a2f9b 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -772,26 +772,6 @@ struct kvm_vcpu_arch { #define IN_WFI __vcpu_single_flag(sflags, BIT(7))
-/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ -#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ - sve_ffr_offset((vcpu)->arch.sve_max_vl)) - -#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl) - -#define vcpu_sve_state_size(vcpu) ({ \ - size_t __size_ret; \ - unsigned int __vcpu_vq; \ - \ - if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) { \ - __size_ret = 0; \ - } else { \ - __vcpu_vq = vcpu_sve_max_vq(vcpu); \ - __size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq); \ - } \ - \ - __size_ret; \ -}) - #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \ KVM_GUESTDBG_USE_SW_BP | \ KVM_GUESTDBG_USE_HW | \ @@ -820,6 +800,26 @@ struct kvm_vcpu_arch {
#define vcpu_gp_regs(v) (&(v)->arch.ctxt.regs)
+/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ +#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ + sve_ffr_offset((vcpu)->arch.sve_max_vl)) + +#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl) + +#define vcpu_sve_state_size(vcpu) ({ \ + size_t __size_ret; \ + unsigned int __vcpu_vq; \ + \ + if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) { \ + __size_ret = 0; \ + } else { \ + __vcpu_vq = vcpu_sve_max_vq(vcpu); \ + __size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq); \ + } \ + \ + __size_ret; \ +}) + /* * Only use __vcpu_sys_reg/ctxt_sys_reg if you know you want the * memory backed version of a register, and not the one most recently
SME introduces a second vector length enumerated and configured in the same manner as for SVE. In a similar manner to the host kernel refactor to store an array of vector lengths in order to facilitate sharing code between the two.
We do not fully handle vcpu_sve_pffr() since we have not yet introduced support for streaming mode, this will be updated as part of implementing streaming mode.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 12 +++++++----- arch/arm64/kvm/fpsimd.c | 2 +- arch/arm64/kvm/guest.c | 6 +++--- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 ++++- arch/arm64/kvm/reset.c | 16 ++++++++-------- 5 files changed, 23 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 9180713a2f9b..3b557ffb8e7b 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -74,7 +74,7 @@ static inline enum kvm_mode kvm_get_mode(void) { return KVM_MODE_NONE; };
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
-extern unsigned int __ro_after_init kvm_sve_max_vl; +extern unsigned int __ro_after_init kvm_vec_max_vl[ARM64_VEC_MAX]; int __init kvm_arm_init_sve(void);
u32 __attribute_const__ kvm_target_cpu(void); @@ -515,7 +515,7 @@ struct kvm_vcpu_arch { */ void *sve_state; enum fp_type fp_type; - unsigned int sve_max_vl; + unsigned int max_vl[ARM64_VEC_MAX]; u64 svcr;
/* Stage 2 paging state used by the hardware on next switch */ @@ -802,15 +802,17 @@ struct kvm_vcpu_arch {
/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ - sve_ffr_offset((vcpu)->arch.sve_max_vl)) + sve_ffr_offset((vcpu)->arch.max_vl[ARM64_VEC_SVE]))
-#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl) +#define vcpu_vec_max_vq(type, vcpu) sve_vq_from_vl((vcpu)->arch.max_vl[type]) + +#define vcpu_sve_max_vq(vcpu) vcpu_vec_max_vq(ARM64_VEC_SVE, vcpu)
#define vcpu_sve_state_size(vcpu) ({ \ size_t __size_ret; \ unsigned int __vcpu_vq; \ \ - if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) { \ + if (WARN_ON(!sve_vl_valid((vcpu)->arch.max_vl[ARM64_VEC_SVE]))) { \ __size_ret = 0; \ } else { \ __vcpu_vq = vcpu_sve_max_vq(vcpu); \ diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 8c1d0d4853df..a402a072786a 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -150,7 +150,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) */ fp_state.st = &vcpu->arch.ctxt.fp_regs; fp_state.sve_state = vcpu->arch.sve_state; - fp_state.sve_vl = vcpu->arch.sve_max_vl; + fp_state.sve_vl = vcpu->arch.max_vl[ARM64_VEC_SVE]; fp_state.sme_state = NULL; fp_state.svcr = &vcpu->arch.svcr; fp_state.fp_type = &vcpu->arch.fp_type; diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index aaf1d4939739..3ae08f7c0b80 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -317,7 +317,7 @@ static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (!vcpu_has_sve(vcpu)) return -ENOENT;
- if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl))) + if (WARN_ON(!sve_vl_valid(vcpu->arch.max_vl[ARM64_VEC_SVE]))) return -EINVAL;
memset(vqs, 0, sizeof(vqs)); @@ -355,7 +355,7 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (vq_present(vqs, vq)) max_vq = vq;
- if (max_vq > sve_vq_from_vl(kvm_sve_max_vl)) + if (max_vq > sve_vq_from_vl(kvm_vec_max_vl[ARM64_VEC_SVE])) return -EINVAL;
/* @@ -374,7 +374,7 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return -EINVAL;
/* vcpu->arch.sve_state will be alloc'd by kvm_vcpu_finalize_sve() */ - vcpu->arch.sve_max_vl = sve_vl_from_vq(max_vq); + vcpu->arch.max_vl[ARM64_VEC_SVE] = sve_vl_from_vq(max_vq);
return 0; } diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 84deed83e580..56808df6a078 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -26,11 +26,14 @@ void __kvm_hyp_host_forward_smc(struct kvm_cpu_context *host_ctxt); static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu) { struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; + int i;
hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+ for (i = 0; i < ARRAY_SIZE(hyp_vcpu->vcpu.arch.max_vl); i++) + hyp_vcpu->vcpu.arch.max_vl[i] = host_vcpu->arch.max_vl[i]; + hyp_vcpu->vcpu.arch.sve_state = kern_hyp_va(host_vcpu->arch.sve_state); - hyp_vcpu->vcpu.arch.sve_max_vl = host_vcpu->arch.sve_max_vl;
hyp_vcpu->vcpu.arch.hw_mmu = host_vcpu->arch.hw_mmu;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 5bb4de162cab..81b949dd809d 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -45,12 +45,12 @@ static u32 __ro_after_init kvm_ipa_limit; #define VCPU_RESET_PSTATE_SVC (PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \ PSR_AA32_I_BIT | PSR_AA32_F_BIT)
-unsigned int __ro_after_init kvm_sve_max_vl; +unsigned int __ro_after_init kvm_vec_max_vl[ARM64_VEC_MAX];
int __init kvm_arm_init_sve(void) { if (system_supports_sve()) { - kvm_sve_max_vl = sve_max_virtualisable_vl(); + kvm_vec_max_vl[ARM64_VEC_SVE] = sve_max_virtualisable_vl();
/* * The get_sve_reg()/set_sve_reg() ioctl interface will need @@ -58,16 +58,16 @@ int __init kvm_arm_init_sve(void) * order to support vector lengths greater than * VL_ARCH_MAX: */ - if (WARN_ON(kvm_sve_max_vl > VL_ARCH_MAX)) - kvm_sve_max_vl = VL_ARCH_MAX; + if (WARN_ON(kvm_vec_max_vl[ARM64_VEC_SVE] > VL_ARCH_MAX)) + kvm_vec_max_vl[ARM64_VEC_SVE] = VL_ARCH_MAX;
/* * Don't even try to make use of vector lengths that * aren't available on all CPUs, for now: */ - if (kvm_sve_max_vl < sve_max_vl()) + if (kvm_vec_max_vl[ARM64_VEC_SVE] < sve_max_vl()) pr_warn("KVM: SVE vector length for guests limited to %u bytes\n", - kvm_sve_max_vl); + kvm_vec_max_vl[ARM64_VEC_SVE]); }
return 0; @@ -75,7 +75,7 @@ int __init kvm_arm_init_sve(void)
static void kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu) { - vcpu->arch.sve_max_vl = kvm_sve_max_vl; + vcpu->arch.max_vl[ARM64_VEC_SVE] = kvm_vec_max_vl[ARM64_VEC_SVE];
/* * Userspace can still customize the vector lengths by writing @@ -96,7 +96,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu) size_t reg_sz; int ret;
- vl = vcpu->arch.sve_max_vl; + vl = vcpu->arch.max_vl[ARM64_VEC_SVE];
/* * Responsibility for these properties is shared between
SME, the Scalable Matrix Extension, is an arm64 extension which adds support for matrix operations, with core concepts patterned after SVE.
SVE introduced some complication in the ABI since it adds new vector floating point registers with runtime configurable size, the size being controlled by a prameter called the vector length (VL). To provide control of this to VMMs we offer two phase configuration of SVE, SVE must first be enabled for the vCPU with KVM_ARM_VCPU_INIT(KVM_ARM_VCPU_SVE), after which vector length may then be configured but the configurably sized floating point registers are inaccessible until finalized with a call to KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE) after which the configurably sized registers can be accessed.
SME introduces an additional independent configurable vector length which as well as controlling the size of the new ZA register also provides an alternative view of the configurably sized SVE registers (known as streaming mode) with the guest able to switch between the two modes as it pleases. There is also a fixed sized register ZT0 introduced in SME2. As well as streaming mode the guest may enable and disable ZA and (where SME2 is available) ZT0 dynamically independently of streaming mode. These modes are controlled via the system register SVCR.
We handle the configuration of the vector length for SME in a similar manner to SVE, requiring initialization and finalization of the feature with a pseudo register controlling the available SME vector lengths as for SVE. Further, if the guest has both SVE and SME then finalizing one prevents further configuration of the vector length for the other.
Where both SVE and SME are configured for the guest we always present the SVE registers to userspace as having the larger of the configured maximum SVE and SME vector lengths, discarding extra data at load time and zero padding on read as required if the active vector length is lower. Note that this means that enabling or disabling streaming mode while the guest is stopped will not zero Zn or Pn as it will when the guest is running, but it does allow SVCR, Zn and Pn to be read and written in any order.
Userspace access to ZA and (if configured) ZT0 is always available, they will be zeroed when the guest runs if disabled in SVCR and the value read will be zero if the guest stops with them disabled. This mirrors the behaviour of the architecture, enabling access causes ZA and ZT0 to be zeroed, while allowing access to SVCR, ZA and ZT0 to be performed in any order.
If SME is enabled for a guest without SVE then the FPSIMD Vn registers must be accessed via the low 128 bits of the SVE Zn registers as is the case when SVE is enabled. This is not ideal but allows access to SVCR and the registers in any order without duplication or ambiguity about which values should take effect. This may be an issue for VMMs that are unaware of SME on systems that implement it without SVE if they let SME be enabled, the lack of access to Vn may surprise them, but it seems like an unusual implementation choice.
For SME unware VMMs on systems with both SVE and SME support the SVE registers may be larger than expected, this should be less disruptive than on a system without SVE as they will simply ignore the high bits of the registers.
Signed-off-by: Mark Brown broonie@kernel.org --- Documentation/virt/kvm/api.rst | 104 +++++++++++++++++++++++++++++------------ 1 file changed, 73 insertions(+), 31 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 7025b3751027..b64541fa3e2a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -374,7 +374,7 @@ Errors: instructions from device memory (arm64) ENOSYS data abort outside memslots with no syndrome info and KVM_CAP_ARM_NISV_TO_USER not enabled (arm64) - EPERM SVE feature set but not finalized (arm64) + EPERM SVE or SME feature set but not finalized (arm64) ======= ==============================================================
This ioctl is used to run a guest virtual cpu. While there are no @@ -2585,12 +2585,13 @@ Specifically: 0x6020 0000 0010 00d5 FPCR 32 fp_regs.fpcr ======================= ========= ===== =======================================
-.. [1] These encodings are not accepted for SVE-enabled vcpus. See +.. [1] These encodings are not accepted for SVE or SME enabled vcpus. See KVM_ARM_VCPU_INIT.
The equivalent register content can be accessed via bits [127:0] of the corresponding SVE Zn registers instead for vcpus that have SVE - enabled (see below). + or SME enabled (see below). Note carefully that this is true even + when only SVE is supported.
arm64 CCSIDR registers are demultiplexed by CSSELR value::
@@ -2621,24 +2622,31 @@ arm64 SVE registers have the following bit patterns:: 0x6050 0000 0015 060 slice:5 FFR bits[256*slice + 255 : 256*slice] 0x6060 0000 0015 ffff KVM_REG_ARM64_SVE_VLS pseudo-register
-Access to register IDs where 2048 * slice >= 128 * max_vq will fail with -ENOENT. max_vq is the vcpu's maximum supported vector length in 128-bit -quadwords: see [2]_ below. +arm64 SME registers have the following bit patterns: + + 0x6080 0000 0017 00 <n:5> slice:5 ZA.H[n] bits[2048*slice + 2047 : 2048*slice] + 0x60XX 0000 0017 0100 ZT0 + 0x6060 0000 0017 fffe KVM_REG_ARM64_SME_VLS pseudo-register + +Access to Z, P or ZA register IDs where 2048 * slice >= 128 * max_vq +will fail with ENOENT. max_vq is the vcpu's maximum supported vector +length in 128-bit quadwords: see [2]_ below.
These registers are only accessible on vcpus for which SVE is enabled. See KVM_ARM_VCPU_INIT for details.
-In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not -accessible until the vcpu's SVE configuration has been finalized -using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). See KVM_ARM_VCPU_INIT -and KVM_ARM_VCPU_FINALIZE for more information about this procedure. +In addition, except for KVM_REG_ARM64_SVE_VLS and +KVM_REG_ARM64_SME_VLS, these registers are not accessible until the +vcpu's SVE configuration has been finalized using +KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC). See KVM_ARM_VCPU_INIT and +KVM_ARM_VCPU_FINALIZE for more information about this procedure.
-KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector -lengths supported by the vcpu to be discovered and configured by -userspace. When transferred to or from user memory via KVM_GET_ONE_REG -or KVM_SET_ONE_REG, the value of this register is of type -__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as -follows:: +KVM_REG_ARM64_SVE_VLS and KVM_ARM64_VCPU_SME_VLS are pseudo-registers +that allows the set of vector lengths supported by the vcpu to be +discovered and configured by userspace. When transferred to or from +user memory via KVM_GET_ONE_REG or KVM_SET_ONE_REG, the value of this +register is of type __u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the +set of vector lengths as follows::
__u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
@@ -2652,17 +2660,20 @@ follows:: .. [2] The maximum value vq for which the above condition is true is max_vq. This is the maximum vector length available to the guest on this vcpu, and determines which register slices are visible through - this ioctl interface. + this ioctl interface. If both SVE and SME are supported for the + guest this will be the larger of the two vector lengths regardless + of streaming mode being active.
(See Documentation/arch/arm64/sve.rst for an explanation of the "vq" nomenclature.)
-KVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT. -KVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that -the host supports. +KVM_REG_ARM64_SVE_VLS and KVM_REG_ARM_SME_VLS are only accessible +after KVM_ARM_VCPU_INIT. KVM_ARM_VCPU_INIT initialises them to the +best set of vector lengths that the host supports.
-Userspace may subsequently modify it if desired until the vcpu's SVE -configuration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). +Userspace may subsequently modify these registers if desired until the +vcpu's SVE and SME configuration is finalized using +KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC).
Apart from simply removing all vector lengths from the host set that exceed some value, support for arbitrarily chosen sets of vector lengths @@ -2670,8 +2681,8 @@ is hardware-dependent and may not be available. Attempting to configure an invalid set of vector lengths via KVM_SET_ONE_REG will fail with EINVAL.
-After the vcpu's SVE configuration is finalized, further attempts to -write this register will fail with EPERM. +After the vcpu's SVE or SME configuration is finalized, further +attempts to write these registers will fail with EPERM.
arm64 bitmap feature firmware pseudo-registers have the following bit pattern::
@@ -3454,6 +3465,7 @@ The initial values are defined as: - General Purpose registers, including PC and SP: set to 0 - FPSIMD/NEON registers: set to 0 - SVE registers: set to 0 + - SME registers: set to 0 - System registers: Reset to their architecturally defined values as for a warm reset to EL1 (resp. SVC)
@@ -3496,7 +3508,7 @@ Possible features:
- KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only). Depends on KVM_CAP_ARM_SVE. - Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): + Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
* After KVM_ARM_VCPU_INIT:
@@ -3504,7 +3516,7 @@ Possible features: initial value of this pseudo-register indicates the best set of vector lengths possible for a vcpu on this host.
- * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): + * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC}):
- KVM_RUN and KVM_GET_REG_LIST are not available;
@@ -3517,11 +3529,40 @@ Possible features: KVM_SET_ONE_REG, to modify the set of vector lengths available for the vcpu.
- * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): + * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
- the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and can no longer be written using KVM_SET_ONE_REG.
+ - KVM_ARM_VCPU_SME: Enables SME for the CPU (arm64 only). + Depends on KVM_CAP_ARM_SME. + Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): + + * After KVM_ARM_VCPU_INIT: + + - KVM_REG_ARM64_SME_VLS may be read using KVM_GET_ONE_REG: the + initial value of this pseudo-register indicates the best set of + vector lengths possible for a vcpu on this host. + + * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC}): + + - KVM_RUN and KVM_GET_REG_LIST are not available; + + - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access + the scalable architectural SVE registers + KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or + KVM_REG_ARM64_SVE_FFR, the matrix register + KVM_REG_ARM64_SME_ZA() or the LUT register KVM_REG_ARM64_ZT(); + + - KVM_REG_ARM64_SME_VLS may optionally be written using + KVM_SET_ONE_REG, to modify the set of vector lengths available + for the vcpu. + + * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): + + - the KVM_REG_ARM64_SME_VLS pseudo-register is immutable, and can + no longer be written using KVM_SET_ONE_REG. + 4.83 KVM_ARM_PREFERRED_TARGET -----------------------------
@@ -5055,11 +5096,12 @@ Errors:
Recognised values for feature:
- ===== =========================================== - arm64 KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE) - ===== =========================================== + ===== ============================================================== + arm64 KVM_ARM_VCPU_VEC (requires KVM_CAP_ARM_SVE or KVM_CAP_ARM_SME) + arm64 KVM_ARM_VCPU_SVE (alias for KVM_ARM_VCPU_VEC) + ===== ==============================================================
-Finalizes the configuration of the specified vcpu feature. +Finalizes the configuration of the specified vcpu features.
The vcpu must already have been initialised, enabling the affected feature, by means of a successful KVM_ARM_VCPU_INIT call with the appropriate flag set in
Since FFR is an optional feature of SME's streaming SVE mode in order to support load of guest register state when SME is implemented we need to provide callers of __sve_restore_state() access to the flag that the sve_load macro has indicating if FFR should be loaded. Do so, simply a matter of removing the hard coding in the asm.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_hyp.h | 2 +- arch/arm64/kvm/hyp/fpsimd.S | 1 - arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 145ce73fc16c..7ac38ed90062 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -111,7 +111,7 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs); void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs); -void __sve_restore_state(void *sve_pffr, u32 *fpsr); +void __sve_restore_state(void *sve_pffr, u32 *fpsr, bool restore_ffr);
u64 __guest_enter(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S index 61e6f3ba7b7d..8940954b5420 100644 --- a/arch/arm64/kvm/hyp/fpsimd.S +++ b/arch/arm64/kvm/hyp/fpsimd.S @@ -21,7 +21,6 @@ SYM_FUNC_START(__fpsimd_restore_state) SYM_FUNC_END(__fpsimd_restore_state)
SYM_FUNC_START(__sve_restore_state) - mov x2, #1 sve_load 0, x1, x2, 3 ret SYM_FUNC_END(__sve_restore_state) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index f99d8af0b9af..9601212bd3ce 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -286,7 +286,7 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu) { sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2); __sve_restore_state(vcpu_sve_pffr(vcpu), - &vcpu->arch.ctxt.fp_regs.fpsr); + &vcpu->arch.ctxt.fp_regs.fpsr, true); write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR); }
Introduce flags for the individually selectable features in SME which add architectural state:
- Base SME which adds the system registers and ZA matrix. - SME 2 which adds ZT0. - FA64 which enables access to the full floating point feature set in streaming mode, including adding FFR in streaming mode.
along with helper functions for checking them.
These will be used by further patches actually implementing support for SME in guests.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 13 +++++++++++++ arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 10 ++++++++++ 2 files changed, 23 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3b557ffb8e7b..461068c99b61 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -710,6 +710,12 @@ struct kvm_vcpu_arch { #define GUEST_HAS_PTRAUTH __vcpu_single_flag(cflags, BIT(2)) /* KVM_ARM_VCPU_INIT completed */ #define VCPU_INITIALIZED __vcpu_single_flag(cflags, BIT(3)) +/* SME1 exposed to guest */ +#define GUEST_HAS_SME __vcpu_single_flag(cflags, BIT(4)) +/* SME2 exposed to guest */ +#define GUEST_HAS_SME2 __vcpu_single_flag(cflags, BIT(5)) +/* FA64 exposed to guest */ +#define GUEST_HAS_FA64 __vcpu_single_flag(cflags, BIT(6))
/* Exception pending */ #define PENDING_EXCEPTION __vcpu_single_flag(iflags, BIT(0)) @@ -780,6 +786,13 @@ struct kvm_vcpu_arch { #define vcpu_has_sve(vcpu) (system_supports_sve() && \ vcpu_get_flag(vcpu, GUEST_HAS_SVE))
+#define vcpu_has_sme(vcpu) (system_supports_sme() && \ + vcpu_get_flag(vcpu, GUEST_HAS_SME)) +#define vcpu_has_sme2(vcpu) (system_supports_sme2() && \ + vcpu_get_flag(vcpu, GUEST_HAS_SME2)) +#define vcpu_has_fa64(vcpu) (system_supports_fa64() && \ + vcpu_get_flag(vcpu, GUEST_HAS_FA64)) + #ifdef CONFIG_ARM64_PTR_AUTH #define vcpu_has_ptrauth(vcpu) \ ((cpus_have_final_cap(ARM64_HAS_ADDRESS_AUTH) || \ diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h index bb6b571ec627..fb84834cd2a0 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -21,6 +21,16 @@ static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt) ctxt_sys_reg(ctxt, MDSCR_EL1) = read_sysreg(mdscr_el1); }
+static inline bool ctxt_has_sme(struct kvm_cpu_context *ctxt) +{ + struct kvm_vcpu *vcpu = ctxt->__hyp_running_vcpu; + + if (!vcpu) + vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt); + + return vcpu_has_sme(vcpu); +} + static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt) { ctxt_sys_reg(ctxt, TPIDR_EL0) = read_sysreg(tpidr_el0);
Due to the overlap between SVE and SME vector length configuration created by streaming mode SVE we will finalize both at once. Rename the existing finalization to use _VEC (vector) for the naming to avoid confusion.
Since this includes the userspace API we create an alias KVM_ARM_VCPU_VEC for the existing KVM_ARM_VCPU_SVE capability, existing code which does not enable SME will be unaffected and any SME only code will not need to use SVE constants.
No functional change.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 8 +++++--- arch/arm64/include/uapi/asm/kvm.h | 6 ++++++ arch/arm64/kvm/guest.c | 10 +++++----- arch/arm64/kvm/reset.c | 16 ++++++++-------- 4 files changed, 24 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 461068c99b61..920f8a1ff901 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -704,8 +704,8 @@ struct kvm_vcpu_arch {
/* SVE exposed to guest */ #define GUEST_HAS_SVE __vcpu_single_flag(cflags, BIT(0)) -/* SVE config completed */ -#define VCPU_SVE_FINALIZED __vcpu_single_flag(cflags, BIT(1)) +/* SVE/SME config completed */ +#define VCPU_VEC_FINALIZED __vcpu_single_flag(cflags, BIT(1)) /* PTRAUTH exposed to guest */ #define GUEST_HAS_PTRAUTH __vcpu_single_flag(cflags, BIT(2)) /* KVM_ARM_VCPU_INIT completed */ @@ -793,6 +793,8 @@ struct kvm_vcpu_arch { #define vcpu_has_fa64(vcpu) (system_supports_fa64() && \ vcpu_get_flag(vcpu, GUEST_HAS_FA64))
+#define vcpu_has_vec(vcpu) (vcpu_has_sve(vcpu) || vcpu_has_sme(vcpu)) + #ifdef CONFIG_ARM64_PTR_AUTH #define vcpu_has_ptrauth(vcpu) \ ((cpus_have_final_cap(ARM64_HAS_ADDRESS_AUTH) || \ @@ -1179,7 +1181,7 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm) int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature); bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
-#define kvm_arm_vcpu_sve_finalized(vcpu) vcpu_get_flag(vcpu, VCPU_SVE_FINALIZED) +#define kvm_arm_vcpu_vec_finalized(vcpu) vcpu_get_flag(vcpu, VCPU_VEC_FINALIZED)
#define kvm_has_mte(kvm) \ (system_supports_mte() && \ diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 89d2fc872d9f..3048890fac68 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -111,6 +111,12 @@ struct kvm_regs { #define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */ #define KVM_ARM_VCPU_HAS_EL2 7 /* Support nested virtualization */
+/* + * An alias for _SVE since we finalize VL configuration for both SVE and SME + * simultaneously. + */ +#define KVM_ARM_VCPU_VEC KVM_ARM_VCPU_SVE + struct kvm_vcpu_init { __u32 target; __u32 features[7]; diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 3ae08f7c0b80..6e116fd8a917 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -341,7 +341,7 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (!vcpu_has_sve(vcpu)) return -ENOENT;
- if (kvm_arm_vcpu_sve_finalized(vcpu)) + if (kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM; /* too late! */
if (WARN_ON(vcpu->arch.sve_state)) @@ -496,7 +496,7 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (ret) return ret;
- if (!kvm_arm_vcpu_sve_finalized(vcpu)) + if (!kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM;
if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset, @@ -522,7 +522,7 @@ static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (ret) return ret;
- if (!kvm_arm_vcpu_sve_finalized(vcpu)) + if (!kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM;
if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr, @@ -656,7 +656,7 @@ static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu) return 0;
/* Policed by KVM_GET_REG_LIST: */ - WARN_ON(!kvm_arm_vcpu_sve_finalized(vcpu)); + WARN_ON(!kvm_arm_vcpu_vec_finalized(vcpu));
return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */) + 1; /* KVM_REG_ARM64_SVE_VLS */ @@ -674,7 +674,7 @@ static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, return 0;
/* Policed by KVM_GET_REG_LIST: */ - WARN_ON(!kvm_arm_vcpu_sve_finalized(vcpu)); + WARN_ON(!kvm_arm_vcpu_vec_finalized(vcpu));
/* * Enumerate this first, so that userspace can save/restore in diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 81b949dd809d..ab7cd657a73c 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -89,7 +89,7 @@ static void kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu) * Finalize vcpu's maximum SVE vector length, allocating * vcpu->arch.sve_state as necessary. */ -static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu) +static int kvm_vcpu_finalize_vec(struct kvm_vcpu *vcpu) { void *buf; unsigned int vl; @@ -119,21 +119,21 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu) } vcpu->arch.sve_state = buf; - vcpu_set_flag(vcpu, VCPU_SVE_FINALIZED); + vcpu_set_flag(vcpu, VCPU_VEC_FINALIZED); return 0; }
int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature) { switch (feature) { - case KVM_ARM_VCPU_SVE: - if (!vcpu_has_sve(vcpu)) + case KVM_ARM_VCPU_VEC: + if (!vcpu_has_vec(vcpu)) return -EINVAL;
- if (kvm_arm_vcpu_sve_finalized(vcpu)) + if (kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM;
- return kvm_vcpu_finalize_sve(vcpu); + return kvm_vcpu_finalize_vec(vcpu); }
return -EINVAL; @@ -141,7 +141,7 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu) { - if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu)) + if (vcpu_has_vec(vcpu) && !kvm_arm_vcpu_vec_finalized(vcpu)) return false;
return true; @@ -207,7 +207,7 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu) if (loaded) kvm_arch_vcpu_put(vcpu);
- if (!kvm_arm_vcpu_sve_finalized(vcpu)) { + if (!kvm_arm_vcpu_vec_finalized(vcpu)) { if (vcpu_has_feature(vcpu, KVM_ARM_VCPU_SVE)) kvm_vcpu_enable_sve(vcpu); } else {
Set up the basic system register descriptions for the more straightforward SME registers. All the registers are available from SME 1.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/sys_regs.c | 23 +++++++++++++++++++---- 2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 920f8a1ff901..4be5dda9734d 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -333,6 +333,7 @@ enum vcpu_sysreg { ACTLR_EL1, /* Auxiliary Control Register */ CPACR_EL1, /* Coprocessor Access Control */ ZCR_EL1, /* SVE Control */ + SMCR_EL1, /* SME Control */ TTBR0_EL1, /* Translation Table Base Register 0 */ TTBR1_EL1, /* Translation Table Base Register 1 */ TCR_EL1, /* Translation Control Register */ diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 4735e1b37fb3..e6339ca1d8dc 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1404,7 +1404,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, if (!kvm_has_mte(vcpu->kvm)) val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);
- val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME); + if (!vcpu_has_sme(vcpu)) + val &= ~ID_AA64PFR1_EL1_SME_MASK; break; case SYS_ID_AA64ISAR1_EL1: if (!vcpu_has_ptrauth(vcpu)) @@ -1470,6 +1471,9 @@ static unsigned int id_visibility(const struct kvm_vcpu *vcpu, if (!vcpu_has_sve(vcpu)) return REG_RAZ; break; + case SYS_ID_AA64SMFR0_EL1: + if (!vcpu_has_sme(vcpu)) + return REG_RAZ; }
return 0; @@ -1521,6 +1525,16 @@ static unsigned int sve_visibility(const struct kvm_vcpu *vcpu, return REG_HIDDEN; }
+/* Visibility overrides for SME-specific control registers */ +static unsigned int sme_visibility(const struct kvm_vcpu *vcpu, + const struct sys_reg_desc *rd) +{ + if (vcpu_has_sme(vcpu)) + return 0; + + return REG_HIDDEN; +} + static u64 read_sanitised_id_aa64pfr0_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { @@ -2142,7 +2156,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3), ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0), - ID_HIDDEN(ID_AA64SMFR0_EL1), + ID_WRITABLE(ID_AA64SMFR0_EL1, ~ID_AA64SMFR0_EL1_RES0), ID_UNALLOCATED(4,6), ID_UNALLOCATED(4,7),
@@ -2211,7 +2225,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility }, { SYS_DESC(SYS_TRFCR_EL1), undef_access }, { SYS_DESC(SYS_SMPRI_EL1), undef_access }, - { SYS_DESC(SYS_SMCR_EL1), undef_access }, + { SYS_DESC(SYS_SMCR_EL1), NULL, reset_val, SMCR_EL1, 0, .visibility = sme_visibility }, { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 }, { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 }, { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 }, @@ -2306,7 +2320,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_CLIDR_EL1), access_clidr, reset_clidr, CLIDR_EL1, .set_user = set_clidr }, { SYS_DESC(SYS_CCSIDR2_EL1), undef_access }, - { SYS_DESC(SYS_SMIDR_EL1), undef_access }, + { SYS_DESC(SYS_SMIDR_EL1), .access = access_id_reg, + .get_user = get_id_reg, .visibility = sme_visibility }, { SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 }, { SYS_DESC(SYS_CTR_EL0), access_ctr }, { SYS_DESC(SYS_SVCR), undef_access },
SME adds a new user register TPIDR2_EL0, implement support for mananging this for guests. We provide system register access to it, disable traps and context switch it along with the other TPIDRs for guests with SME.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/hyp/include/hyp/switch.h | 5 ++++- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 6 ++++++ arch/arm64/kvm/sys_regs.c | 2 +- 4 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 4be5dda9734d..36bf9d7e92e1 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -347,6 +347,7 @@ enum vcpu_sysreg { CONTEXTIDR_EL1, /* Context ID Register */ TPIDR_EL0, /* Thread ID, User R/W */ TPIDRRO_EL0, /* Thread ID, User R/O */ + TPIDR2_EL0, /* Thread ID 2, User R/W */ TPIDR_EL1, /* Thread ID, Privileged */ AMAIR_EL1, /* Aux Memory Attribute Indirection Register */ CNTKCTL_EL1, /* Timer Control Register (EL1) */ @@ -885,6 +886,7 @@ static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val) case CONTEXTIDR_EL1: *val = read_sysreg_s(SYS_CONTEXTIDR_EL12);break; case TPIDR_EL0: *val = read_sysreg_s(SYS_TPIDR_EL0); break; case TPIDRRO_EL0: *val = read_sysreg_s(SYS_TPIDRRO_EL0); break; + case TPIDR2_EL0: *val = read_sysreg_s(SYS_TPIDR2_EL0); break; case TPIDR_EL1: *val = read_sysreg_s(SYS_TPIDR_EL1); break; case AMAIR_EL1: *val = read_sysreg_s(SYS_AMAIR_EL12); break; case CNTKCTL_EL1: *val = read_sysreg_s(SYS_CNTKCTL_EL12); break; @@ -928,6 +930,7 @@ static inline bool __vcpu_write_sys_reg_to_cpu(u64 val, int reg) case VBAR_EL1: write_sysreg_s(val, SYS_VBAR_EL12); break; case CONTEXTIDR_EL1: write_sysreg_s(val, SYS_CONTEXTIDR_EL12);break; case TPIDR_EL0: write_sysreg_s(val, SYS_TPIDR_EL0); break; + case TPIDR2_EL0: write_sysreg_s(val, SYS_TPIDR2_EL0); break; case TPIDRRO_EL0: write_sysreg_s(val, SYS_TPIDRRO_EL0); break; case TPIDR_EL1: write_sysreg_s(val, SYS_TPIDR_EL1); break; case AMAIR_EL1: write_sysreg_s(val, SYS_AMAIR_EL12); break; diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 9601212bd3ce..72982e752972 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -93,7 +93,10 @@ static inline void __activate_traps_hfgxtr(struct kvm_vcpu *vcpu) ctxt_sys_reg(hctxt, HFGWTR_EL2) = read_sysreg_s(SYS_HFGWTR_EL2);
if (cpus_have_final_cap(ARM64_SME)) { - tmp = HFGxTR_EL2_nSMPRI_EL1_MASK | HFGxTR_EL2_nTPIDR2_EL0_MASK; + tmp = HFGxTR_EL2_nSMPRI_EL1_MASK; + + if (!vcpu_has_sme(vcpu)) + tmp |= HFGxTR_EL2_nTPIDR2_EL0_MASK;
r_clr |= tmp; w_clr |= tmp; diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h index fb84834cd2a0..5436b33d50b7 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -35,6 +35,9 @@ static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt) { ctxt_sys_reg(ctxt, TPIDR_EL0) = read_sysreg(tpidr_el0); ctxt_sys_reg(ctxt, TPIDRRO_EL0) = read_sysreg(tpidrro_el0); + + if (ctxt_has_sme(ctxt)) + ctxt_sys_reg(ctxt, TPIDR2_EL0) = read_sysreg_s(SYS_TPIDR2_EL0); }
static inline bool ctxt_has_mte(struct kvm_cpu_context *ctxt) @@ -105,6 +108,9 @@ static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) { write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL0), tpidr_el0); write_sysreg(ctxt_sys_reg(ctxt, TPIDRRO_EL0), tpidrro_el0); + + if (ctxt_has_sme(ctxt)) + write_sysreg_s(ctxt_sys_reg(ctxt, TPIDR2_EL0), SYS_TPIDR2_EL0); }
static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index e6339ca1d8dc..a33ad12dc3ab 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2369,7 +2369,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 }, { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 }, - { SYS_DESC(SYS_TPIDR2_EL0), undef_access }, + { SYS_DESC(SYS_TPIDR2_EL0), NULL, reset_unknown, TPIDR2_EL0, 0, .visibility = sme_visibility },
{ SYS_DESC(SYS_SCXTNUM_EL0), undef_access },
SME priorities are entirely implementation defined and Linux currently has no support for SME priorities so we do not expose them to SME capable guests, reporting SMIDR_EL1.SMPS as 0. This means that on a host which supports SME priorities we need to trap writes to SMPRI_EL1 and make the whole register RES0. For a guest that does not support SME at all the register should instead be undefined.
If the host does not support SME priorities we could disable the trap but since there is no reason to access SMPRI_EL1 on a system that does not support priorities it seems more trouble than it is worth to detect and handle this eventuality, especially given the lack of SME priority support in the host and potential user confusion that may result if we report that the feature is detected but do not provide any interface for it.
With the current specification and host implementation we could disable read traps for all guests since we initialise SMPRI_EL1.Priority to 0 but for robustness when we do start implementing host support for priorities or if more fields are added just leave the traps enabled.
Once we have physical implementations that implement SME priorities and an understanding of how their use impacts the system we can consider exposing the feature to guests in some form but this will require substantial study.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/hyp/include/hyp/switch.h | 5 +++++ arch/arm64/kvm/sys_regs.c | 14 +++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 72982e752972..0cf4770b9d70 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -93,6 +93,11 @@ static inline void __activate_traps_hfgxtr(struct kvm_vcpu *vcpu) ctxt_sys_reg(hctxt, HFGWTR_EL2) = read_sysreg_s(SYS_HFGWTR_EL2);
if (cpus_have_final_cap(ARM64_SME)) { + /* + * We hide priorities from guests so need to trap + * access to SMPRI_EL1 in order to map it to RES0 + * even if the guest has SME. + */ tmp = HFGxTR_EL2_nSMPRI_EL1_MASK;
if (!vcpu_has_sme(vcpu)) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index a33ad12dc3ab..b618bcab526e 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -351,6 +351,18 @@ static bool access_gic_sre(struct kvm_vcpu *vcpu, return true; }
+static bool access_res0(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + return ignore_write(vcpu, p); + + p->regval = 0; + + return true; +} + static bool trap_raz_wi(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) @@ -2224,7 +2236,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility }, { SYS_DESC(SYS_TRFCR_EL1), undef_access }, - { SYS_DESC(SYS_SMPRI_EL1), undef_access }, + { SYS_DESC(SYS_SMPRI_EL1), .access = access_res0, .visibility = sme_visibility }, { SYS_DESC(SYS_SMCR_EL1), NULL, reset_val, SMCR_EL1, 0, .visibility = sme_visibility }, { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 }, { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
As a placeholder while SME guests were not supported we provide a u64 in struct kvm_vcpu_arch for the host kernel's floating point save code to use when managing KVM guests. In order to support KVM guests we will need to replace this with a proper KVM system register, do so and update the system register definition to make it accessible to the guest if it has SME.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/fpsimd.c | 8 +++++--- arch/arm64/kvm/sys_regs.c | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 36bf9d7e92e1..690c439b5e2a 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -356,6 +356,7 @@ enum vcpu_sysreg { MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */ OSLSR_EL1, /* OS Lock Status Register */ DISR_EL1, /* Deferred Interrupt Status Register */ + SVCR, /* Scalable Vector Control Register */
/* Performance Monitors Registers */ PMCR_EL0, /* Control Register */ @@ -518,7 +519,6 @@ struct kvm_vcpu_arch { void *sve_state; enum fp_type fp_type; unsigned int max_vl[ARM64_VEC_MAX]; - u64 svcr;
/* Stage 2 paging state used by the hardware on next switch */ struct kvm_s2_mmu *hw_mmu; diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index a402a072786a..1be18d719fce 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -145,14 +145,16 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
/* - * Currently we do not support SME guests so SVCR is - * always 0 and we just need a variable to point to. + * We peer into the registers since SVCR is saved as + * part of the floating point state, determining which + * registers exist and their size, so is saved by + * fpsimd_save(). */ fp_state.st = &vcpu->arch.ctxt.fp_regs; fp_state.sve_state = vcpu->arch.sve_state; fp_state.sve_vl = vcpu->arch.max_vl[ARM64_VEC_SVE]; fp_state.sme_state = NULL; - fp_state.svcr = &vcpu->arch.svcr; + fp_state.svcr = &(vcpu->arch.ctxt.sys_regs[SVCR]); fp_state.fp_type = &vcpu->arch.fp_type;
if (vcpu_has_sve(vcpu)) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index b618bcab526e..f908aa3fb606 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2336,7 +2336,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { .get_user = get_id_reg, .visibility = sme_visibility }, { SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 }, { SYS_DESC(SYS_CTR_EL0), access_ctr }, - { SYS_DESC(SYS_SVCR), undef_access }, + { SYS_DESC(SYS_SVCR), NULL, reset_val, SVCR, 0, .visibility = sme_visibility },
{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
If the guest has access to SME then we need to restore state for it and configure SMCR_EL2 to grant the guest access to the vector lengths that it has access to, along with FA64 and ZT0 if they were enabled.
SME has three sets of registers, ZA, ZT (only present if SME2 is supported for the guest) and streaming SVE. ZA and streaming SVE use a separately configured streaming mode vector length. The first two are fairly straightforward, they are enabled by PSTATE.ZA which is saved and restored via SVCR.ZA. We reuse the assembly the host already uses to load them from a single contiguous buffer. If PSTATE.ZA is not set they can only be accessed by setting PSTATE.ZA which will initialise them to all bits 0.
Streaming mode SVE presents some complications, this is a version of the SVE registers which may optionally omit the FFR register and which uses the streaming mode vector length. Similarly to ZA and ZT it is controlled by PSTATE.SM which we save and restore with SVCR.SM. Any change in the value of PSTATE.SM initialises all bits of the floating point registers to 0 so we do not need to be concerned about information leakage due to changes in vector length between streaming and non-streaming modes.
The overall floating point restore is modified to start by assuming that we will restore either FPSIMD or full SVE register state depending on if the guest has access to vanilla SVE. If the guest has SME we then check the SME configuration we are restoring and override as appropriate. Since SVE instructions are VL agnostic the hardware will load using the appropriate vector length without us needing to pass further flags.
We also modify the floating point save to save SMCR and restore the host configuration in a similar way to ZCR, the code is slightly more complex since for SME as well as the vector length we also have enables for FA64 and ZT0 in the register.
Since the layout of the SVE register state depends on the vector length we need to pass the active vector length when getting the pointer to FFR.
Since for reasons discussed at length in aaa2f14e6f3f ("KVM: arm64: Clarify host SME state management") we always disable both PSTATE.SM and PSTATE.ZA prior to running the guest meaning that we do not need to worry about saving host SME state.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 7 ++- arch/arm64/include/asm/kvm_hyp.h | 1 + arch/arm64/kvm/fpsimd.c | 83 ++++++++++++++++++++++----------- arch/arm64/kvm/hyp/fpsimd.S | 10 ++++ arch/arm64/kvm/hyp/include/hyp/switch.h | 76 ++++++++++++++++++++++++++++-- 5 files changed, 143 insertions(+), 34 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 690c439b5e2a..022b9585e6f6 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -517,6 +517,7 @@ struct kvm_vcpu_arch { * records which view it saved in fp_type. */ void *sve_state; + void *sme_state; enum fp_type fp_type; unsigned int max_vl[ARM64_VEC_MAX];
@@ -818,13 +819,15 @@ struct kvm_vcpu_arch { #define vcpu_gp_regs(v) (&(v)->arch.ctxt.regs)
/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ -#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ - sve_ffr_offset((vcpu)->arch.max_vl[ARM64_VEC_SVE])) +#define vcpu_sve_pffr(vcpu, vec) (kern_hyp_va((vcpu)->arch.sve_state) + \ + sve_ffr_offset((vcpu)->arch.max_vl[vec]))
#define vcpu_vec_max_vq(type, vcpu) sve_vq_from_vl((vcpu)->arch.max_vl[type])
#define vcpu_sve_max_vq(vcpu) vcpu_vec_max_vq(ARM64_VEC_SVE, vcpu)
+#define vcpu_sme_max_vq(vcpu) vcpu_vec_max_vq(ARM64_VEC_SME, vcpu) + #define vcpu_sve_state_size(vcpu) ({ \ size_t __size_ret; \ unsigned int __vcpu_vq; \ diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 7ac38ed90062..e555af02ece3 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -112,6 +112,7 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu); void __fpsimd_save_state(struct user_fpsimd_state *fp_regs); void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs); void __sve_restore_state(void *sve_pffr, u32 *fpsr, bool restore_ffr); +void __sme_restore_state(void const *state, bool zt);
u64 __guest_enter(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 1be18d719fce..d9a56a4027a6 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -153,10 +153,15 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) fp_state.st = &vcpu->arch.ctxt.fp_regs; fp_state.sve_state = vcpu->arch.sve_state; fp_state.sve_vl = vcpu->arch.max_vl[ARM64_VEC_SVE]; - fp_state.sme_state = NULL; + fp_state.sme_vl = vcpu->arch.max_vl[ARM64_VEC_SME]; + fp_state.sme_state = vcpu->arch.sme_state; fp_state.svcr = &(vcpu->arch.ctxt.sys_regs[SVCR]); fp_state.fp_type = &vcpu->arch.fp_type;
+ /* + * If we are in streaming mode PSTATE.SM will override + * FP_STATE_FPSIMD during save for SME only guests. + */ if (vcpu_has_sve(vcpu)) fp_state.to_save = FP_STATE_SVE; else @@ -177,26 +182,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) { unsigned long flags; + u64 cpacr_set, cpacr_clear; + u64 smcr_cur, smcr_new;
local_irq_save(flags);
- /* - * If we have VHE then the Hyp code will reset CPACR_EL1 to - * the default value and we need to reenable SME. - */ - if (has_vhe() && system_supports_sme()) { - /* Also restore EL0 state seen on entry */ - if (vcpu_get_flag(vcpu, HOST_SME_ENABLED)) - sysreg_clear_set(CPACR_EL1, 0, - CPACR_EL1_SMEN_EL0EN | - CPACR_EL1_SMEN_EL1EN); - else - sysreg_clear_set(CPACR_EL1, - CPACR_EL1_SMEN_EL0EN, - CPACR_EL1_SMEN_EL1EN); - isb(); - } - if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) { if (vcpu_has_sve(vcpu)) { __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR); @@ -207,19 +197,56 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) SYS_ZCR_EL1); }
+ if (vcpu_has_sme(vcpu)) { + smcr_cur = read_sysreg_el1(SYS_SMCR); + __vcpu_sys_reg(vcpu, SMCR_EL1) = smcr_cur; + + /* Restore the full VL and feature set */ + if (!has_vhe()) { + smcr_new = vcpu_sme_max_vq(vcpu) - 1; + + if (system_supports_fa64()) + smcr_new |= SMCR_ELx_FA64; + if (system_supports_sme2()) + smcr_new |= SMCR_ELx_EZT0_MASK; + + if (smcr_cur != smcr_new) + write_sysreg_s(smcr_new, SYS_SMCR_EL1); + } + } + fpsimd_save_and_flush_cpu_state(); - } else if (has_vhe() && system_supports_sve()) { + } else if (has_vhe()) { /* - * The FPSIMD/SVE state in the CPU has not been touched, and we - * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been - * reset by kvm_reset_cptr_el2() in the Hyp code, disabling SVE - * for EL0. To avoid spurious traps, restore the trap state - * seen by kvm_arch_vcpu_load_fp(): + * The FP state in the CPU has not been touched, and + * we have SVE or SME (and VHE): CPACR_EL1 (alias + * CPTR_EL2) has been reset by kvm_reset_cptr_el2() in + * the Hyp code, disabling SVE/SME for EL0. To avoid + * spurious traps, restore the trap state seen by + * kvm_arch_vcpu_load_fp(): */ - if (vcpu_get_flag(vcpu, HOST_SVE_ENABLED)) - sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_ZEN_EL0EN); - else - sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0); + + cpacr_clear = 0; + cpacr_set = 0; + + if (system_supports_sve()) { + if (vcpu_get_flag(vcpu, HOST_SVE_ENABLED)) { + cpacr_set |= CPACR_EL1_ZEN_EL0EN; + } else { + cpacr_clear |= CPACR_EL1_ZEN_EL0EN; + } + } + + if (system_supports_sme()) { + if (vcpu_get_flag(vcpu, HOST_SME_ENABLED)) { + cpacr_set |= CPACR_EL1_SMEN_EL0EN; + } else { + cpacr_clear |= CPACR_EL1_SMEN_EL0EN; + } + } + + if (cpacr_clear || cpacr_set) + sysreg_clear_set(CPACR_EL1, cpacr_clear, cpacr_set); }
local_irq_restore(flags); diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S index 8940954b5420..b266f97b77ae 100644 --- a/arch/arm64/kvm/hyp/fpsimd.S +++ b/arch/arm64/kvm/hyp/fpsimd.S @@ -24,3 +24,13 @@ SYM_FUNC_START(__sve_restore_state) sve_load 0, x1, x2, 3 ret SYM_FUNC_END(__sve_restore_state) + +SYM_FUNC_START(__sme_restore_state) + _sme_rdsvl 2, 1 // x2 = VL/8 + sme_load_za 0, x2, 12 // Leaves x0 pointing to the end of ZA + + cbz x1, 1f + _ldr_zt 0 +1: + ret +SYM_FUNC_END(__sme_restore_state) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 0cf4770b9d70..10a055e8ed73 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -293,11 +293,45 @@ static bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code) static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu) { sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2); - __sve_restore_state(vcpu_sve_pffr(vcpu), - &vcpu->arch.ctxt.fp_regs.fpsr, true); write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR); }
+static inline void __hyp_sme_restore_guest(struct kvm_vcpu *vcpu, + bool *restore_sve_regs, + bool *restore_ffr, + enum vec_type *vec_type) +{ + u64 svcr, old_smcr, new_smcr; + + /* Configure VL and features for the guest */ + old_smcr = read_sysreg_s(SYS_SMCR_EL2); + new_smcr = old_smcr & ~(SMCR_ELx_LEN_MASK | SMCR_ELx_FA64_MASK | + SMCR_ELx_EZT0_MASK); + new_smcr |= (vcpu_sme_max_vq(vcpu) - 1) & SMCR_ELx_LEN_MASK; + if (vcpu_has_fa64(vcpu)) + new_smcr |= SMCR_ELx_FA64_MASK; + if (vcpu_has_sme2(vcpu)) + new_smcr |= SMCR_ELx_EZT0_MASK; + if (old_smcr != new_smcr) + write_sysreg_s(new_smcr, SYS_SMCR_EL2); + + write_sysreg_el1(__vcpu_sys_reg(vcpu, SMCR_EL1), SYS_SMCR); + + /* Restore SVCR before data to ensure PSTATE.{SM,ZA} are configured */ + svcr = __vcpu_sys_reg(vcpu, SVCR); + write_sysreg_s(svcr, SYS_SVCR); + + if (svcr & SVCR_SM_MASK) { + *restore_sve_regs = true; + *restore_ffr = vcpu_has_fa64(vcpu); + *vec_type = ARM64_VEC_SME; + } + + if (svcr & SVCR_ZA_MASK) + __sme_restore_state(kern_hyp_va((vcpu)->arch.sme_state), + vcpu_has_sme2(vcpu)); +} + /* * We trap the first access to the FP/SIMD to save the host context and * restore the guest context lazily. @@ -306,15 +340,20 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu) */ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) { - bool sve_guest; - u8 esr_ec; + bool sve_guest, sme_guest; + u8 esr_ec, esr_iss; u64 reg; + bool restore_ffr; + bool restore_sve_regs; + enum vec_type vec_type = ARM64_VEC_SVE;
if (!system_supports_fpsimd()) return false;
sve_guest = vcpu_has_sve(vcpu); + sme_guest = vcpu_has_sme(vcpu); esr_ec = kvm_vcpu_trap_get_class(vcpu); + esr_iss = ESR_ELx_ISS(kvm_vcpu_get_esr(vcpu));
/* Only handle traps the vCPU can support here: */ switch (esr_ec) { @@ -324,6 +363,14 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) if (!sve_guest) return false; break; + case ESR_ELx_EC_SME: + if (!sme_guest) + return false; + + if (!vcpu_has_sme2(vcpu) && + (esr_iss == ESR_ELx_SME_ISS_ZT_DISABLED)) + return false; + break; default: return false; } @@ -335,12 +382,16 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN; if (sve_guest) reg |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; + if (sme_guest) + reg |= CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN;
sysreg_clear_set(cpacr_el1, 0, reg); } else { reg = CPTR_EL2_TFP; if (sve_guest) reg |= CPTR_EL2_TZ; + if (sme_guest) + reg |= CPTR_EL2_TSM;
sysreg_clear_set(cptr_el2, reg, 0); } @@ -351,8 +402,25 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) __fpsimd_save_state(vcpu->arch.host_fpsimd_state);
/* Restore the guest state */ + + /* + * These may be overridden if the guest has SME, the host can + * have SME without SVE and in streaming mode SME may lack FFR. + */ + restore_sve_regs = sve_guest; + restore_ffr = sve_guest; + if (sve_guest) __hyp_sve_restore_guest(vcpu); + /* SVCR is cleared by kvm_arch_vcpu_load_fp() for !SME cases */ + if (sme_guest) + __hyp_sme_restore_guest(vcpu, &restore_sve_regs, &restore_ffr, + &vec_type); + + if (restore_sve_regs) + __sve_restore_state(vcpu_sve_pffr(vcpu, vec_type), + &vcpu->arch.ctxt.fp_regs.fpsr, + restore_ffr); else __fpsimd_restore_state(&vcpu->arch.ctxt.fp_regs);
Now that we have support for managing SME state for KVM guests add handling for SME exceptions generated by guests. As with SVE these are routed to the generic floating point exception handlers for both VHE and nVHE, the floating point state is handled as a uniform block.
Since we do not presently support SME for protected VMs handle exceptions from protected guests as UNDEF.
For nVHE and hVHE modes we currently do a lazy restore of the host EL2 setup for SVE, do the same for SME. Since it is likely that there will be common situations where SVE and SME are both used in quick succession by the host (eg, saving the guest state) restore the configuration for both at once in order to minimise the number of EL2 entries.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_emulate.h | 12 ++++---- arch/arm64/kvm/handle_exit.c | 11 +++++++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 56 ++++++++++++++++++++++++++++++------ arch/arm64/kvm/hyp/nvhe/switch.c | 13 ++++----- arch/arm64/kvm/hyp/vhe/switch.c | 3 ++ 5 files changed, 73 insertions(+), 22 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 14d6ff2e2a39..756c2c28c592 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -584,16 +584,15 @@ static __always_inline u64 kvm_get_reset_cptr_el2(struct kvm_vcpu *vcpu)
if (has_vhe()) { val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN | - CPACR_EL1_ZEN_EL1EN); - if (cpus_have_final_cap(ARM64_SME)) - val |= CPACR_EL1_SMEN_EL1EN; + CPACR_EL1_ZEN_EL1EN | CPACR_EL1_SMEN_EL1EN); } else if (has_hvhe()) { val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN);
if (!vcpu_has_sve(vcpu) || (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED)) val |= CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN; - if (cpus_have_final_cap(ARM64_SME)) + if (!vcpu_has_sme(vcpu) || + (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED)) val |= CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN; } else { val = CPTR_NVHE_EL2_RES1; @@ -602,8 +601,9 @@ static __always_inline u64 kvm_get_reset_cptr_el2(struct kvm_vcpu *vcpu) if (vcpu_has_sve(vcpu) && (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED)) val |= CPTR_EL2_TZ; - if (cpus_have_final_cap(ARM64_SME)) - val &= ~CPTR_EL2_TSM; + if (vcpu_has_sme(vcpu) && + (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED)) + val |= CPTR_EL2_TSM; }
return val; diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 617ae6dea5d5..e5d8d8767872 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -206,6 +206,16 @@ static int handle_sve(struct kvm_vcpu *vcpu) return 1; }
+/* + * Guest access to SME registers should be routed to this handler only + * when the system doesn't support SME. + */ +static int handle_sme(struct kvm_vcpu *vcpu) +{ + kvm_inject_undefined(vcpu); + return 1; +} + /* * Guest usage of a ptrauth instruction (which the guest EL1 did not turn into * a NOP). If we get here, it is that we didn't fixup ptrauth on exit, and all @@ -268,6 +278,7 @@ static exit_handle_fn arm_exit_handlers[] = { [ESR_ELx_EC_SVC64] = handle_svc, [ESR_ELx_EC_SYS64] = kvm_handle_sys_reg, [ESR_ELx_EC_SVE] = handle_sve, + [ESR_ELx_EC_SME] = handle_sme, [ESR_ELx_EC_ERET] = kvm_handle_eret, [ESR_ELx_EC_IABT_LOW] = kvm_handle_guest_abort, [ESR_ELx_EC_DABT_LOW] = kvm_handle_guest_abort, diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 56808df6a078..b2da4800b673 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -411,6 +411,52 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt) kvm_skip_host_instr(); }
+static void handle_host_vec(void) +{ + u64 old_smcr, new_smcr; + u64 mask = 0; + + /* + * Handle lazy restore of the EL2 configuration for host SVE + * and SME usage. It is likely that when a host supports both + * SVE and SME it will use both in quick succession (eg, + * saving guest state) so we restore both when either traps. + */ + if (has_hvhe()) { + if (cpus_have_final_cap(ARM64_SVE)) + mask |= CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN; + if (cpus_have_final_cap(ARM64_SME)) + mask |= CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN; + + sysreg_clear_set(cpacr_el1, 0, mask); + } else { + if (cpus_have_final_cap(ARM64_SVE)) + mask |= CPTR_EL2_TZ; + if (cpus_have_final_cap(ARM64_SME)) + mask |= CPTR_EL2_TSM; + + sysreg_clear_set(cptr_el2, mask, 0); + } + + isb(); + + if (cpus_have_final_cap(ARM64_SVE)) + sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); + + if (cpus_have_final_cap(ARM64_SME)) { + old_smcr = read_sysreg_s(SYS_SMCR_EL2); + new_smcr = SMCR_ELx_LEN_MASK; + + if (cpus_have_final_cap(ARM64_SME_FA64)) + new_smcr |= SMCR_ELx_FA64_MASK; + if (cpus_have_final_cap(ARM64_SME2)) + new_smcr |= SMCR_ELx_EZT0_MASK; + + if (old_smcr != new_smcr) + write_sysreg_s(new_smcr, SYS_SMCR_EL2); + } +} + void handle_trap(struct kvm_cpu_context *host_ctxt) { u64 esr = read_sysreg_el2(SYS_ESR); @@ -423,14 +469,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) handle_host_smc(host_ctxt); break; case ESR_ELx_EC_SVE: - /* Handle lazy restore of the host VL */ - if (has_hvhe()) - sysreg_clear_set(cpacr_el1, 0, (CPACR_EL1_ZEN_EL1EN | - CPACR_EL1_ZEN_EL0EN)); - else - sysreg_clear_set(cptr_el2, CPTR_EL2_TZ, 0); - isb(); - sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); + case ESR_ELx_EC_SME: + handle_host_vec(); break; case ESR_ELx_EC_IABT_LOW: case ESR_ELx_EC_DABT_LOW: diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index c50f8459e4fc..b022728edb2f 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -46,19 +46,14 @@ static void __activate_traps(struct kvm_vcpu *vcpu) val = vcpu->arch.cptr_el2; val |= CPTR_EL2_TAM; /* Same bit irrespective of E2H */ val |= has_hvhe() ? CPACR_EL1_TTA : CPTR_EL2_TTA; - if (cpus_have_final_cap(ARM64_SME)) { - if (has_hvhe()) - val &= ~(CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN); - else - val |= CPTR_EL2_TSM; - }
if (!guest_owns_fp_regs(vcpu)) { if (has_hvhe()) val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN | - CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN); + CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN | + CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN); else - val |= CPTR_EL2_TFP | CPTR_EL2_TZ; + val |= CPTR_EL2_TFP | CPTR_EL2_TZ | CPTR_EL2_TSM;
__activate_traps_fpsimd32(vcpu); } @@ -186,6 +181,7 @@ static const exit_handler_fn hyp_exit_handlers[] = { [0 ... ESR_ELx_EC_MAX] = NULL, [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, [ESR_ELx_EC_SYS64] = kvm_hyp_handle_sysreg, + [ESR_ELx_EC_SME] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_SVE] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_FP_ASIMD] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_IABT_LOW] = kvm_hyp_handle_iabt_low, @@ -198,6 +194,7 @@ static const exit_handler_fn hyp_exit_handlers[] = { static const exit_handler_fn pvm_exit_handlers[] = { [0 ... ESR_ELx_EC_MAX] = NULL, [ESR_ELx_EC_SYS64] = kvm_handle_pvm_sys64, + [ESR_ELx_EC_SME] = kvm_handle_pvm_restricted, [ESR_ELx_EC_SVE] = kvm_handle_pvm_restricted, [ESR_ELx_EC_FP_ASIMD] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_IABT_LOW] = kvm_hyp_handle_iabt_low, diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 1581df6aec87..0b1a9733f3e0 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -78,6 +78,8 @@ static void __activate_traps(struct kvm_vcpu *vcpu) if (guest_owns_fp_regs(vcpu)) { if (vcpu_has_sve(vcpu)) val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; + if (vcpu_has_sme(vcpu)) + val |= CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN; } else { val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN); __activate_traps_fpsimd32(vcpu); @@ -177,6 +179,7 @@ static const exit_handler_fn hyp_exit_handlers[] = { [0 ... ESR_ELx_EC_MAX] = NULL, [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, [ESR_ELx_EC_SYS64] = kvm_hyp_handle_sysreg, + [ESR_ELx_EC_SME] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_SVE] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_FP_ASIMD] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_IABT_LOW] = kvm_hyp_handle_iabt_low,
Configuration for SME vector lengths is done using a virtual register as for SVE, hook up the implementation for the virtual register. Since we do not yet have support for any of the new SME registers stub register access functions are provided.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/uapi/asm/kvm.h | 9 ++++ arch/arm64/kvm/guest.c | 94 +++++++++++++++++++++++++++++++-------- 2 files changed, 84 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 3048890fac68..02642bb96496 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -353,6 +353,15 @@ struct kvm_arm_counter_offset { #define KVM_ARM64_SVE_VLS_WORDS \ ((KVM_ARM64_SVE_VQ_MAX - KVM_ARM64_SVE_VQ_MIN) / 64 + 1)
+/* SME registers */ +#define KVM_REG_ARM64_SME (0x17 << KVM_REG_ARM_COPROC_SHIFT) + +/* Vector lengths pseudo-register: */ +#define KVM_REG_ARM64_SME_VLS (KVM_REG_ARM64 | KVM_REG_ARM64_SME | \ + KVM_REG_SIZE_U512 | 0xffff) +#define KVM_ARM64_SME_VLS_WORDS \ + ((KVM_ARM64_SVE_VQ_MAX - KVM_ARM64_SVE_VQ_MIN) / 64 + 1) + /* Bitmap feature firmware registers */ #define KVM_REG_ARM_FW_FEAT_BMAP (0x0016 << KVM_REG_ARM_COPROC_SHIFT) #define KVM_REG_ARM_FW_FEAT_BMAP_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 6e116fd8a917..30446ae357f0 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -309,22 +309,20 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) #define vq_mask(vq) ((u64)1 << ((vq) - SVE_VQ_MIN) % 64) #define vq_present(vqs, vq) (!!((vqs)[vq_word(vq)] & vq_mask(vq)))
-static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +static int get_vec_vls(enum vec_type vec_type, struct kvm_vcpu *vcpu, + const struct kvm_one_reg *reg) { unsigned int max_vq, vq; u64 vqs[KVM_ARM64_SVE_VLS_WORDS];
- if (!vcpu_has_sve(vcpu)) - return -ENOENT; - - if (WARN_ON(!sve_vl_valid(vcpu->arch.max_vl[ARM64_VEC_SVE]))) + if (WARN_ON(!sve_vl_valid(vcpu->arch.max_vl[vec_type]))) return -EINVAL;
memset(vqs, 0, sizeof(vqs));
- max_vq = vcpu_sve_max_vq(vcpu); + max_vq = vcpu_vec_max_vq(vec_type, vcpu); for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq) - if (sve_vq_available(vq)) + if (vq_available(vec_type, vq)) vqs[vq_word(vq)] |= vq_mask(vq);
if (copy_to_user((void __user *)reg->addr, vqs, sizeof(vqs))) @@ -333,18 +331,13 @@ static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return 0; }
-static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +static int set_vec_vls(enum vec_type vec_type, struct kvm_vcpu *vcpu, + const struct kvm_one_reg *reg) { unsigned int max_vq, vq; u64 vqs[KVM_ARM64_SVE_VLS_WORDS];
- if (!vcpu_has_sve(vcpu)) - return -ENOENT; - - if (kvm_arm_vcpu_vec_finalized(vcpu)) - return -EPERM; /* too late! */ - - if (WARN_ON(vcpu->arch.sve_state)) + if (WARN_ON(!sve_vl_valid(vcpu->arch.max_vl[vec_type]))) return -EINVAL;
if (copy_from_user(vqs, (const void __user *)reg->addr, sizeof(vqs))) @@ -355,18 +348,18 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (vq_present(vqs, vq)) max_vq = vq;
- if (max_vq > sve_vq_from_vl(kvm_vec_max_vl[ARM64_VEC_SVE])) + if (max_vq > sve_vq_from_vl(kvm_vec_max_vl[vec_type])) return -EINVAL;
/* * Vector lengths supported by the host can't currently be * hidden from the guest individually: instead we can only set a - * maximum via ZCR_EL2.LEN. So, make sure the available vector + * maximum via xCR_EL2.LEN. So, make sure the available vector * lengths match the set requested exactly up to the requested * maximum: */ for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq) - if (vq_present(vqs, vq) != sve_vq_available(vq)) + if (vq_present(vqs, vq) != vq_available(vec_type, vq)) return -EINVAL;
/* Can't run with no vector lengths at all: */ @@ -374,11 +367,33 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return -EINVAL;
/* vcpu->arch.sve_state will be alloc'd by kvm_vcpu_finalize_sve() */ - vcpu->arch.max_vl[ARM64_VEC_SVE] = sve_vl_from_vq(max_vq); + vcpu->arch.max_vl[vec_type] = sve_vl_from_vq(max_vq);
return 0; }
+static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + if (!vcpu_has_sve(vcpu)) + return -ENOENT; + + return get_vec_vls(ARM64_VEC_SVE, vcpu, reg); +} + +static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + if (!vcpu_has_sve(vcpu)) + return -ENOENT; + + if (kvm_arm_vcpu_vec_finalized(vcpu)) + return -EPERM; /* too late! */ + + if (WARN_ON(vcpu->arch.sve_state)) + return -EINVAL; + + return set_vec_vls(ARM64_VEC_SVE, vcpu, reg); +} + #define SVE_REG_SLICE_SHIFT 0 #define SVE_REG_SLICE_BITS 5 #define SVE_REG_ID_SHIFT (SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS) @@ -532,6 +547,45 @@ static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return 0; }
+static int get_sme_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + if (!vcpu_has_sme(vcpu)) + return -ENOENT; + + return get_vec_vls(ARM64_VEC_SME, vcpu, reg); +} + +static int set_sme_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + if (!vcpu_has_sme(vcpu)) + return -ENOENT; + + if (kvm_arm_vcpu_vec_finalized(vcpu)) + return -EPERM; /* too late! */ + + if (WARN_ON(vcpu->arch.sme_state)) + return -EINVAL; + + return set_vec_vls(ARM64_VEC_SME, vcpu, reg); +} + +static int get_sme_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + /* Handle the KVM_REG_ARM64_SME_VLS pseudo-reg as a special case: */ + if (reg->id == KVM_REG_ARM64_SME_VLS) + return get_sme_vls(vcpu, reg); + + return -EINVAL; +} + +static int set_sme_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) +{ + /* Handle the KVM_REG_ARM64_SME_VLS pseudo-reg as a special case: */ + if (reg->id == KVM_REG_ARM64_SME_VLS) + return set_sme_vls(vcpu, reg); + + return -EINVAL; +} int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) { return -EINVAL; @@ -771,6 +825,7 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) case KVM_REG_ARM_FW_FEAT_BMAP: return kvm_arm_get_fw_reg(vcpu, reg); case KVM_REG_ARM64_SVE: return get_sve_reg(vcpu, reg); + case KVM_REG_ARM64_SME: return get_sme_reg(vcpu, reg); }
if (is_timer_reg(reg->id)) @@ -791,6 +846,7 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) case KVM_REG_ARM_FW_FEAT_BMAP: return kvm_arm_set_fw_reg(vcpu, reg); case KVM_REG_ARM64_SVE: return set_sve_reg(vcpu, reg); + case KVM_REG_ARM64_SME: return set_sme_reg(vcpu, reg); }
if (is_timer_reg(reg->id))
As for SVE we will need to pull parts of dynamically sized registers out of a block of memory for SME so we will use a similar code pattern for this. Rename the current struct sve_state_reg_region in preparation for this.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/guest.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 30446ae357f0..1d161fa7c2be 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -418,9 +418,9 @@ static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) */ #define vcpu_sve_slices(vcpu) 1
-/* Bounds of a single SVE register slice within vcpu->arch.sve_state */ -struct sve_state_reg_region { - unsigned int koffset; /* offset into sve_state in kernel memory */ +/* Bounds of a single register slice within vcpu->arch.s[mv]e_state */ +struct vec_state_reg_region { + unsigned int koffset; /* offset into s[mv]e_state in kernel memory */ unsigned int klen; /* length in kernel memory */ unsigned int upad; /* extra trailing padding in user memory */ }; @@ -429,7 +429,7 @@ struct sve_state_reg_region { * Validate SVE register ID and get sanitised bounds for user/kernel SVE * register copy */ -static int sve_reg_to_region(struct sve_state_reg_region *region, +static int sve_reg_to_region(struct vec_state_reg_region *region, struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { @@ -499,7 +499,7 @@ static int sve_reg_to_region(struct sve_state_reg_region *region, static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { int ret; - struct sve_state_reg_region region; + struct vec_state_reg_region region; char __user *uptr = (char __user *)reg->addr;
/* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */ @@ -525,7 +525,7 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { int ret; - struct sve_state_reg_region region; + struct vec_state_reg_region region; const char __user *uptr = (const char __user *)reg->addr;
/* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */
SME defines a new mode called streaming mode with an associated vector length which may be configured independently to the standard SVE vector length. SVE and SME also have no interdependency, it is valid to implement SME without SVE. When in streaming mode the SVE registers are accessible with the streaming mode vector length rather than the usual SVE vector length. In order to handle this we extend the existing SVE register interface to expose the vector length dependent registers with the larger of the SVE or SME vector length and requiring access to the V registers via the SVE register set if the system supports only SME.
The main complication this creates is that the format that is sensible for saving and restoring the registers in the hypervisor is no longer the same as that which is exposed to userspace. This is especially true when the host does not have SVE, in that case we must use FPSIMD register save and restore code in non-streaming mode. Handle this by extending our set of register access owners to include userspace access, converting to the format used by the hypervisor if needed when preparing to run the guest and to userspace format when the registers are read. This avoids any conversions in normal hypervisor scheduling.
The userspace format is the existing SVE format with the largest vector length used. The hypervisor format varies depending on the current value of SVCR.SM and if SVE is supported:
- If SVCR.SM=1 then SVE format with the SME vector length is used. - If SVCR.SM=0 and SVE is not supported then the FPSIMD format is used. - If SVCR.SM=0 and SVE is supported the SVE format and vector length are used.
When converting to a larger vector length we pad the high bits with 0.
Since we already track the owner of the floating point register state introduce a new state FP_STATE_USER_OWNED in which the state is stored in the format we present to userspace. The guest starts out in this state. When we prepare to run the guest we check for this state and the guest we check for this state and if we are in it we do any rewrites needed to store in the format the hypervisor expects. In order to minimise overhead we only convert back to userspace format if userspace accesses the SVE registers.
For simpliciy we always represent FFR for SVE storage, if the system lacks both SVE and streaming mode FFR then the value will be visible but ignored.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 6 +- arch/arm64/kvm/arm.c | 9 +- arch/arm64/kvm/fpsimd.c | 173 ++++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/guest.c | 16 ++-- 4 files changed, 195 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 022b9585e6f6..a5ed0433edc6 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -540,6 +540,7 @@ struct kvm_vcpu_arch { FP_STATE_FREE, FP_STATE_HOST_OWNED, FP_STATE_GUEST_OWNED, + FP_STATE_USER_OWNED, } fp_state;
/* Configuration flags, set once and for all before the vcpu can run */ @@ -828,6 +829,9 @@ struct kvm_vcpu_arch {
#define vcpu_sme_max_vq(vcpu) vcpu_vec_max_vq(ARM64_VEC_SME, vcpu)
+int vcpu_max_vq(struct kvm_vcpu *vcpu); +void vcpu_fp_guest_to_user(struct kvm_vcpu *vcpu); + #define vcpu_sve_state_size(vcpu) ({ \ size_t __size_ret; \ unsigned int __vcpu_vq; \ @@ -835,7 +839,7 @@ struct kvm_vcpu_arch { if (WARN_ON(!sve_vl_valid((vcpu)->arch.max_vl[ARM64_VEC_SVE]))) { \ __size_ret = 0; \ } else { \ - __vcpu_vq = vcpu_sve_max_vq(vcpu); \ + __vcpu_vq = vcpu_max_vq(vcpu); \ __size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq); \ } \ \ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index e5f75f1f1085..aa7e2031263c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -376,9 +376,14 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
/* * Default value for the FP state, will be overloaded at load - * time if we support FP (pretty likely) + * time if we support FP (pretty likely). If we support both + * SVE and SME we may have to rewrite between the two VLs, + * default to formatting the registers for userspace access. */ - vcpu->arch.fp_state = FP_STATE_FREE; + if (system_supports_sve() && system_supports_sme()) + vcpu->arch.fp_state = FP_STATE_USER_OWNED; + else + vcpu->arch.fp_state = FP_STATE_FREE;
/* Set up the timer */ kvm_timer_vcpu_init(vcpu); diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index d9a56a4027a6..a40072e149cf 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -14,6 +14,24 @@ #include <asm/kvm_mmu.h> #include <asm/sysreg.h>
+/* We present Z and P to userspace with the maximum of the SVE or SME VL */ +int vcpu_max_vq(struct kvm_vcpu *vcpu) +{ + int sve, sme; + + if (vcpu_has_sve(vcpu)) + sve = vcpu_sve_max_vq(vcpu); + else + sve = 0; + + if (vcpu_has_sme(vcpu)) + sme = vcpu_sme_max_vq(vcpu); + else + sme = 0; + + return max(sve, sme); +} + void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu) { struct task_struct *p = vcpu->arch.parent_task; @@ -65,6 +83,159 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu) return 0; }
+static bool vcpu_fp_user_format_needed(struct kvm_vcpu *vcpu) +{ + /* Only systems with SME need rewrites */ + if (!system_supports_sme()) + return false; + + /* + * If we have both SVE and SME and the two VLs are the same + * and no rewrite is needed. + */ + if (vcpu_has_sve(vcpu) && + (vcpu_sve_max_vq(vcpu) == vcpu_sme_max_vq(vcpu))) + return false; + + return true; +} + +static bool vcpu_sm_active(struct kvm_vcpu *vcpu) +{ + return __vcpu_sys_reg(vcpu, SVCR) & SVCR_SM; +} + +static int vcpu_active_vq(struct kvm_vcpu *vcpu) +{ + if (vcpu_sm_active(vcpu)) + return vcpu_sme_max_vq(vcpu); + else + return vcpu_sve_max_vq(vcpu); +} + +static void *buf_zreg(void *buf, int vq, int reg) +{ + return buf + __SVE_ZREG_OFFSET(vq, reg) - __SVE_ZREGS_OFFSET; +} + +static void *buf_preg(void *buf, int vq, int reg) +{ + return buf + __SVE_PREG_OFFSET(vq, reg) - __SVE_ZREGS_OFFSET; +} + +static void vcpu_rewrite_sve(struct kvm_vcpu *vcpu, int vq_in, int vq_out) +{ + void *new_buf; + int copy_size, i; + + new_buf = kzalloc(vcpu_sve_state_size(vcpu), GFP_KERNEL); + if (!new_buf) + return; + + if (WARN_ON_ONCE(vq_in == vq_out)) + return; + + /* Z registers */ + if (vq_in < vq_out) + copy_size = vq_in * __SVE_VQ_BYTES; + else + copy_size = vq_out * __SVE_VQ_BYTES; + + for (i = 0; i < SVE_NUM_ZREGS; i++) + memcpy(buf_zreg(new_buf, vq_out, i), + buf_zreg(vcpu->arch.sve_state, vq_in, i), + copy_size); + + /* P and FFR, FFR is stored as an additional P */ + copy_size /= 8; + for (i = 0; i <= SVE_NUM_PREGS; i++) + memcpy(buf_preg(new_buf, vq_out, i), + buf_preg(vcpu->arch.sve_state, vq_in, i), + copy_size); + + /* + * Ideally we would unmap the existing SVE buffer and remap + * the new one. + */ + memcpy(vcpu->arch.sve_state, new_buf, vcpu_sve_state_size(vcpu)); + kfree(new_buf); +} + +/* + * If both SVE and SME are supported we present userspace with the SVE + * Z, P and FFR registers configured with the larger of the SVE and + * SME vector length, and if we have SME then even without SVE we + * present the V registers via Z. + */ +static void vcpu_fp_user_to_guest(struct kvm_vcpu *vcpu) +{ + if (likely(vcpu->arch.fp_state != FP_STATE_USER_OWNED)) + return; + + if (!vcpu_fp_user_format_needed(vcpu)) { + vcpu->arch.fp_state = FP_STATE_FREE; + return; + } + + if (vcpu_has_sve(vcpu)) { + /* + * The register state is stored in SVE format, rewrite + * from the larger VL to the one the guest is + * currently using. + */ + if (vcpu_active_vq(vcpu) != vcpu_max_vq(vcpu)) + vcpu_rewrite_sve(vcpu, vcpu_max_vq(vcpu), + vcpu_active_vq(vcpu)); + } else { + /* + * A FPSIMD only system will store non-streaming guest + * state in FPSIMD format when running the guest but + * present to userspace via the SVE regset. + */ + if (!vcpu_sm_active(vcpu)) + __sve_to_fpsimd(&vcpu->arch.ctxt.fp_regs, + vcpu->arch.sve_state, + vcpu_sme_max_vq(vcpu)); + } + + vcpu->arch.fp_state = FP_STATE_FREE; +} + +void vcpu_fp_guest_to_user(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.fp_state == FP_STATE_USER_OWNED) + return; + + if (!vcpu_fp_user_format_needed(vcpu)) + return; + + if (vcpu_has_sve(vcpu)) { + /* + * The register state is stored in SVE format, rewrite + * to the largest VL. + */ + if (vcpu_active_vq(vcpu) != vcpu_max_vq(vcpu)) + vcpu_rewrite_sve(vcpu, vcpu_active_vq(vcpu), + vcpu_max_vq(vcpu)); + } else { + /* + * A FPSIMD only system will store non-streaming guest + * state in FPSIMD format when running the guest but + * present to userspace via the SVE regset, rewrite + * with zero padding. + */ + if (!vcpu_sm_active(vcpu)) { + memset(vcpu->arch.sve_state, 0, + vcpu_sve_state_size(vcpu)); + __fpsimd_to_sve(vcpu->arch.sve_state, + &vcpu->arch.ctxt.fp_regs, + vcpu_sme_max_vq(vcpu)); + } + } + + vcpu->arch.fp_state = FP_STATE_USER_OWNED; +} + /* * Prepare vcpu for saving the host's FPSIMD state and loading the guest's. * The actual loading is done by the FPSIMD access trap taken to hyp. @@ -81,6 +252,8 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
fpsimd_kvm_prepare();
+ vcpu_fp_user_to_guest(vcpu); + /* * We will check TIF_FOREIGN_FPSTATE just before entering the * guest in kvm_arch_vcpu_ctxflush_fp() and override this to diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 1d161fa7c2be..5f2845625c55 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -110,9 +110,9 @@ static int core_reg_size_from_offset(const struct kvm_vcpu *vcpu, u64 off) /* * The KVM_REG_ARM64_SVE regs must be used instead of * KVM_REG_ARM_CORE for accessing the FPSIMD V-registers on - * SVE-enabled vcpus: + * SVE or SME enabled vcpus: */ - if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off)) + if (vcpu_has_vec(vcpu) && core_reg_offset_is_vreg(off)) return -EINVAL;
return size; @@ -462,20 +462,20 @@ static int sve_reg_to_region(struct vec_state_reg_region *region, reg_num = (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
if (reg->id >= zreg_id_min && reg->id <= zreg_id_max) { - if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) + if (!vcpu_has_vec(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) return -ENOENT;
- vq = vcpu_sve_max_vq(vcpu); + vq = vcpu_max_vq(vcpu);
reqoffset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET; reqlen = KVM_SVE_ZREG_SIZE; maxlen = SVE_SIG_ZREG_SIZE(vq); } else if (reg->id >= preg_id_min && reg->id <= preg_id_max) { - if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) + if (!vcpu_has_vec(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) return -ENOENT;
- vq = vcpu_sve_max_vq(vcpu); + vq = vcpu_max_vq(vcpu);
reqoffset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET; @@ -514,6 +514,8 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (!kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM;
+ vcpu_fp_guest_to_user(vcpu); + if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset, region.klen) || clear_user(uptr + region.klen, region.upad)) @@ -540,6 +542,8 @@ static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (!kvm_arm_vcpu_vec_finalized(vcpu)) return -EPERM;
+ vcpu_fp_guest_to_user(vcpu); + if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr, region.klen)) return -EFAULT;
The SME ZA matrix is a single SVL*SVL register which is available when PSTATE.ZA is set. We follow the pattern established by the architecture itself and expose this to userspace as a series of horizontal SVE vectors with the streaming mode vector length, using the format already established for the SVE vectors themselves.
For the purposes of exporting to userspace we ignore the value of PSTATE.ZA, if PSTATE.ZA is clear when the guest is run then the guest will need to set it to access ZA which would cause the value to be cleared. If userspace reads ZA when PSTATE.ZA is clear then it will read whatever stale data was last saved. This removes ordering requirements from userspace, minimising the need to special case.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 14 ++++++ arch/arm64/include/uapi/asm/kvm.h | 15 +++++++ arch/arm64/kvm/guest.c | 95 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 122 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index a5ed0433edc6..a1aa9471084d 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -846,6 +846,20 @@ void vcpu_fp_guest_to_user(struct kvm_vcpu *vcpu); __size_ret; \ })
+#define vcpu_sme_state_size(vcpu) ({ \ + size_t __size_ret; \ + unsigned int __vcpu_vq; \ + \ + if (WARN_ON(!sve_vl_valid((vcpu)->arch.max_vl[ARM64_VEC_SME]))) { \ + __size_ret = 0; \ + } else { \ + __vcpu_vq = vcpu_sme_max_vq(vcpu); \ + __size_ret = ZA_SIG_REGS_SIZE(__vcpu_vq); \ + } \ + \ + __size_ret; \ +}) + /* * Only use __vcpu_sys_reg/ctxt_sys_reg if you know you want the * memory backed version of a register, and not the one most recently diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 02642bb96496..00fb2ea4c057 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -356,6 +356,21 @@ struct kvm_arm_counter_offset { /* SME registers */ #define KVM_REG_ARM64_SME (0x17 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_ARM64_SME_VQ_MIN __SVE_VQ_MIN +#define KVM_ARM64_SME_VQ_MAX __SVE_VQ_MAX + +/* ZA and ZTn occupy blocks at the following offsets within this range: */ +#define KVM_REG_ARM64_SME_ZA_BASE 0 +#define KVM_REG_ARM64_SME_ZT_BASE 0x600 + +#define KVM_ARM64_SME_MAX_ZAHREG (__SVE_VQ_BYTES * KVM_ARM64_SME_VQ_MAX) + +#define KVM_REG_ARM64_SME_ZAHREG(n, i) \ + (KVM_REG_ARM64 | KVM_REG_ARM64_SME | KVM_REG_ARM64_SME_ZA_BASE | \ + KVM_REG_SIZE_U2048 | \ + (((n) & (KVM_ARM64_SME_MAX_ZAHREG - 1)) << 5) | \ + ((i) & (KVM_ARM64_SVE_MAX_SLICES - 1))) + /* Vector lengths pseudo-register: */ #define KVM_REG_ARM64_SME_VLS (KVM_REG_ARM64 | KVM_REG_ARM64_SME | \ KVM_REG_SIZE_U512 | 0xffff) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 5f2845625c55..cb38af891387 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -573,22 +573,113 @@ static int set_sme_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_vec_vls(ARM64_VEC_SME, vcpu, reg); }
+/* + * Validate SVE register ID and get sanitised bounds for user/kernel SVE + * register copy + */ +static int sme_reg_to_region(struct vec_state_reg_region *region, + struct kvm_vcpu *vcpu, + const struct kvm_one_reg *reg) +{ + /* reg ID ranges for ZA.H[n] registers */ + unsigned int vq = vcpu_sme_max_vq(vcpu) - 1; + const u64 za_h_max = vq * __SVE_VQ_BYTES; + const u64 zah_id_min = KVM_REG_ARM64_SME_ZAHREG(0, 0); + const u64 zah_id_max = KVM_REG_ARM64_SME_ZAHREG(za_h_max - 1, + SVE_NUM_SLICES - 1); + + unsigned int reg_num; + + unsigned int reqoffset, reqlen; /* User-requested offset and length */ + unsigned int maxlen; /* Maximum permitted length */ + + size_t sme_state_size; + + reg_num = (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT; + + if (reg->id >= zah_id_min && reg->id <= zah_id_max) { + /* ZA is exposed as SVE vectors ZA.H[n] */ + if (!vcpu_has_sme(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) + return -ENOENT; + + reqoffset = ZA_SIG_ZAV_OFFSET(vq, reg_num) - + ZA_SIG_REGS_OFFSET; + reqlen = KVM_SVE_ZREG_SIZE; + maxlen = SVE_SIG_ZREG_SIZE(vq); + } else { + return -EINVAL; + } + + sme_state_size = vcpu_sme_state_size(vcpu); + if (WARN_ON(!sme_state_size)) + return -EINVAL; + + region->koffset = array_index_nospec(reqoffset, sme_state_size); + region->klen = min(maxlen, reqlen); + region->upad = reqlen - region->klen; + + return 0; +} + +/* + * ZA is exposed as an array of horizontal vectors with the same + * format as SVE, mirroring the architecture's LDR ZA[Wv, offs], [Xn] + * instruction. + */ + static int get_sme_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { + int ret; + struct vec_state_reg_region region; + char __user *uptr = (char __user *)reg->addr; + /* Handle the KVM_REG_ARM64_SME_VLS pseudo-reg as a special case: */ if (reg->id == KVM_REG_ARM64_SME_VLS) return get_sme_vls(vcpu, reg);
- return -EINVAL; + /* Try to interpret reg ID as an architectural SVE register... */ + ret = sme_reg_to_region(®ion, vcpu, reg); + if (ret) + return ret; + + /* Try to interpret reg ID as an architectural SVE register... */ + ret = sme_reg_to_region(®ion, vcpu, reg); + if (ret) + return ret; + + if (!kvm_arm_vcpu_vec_finalized(vcpu)) + return -EPERM; + + if (copy_from_user(vcpu->arch.sme_state + region.koffset, uptr, + region.klen)) + return -EFAULT; + + return 0; }
static int set_sme_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { + int ret; + struct vec_state_reg_region region; + char __user *uptr = (char __user *)reg->addr; + /* Handle the KVM_REG_ARM64_SME_VLS pseudo-reg as a special case: */ if (reg->id == KVM_REG_ARM64_SME_VLS) return set_sme_vls(vcpu, reg);
- return -EINVAL; + /* Try to interpret reg ID as an architectural SVE register... */ + ret = sme_reg_to_region(®ion, vcpu, reg); + if (ret) + return ret; + + if (!kvm_arm_vcpu_vec_finalized(vcpu)) + return -EPERM; + + if (copy_from_user(vcpu->arch.sme_state + region.koffset, uptr, + region.klen)) + return -EFAULT; + + return 0; } int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) {
ZT0 is a single register with a refreshingly fixed size 512 bit register which is like ZA accessible only when PSTATE.ZA is set. Add support for it to the userspace API, as with ZA we allow the regster to be read or written regardless of the state of PSTATE.ZA in order to simplify userspace usage. The value will be reset to 0 whenever PSTATE.ZA changes from 0 to 1, userspace can read stale values but these are not observable by the guest without manipulation of PSTATE.ZA by userspace.
While there is currently only one ZT register the naming as ZT0 and the instruction encoding clearly leave room for future extensions adding more ZT registers. This encoding can readily support such an extension if one is introduced.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/include/uapi/asm/kvm.h | 2 ++ arch/arm64/kvm/guest.c | 13 +++++++++++-- 3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index a1aa9471084d..6a5002ab8042 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -855,6 +855,8 @@ void vcpu_fp_guest_to_user(struct kvm_vcpu *vcpu); } else { \ __vcpu_vq = vcpu_sme_max_vq(vcpu); \ __size_ret = ZA_SIG_REGS_SIZE(__vcpu_vq); \ + if (system_supports_sme2()) \ + __size_ret += ZT_SIG_REG_SIZE; \ } \ \ __size_ret; \ diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 00fb2ea4c057..58640aeb88e4 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -371,6 +371,8 @@ struct kvm_arm_counter_offset { (((n) & (KVM_ARM64_SME_MAX_ZAHREG - 1)) << 5) | \ ((i) & (KVM_ARM64_SVE_MAX_SLICES - 1)))
+#define KVM_REG_ARM64_SME_ZTREG_SIZE (512 / 8) + /* Vector lengths pseudo-register: */ #define KVM_REG_ARM64_SME_VLS (KVM_REG_ARM64 | KVM_REG_ARM64_SME | \ KVM_REG_SIZE_U512 | 0xffff) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index cb38af891387..fba5ff377b8b 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -587,7 +587,6 @@ static int sme_reg_to_region(struct vec_state_reg_region *region, const u64 zah_id_min = KVM_REG_ARM64_SME_ZAHREG(0, 0); const u64 zah_id_max = KVM_REG_ARM64_SME_ZAHREG(za_h_max - 1, SVE_NUM_SLICES - 1); - unsigned int reg_num;
unsigned int reqoffset, reqlen; /* User-requested offset and length */ @@ -598,14 +597,24 @@ static int sme_reg_to_region(struct vec_state_reg_region *region, reg_num = (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
if (reg->id >= zah_id_min && reg->id <= zah_id_max) { - /* ZA is exposed as SVE vectors ZA.H[n] */ if (!vcpu_has_sme(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0) return -ENOENT;
+ /* ZA is exposed as SVE vectors ZA.H[n] */ reqoffset = ZA_SIG_ZAV_OFFSET(vq, reg_num) - ZA_SIG_REGS_OFFSET; reqlen = KVM_SVE_ZREG_SIZE; maxlen = SVE_SIG_ZREG_SIZE(vq); + } if (reg->id == KVM_REG_ARM64_SME_ZT_BASE) { + /* ZA is exposed as SVE vectors ZA.H[n] */ + if (!vcpu_has_sme2(vcpu) || + (reg->id & SVE_REG_SLICE_MASK) > 0 || + reg_num > 0) + return -ENOENT; + + /* ZT0 is stored after ZA */ + reqlen = KVM_REG_ARM64_SME_ZTREG_SIZE; + maxlen = KVM_REG_ARM64_SME_ZTREG_SIZE; } else { return -EINVAL; }
As well as a substantial set of features which provide additional instructions there are also two current extensions which add new architectural state, SME2 (which adds ZT) and FA64 (which makes FFR valid in streaming mode SVE). Allow all of these to be configured through writes to the ID registers.
At present the guest support for SME2 and FA64 does not use the values configured here pending clarity on the approach to be taken generally with regards to parsing supported features from ID registers.
We always allocate state for the new architectural state which might be enabled if the host supports it, in the case of FFR this simplifies the already fiddly allocation and is needed when SVE is also supported. In the case of ZT the register is 64 bytes which is not completely trivial (though not so much relative to the other SME state) but it is not expected that there will be many practical users that want SME1 only so it is expected that guests with SME1 only would not be common enough to justify the complication of handling this. If this proves to be a problem we can improve things incrementally.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/sys_regs.c | 30 ++++++++++++++++++++++-------- 1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index f908aa3fb606..1ea658615467 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1412,13 +1412,6 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, val = read_sanitised_ftr_reg(id);
switch (id) { - case SYS_ID_AA64PFR1_EL1: - if (!kvm_has_mte(vcpu->kvm)) - val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE); - - if (!vcpu_has_sme(vcpu)) - val &= ~ID_AA64PFR1_EL1_SME_MASK; - break; case SYS_ID_AA64ISAR1_EL1: if (!vcpu_has_ptrauth(vcpu)) val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) | @@ -1582,6 +1575,20 @@ static u64 read_sanitised_id_aa64pfr0_el1(struct kvm_vcpu *vcpu, return val; }
+static u64 read_sanitised_id_aa64pfr1_el1(struct kvm_vcpu *vcpu, + const struct sys_reg_desc *rd) +{ + u64 val = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1); + + if (!vcpu_has_sme(vcpu)) + val &= ~ID_AA64PFR1_EL1_SME_MASK; + + if (!kvm_has_mte(vcpu->kvm)) + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE); + + return val; +} + #define ID_REG_LIMIT_FIELD_ENUM(val, reg, field, limit) \ ({ \ u64 __f_val = FIELD_GET(reg##_##field##_MASK, val); \ @@ -2164,7 +2171,14 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_AA64PFR0_EL1_GIC | ID_AA64PFR0_EL1_AdvSIMD | ID_AA64PFR0_EL1_FP), }, - ID_SANITISED(ID_AA64PFR1_EL1), + { SYS_DESC(SYS_ID_AA64PFR1_EL1), + .access = access_id_reg, + .get_user = get_id_reg, + .set_user = set_id_reg, + .reset = read_sanitised_id_aa64pfr1_el1, + .val = ~(ID_AA64PFR1_EL1_MPAM_frac | + ID_AA64PFR1_EL1_RAS_frac | + ID_AA64PFR1_EL1_MTE), }, ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3), ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0),
Now that everything else is in place allow userspace to enable SME with a vCPU flag KVM_ARM_VCPU_SME similar to that for SVE, finalized with the same KVM_ARM_VCPU_VEC/SVE flag used for SVE.
As with SVE when when SME feature is enabled we default the vector length to the highest vector length that can be exposed to guests without them being able to observe inconsistencies between the set of supported vector lengths on processors, discovering this during host SME enumeration.
When the vectors are finalised we allocate state for both SVE and SME.
Currently SME2 and FA64 are enabled for the guest if supported by the host, once it is clear how we intend to handle parsing features from ID registers these features should be handled based on the configured ID registers.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/kvm_host.h | 3 +- arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/kernel/fpsimd.c | 28 +++++++++ arch/arm64/kvm/arm.c | 7 +++ arch/arm64/kvm/reset.c | 120 +++++++++++++++++++++++++++++++++----- include/uapi/linux/kvm.h | 1 + 6 files changed, 145 insertions(+), 15 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 6a5002ab8042..c054aee8160a 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -38,7 +38,7 @@
#define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
-#define KVM_VCPU_MAX_FEATURES 7 +#define KVM_VCPU_MAX_FEATURES 8 #define KVM_VCPU_VALID_FEATURES (BIT(KVM_VCPU_MAX_FEATURES) - 1)
#define KVM_REQ_SLEEP \ @@ -76,6 +76,7 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
extern unsigned int __ro_after_init kvm_vec_max_vl[ARM64_VEC_MAX]; int __init kvm_arm_init_sve(void); +int __init kvm_arm_init_sme(void);
u32 __attribute_const__ kvm_target_cpu(void); void kvm_reset_vcpu(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 58640aeb88e4..0f5e678c3ef3 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -110,6 +110,7 @@ struct kvm_regs { #define KVM_ARM_VCPU_PTRAUTH_ADDRESS 5 /* VCPU uses address authentication */ #define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */ #define KVM_ARM_VCPU_HAS_EL2 7 /* Support nested virtualization */ +#define KVM_ARM_VCPU_SME 8 /* enable SME for this CPU */
/* * An alias for _SVE since we finalize VL configuration for both SVE and SME diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index e6a4dd68f62a..05e806a580f0 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1307,6 +1307,8 @@ void __init sme_setup(void) { struct vl_info *info = &vl_info[ARM64_VEC_SME]; int min_bit, max_bit; + DECLARE_BITMAP(tmp_map, SVE_VQ_MAX); + unsigned long b;
if (!cpus_have_cap(ARM64_SME)) return; @@ -1342,6 +1344,32 @@ void __init sme_setup(void) info->max_vl); pr_info("SME: default vector length %u bytes per vector\n", get_sme_default_vl()); + + /* + * KVM can't cope with any mismatch in supported VLs between + * CPUs, detect any inconsistencies. Unlike SVE it is + * architecturally possible to end up with no VLs. + */ + bitmap_andnot(tmp_map, info->vq_partial_map, info->vq_map, + SVE_VQ_MAX); + + b = find_last_bit(tmp_map, SVE_VQ_MAX); + if (b >= SVE_VQ_MAX) + /* No non-virtualisable VLs found */ + info->max_virtualisable_vl = SVE_VQ_MAX; + else if (WARN_ON(b == SVE_VQ_MAX - 1)) + /* No virtualisable VLs? Architecturally possible... */ + info->max_virtualisable_vl = 0; + else /* b + 1 < SVE_VQ_MAX */ + info->max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1)); + + if (info->max_virtualisable_vl > info->max_vl) + info->max_virtualisable_vl = info->max_vl; + + /* KVM decides whether to support mismatched systems. Just warn here: */ + if (sve_max_virtualisable_vl() < sve_max_vl()) + pr_warn("%s: unvirtualisable vector lengths present\n", + info->name); }
#endif /* CONFIG_ARM64_SME */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index aa7e2031263c..baa4a8074aaf 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -305,6 +305,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_SVE: r = system_supports_sve(); break; + case KVM_CAP_ARM_SME: + r = system_supports_sme(); + break; case KVM_CAP_ARM_PTRAUTH_ADDRESS: case KVM_CAP_ARM_PTRAUTH_GENERIC: r = system_has_full_ptr_auth(); @@ -2559,6 +2562,10 @@ static __init int kvm_arm_init(void) if (err) return err;
+ err = kvm_arm_init_sme(); + if (err) + return err; + err = kvm_arm_vmid_alloc_init(); if (err) { kvm_err("Failed to initialize VMID allocator.\n"); diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index ab7cd657a73c..56fc91cd8567 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -85,42 +85,125 @@ static void kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu) vcpu_set_flag(vcpu, GUEST_HAS_SVE); }
+int __init kvm_arm_init_sme(void) +{ + if (system_supports_sme()) { + kvm_vec_max_vl[ARM64_VEC_SME] = sme_max_virtualisable_vl(); + + /* + * The get_sve_reg()/set_sve_reg() ioctl interface will need + * to be extended with multiple register slice support in + * order to support vector lengths greater than + * VL_ARCH_MAX: + */ + if (WARN_ON(kvm_vec_max_vl[ARM64_VEC_SME] > VL_ARCH_MAX)) + kvm_vec_max_vl[ARM64_VEC_SME] = VL_ARCH_MAX; + + /* + * Don't even try to make use of vector lengths that + * aren't available on all CPUs, for now: + */ + if (kvm_vec_max_vl[ARM64_VEC_SME] < sme_max_vl()) + pr_warn("KVM: SME vector length for guests limited to %u bytes\n", + kvm_vec_max_vl[ARM64_VEC_SME]); + } + + return 0; +} + +static int kvm_vcpu_enable_sme(struct kvm_vcpu *vcpu) +{ + vcpu->arch.max_vl[ARM64_VEC_SME] = kvm_vec_max_vl[ARM64_VEC_SME]; + + /* + * Userspace can still customize the vector lengths by writing + * KVM_REG_ARM64_SME_VLS. Allocation is deferred until + * kvm_arm_vcpu_finalize(), which freezes the configuration. + */ + vcpu_set_flag(vcpu, GUEST_HAS_SME); + + return 0; +} + + /* - * Finalize vcpu's maximum SVE vector length, allocating - * vcpu->arch.sve_state as necessary. + * Finalize vcpu's maximum SVE/SME vector lengths, allocating + * vcpu->arch.sve_state and vcpu->arch.sme_state as necessary. */ static int kvm_vcpu_finalize_vec(struct kvm_vcpu *vcpu) { - void *buf; unsigned int vl; - size_t reg_sz; + void *sve_buf, *sme_buf; + size_t sve_sz, sme_sz; int ret;
- vl = vcpu->arch.max_vl[ARM64_VEC_SVE]; + /* Both SVE and SME need the SVE state */
- /* + /* * Responsibility for these properties is shared between * kvm_arm_init_sve(), kvm_vcpu_enable_sve() and * set_sve_vls(). Double-check here just to be sure: */ + vl = vcpu->arch.max_vl[ARM64_VEC_SVE]; if (WARN_ON(!sve_vl_valid(vl) || vl > sve_max_virtualisable_vl() || vl > VL_ARCH_MAX)) return -EIO;
- reg_sz = vcpu_sve_state_size(vcpu); - buf = kzalloc(reg_sz, GFP_KERNEL_ACCOUNT); - if (!buf) + sve_sz = vcpu_sve_state_size(vcpu); + sve_buf = kzalloc(sve_sz, GFP_KERNEL_ACCOUNT); + if (!sve_buf) return -ENOMEM;
- ret = kvm_share_hyp(buf, buf + reg_sz); + ret = kvm_share_hyp(sve_buf, sve_buf + sve_sz); if (ret) { - kfree(buf); + kfree(sve_buf); return ret; } + + if (vcpu_has_sme(vcpu)) { + vl = vcpu->arch.max_vl[ARM64_VEC_SME]; + if (WARN_ON(!sve_vl_valid(vl) || + vl > sme_max_virtualisable_vl() || + vl > VL_ARCH_MAX)) { + ret = -EIO; + goto free_sve; + } + + sme_sz = vcpu_sme_state_size(vcpu); + sme_buf = kzalloc(sme_sz, GFP_KERNEL_ACCOUNT); + if (!sme_buf) { + ret = -ENOMEM; + goto free_sve; + } + + ret = kvm_share_hyp(sme_buf, sme_buf + sme_sz); + if (ret) { + kfree(sme_buf); + goto free_sve; + } + + vcpu->arch.sme_state = sme_buf; + + /* + * These are expected to be parsed from ID registers + * once the general approach to doing that is worked + * out. + */ + if (system_supports_sme2()) + vcpu_set_flag(vcpu, GUEST_HAS_SME2); + if (system_supports_fa64()) + vcpu_set_flag(vcpu, GUEST_HAS_FA64); + } - vcpu->arch.sve_state = buf; + vcpu->arch.sve_state = sve_buf; + vcpu_set_flag(vcpu, VCPU_VEC_FINALIZED); return 0; + +free_sve: + kvm_unshare_hyp(sve_buf, sve_buf + sve_sz); + kfree(sve_buf); + return ret; }
int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature) @@ -150,19 +233,25 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu) void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu) { void *sve_state = vcpu->arch.sve_state; + void *sme_state = vcpu->arch.sme_state;
kvm_vcpu_unshare_task_fp(vcpu); kvm_unshare_hyp(vcpu, vcpu + 1); if (sve_state) kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu)); kfree(sve_state); + if (sme_state) + kvm_unshare_hyp(sme_state, sme_state + vcpu_sme_state_size(vcpu)); + kfree(sme_state); kfree(vcpu->arch.ccsidr); }
-static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu) +static void kvm_vcpu_reset_vec(struct kvm_vcpu *vcpu) { if (vcpu_has_sve(vcpu)) memset(vcpu->arch.sve_state, 0, vcpu_sve_state_size(vcpu)); + if (vcpu_has_sme(vcpu)) + memset(vcpu->arch.sme_state, 0, vcpu_sme_state_size(vcpu)); }
static void kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu) @@ -210,8 +299,11 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu) if (!kvm_arm_vcpu_vec_finalized(vcpu)) { if (vcpu_has_feature(vcpu, KVM_ARM_VCPU_SVE)) kvm_vcpu_enable_sve(vcpu); + + if (vcpu_has_feature(vcpu, KVM_ARM_VCPU_SME)) + kvm_vcpu_enable_sme(vcpu); } else { - kvm_vcpu_reset_sve(vcpu); + kvm_vcpu_reset_vec(vcpu); }
if (vcpu_has_feature(vcpu, KVM_ARM_VCPU_PTRAUTH_ADDRESS) || diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 211b86de35ac..42484c0a9051 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1201,6 +1201,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 #define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230 +#define KVM_CAP_ARM_SME 231
#ifdef KVM_CAP_IRQ_ROUTING
SME adds a number of new system registers, update get-reg-list to check for them based on the visibility of SME.
Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 +++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c b/tools/testing/selftests/kvm/aarch64/get-reg-list.c index 709d7d721760..a1c2853388b6 100644 --- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c +++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c @@ -23,6 +23,18 @@ struct feature_id_reg { };
static struct feature_id_reg feat_id_regs[] = { + { + ARM64_SYS_REG(3, 0, 1, 2, 4), /* SMPRI_EL1 */ + ARM64_SYS_REG(3, 0, 0, 4, 1), /* ID_AA64PFR1_EL1 */ + 24, + 1 + }, + { + ARM64_SYS_REG(3, 0, 1, 2, 6), /* SMCR_EL1 */ + ARM64_SYS_REG(3, 0, 0, 4, 1), /* ID_AA64PFR1_EL1 */ + 24, + 1 + }, { ARM64_SYS_REG(3, 0, 2, 0, 3), /* TCR2_EL1 */ ARM64_SYS_REG(3, 0, 0, 7, 3), /* ID_AA64MMFR3_EL1 */ @@ -40,7 +52,25 @@ static struct feature_id_reg feat_id_regs[] = { ARM64_SYS_REG(3, 0, 0, 7, 3), /* ID_AA64MMFR3_EL1 */ 4, 1 - } + }, + { + ARM64_SYS_REG(3, 1, 0, 0, 6), /* SMIDR_EL1 */ + ARM64_SYS_REG(3, 0, 0, 4, 1), /* ID_AA64PFR1_EL1 */ + 24, + 1 + }, + { + ARM64_SYS_REG(3, 3, 4, 2, 2), /* SVCR */ + ARM64_SYS_REG(3, 0, 0, 4, 1), /* ID_AA64PFR1_EL1 */ + 24, + 1 + }, + { + ARM64_SYS_REG(3, 3, 13, 0, 5), /* TPIDR2_EL0 */ + ARM64_SYS_REG(3, 0, 0, 4, 1), /* ID_AA64PFR1_EL1 */ + 24, + 1 + }, };
bool filter_reg(__u64 reg)
linux-kselftest-mirror@lists.linaro.org