From: Yicong Yang yangyicong@hisilicon.com
Armv8.7 introduces single-copy atomic 64-byte loads and stores instructions and its variants named under FEAT_{LS64, LS64_V}. Add support for Armv8.7 FEAT_{LS64, LS64_V}: - Add identifying and enabling in the cpufeature list - Expose the support of these features to userspace through HWCAP3 and cpuinfo - Add related hwcap test - Handle the trap of unsupported memory (normal/uncacheable) access in a VM
A real scenario for this feature is that the userspace driver can make use of this to implement direct WQE (workqueue entry) - a mechanism to fill WQE directly into the hardware.
This patchset also complement with Marc's patchset v2[1] for handling LS64* trapped if not advertised for a VM.
[1] https://lore.kernel.org/linux-arm-kernel/20250310122505.2857610-1-maz@kernel...
Tested with updated hwcap test: On host: root@localhost:/tmp# dmesg | grep "All CPU(s) started" [ 0.504846] CPU: All CPU(s) started at EL2 root@localhost:/tmp# ./hwcap [...] # LS64 present ok 217 cpuinfo_match_LS64 ok 218 sigill_LS64 ok 219 # SKIP sigbus_LS64 # LS64_V present ok 220 cpuinfo_match_LS64_V ok 221 sigill_LS64_V ok 222 # SKIP sigbus_LS64_V # 115 skipped test(s) detected. Consider enabling relevant config options to improve coverage. # Totals: pass:107 fail:0 xfail:0 xpass:0 skip:115 error:0
On guest: root@localhost:/# dmesg | grep "All CPU(s) started" [ 0.205580] CPU: All CPU(s) started at EL1 root@localhost:/mnt# ./hwcap [...] # LS64 present ok 217 cpuinfo_match_LS64 ok 218 sigill_LS64 ok 219 # SKIP sigbus_LS64 # LS64_V present ok 220 cpuinfo_match_LS64_V ok 221 sigill_LS64_V ok 222 # SKIP sigbus_LS64_V # 115 skipped test(s) detected. Consider enabling relevant config options to improve coverage. # Totals: pass:107 fail:0 xfail:0 xpass:0 skip:115 error:0
Change since v1: - Drop the suppport for LS64_ACCDATA - handle the DABT of unsupported memory type after checking the memory attributes Link: https://lore.kernel.org/linux-arm-kernel/20241202135504.14252-1-yangyicong@h...
Yicong Yang (6): arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 arm64: Add support for FEAT_{LS64, LS64_V} KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
Documentation/arch/arm64/booting.rst | 12 +++ Documentation/arch/arm64/elf_hwcaps.rst | 6 ++ arch/arm64/include/asm/el2_setup.h | 12 ++- arch/arm64/include/asm/esr.h | 8 ++ arch/arm64/include/asm/hwcap.h | 2 + arch/arm64/include/asm/kvm_emulate.h | 7 ++ arch/arm64/include/uapi/asm/hwcap.h | 2 + arch/arm64/kernel/cpufeature.c | 51 +++++++++++++ arch/arm64/kernel/cpuinfo.c | 2 + arch/arm64/kvm/inject_fault.c | 35 +++++++++ arch/arm64/kvm/mmu.c | 37 +++++++++- arch/arm64/tools/cpucaps | 2 + tools/testing/selftests/arm64/abi/hwcap.c | 90 +++++++++++++++++++++++ 13 files changed, 264 insertions(+), 2 deletions(-)
From: Yicong Yang yangyicong@hisilicon.com
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage at EL0/1.
This doesn't mean these instructions are always available in EL0/1 if provided. The hypervisor still have the control at runtime.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- arch/arm64/include/asm/el2_setup.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index ebceaae3c749..0259941602c4 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -57,9 +57,19 @@ /* Enable GCS if supported */ mrs_s x1, SYS_ID_AA64PFR1_EL1 ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4 - cbz x1, .Lset_hcrx_@ + cbz x1, .Lskip_gcs_hcrx_@ orr x0, x0, #HCRX_EL2_GCSEn
+.Lskip_gcs_hcrx_@: + /* Enable LS64, LS64_V if supported */ + mrs_s x1, SYS_ID_AA64ISAR1_EL1 + ubfx x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4 + cbz x1, .Lset_hcrx_@ + orr x0, x0, #HCRX_EL2_EnALS + cmp x1, #ID_AA64ISAR1_EL1_LS64_LS64_V + b.lt .Lset_hcrx_@ + orr x0, x0, #HCRX_EL2_EnASR + .Lset_hcrx_@: msr_s SYS_HCRX_EL2, x0 .Lskip_hcrx_@:
On 31/03/2025 10:43, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage at EL0/1.
This doesn't mean these instructions are always available in EL0/1 if provided. The hypervisor still have the control at runtime.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com
arch/arm64/include/asm/el2_setup.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index ebceaae3c749..0259941602c4 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -57,9 +57,19 @@ /* Enable GCS if supported */ mrs_s x1, SYS_ID_AA64PFR1_EL1 ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
- cbz x1, .Lset_hcrx_@
- cbz x1, .Lskip_gcs_hcrx_@ orr x0, x0, #HCRX_EL2_GCSEn
+.Lskip_gcs_hcrx_@:
minor nit: For consistency, could we rename this "set_ls64", similar to "set_hcrx" ?
- /* Enable LS64, LS64_V if supported */
- mrs_s x1, SYS_ID_AA64ISAR1_EL1
- ubfx x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4
- cbz x1, .Lset_hcrx_@
- orr x0, x0, #HCRX_EL2_EnALS
- cmp x1, #ID_AA64ISAR1_EL1_LS64_LS64_V
- b.lt .Lset_hcrx_@
- orr x0, x0, #HCRX_EL2_EnASR
- .Lset_hcrx_@: msr_s SYS_HCRX_EL2, x0 .Lskip_hcrx_@:
Suzuki
On 2025/4/3 17:04, Suzuki K Poulose wrote:
On 31/03/2025 10:43, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage at EL0/1.
This doesn't mean these instructions are always available in EL0/1 if provided. The hypervisor still have the control at runtime.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com
arch/arm64/include/asm/el2_setup.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index ebceaae3c749..0259941602c4 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -57,9 +57,19 @@ /* Enable GCS if supported */ mrs_s x1, SYS_ID_AA64PFR1_EL1 ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4 - cbz x1, .Lset_hcrx_@ + cbz x1, .Lskip_gcs_hcrx_@ orr x0, x0, #HCRX_EL2_GCSEn +.Lskip_gcs_hcrx_@:
minor nit: For consistency, could we rename this "set_ls64", similar to "set_hcrx" ?
IIUC, set_xxx really touches the registers and skip_xxx should just check and prepare the feature bits. so here using .Lskip_gcs_hrcx_@ should be more proper and consistent with other places in el2_setup.h, like __init_el2_debug/__init_el2_fgt which also use .Lskip_xxx for skipping an unsupported feature?
Thanks.
+ /* Enable LS64, LS64_V if supported */ + mrs_s x1, SYS_ID_AA64ISAR1_EL1 + ubfx x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4 + cbz x1, .Lset_hcrx_@ + orr x0, x0, #HCRX_EL2_EnALS + cmp x1, #ID_AA64ISAR1_EL1_LS64_LS64_V + b.lt .Lset_hcrx_@ + orr x0, x0, #HCRX_EL2_EnASR
.Lset_hcrx_@: msr_s SYS_HCRX_EL2, x0 .Lskip_hcrx_@:
Suzuki
.
From: Yicong Yang yangyicong@hisilicon.com
Armv8.7 introduces single-copy atomic 64-byte loads and stores instructions and its variants named under FEAT_{LS64, LS64_V}. These features are identified by ID_AA64ISAR1_EL1.LS64 and the use of such instructions in userspace (EL0) can be trapped. In order to support the use of corresponding instructions in userspace: - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace - Add identifying and enabling in the cpufeature list - Expose these support of these features to userspace through HWCAP3 and cpuinfo
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- Documentation/arch/arm64/booting.rst | 12 ++++++ Documentation/arch/arm64/elf_hwcaps.rst | 6 +++ arch/arm64/include/asm/hwcap.h | 2 + arch/arm64/include/uapi/asm/hwcap.h | 2 + arch/arm64/kernel/cpufeature.c | 51 +++++++++++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 2 + arch/arm64/tools/cpucaps | 2 + 7 files changed, 77 insertions(+)
diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst index dee7b6de864f..cd26e6326146 100644 --- a/Documentation/arch/arm64/booting.rst +++ b/Documentation/arch/arm64/booting.rst @@ -483,6 +483,18 @@ Before jumping into the kernel, the following conditions must be met:
- MDCR_EL3.TPM (bit 6) must be initialized to 0b0
+ For CPUs support for 64-byte loads and stores without status (FEAT_LS64): + + - If the kernel is entered at EL1 and EL2 is present: + + - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1. + + For CPUs support for 64-byte loads and stores with status (FEAT_LS64_V): + + - If the kernel is entered at EL1 and EL2 is present: + + - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1. + The requirements described above for CPU mode, caches, MMUs, architected timers, coherency and system registers apply to all CPUs. All CPUs must enter the kernel in the same exception level. Where the values documented diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst index 69d7afe56853..9e6db258ff48 100644 --- a/Documentation/arch/arm64/elf_hwcaps.rst +++ b/Documentation/arch/arm64/elf_hwcaps.rst @@ -435,6 +435,12 @@ HWCAP2_SME_SF8DP4 HWCAP2_POE Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
+HWCAP3_LS64 + Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001. + +HWCAP3_LS64_V + Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010. + 4. Unused AT_HWCAP bits -----------------------
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index 1c3f9617d54f..f45ab66d3466 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -176,6 +176,8 @@ #define KERNEL_HWCAP_POE __khwcap2_feature(POE)
#define __khwcap3_feature(x) (const_ilog2(HWCAP3_ ## x) + 128) +#define KERNEL_HWCAP_LS64 __khwcap3_feature(LS64) +#define KERNEL_HWCAP_LS64_V __khwcap3_feature(LS64_V)
/* * This yields a mask that user programs can use to figure out what diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index 705a7afa8e58..88579dad778d 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -143,5 +143,7 @@ /* * HWCAP3 flags - for AT_HWCAP3 */ +#define HWCAP3_LS64 (1UL << 0) +#define HWCAP3_LS64_V (1UL << 1)
#endif /* _UAPI__ASM_HWCAP_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9c4d6d552b25..61d9d9959269 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -231,6 +231,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = { };
static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0), @@ -2278,6 +2279,38 @@ static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap) } #endif /* CONFIG_ARM64_E0PD */
+static bool has_ls64(const struct arm64_cpu_capabilities *entry, int __unused) +{ + u64 ls64; + + ls64 = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg), + entry->field_pos, entry->sign); + + if (ls64 == ID_AA64ISAR1_EL1_LS64_NI || + ls64 > ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA) + return false; + + if (entry->capability == ARM64_HAS_LS64 && + ls64 >= ID_AA64ISAR1_EL1_LS64_LS64) + return true; + + if (entry->capability == ARM64_HAS_LS64_V && + ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_V) + return true; + + return false; +} + +static void cpu_enable_ls64(struct arm64_cpu_capabilities const *cap) +{ + sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnALS, SCTLR_EL1_EnALS); +} + +static void cpu_enable_ls64_v(struct arm64_cpu_capabilities const *cap) +{ + sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnASR, SCTLR_EL1_EnASR); +} + #ifdef CONFIG_ARM64_PSEUDO_NMI static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, int scope) @@ -3041,6 +3074,22 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_pmuv3, }, #endif + { + .desc = "LS64", + .capability = ARM64_HAS_LS64, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_ls64, + .cpu_enable = cpu_enable_ls64, + ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64) + }, + { + .desc = "LS64_V", + .capability = ARM64_HAS_LS64_V, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_ls64, + .cpu_enable = cpu_enable_ls64_v, + ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64_V) + }, {}, };
@@ -3153,6 +3202,8 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64ISAR1_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_EBF16), HWCAP_CAP(ID_AA64ISAR1_EL1, DGH, IMP, CAP_HWCAP, KERNEL_HWCAP_DGH), HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM), + HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64, CAP_HWCAP, KERNEL_HWCAP_LS64), + HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64_V, CAP_HWCAP, KERNEL_HWCAP_LS64_V), HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT), HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX), HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT), diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 285d7d538342..a56c313e6474 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -81,6 +81,8 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_PACA] = "paca", [KERNEL_HWCAP_PACG] = "pacg", [KERNEL_HWCAP_GCS] = "gcs", + [KERNEL_HWCAP_LS64] = "ls64", + [KERNEL_HWCAP_LS64_V] = "ls64_v", [KERNEL_HWCAP_DCPODP] = "dcpodp", [KERNEL_HWCAP_SVE2] = "sve2", [KERNEL_HWCAP_SVEAES] = "sveaes", diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 772c1b008e43..4a6a508073d3 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -42,6 +42,8 @@ HAS_HCX HAS_LDAPR HAS_LPA2 HAS_LSE_ATOMICS +HAS_LS64 +HAS_LS64_V HAS_MOPS HAS_NESTED_VIRT HAS_PAN
From: Yicong Yang yangyicong@hisilicon.com
Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- arch/arm64/include/asm/kvm_emulate.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index d7cf66573aca..9165fcf719ab 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -684,6 +684,12 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
if (kvm_has_fpmr(kvm)) vcpu->arch.hcrx_el2 |= HCRX_EL2_EnFPM; + + if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64)) + vcpu->arch.hcrx_el2 |= HCRX_EL2_EnALS; + + if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64_V)) + vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR; } } #endif /* __ARM64_KVM_EMULATE_H__ */
From: Yicong Yang yangyicong@hisilicon.com
Add tests for FEAT_{LS64, LS64_V}. Issue related instructions if feature presents, no SIGILL should be received. When such instructions operate on Device memory or non-cacheable memory, we may received a SIGBUS during the test (w/o FEAT_LS64WB). Just ignore it since we only tested whether the instruction itself can be issued as expected on platforms declaring the support of such features.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- tools/testing/selftests/arm64/abi/hwcap.c | 90 +++++++++++++++++++++++ 1 file changed, 90 insertions(+)
diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c index 35f521e5f41c..e1724f038cc1 100644 --- a/tools/testing/selftests/arm64/abi/hwcap.c +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -11,6 +11,8 @@ #include <stdlib.h> #include <string.h> #include <unistd.h> +#include <linux/auxvec.h> +#include <linux/compiler.h> #include <sys/auxv.h> #include <sys/prctl.h> #include <asm/hwcap.h> @@ -578,6 +580,78 @@ static void lrcpc3_sigill(void) : "=r" (data0), "=r" (data1) : "r" (src) :); }
+static void ignore_signal(int sig, siginfo_t *info, void *context) +{ + ucontext_t *uc = context; + + uc->uc_mcontext.pc += 4; +} + +static void ls64_sigill(void) +{ + struct sigaction ign, old; + char src[64] __aligned(64) = { 1 }; + + /* + * LS64, LS64_V require target memory to be Device/Non-cacheable (if + * FEAT_LS64WB not supported) and the completer supports these + * instructions, otherwise we'll receive a SIGBUS. Since we are only + * testing the ABI here, so just ignore the SIGBUS and see if we can + * execute the instructions without receiving a SIGILL. Restore the + * handler of SIGBUS after this test. + */ + ign.sa_sigaction = ignore_signal; + ign.sa_flags = SA_SIGINFO | SA_RESTART; + sigemptyset(&ign.sa_mask); + sigaction(SIGBUS, &ign, &old); + + register void *xn asm ("x8") = src; + register u64 xt_1 asm ("x0"); + register u64 __maybe_unused xt_2 asm ("x1"); + register u64 __maybe_unused xt_3 asm ("x2"); + register u64 __maybe_unused xt_4 asm ("x3"); + register u64 __maybe_unused xt_5 asm ("x4"); + register u64 __maybe_unused xt_6 asm ("x5"); + register u64 __maybe_unused xt_7 asm ("x6"); + register u64 __maybe_unused xt_8 asm ("x7"); + + /* LD64B x0, [x8] */ + asm volatile(".inst 0xf83fd100" : "=r" (xt_1) : "r" (xn)); + + /* ST64B x0, [x8] */ + asm volatile(".inst 0xf83f9100" : : "r" (xt_1), "r" (xn)); + + sigaction(SIGBUS, &old, NULL); +} + +static void ls64_v_sigill(void) +{ + struct sigaction ign, old; + char dst[64] __aligned(64); + + /* See comment in ls64_sigill() */ + ign.sa_sigaction = ignore_signal; + ign.sa_flags = SA_SIGINFO | SA_RESTART; + sigemptyset(&ign.sa_mask); + sigaction(SIGBUS, &ign, &old); + + register void *xn asm ("x8") = dst; + register u64 xt_1 asm ("x0") = 1; + register u64 __maybe_unused xt_2 asm ("x1") = 2; + register u64 __maybe_unused xt_3 asm ("x2") = 3; + register u64 __maybe_unused xt_4 asm ("x3") = 4; + register u64 __maybe_unused xt_5 asm ("x4") = 5; + register u64 __maybe_unused xt_6 asm ("x5") = 6; + register u64 __maybe_unused xt_7 asm ("x6") = 7; + register u64 __maybe_unused xt_8 asm ("x7") = 8; + register u64 st asm ("x9"); + + /* ST64BV x9, x0, [x8] */ + asm volatile(".inst 0xf829b100" : "=r" (st) : "r" (xt_1), "r" (xn)); + + sigaction(SIGBUS, &old, NULL); +} + static const struct hwcap_data { const char *name; unsigned long at_hwcap; @@ -1098,6 +1172,22 @@ static const struct hwcap_data { .sigill_fn = hbc_sigill, .sigill_reliable = true, }, + { + .name = "LS64", + .at_hwcap = AT_HWCAP3, + .hwcap_bit = HWCAP3_LS64, + .cpuinfo = "ls64", + .sigill_fn = ls64_sigill, + .sigill_reliable = true, + }, + { + .name = "LS64_V", + .at_hwcap = AT_HWCAP3, + .hwcap_bit = HWCAP3_LS64_V, + .cpuinfo = "ls64_v", + .sigill_fn = ls64_v_sigill, + .sigill_reliable = true, + }, };
typedef void (*sighandler_fn)(int, siginfo_t *, void *);
From: Yicong Yang yangyicong@hisilicon.com
0x35 indicates IMPLEMENTATION DEFINED fault for Unsupported Exclusive or Atomic access. Add ESR_ELx_FSC definition and corresponding wrapper.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- arch/arm64/include/asm/esr.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index d1b1a33f9a8b..2f357442f646 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -121,6 +121,7 @@ #define ESR_ELx_FSC_SEA_TTW(n) (0x14 + (n)) #define ESR_ELx_FSC_SECC (0x18) #define ESR_ELx_FSC_SECC_TTW(n) (0x1c + (n)) +#define ESR_ELx_FSC_EXCL_ATOMIC (0x35)
/* Status codes for individual page table levels */ #define ESR_ELx_FSC_ACCESS_L(n) (ESR_ELx_FSC_ACCESS + (n)) @@ -464,6 +465,13 @@ static inline bool esr_fsc_is_access_flag_fault(unsigned long esr) (esr == ESR_ELx_FSC_ACCESS_L(0)); }
+static inline bool esr_fsc_is_excl_atomic_fault(unsigned long esr) +{ + esr = esr & ESR_ELx_FSC; + + return esr == ESR_ELx_FSC_EXCL_ATOMIC; +} + /* Indicate whether ESR.EC==0x1A is for an ERETAx instruction */ static inline bool esr_iss_is_eretax(unsigned long esr) {
On Mon, Mar 31, 2025 at 05:43:19PM +0800, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
0x35 indicates IMPLEMENTATION DEFINED fault for Unsupported Exclusive or Atomic access. Add ESR_ELx_FSC definition and corresponding wrapper.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com
Just squash this into the next patch. Adding a helper w/o any user usually isn't a good idea.
Thanks, Oliver
On 2025/4/2 0:15, Oliver Upton wrote:
On Mon, Mar 31, 2025 at 05:43:19PM +0800, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
0x35 indicates IMPLEMENTATION DEFINED fault for Unsupported Exclusive or Atomic access. Add ESR_ELx_FSC definition and corresponding wrapper.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com
Just squash this into the next patch. Adding a helper w/o any user usually isn't a good idea.
sure, will squash this into the next patch.
Thanks.
From: Yicong Yang yangyicong@hisilicon.com
If FEAT_LS64WB not supported, FEAT_LS64* instructions only support to access Device/Uncacheable memory, otherwise a data abort for unsupported Exclusive or atomic access (0x35) is generated per spec. It's implementation defined whether the target exception level is routed and is possible to implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section C3.2.12.2 Single-copy atomic 64-byte load/store:
The check is performed against the resulting memory type after all enabled stages of translation. In this case the fault is reported at the final enabled stage of translation.
If it's implemented as generate the DABT to the final enabled stage (stage-2), inject a DABT to the guest to handle it.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com --- arch/arm64/include/asm/kvm_emulate.h | 1 + arch/arm64/kvm/inject_fault.c | 35 ++++++++++++++++++++++++++ arch/arm64/kvm/mmu.c | 37 +++++++++++++++++++++++++++- 3 files changed, 72 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 9165fcf719ab..96701b19982e 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -47,6 +47,7 @@ void kvm_skip_instr32(struct kvm_vcpu *vcpu); void kvm_inject_undefined(struct kvm_vcpu *vcpu); void kvm_inject_vabt(struct kvm_vcpu *vcpu); void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr); +void kvm_inject_dabt_excl_atomic(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_size_fault(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index a640e839848e..1c3643126b92 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -171,6 +171,41 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr) inject_abt64(vcpu, false, addr); }
+/** + * kvm_inject_dabt_excl_atomic - inject a data abort for unsupported exclusive + * or atomic access + * @vcpu: The VCPU to receive the data abort + * @addr: The address to report in the DFAR + * + * It is assumed that this code is called from the VCPU thread and that the + * VCPU therefore is not currently executing guest code. + */ +void kvm_inject_dabt_excl_atomic(struct kvm_vcpu *vcpu, unsigned long addr) +{ + unsigned long cpsr = *vcpu_cpsr(vcpu); + u64 esr = 0; + + pend_sync_exception(vcpu); + + if (kvm_vcpu_trap_il_is32bit(vcpu)) + esr |= ESR_ELx_IL; + + if ((cpsr & PSR_MODE_MASK) == PSR_MODE_EL0t) + esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT; + else + esr |= ESR_ELx_EC_DABT_CUR << ESR_ELx_EC_SHIFT; + + esr |= ESR_ELx_FSC_EXCL_ATOMIC; + + if (match_target_el(vcpu, unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC))) { + vcpu_write_sys_reg(vcpu, addr, FAR_EL1); + vcpu_write_sys_reg(vcpu, esr, ESR_EL1); + } else { + vcpu_write_sys_reg(vcpu, addr, FAR_EL2); + vcpu_write_sys_reg(vcpu, esr, ESR_EL2); + } +} + /** * kvm_inject_pabt - inject a prefetch abort into the guest * @vcpu: The VCPU to receive the prefetch abort diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 2feb6c6b63af..7ae4e48fd040 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1658,6 +1658,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC;
+ if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) { + /* + * Target address is normal memory on the Host. We come here + * because: + * 1) Guest map it as device memory and perform LS64 operations + * 2) VMM report it as device memory mistakenly + * Warn the VMM and inject the DABT back to the guest. + */ + if (!device) + kvm_err("memory attributes maybe incorrect for hva 0x%lx\n", hva); + + /* + * Otherwise it's a piece of device memory on the Host. + * Inject the DABT back to the guest since the mapping + * is wrong. + */ + kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu)); + } + /* * Potentially reduce shadow S2 permissions to match the guest's own * S2. For exec faults, we'd only reach this point if the guest @@ -1836,7 +1855,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) /* Check the stage-2 fault is trans. fault or write fault */ if (!esr_fsc_is_translation_fault(esr) && !esr_fsc_is_permission_fault(esr) && - !esr_fsc_is_access_flag_fault(esr)) { + !esr_fsc_is_access_flag_fault(esr) && + !esr_fsc_is_excl_atomic_fault(esr)) { kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n", kvm_vcpu_trap_get_class(vcpu), (unsigned long)kvm_vcpu_trap_get_fault(vcpu), @@ -1919,6 +1939,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; }
+ /* + * If instructions of FEAT_{LS64, LS64_V} operated on + * unsupported memory regions, a DABT for unsupported + * Exclusive or atomic access is generated. It's + * implementation defined whether the exception will + * be taken to, a stage-1 DABT or the final enabled + * stage of translation (stage-2 in this case as we + * hit here). Inject a DABT to the guest to handle it + * if it's implemented as a stage-2 DABT. + */ + if (esr_fsc_is_excl_atomic_fault(esr)) { + kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu)); + return 1; + } + /* * The IPA is reported as [MAX:12], so we need to * complement it with the bottom 12 bits from the
Hi Yicong,
On Mon, Mar 31, 2025 at 05:43:20PM +0800, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
If FEAT_LS64WB not supported, FEAT_LS64* instructions only support to access Device/Uncacheable memory, otherwise a data abort for unsupported Exclusive or atomic access (0x35) is generated per spec. It's implementation defined whether the target exception level is routed and is possible to implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section C3.2.12.2 Single-copy atomic 64-byte load/store:
The check is performed against the resulting memory type after all enabled stages of translation. In this case the fault is reported at the final enabled stage of translation.
Just use section citations, this quote isn't very useful since it doesn't adequately describe the different IMPLEMENTATION DEFINED behavior.
If it's implemented as generate the DABT to the final enabled stage (stage-2), inject a DABT to the guest to handle it.
This should be ordered _before_ the patch that exposes FEAT_LS64* to the VM.
@@ -1658,6 +1658,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC;
- if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) {
/*
* Target address is normal memory on the Host. We come here
* because:
* 1) Guest map it as device memory and perform LS64 operations
* 2) VMM report it as device memory mistakenly
* Warn the VMM and inject the DABT back to the guest.
*/
if (!device)
kvm_err("memory attributes maybe incorrect for hva 0x%lx\n", hva);
No, kvm_err() doesn't warn the VMM at all. KVM should never log anything for a situation that can be caused by the guest, e.g. option #1 in your comment.
Keep in mind that data aborts with DFSC == 0x35 can happen for a lot more than LS64 instructions, e.g. an atomic on a Device-* mapping.
@@ -1919,6 +1939,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; }
/*
* If instructions of FEAT_{LS64, LS64_V} operated on
* unsupported memory regions, a DABT for unsupported
* Exclusive or atomic access is generated. It's
* implementation defined whether the exception will
* be taken to, a stage-1 DABT or the final enabled
* stage of translation (stage-2 in this case as we
* hit here). Inject a DABT to the guest to handle it
* if it's implemented as a stage-2 DABT.
*/
if (esr_fsc_is_excl_atomic_fault(esr)) {
kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
return 1;
}
A precondition of taking such a data abort is having a valid mapping at stage-2. If KVM can't resolve the HVA of the fault then there couldn't have been a stage-2 mapping.
Thanks, Oliver
On 2025/4/2 0:13, Oliver Upton wrote:
Hi Yicong,
On Mon, Mar 31, 2025 at 05:43:20PM +0800, Yicong Yang wrote:
From: Yicong Yang yangyicong@hisilicon.com
If FEAT_LS64WB not supported, FEAT_LS64* instructions only support to access Device/Uncacheable memory, otherwise a data abort for unsupported Exclusive or atomic access (0x35) is generated per spec. It's implementation defined whether the target exception level is routed and is possible to implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section C3.2.12.2 Single-copy atomic 64-byte load/store:
The check is performed against the resulting memory type after all enabled stages of translation. In this case the fault is reported at the final enabled stage of translation.
Just use section citations, this quote isn't very useful since it doesn't adequately describe the different IMPLEMENTATION DEFINED behavior.
will drop the quote here.
If it's implemented as generate the DABT to the final enabled stage (stage-2), inject a DABT to the guest to handle it.
This should be ordered _before_ the patch that exposes FEAT_LS64* to the VM.
will make this patch ahead.
@@ -1658,6 +1658,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC;
- if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) {
/*
* Target address is normal memory on the Host. We come here
* because:
* 1) Guest map it as device memory and perform LS64 operations
* 2) VMM report it as device memory mistakenly
* Warn the VMM and inject the DABT back to the guest.
*/
if (!device)
kvm_err("memory attributes maybe incorrect for hva 0x%lx\n", hva);
No, kvm_err() doesn't warn the VMM at all. KVM should never log anything for a situation that can be caused by the guest, e.g. option #1 in your comment.
ok will drop the log here and only inject a DABT back.
Keep in mind that data aborts with DFSC == 0x35 can happen for a lot more than LS64 instructions, e.g. an atomic on a Device-* mapping.
got it. 0x35 should be caused by LS64* or IMPLEMENTATION DEFINED fault, but no further hint to distinguish between these two faults. hope it's also the right behaviour to inject a DABT back for the latter case.
@@ -1919,6 +1939,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; }
/*
* If instructions of FEAT_{LS64, LS64_V} operated on
* unsupported memory regions, a DABT for unsupported
* Exclusive or atomic access is generated. It's
* implementation defined whether the exception will
* be taken to, a stage-1 DABT or the final enabled
* stage of translation (stage-2 in this case as we
* hit here). Inject a DABT to the guest to handle it
* if it's implemented as a stage-2 DABT.
*/
if (esr_fsc_is_excl_atomic_fault(esr)) {
kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
return 1;
}
A precondition of taking such a data abort is having a valid mapping at stage-2. If KVM can't resolve the HVA of the fault then there couldn't have been a stage-2 mapping.
Here's handling the case for emulated mmio, I thought there's no valid stage-2 mapping for the emulated MMIO? so this check is put just before entering io_mem_abort(). should it be put into io_mem_abort() or we just don't handle the emulated case?
For the case where's a valid stage-2 mapping, the LS64 DABT is handled in user_mem_abort().
Thanks.
On Mon, Apr 07, 2025 at 11:33:01AM +0800, Yicong Yang wrote:
On 2025/4/2 0:13, Oliver Upton wrote:
On Mon, Mar 31, 2025 at 05:43:20PM +0800, Yicong Yang wrote:
@@ -1658,6 +1658,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
Keep in mind that data aborts with DFSC == 0x35 can happen for a lot more than LS64 instructions, e.g. an atomic on a Device-* mapping.
got it. 0x35 should be caused by LS64* or IMPLEMENTATION DEFINED fault, but no further hint to distinguish between these two faults. hope it's also the right behaviour to inject a DABT back for the latter case.
There isn't exactly a 'right' behavior here. The abort could either be due to a bug in the guest (doing an access on something knows it can't) or the VMM creating / describing the IPA memory map incorrectly.
Since KVM can't really work out who's to blame in this situation we should probably exit to userspace + provide a way to reinject the abort.
@@ -1919,6 +1939,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; }
/*
* If instructions of FEAT_{LS64, LS64_V} operated on
* unsupported memory regions, a DABT for unsupported
* Exclusive or atomic access is generated. It's
* implementation defined whether the exception will
* be taken to, a stage-1 DABT or the final enabled
* stage of translation (stage-2 in this case as we
* hit here). Inject a DABT to the guest to handle it
* if it's implemented as a stage-2 DABT.
*/
if (esr_fsc_is_excl_atomic_fault(esr)) {
kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
return 1;
}
A precondition of taking such a data abort is having a valid mapping at stage-2. If KVM can't resolve the HVA of the fault then there couldn't have been a stage-2 mapping.
Here's handling the case for emulated mmio, I thought there's no valid stage-2 mapping for the emulated MMIO? so this check is put just before entering io_mem_abort(). should it be put into io_mem_abort() or we just don't handle the emulated case?
Right -- there's no valid stage-2 translation for _most_ MMIO. If KVM cannot find an HVA for the fault (look at the condition that gets us here) then we know there isn't a stage-2 mapping. How would we know what to map?
In that case I would expect to take a Translation fault with instruction syndrome that can can be used to construct an exit to the VMM. Marc had some patches on list to do exactly that [*].
However, after reading this again there's a rather ugly catch. The KVM ABI has it that writes to a RO memlot generate an MMIO exit, so it *is* possible to get here w/ a stage-2 mapping. Unfortunately there's no instruction syndrome with DFSC = 0x35 so no way to decode the access.
This is starting to sound similar an nISV MMIO abort...
[*]: https://lore.kernel.org/kvmarm/20240815125959.2097734-1-maz@kernel.org/
Thanks, Oliver
On 2025/4/7 13:35, Oliver Upton wrote:
On Mon, Apr 07, 2025 at 11:33:01AM +0800, Yicong Yang wrote:
On 2025/4/2 0:13, Oliver Upton wrote:
On Mon, Mar 31, 2025 at 05:43:20PM +0800, Yicong Yang wrote:
@@ -1658,6 +1658,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
Keep in mind that data aborts with DFSC == 0x35 can happen for a lot more than LS64 instructions, e.g. an atomic on a Device-* mapping.
got it. 0x35 should be caused by LS64* or IMPLEMENTATION DEFINED fault, but no further hint to distinguish between these two faults. hope it's also the right behaviour to inject a DABT back for the latter case.
There isn't exactly a 'right' behavior here. The abort could either be due to a bug in the guest (doing an access on something knows it can't) or the VMM creating / describing the IPA memory map incorrectly.
Since KVM can't really work out who's to blame in this situation we should probably exit to userspace + provide a way to reinject the abort.
as mentioned below, maybe the proper way is to handle it like KVM_EXIT_ARM_NISV. In this case exit to userspace with KVM_EXIT_ARM_LDST64B, if it's due to the wrong mapping of the guestOS then reinject the DABT back to the guest, otherwise the userspace need to check whether it's reporting the memory region correctly and handle in its own way.
@@ -1919,6 +1939,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; }
/*
* If instructions of FEAT_{LS64, LS64_V} operated on
* unsupported memory regions, a DABT for unsupported
* Exclusive or atomic access is generated. It's
* implementation defined whether the exception will
* be taken to, a stage-1 DABT or the final enabled
* stage of translation (stage-2 in this case as we
* hit here). Inject a DABT to the guest to handle it
* if it's implemented as a stage-2 DABT.
*/
if (esr_fsc_is_excl_atomic_fault(esr)) {
kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
return 1;
}
A precondition of taking such a data abort is having a valid mapping at stage-2. If KVM can't resolve the HVA of the fault then there couldn't have been a stage-2 mapping.
Here's handling the case for emulated mmio, I thought there's no valid stage-2 mapping for the emulated MMIO? so this check is put just before entering io_mem_abort(). should it be put into io_mem_abort() or we just don't handle the emulated case?
Right -- there's no valid stage-2 translation for _most_ MMIO. If KVM cannot find an HVA for the fault (look at the condition that gets us here) then we know there isn't a stage-2 mapping. How would we know what to map?
In that case I would expect to take a Translation fault with instruction syndrome that can can be used to construct an exit to the VMM.
you're right, I misunderstand here. If there's no stage-2 mapping we'll have a translation fault here rather than an unsupported exclusive or atomic access fault. So no need to check and handle it here since we won't have such a fault in this case.
Marc had some patches on list to do exactly that [*].
However, after reading this again there's a rather ugly catch. The KVM ABI has it that writes to a RO memlot generate an MMIO exit, so it *is* possible to get here w/ a stage-2 mapping. Unfortunately there's no instruction syndrome with DFSC = 0x35 so no way to decode the access.
This is starting to sound similar an nISV MMIO abort...
yes similiar to nISV MMIO and no ISV as well. Per-spec DABT caused by LS64* has ISV==1 if it's a Translation fault, Access flag fault, or Permission fault, but not for unsupported exclusive or atomic access fault..
Thanks.
linux-kselftest-mirror@lists.linaro.org