RISC-V defines three extensions for pointer masking[1]: - Smmpm: configured in M-mode, affects M-mode - Smnpm: configured in M-mode, affects the next lower mode (S or U-mode) - Ssnpm: configured in S-mode, affects the next lower mode (VS, VU, or U-mode)
This series adds support for configuring Smnpm or Ssnpm (depending on which privilege mode the kernel is running in) to allow pointer masking in userspace (VU or U-mode), extending the PR_SET_TAGGED_ADDR_CTRL API from arm64. Unlike arm64 TBI, userspace pointer masking is not enabled by default on RISC-V. Additionally, the tag width (referred to as PMLEN) is variable, so userspace needs to ask the kernel for a specific tag width, which is interpreted as a lower bound on the number of tag bits.
This series also adds support for a tagged address ABI similar to arm64 and x86. Since accesses from the kernel to user memory use the kernel's pointer masking configuration, not the user's, the kernel must untag user pointers in software before dereferencing them. And since the tag width is variable, as with LAM on x86, it must be kept the same across all threads in a process so untagged_addr_remote() can work.
[1]: https://github.com/riscv/riscv-j-extension/raw/d70011dde6c2/zjpm-spec.pdf --- This series depends on the per-thread envcfg series in riscv/for-next.
This series can be tested in QEMU by applying a patch set[2].
KASAN_SW_TAGS using pointer masking is an independent patch series[3].
[2]: https://lore.kernel.org/qemu-devel/20240511101053.1875596-1-me@deliversmonke... [3]: https://lore.kernel.org/linux-riscv/20240814085618.968833-1-samuel.holland@s...
Changes in v5: - Update pointer masking spec version to 1.0 and state to ratified - Document how PR_[SG]ET_TAGGED_ADDR_CTRL are used on RISC-V - Document that the RISC-V tagged address ABI is the same as AArch64 - Rename "pm" selftests directory to "abi" to be more generic - Fix -Wparentheses warnings - Fix order of operations when writing via the tagged pointer - Update pointer masking spec version to 1.0 in hwprobe documentation
Changes in v4: - Switch IS_ENABLED back to #ifdef to fix riscv32 build - Combine __untagged_addr() and __untagged_addr_remote()
Changes in v3: - Note in the commit message that the ISA extension spec is frozen - Rebase on riscv/for-next (ISA extension list conflicts) - Remove RISCV_ISA_EXT_SxPM, which was not used anywhere - Use shifts instead of large numbers in ENVCFG_PMM* macro definitions - Rename CONFIG_RISCV_ISA_POINTER_MASKING to CONFIG_RISCV_ISA_SUPM, since it only controls the userspace part of pointer masking - Use IS_ENABLED instead of #ifdef when possible - Use an enum for the supported PMLEN values - Simplify the logic in set_tagged_addr_ctrl() - Use IS_ENABLED instead of #ifdef when possible - Implement mm_untag_mask() - Remove pmlen from struct thread_info (now only in mm_context_t)
Changes in v2: - Drop patch 4 ("riscv: Define is_compat_thread()"), as an equivalent patch was already applied - Move patch 5 ("riscv: Split per-CPU and per-thread envcfg bits") to a different series[3] - Update pointer masking specification version reference - Provide macros for the extension affecting the kernel and userspace - Use the correct name for the hstatus.HUPMM field - Rebase on riscv/linux.git for-next - Add and use the envcfg_update_bits() helper function - Inline flush_tagged_addr_state() - Implement untagged_addr_remote() - Restrict PMLEN changes once a process is multithreaded - Rename "tags" directory to "pm" to avoid .gitignore rules - Add .gitignore file to ignore the compiled selftest binary - Write to a pipe to force dereferencing the user pointer - Handle SIGSEGV in the child process to reduce dmesg noise - Export Supm via hwprobe - Export Smnpm and Ssnpm to KVM guests
Samuel Holland (10): dt-bindings: riscv: Add pointer masking ISA extensions riscv: Add ISA extension parsing for pointer masking riscv: Add CSR definitions for pointer masking riscv: Add support for userspace pointer masking riscv: Add support for the tagged address ABI riscv: Allow ptrace control of the tagged address ABI riscv: selftests: Add a pointer masking test riscv: hwprobe: Export the Supm ISA extension RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
Documentation/arch/riscv/hwprobe.rst | 3 + Documentation/arch/riscv/uabi.rst | 16 + .../devicetree/bindings/riscv/extensions.yaml | 18 + arch/riscv/Kconfig | 11 + arch/riscv/include/asm/csr.h | 16 + arch/riscv/include/asm/hwcap.h | 5 + arch/riscv/include/asm/mmu.h | 7 + arch/riscv/include/asm/mmu_context.h | 13 + arch/riscv/include/asm/processor.h | 8 + arch/riscv/include/asm/switch_to.h | 11 + arch/riscv/include/asm/uaccess.h | 43 ++- arch/riscv/include/uapi/asm/hwprobe.h | 1 + arch/riscv/include/uapi/asm/kvm.h | 2 + arch/riscv/kernel/cpufeature.c | 3 + arch/riscv/kernel/process.c | 154 ++++++++ arch/riscv/kernel/ptrace.c | 42 +++ arch/riscv/kernel/sys_hwprobe.c | 3 + arch/riscv/kvm/vcpu_onereg.c | 4 + include/uapi/linux/elf.h | 1 + include/uapi/linux/prctl.h | 5 +- .../selftests/kvm/riscv/get-reg-list.c | 8 + tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/abi/.gitignore | 1 + tools/testing/selftests/riscv/abi/Makefile | 10 + .../selftests/riscv/abi/pointer_masking.c | 332 ++++++++++++++++++ 25 files changed, 712 insertions(+), 7 deletions(-) create mode 100644 tools/testing/selftests/riscv/abi/.gitignore create mode 100644 tools/testing/selftests/riscv/abi/Makefile create mode 100644 tools/testing/selftests/riscv/abi/pointer_masking.c
The RISC-V Pointer Masking specification defines three extensions: Smmpm, Smnpm, and Ssnpm. Document the behavior of these extensions as following the ratified version 1.0 of the specification.
Acked-by: Conor Dooley conor.dooley@microchip.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Update pointer masking spec version to 1.0 and state to ratified
Changes in v3: - Note in the commit message that the ISA extension spec is frozen
Changes in v2: - Update pointer masking specification version reference
.../devicetree/bindings/riscv/extensions.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 2cf2026cff57..28bf1daa1d27 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -128,6 +128,18 @@ properties: changes to interrupts as frozen at commit ccbddab ("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smmpm + description: | + The standard Smmpm extension for M-mode pointer masking as + ratified at commit d70011dde6c2 ("Update to ratified state") + of riscv-j-extension. + + - const: smnpm + description: | + The standard Smnpm extension for next-mode pointer masking as + ratified at commit d70011dde6c2 ("Update to ratified state") + of riscv-j-extension. + - const: smstateen description: | The standard Smstateen extension for controlling access to CSRs @@ -147,6 +159,12 @@ properties: and mode-based filtering as ratified at commit 01d1df0 ("Add ability to manually trigger workflow. (#2)") of riscv-count-overflow.
+ - const: ssnpm + description: | + The standard Ssnpm extension for next-mode pointer masking as + ratified at commit d70011dde6c2 ("Update to ratified state") + of riscv-j-extension. + - const: sstc description: | The standard Sstc supervisor-level extension for time compare as
The RISC-V Pointer Masking specification defines three extensions: Smmpm, Smnpm, and Ssnpm. Add support for parsing each of them. The specific extension which provides pointer masking support to userspace (Supm) depends on the kernel's privilege mode, so provide a macro to abstract this selection.
Smmpm implies the existence of the mseccfg CSR. As it is the only user of this CSR so far, there is no need for an Xlinuxmseccfg extension.
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
(no changes since v3)
Changes in v3: - Rebase on riscv/for-next (ISA extension list conflicts) - Remove RISCV_ISA_EXT_SxPM, which was not used anywhere
Changes in v2: - Provide macros for the extension affecting the kernel and userspace
arch/riscv/include/asm/hwcap.h | 5 +++++ arch/riscv/kernel/cpufeature.c | 3 +++ 2 files changed, 8 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 46d9de54179e..8608883da453 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -93,6 +93,9 @@ #define RISCV_ISA_EXT_ZCMOP 84 #define RISCV_ISA_EXT_ZAWRS 85 #define RISCV_ISA_EXT_SVVPTC 86 +#define RISCV_ISA_EXT_SMMPM 87 +#define RISCV_ISA_EXT_SMNPM 88 +#define RISCV_ISA_EXT_SSNPM 89
#define RISCV_ISA_EXT_XLINUXENVCFG 127
@@ -101,8 +104,10 @@
#ifdef CONFIG_RISCV_M_MODE #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA +#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SMNPM #else #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA +#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SSNPM #endif
#endif /* _ASM_RISCV_HWCAP_H */ diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index b3a057c36996..94596bca464e 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -377,9 +377,12 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts), __RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT), __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), + __RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM), + __RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts), __RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN), __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), + __RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts), __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
Pointer masking is controlled via a two-bit PMM field, which appears in various CSRs depending on which extensions are implemented. Smmpm adds the field to mseccfg; Smnpm adds the field to menvcfg; Ssnpm adds the field to senvcfg. If the H extension is implemented, Ssnpm also defines henvcfg.PMM and hstatus.HUPMM.
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
(no changes since v3)
Changes in v3: - Use shifts instead of large numbers in ENVCFG_PMM* macro definitions
Changes in v2: - Use the correct name for the hstatus.HUPMM field
arch/riscv/include/asm/csr.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 25966995da04..fe5d4eb9adea 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -119,6 +119,10 @@
/* HSTATUS flags */ #ifdef CONFIG_64BIT +#define HSTATUS_HUPMM _AC(0x3000000000000, UL) +#define HSTATUS_HUPMM_PMLEN_0 _AC(0x0000000000000, UL) +#define HSTATUS_HUPMM_PMLEN_7 _AC(0x2000000000000, UL) +#define HSTATUS_HUPMM_PMLEN_16 _AC(0x3000000000000, UL) #define HSTATUS_VSXL _AC(0x300000000, UL) #define HSTATUS_VSXL_SHIFT 32 #endif @@ -195,6 +199,10 @@ /* xENVCFG flags */ #define ENVCFG_STCE (_AC(1, ULL) << 63) #define ENVCFG_PBMTE (_AC(1, ULL) << 62) +#define ENVCFG_PMM (_AC(0x3, ULL) << 32) +#define ENVCFG_PMM_PMLEN_0 (_AC(0x0, ULL) << 32) +#define ENVCFG_PMM_PMLEN_7 (_AC(0x2, ULL) << 32) +#define ENVCFG_PMM_PMLEN_16 (_AC(0x3, ULL) << 32) #define ENVCFG_CBZE (_AC(1, UL) << 7) #define ENVCFG_CBCFE (_AC(1, UL) << 6) #define ENVCFG_CBIE_SHIFT 4 @@ -216,6 +224,12 @@ #define SMSTATEEN0_SSTATEEN0_SHIFT 63 #define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/* mseccfg bits */ +#define MSECCFG_PMM ENVCFG_PMM +#define MSECCFG_PMM_PMLEN_0 ENVCFG_PMM_PMLEN_0 +#define MSECCFG_PMM_PMLEN_7 ENVCFG_PMM_PMLEN_7 +#define MSECCFG_PMM_PMLEN_16 ENVCFG_PMM_PMLEN_16 + /* symbolic CSR names: */ #define CSR_CYCLE 0xc00 #define CSR_TIME 0xc01 @@ -382,6 +396,8 @@ #define CSR_MIP 0x344 #define CSR_PMPCFG0 0x3a0 #define CSR_PMPADDR0 0x3b0 +#define CSR_MSECCFG 0x747 +#define CSR_MSECCFGH 0x757 #define CSR_MVENDORID 0xf11 #define CSR_MARCHID 0xf12 #define CSR_MIMPID 0xf13
RISC-V supports pointer masking with a variable number of tag bits (which is called "PMLEN" in the specification) and which is configured at the next higher privilege level.
Wire up the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctls so userspace can request a lower bound on the number of tag bits and determine the actual number of tag bits. As with arm64's PR_TAGGED_ADDR_ENABLE, the pointer masking configuration is thread-scoped, inherited on clone() and fork() and cleared on execve().
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Tested-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Document how PR_[SG]ET_TAGGED_ADDR_CTRL are used on RISC-V
Changes in v4: - Switch IS_ENABLED back to #ifdef to fix riscv32 build
Changes in v3: - Rename CONFIG_RISCV_ISA_POINTER_MASKING to CONFIG_RISCV_ISA_SUPM, since it only controls the userspace part of pointer masking - Use IS_ENABLED instead of #ifdef when possible - Use an enum for the supported PMLEN values - Simplify the logic in set_tagged_addr_ctrl()
Changes in v2: - Rebase on riscv/linux.git for-next - Add and use the envcfg_update_bits() helper function - Inline flush_tagged_addr_state()
Documentation/arch/riscv/uabi.rst | 12 ++++ arch/riscv/Kconfig | 11 ++++ arch/riscv/include/asm/processor.h | 8 +++ arch/riscv/include/asm/switch_to.h | 11 ++++ arch/riscv/kernel/process.c | 91 ++++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +- 6 files changed, 137 insertions(+), 1 deletion(-)
diff --git a/Documentation/arch/riscv/uabi.rst b/Documentation/arch/riscv/uabi.rst index 2b420bab0527..ddb8359a46ed 100644 --- a/Documentation/arch/riscv/uabi.rst +++ b/Documentation/arch/riscv/uabi.rst @@ -68,3 +68,15 @@ Misaligned accesses Misaligned scalar accesses are supported in userspace, but they may perform poorly. Misaligned vector accesses are only supported if the Zicclsm extension is supported. + +Pointer masking +--------------- + +Support for pointer masking in userspace (the Supm extension) is provided via +the ``PR_SET_TAGGED_ADDR_CTRL`` and ``PR_GET_TAGGED_ADDR_CTRL`` ``prctl()`` +operations. Pointer masking is disabled by default. To enable it, userspace +must call ``PR_SET_TAGGED_ADDR_CTRL`` with the ``PR_PMLEN`` field set to the +number of mask/tag bits needed by the application. ``PR_PMLEN`` is interpreted +as a lower bound; if the kernel is unable to satisfy the request, the +``PR_SET_TAGGED_ADDR_CTRL`` operation will fail. The actual number of tag bits +is returned in ``PR_PMLEN`` by the ``PR_GET_TAGGED_ADDR_CTRL`` operation. diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 22dc5ea4196c..0ef449465378 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -531,6 +531,17 @@ config RISCV_ISA_C
If you don't know what to do here, say Y.
+config RISCV_ISA_SUPM + bool "Supm extension for userspace pointer masking" + depends on 64BIT + default y + help + Add support for pointer masking in userspace (Supm) when the + underlying hardware extension (Smnpm or Ssnpm) is detected at boot. + + If this option is disabled, userspace will be unable to use + the prctl(PR_{SET,GET}_TAGGED_ADDR_CTRL) API. + config RISCV_ISA_SVNAPOT bool "Svnapot extension support for supervisor mode NAPOT pages" depends on 64BIT && MMU diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index c1a492508835..5f56eb9d114a 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -178,6 +178,14 @@ extern int set_unalign_ctl(struct task_struct *tsk, unsigned int val); #define RISCV_SET_ICACHE_FLUSH_CTX(arg1, arg2) riscv_set_icache_flush_ctx(arg1, arg2) extern int riscv_set_icache_flush_ctx(unsigned long ctx, unsigned long per_thread);
+#ifdef CONFIG_RISCV_ISA_SUPM +/* PR_{SET,GET}_TAGGED_ADDR_CTRL prctl */ +long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg); +long get_tagged_addr_ctrl(struct task_struct *task); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(current, arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl(current) +#endif + #endif /* __ASSEMBLY__ */
#endif /* _ASM_RISCV_PROCESSOR_H */ diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h index 9685cd85e57c..94e33216b2d9 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -70,6 +70,17 @@ static __always_inline bool has_fpu(void) { return false; } #define __switch_to_fpu(__prev, __next) do { } while (0) #endif
+static inline void envcfg_update_bits(struct task_struct *task, + unsigned long mask, unsigned long val) +{ + unsigned long envcfg; + + envcfg = (task->thread.envcfg & ~mask) | val; + task->thread.envcfg = envcfg; + if (task == current) + csr_write(CSR_ENVCFG, envcfg); +} + static inline void __switch_to_envcfg(struct task_struct *next) { asm volatile (ALTERNATIVE("nop", "csrw " __stringify(CSR_ENVCFG) ", %0", diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index e3142d8a6e28..200d2ed64dfe 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -7,6 +7,7 @@ * Copyright (C) 2017 SiFive */
+#include <linux/bitfield.h> #include <linux/cpu.h> #include <linux/kernel.h> #include <linux/sched.h> @@ -180,6 +181,10 @@ void flush_thread(void) memset(¤t->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); clear_tsk_thread_flag(current, TIF_RISCV_V_DEFER_RESTORE); #endif +#ifdef CONFIG_RISCV_ISA_SUPM + if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) + envcfg_update_bits(current, ENVCFG_PMM, ENVCFG_PMM_PMLEN_0); +#endif }
void arch_release_task_struct(struct task_struct *tsk) @@ -242,3 +247,89 @@ void __init arch_task_cache_init(void) { riscv_v_setup_ctx_cache(); } + +#ifdef CONFIG_RISCV_ISA_SUPM +enum { + PMLEN_0 = 0, + PMLEN_7 = 7, + PMLEN_16 = 16, +}; + +static bool have_user_pmlen_7; +static bool have_user_pmlen_16; + +long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) +{ + unsigned long valid_mask = PR_PMLEN_MASK; + struct thread_info *ti = task_thread_info(task); + unsigned long pmm; + u8 pmlen; + + if (is_compat_thread(ti)) + return -EINVAL; + + if (arg & ~valid_mask) + return -EINVAL; + + /* + * Prefer the smallest PMLEN that satisfies the user's request, + * in case choosing a larger PMLEN has a performance impact. + */ + pmlen = FIELD_GET(PR_PMLEN_MASK, arg); + if (pmlen == PMLEN_0) + pmm = ENVCFG_PMM_PMLEN_0; + else if (pmlen <= PMLEN_7 && have_user_pmlen_7) + pmm = ENVCFG_PMM_PMLEN_7; + else if (pmlen <= PMLEN_16 && have_user_pmlen_16) + pmm = ENVCFG_PMM_PMLEN_16; + else + return -EINVAL; + + envcfg_update_bits(task, ENVCFG_PMM, pmm); + + return 0; +} + +long get_tagged_addr_ctrl(struct task_struct *task) +{ + struct thread_info *ti = task_thread_info(task); + long ret = 0; + + if (is_compat_thread(ti)) + return -EINVAL; + + switch (task->thread.envcfg & ENVCFG_PMM) { + case ENVCFG_PMM_PMLEN_7: + ret = FIELD_PREP(PR_PMLEN_MASK, PMLEN_7); + break; + case ENVCFG_PMM_PMLEN_16: + ret = FIELD_PREP(PR_PMLEN_MASK, PMLEN_16); + break; + } + + return ret; +} + +static bool try_to_set_pmm(unsigned long value) +{ + csr_set(CSR_ENVCFG, value); + return (csr_read_clear(CSR_ENVCFG, ENVCFG_PMM) & ENVCFG_PMM) == value; +} + +static int __init tagged_addr_init(void) +{ + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) + return 0; + + /* + * envcfg.PMM is a WARL field. Detect which values are supported. + * Assume the supported PMLEN values are the same on all harts. + */ + csr_clear(CSR_ENVCFG, ENVCFG_PMM); + have_user_pmlen_7 = try_to_set_pmm(ENVCFG_PMM_PMLEN_7); + have_user_pmlen_16 = try_to_set_pmm(ENVCFG_PMM_PMLEN_16); + + return 0; +} +core_initcall(tagged_addr_init); +#endif /* CONFIG_RISCV_ISA_SUPM */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 35791791a879..cefd656ebf43 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -230,7 +230,7 @@ struct prctl_mm_map { # define PR_PAC_APDBKEY (1UL << 3) # define PR_PAC_APGAKEY (1UL << 4)
-/* Tagged user address controls for arm64 */ +/* Tagged user address controls for arm64 and RISC-V */ #define PR_SET_TAGGED_ADDR_CTRL 55 #define PR_GET_TAGGED_ADDR_CTRL 56 # define PR_TAGGED_ADDR_ENABLE (1UL << 0) @@ -244,6 +244,9 @@ struct prctl_mm_map { # define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT) /* Unused; kept only for source compatibility */ # define PR_MTE_TCF_SHIFT 1 +/* RISC-V pointer masking tag length */ +# define PR_PMLEN_SHIFT 24 +# define PR_PMLEN_MASK (0x7fUL << PR_PMLEN_SHIFT)
/* Control reclaim behavior when allocating memory */ #define PR_SET_IO_FLUSHER 57
When pointer masking is enabled for userspace, the kernel can accept tagged pointers as arguments to some system calls. Allow this by untagging the pointers in access_ok() and the uaccess routines. The uaccess routines must peform untagging in software because U-mode and S-mode have entirely separate pointer masking configurations. In fact, hardware may not even implement pointer masking for S-mode.
Since the number of tag bits is variable, untagged_addr_remote() needs to know what PMLEN to use for the remote mm. Therefore, the pointer masking mode must be the same for all threads sharing an mm. Enforce this with a lock flag in the mm context, as x86 does for LAM. The flag gets reset in init_new_context() during fork(), as the new mm is no longer multithreaded.
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Tested-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Document that the RISC-V tagged address ABI is the same as AArch64
Changes in v4: - Combine __untagged_addr() and __untagged_addr_remote()
Changes in v3: - Use IS_ENABLED instead of #ifdef when possible - Implement mm_untag_mask() - Remove pmlen from struct thread_info (now only in mm_context_t)
Changes in v2: - Implement untagged_addr_remote() - Restrict PMLEN changes once a process is multithreaded
Documentation/arch/riscv/uabi.rst | 4 ++ arch/riscv/include/asm/mmu.h | 7 +++ arch/riscv/include/asm/mmu_context.h | 13 +++++ arch/riscv/include/asm/uaccess.h | 43 ++++++++++++++-- arch/riscv/kernel/process.c | 73 ++++++++++++++++++++++++++-- 5 files changed, 130 insertions(+), 10 deletions(-)
diff --git a/Documentation/arch/riscv/uabi.rst b/Documentation/arch/riscv/uabi.rst index ddb8359a46ed..243e40062e34 100644 --- a/Documentation/arch/riscv/uabi.rst +++ b/Documentation/arch/riscv/uabi.rst @@ -80,3 +80,7 @@ number of mask/tag bits needed by the application. ``PR_PMLEN`` is interpreted as a lower bound; if the kernel is unable to satisfy the request, the ``PR_SET_TAGGED_ADDR_CTRL`` operation will fail. The actual number of tag bits is returned in ``PR_PMLEN`` by the ``PR_GET_TAGGED_ADDR_CTRL`` operation. + +Additionally, when pointer masking is enabled (``PR_PMLEN`` is greater than 0), +a tagged address ABI is supported, with the same interface and behavior as +documented for AArch64 (Documentation/arch/arm64/tagged-address-abi.rst). diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h index c9e03e9da3dc..1cc90465d75b 100644 --- a/arch/riscv/include/asm/mmu.h +++ b/arch/riscv/include/asm/mmu.h @@ -25,9 +25,16 @@ typedef struct { #ifdef CONFIG_BINFMT_ELF_FDPIC unsigned long exec_fdpic_loadmap; unsigned long interp_fdpic_loadmap; +#endif + unsigned long flags; +#ifdef CONFIG_RISCV_ISA_SUPM + u8 pmlen; #endif } mm_context_t;
+/* Lock the pointer masking mode because this mm is multithreaded */ +#define MM_CONTEXT_LOCK_PMLEN 0 + #define cntx2asid(cntx) ((cntx) & SATP_ASID_MASK) #define cntx2version(cntx) ((cntx) & ~SATP_ASID_MASK)
diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h index 7030837adc1a..8c4bc49a3a0f 100644 --- a/arch/riscv/include/asm/mmu_context.h +++ b/arch/riscv/include/asm/mmu_context.h @@ -20,6 +20,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next) { +#ifdef CONFIG_RISCV_ISA_SUPM + next->context.pmlen = 0; +#endif switch_mm(prev, next, NULL); }
@@ -30,11 +33,21 @@ static inline int init_new_context(struct task_struct *tsk, #ifdef CONFIG_MMU atomic_long_set(&mm->context.id, 0); #endif + if (IS_ENABLED(CONFIG_RISCV_ISA_SUPM)) + clear_bit(MM_CONTEXT_LOCK_PMLEN, &mm->context.flags); return 0; }
DECLARE_STATIC_KEY_FALSE(use_asid_allocator);
+#ifdef CONFIG_RISCV_ISA_SUPM +#define mm_untag_mask mm_untag_mask +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return -1UL >> mm->context.pmlen; +} +#endif + #include <asm-generic/mmu_context.h>
#endif /* _ASM_RISCV_MMU_CONTEXT_H */ diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index 72ec1d9bd3f3..fee56b0c8058 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -9,8 +9,41 @@ #define _ASM_RISCV_UACCESS_H
#include <asm/asm-extable.h> +#include <asm/cpufeature.h> #include <asm/pgtable.h> /* for TASK_SIZE */
+#ifdef CONFIG_RISCV_ISA_SUPM +static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, unsigned long addr) +{ + if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) { + u8 pmlen = mm->context.pmlen; + + /* Virtual addresses are sign-extended; physical addresses are zero-extended. */ + if (IS_ENABLED(CONFIG_MMU)) + return (long)(addr << pmlen) >> pmlen; + else + return (addr << pmlen) >> pmlen; + } + + return addr; +} + +#define untagged_addr(addr) ({ \ + unsigned long __addr = (__force unsigned long)(addr); \ + (__force __typeof__(addr))__untagged_addr_remote(current->mm, __addr); \ +}) + +#define untagged_addr_remote(mm, addr) ({ \ + unsigned long __addr = (__force unsigned long)(addr); \ + mmap_assert_locked(mm); \ + (__force __typeof__(addr))__untagged_addr_remote(mm, __addr); \ +}) + +#define access_ok(addr, size) likely(__access_ok(untagged_addr(addr), size)) +#else +#define untagged_addr(addr) (addr) +#endif + /* * User space memory access functions */ @@ -130,7 +163,7 @@ do { \ */ #define __get_user(x, ptr) \ ({ \ - const __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \ + const __typeof__(*(ptr)) __user *__gu_ptr = untagged_addr(ptr); \ long __gu_err = 0; \ \ __chk_user_ptr(__gu_ptr); \ @@ -246,7 +279,7 @@ do { \ */ #define __put_user(x, ptr) \ ({ \ - __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \ + __typeof__(*(ptr)) __user *__gu_ptr = untagged_addr(ptr); \ __typeof__(*__gu_ptr) __val = (x); \ long __pu_err = 0; \ \ @@ -293,13 +326,13 @@ unsigned long __must_check __asm_copy_from_user(void *to, static inline unsigned long raw_copy_from_user(void *to, const void __user *from, unsigned long n) { - return __asm_copy_from_user(to, from, n); + return __asm_copy_from_user(to, untagged_addr(from), n); }
static inline unsigned long raw_copy_to_user(void __user *to, const void *from, unsigned long n) { - return __asm_copy_to_user(to, from, n); + return __asm_copy_to_user(untagged_addr(to), from, n); }
extern long strncpy_from_user(char *dest, const char __user *src, long count); @@ -314,7 +347,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) { might_fault(); return access_ok(to, n) ? - __clear_user(to, n) : n; + __clear_user(untagged_addr(to), n) : n; }
#define __get_kernel_nofault(dst, src, type, err_label) \ diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 200d2ed64dfe..58b6482c2bf6 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -213,6 +213,10 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p);
+ /* Ensure all threads in this mm have the same pointer masking mode. */ + if (IS_ENABLED(CONFIG_RISCV_ISA_SUPM) && p->mm && (clone_flags & CLONE_VM)) + set_bit(MM_CONTEXT_LOCK_PMLEN, &p->mm->context.flags); + memset(&p->thread.s, 0, sizeof(p->thread.s));
/* p->thread holds context to be restored by __switch_to() */ @@ -258,10 +262,16 @@ enum { static bool have_user_pmlen_7; static bool have_user_pmlen_16;
+/* + * Control the relaxed ABI allowing tagged user addresses into the kernel. + */ +static unsigned int tagged_addr_disabled; + long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) { - unsigned long valid_mask = PR_PMLEN_MASK; + unsigned long valid_mask = PR_PMLEN_MASK | PR_TAGGED_ADDR_ENABLE; struct thread_info *ti = task_thread_info(task); + struct mm_struct *mm = task->mm; unsigned long pmm; u8 pmlen;
@@ -276,16 +286,41 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) * in case choosing a larger PMLEN has a performance impact. */ pmlen = FIELD_GET(PR_PMLEN_MASK, arg); - if (pmlen == PMLEN_0) + if (pmlen == PMLEN_0) { pmm = ENVCFG_PMM_PMLEN_0; - else if (pmlen <= PMLEN_7 && have_user_pmlen_7) + } else if (pmlen <= PMLEN_7 && have_user_pmlen_7) { + pmlen = PMLEN_7; pmm = ENVCFG_PMM_PMLEN_7; - else if (pmlen <= PMLEN_16 && have_user_pmlen_16) + } else if (pmlen <= PMLEN_16 && have_user_pmlen_16) { + pmlen = PMLEN_16; pmm = ENVCFG_PMM_PMLEN_16; - else + } else { return -EINVAL; + } + + /* + * Do not allow the enabling of the tagged address ABI if globally + * disabled via sysctl abi.tagged_addr_disabled, if pointer masking + * is disabled for userspace. + */ + if (arg & PR_TAGGED_ADDR_ENABLE && (tagged_addr_disabled || !pmlen)) + return -EINVAL; + + if (!(arg & PR_TAGGED_ADDR_ENABLE)) + pmlen = PMLEN_0; + + if (mmap_write_lock_killable(mm)) + return -EINTR; + + if (test_bit(MM_CONTEXT_LOCK_PMLEN, &mm->context.flags) && mm->context.pmlen != pmlen) { + mmap_write_unlock(mm); + return -EBUSY; + }
envcfg_update_bits(task, ENVCFG_PMM, pmm); + mm->context.pmlen = pmlen; + + mmap_write_unlock(mm);
return 0; } @@ -298,6 +333,10 @@ long get_tagged_addr_ctrl(struct task_struct *task) if (is_compat_thread(ti)) return -EINVAL;
+ /* + * The mm context's pmlen is set only when the tagged address ABI is + * enabled, so the effective PMLEN must be extracted from envcfg.PMM. + */ switch (task->thread.envcfg & ENVCFG_PMM) { case ENVCFG_PMM_PMLEN_7: ret = FIELD_PREP(PR_PMLEN_MASK, PMLEN_7); @@ -307,6 +346,9 @@ long get_tagged_addr_ctrl(struct task_struct *task) break; }
+ if (task->mm->context.pmlen) + ret |= PR_TAGGED_ADDR_ENABLE; + return ret; }
@@ -316,6 +358,24 @@ static bool try_to_set_pmm(unsigned long value) return (csr_read_clear(CSR_ENVCFG, ENVCFG_PMM) & ENVCFG_PMM) == value; }
+/* + * Global sysctl to disable the tagged user addresses support. This control + * only prevents the tagged address ABI enabling via prctl() and does not + * disable it for tasks that already opted in to the relaxed ABI. + */ + +static struct ctl_table tagged_addr_sysctl_table[] = { + { + .procname = "tagged_addr_disabled", + .mode = 0644, + .data = &tagged_addr_disabled, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE, + }, +}; + static int __init tagged_addr_init(void) { if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM)) @@ -329,6 +389,9 @@ static int __init tagged_addr_init(void) have_user_pmlen_7 = try_to_set_pmm(ENVCFG_PMM_PMLEN_7); have_user_pmlen_16 = try_to_set_pmm(ENVCFG_PMM_PMLEN_16);
+ if (!register_sysctl("abi", tagged_addr_sysctl_table)) + return -EINVAL; + return 0; } core_initcall(tagged_addr_init);
This allows a tracer to control the ABI of the tracee, as on arm64.
Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
(no changes since v1)
arch/riscv/kernel/ptrace.c | 42 ++++++++++++++++++++++++++++++++++++++ include/uapi/linux/elf.h | 1 + 2 files changed, 43 insertions(+)
diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 92731ff8c79a..ea67e9fb7a58 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -28,6 +28,9 @@ enum riscv_regset { #ifdef CONFIG_RISCV_ISA_V REGSET_V, #endif +#ifdef CONFIG_RISCV_ISA_SUPM + REGSET_TAGGED_ADDR_CTRL, +#endif };
static int riscv_gpr_get(struct task_struct *target, @@ -152,6 +155,35 @@ static int riscv_vr_set(struct task_struct *target, } #endif
+#ifdef CONFIG_RISCV_ISA_SUPM +static int tagged_addr_ctrl_get(struct task_struct *target, + const struct user_regset *regset, + struct membuf to) +{ + long ctrl = get_tagged_addr_ctrl(target); + + if (IS_ERR_VALUE(ctrl)) + return ctrl; + + return membuf_write(&to, &ctrl, sizeof(ctrl)); +} + +static int tagged_addr_ctrl_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + int ret; + long ctrl; + + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1); + if (ret) + return ret; + + return set_tagged_addr_ctrl(target, ctrl); +} +#endif + static const struct user_regset riscv_user_regset[] = { [REGSET_X] = { .core_note_type = NT_PRSTATUS, @@ -182,6 +214,16 @@ static const struct user_regset riscv_user_regset[] = { .set = riscv_vr_set, }, #endif +#ifdef CONFIG_RISCV_ISA_SUPM + [REGSET_TAGGED_ADDR_CTRL] = { + .core_note_type = NT_RISCV_TAGGED_ADDR_CTRL, + .n = 1, + .size = sizeof(long), + .align = sizeof(long), + .regset_get = tagged_addr_ctrl_get, + .set = tagged_addr_ctrl_set, + }, +#endif };
static const struct user_regset_view riscv_user_native_view = { diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index b9935988da5c..a920cf8934dc 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -450,6 +450,7 @@ typedef struct elf64_shdr { #define NT_MIPS_MSA 0x802 /* MIPS SIMD registers */ #define NT_RISCV_CSR 0x900 /* RISC-V Control and Status Registers */ #define NT_RISCV_VECTOR 0x901 /* RISC-V vector registers */ +#define NT_RISCV_TAGGED_ADDR_CTRL 0x902 /* RISC-V tagged address control (prctl()) */ #define NT_LOONGARCH_CPUCFG 0xa00 /* LoongArch CPU config registers */ #define NT_LOONGARCH_CSR 0xa01 /* LoongArch control and status registers */ #define NT_LOONGARCH_LSX 0xa02 /* LoongArch Loongson SIMD Extension registers */
This test covers the behavior of the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctl() operations, their effects on the userspace ABI, and their effects on the system call ABI.
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Tested-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Rename "pm" selftests directory to "abi" to be more generic - Fix -Wparentheses warnings - Fix order of operations when writing via the tagged pointer
Changes in v2: - Rename "tags" directory to "pm" to avoid .gitignore rules - Add .gitignore file to ignore the compiled selftest binary - Write to a pipe to force dereferencing the user pointer - Handle SIGSEGV in the child process to reduce dmesg noise
tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/abi/.gitignore | 1 + tools/testing/selftests/riscv/abi/Makefile | 10 + .../selftests/riscv/abi/pointer_masking.c | 332 ++++++++++++++++++ 4 files changed, 344 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/riscv/abi/.gitignore create mode 100644 tools/testing/selftests/riscv/abi/Makefile create mode 100644 tools/testing/selftests/riscv/abi/pointer_masking.c
diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile index 7ce03d832b64..099b8c1f46f8 100644 --- a/tools/testing/selftests/riscv/Makefile +++ b/tools/testing/selftests/riscv/Makefile @@ -5,7 +5,7 @@ ARCH ?= $(shell uname -m 2>/dev/null || echo not)
ifneq (,$(filter $(ARCH),riscv)) -RISCV_SUBTARGETS ?= hwprobe vector mm sigreturn +RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector else RISCV_SUBTARGETS := endif diff --git a/tools/testing/selftests/riscv/abi/.gitignore b/tools/testing/selftests/riscv/abi/.gitignore new file mode 100644 index 000000000000..b38358f91c4d --- /dev/null +++ b/tools/testing/selftests/riscv/abi/.gitignore @@ -0,0 +1 @@ +pointer_masking diff --git a/tools/testing/selftests/riscv/abi/Makefile b/tools/testing/selftests/riscv/abi/Makefile new file mode 100644 index 000000000000..ed82ff9c664e --- /dev/null +++ b/tools/testing/selftests/riscv/abi/Makefile @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0 + +CFLAGS += -I$(top_srcdir)/tools/include + +TEST_GEN_PROGS := pointer_masking + +include ../../lib.mk + +$(OUTPUT)/pointer_masking: pointer_masking.c + $(CC) -static -o$@ $(CFLAGS) $(LDFLAGS) $^ diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c new file mode 100644 index 000000000000..dee41b7ee3e3 --- /dev/null +++ b/tools/testing/selftests/riscv/abi/pointer_masking.c @@ -0,0 +1,332 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include <errno.h> +#include <fcntl.h> +#include <setjmp.h> +#include <signal.h> +#include <stdbool.h> +#include <sys/prctl.h> +#include <sys/wait.h> +#include <unistd.h> + +#include "../../kselftest.h" + +#ifndef PR_PMLEN_SHIFT +#define PR_PMLEN_SHIFT 24 +#endif +#ifndef PR_PMLEN_MASK +#define PR_PMLEN_MASK (0x7fUL << PR_PMLEN_SHIFT) +#endif + +static int dev_zero; + +static int pipefd[2]; + +static sigjmp_buf jmpbuf; + +static void sigsegv_handler(int sig) +{ + siglongjmp(jmpbuf, 1); +} + +static int min_pmlen; +static int max_pmlen; + +static inline bool valid_pmlen(int pmlen) +{ + return pmlen == 0 || pmlen == 7 || pmlen == 16; +} + +static void test_pmlen(void) +{ + ksft_print_msg("Testing available PMLEN values\n"); + + for (int request = 0; request <= 16; request++) { + int pmlen, ret; + + ret = prctl(PR_SET_TAGGED_ADDR_CTRL, request << PR_PMLEN_SHIFT, 0, 0, 0); + if (ret) + goto pr_set_error; + + ret = prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0); + ksft_test_result(ret >= 0, "PMLEN=%d PR_GET_TAGGED_ADDR_CTRL\n", request); + if (ret < 0) + goto pr_get_error; + + pmlen = (ret & PR_PMLEN_MASK) >> PR_PMLEN_SHIFT; + ksft_test_result(pmlen >= request, "PMLEN=%d constraint\n", request); + ksft_test_result(valid_pmlen(pmlen), "PMLEN=%d validity\n", request); + + if (min_pmlen == 0) + min_pmlen = pmlen; + if (max_pmlen < pmlen) + max_pmlen = pmlen; + + continue; + +pr_set_error: + ksft_test_result_skip("PMLEN=%d PR_GET_TAGGED_ADDR_CTRL\n", request); +pr_get_error: + ksft_test_result_skip("PMLEN=%d constraint\n", request); + ksft_test_result_skip("PMLEN=%d validity\n", request); + } + + if (max_pmlen == 0) + ksft_exit_fail_msg("Failed to enable pointer masking\n"); +} + +static int set_tagged_addr_ctrl(int pmlen, bool tagged_addr_abi) +{ + int arg, ret; + + arg = pmlen << PR_PMLEN_SHIFT | tagged_addr_abi; + ret = prctl(PR_SET_TAGGED_ADDR_CTRL, arg, 0, 0, 0); + if (!ret) { + ret = prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0); + if (ret == arg) + return 0; + } + + return ret < 0 ? -errno : -ENODATA; +} + +static void test_dereference_pmlen(int pmlen) +{ + static volatile int i; + volatile int *p; + int ret; + + ret = set_tagged_addr_ctrl(pmlen, false); + if (ret) + return ksft_test_result_error("PMLEN=%d setup (%d)\n", pmlen, ret); + + i = pmlen; + + if (pmlen) { + p = (volatile int *)((uintptr_t)&i | 1UL << (__riscv_xlen - pmlen)); + + /* These dereferences should succeed. */ + if (sigsetjmp(jmpbuf, 1)) + return ksft_test_result_fail("PMLEN=%d valid tag\n", pmlen); + if (*p != pmlen) + return ksft_test_result_fail("PMLEN=%d bad value\n", pmlen); + ++*p; + } + + p = (volatile int *)((uintptr_t)&i | 1UL << (__riscv_xlen - pmlen - 1)); + + /* These dereferences should raise SIGSEGV. */ + if (sigsetjmp(jmpbuf, 1)) + return ksft_test_result_pass("PMLEN=%d dereference\n", pmlen); + ++*p; + ksft_test_result_fail("PMLEN=%d invalid tag\n", pmlen); +} + +static void test_dereference(void) +{ + ksft_print_msg("Testing userspace pointer dereference\n"); + + signal(SIGSEGV, sigsegv_handler); + + test_dereference_pmlen(0); + test_dereference_pmlen(min_pmlen); + test_dereference_pmlen(max_pmlen); + + signal(SIGSEGV, SIG_DFL); +} + +static void execve_child_sigsegv_handler(int sig) +{ + exit(42); +} + +static int execve_child(void) +{ + static volatile int i; + volatile int *p = (volatile int *)((uintptr_t)&i | 1UL << (__riscv_xlen - 7)); + + signal(SIGSEGV, execve_child_sigsegv_handler); + + /* This dereference should raise SIGSEGV. */ + return *p; +} + +static void test_fork_exec(void) +{ + int ret, status; + + ksft_print_msg("Testing fork/exec behavior\n"); + + ret = set_tagged_addr_ctrl(min_pmlen, false); + if (ret) + return ksft_test_result_error("setup (%d)\n", ret); + + if (fork()) { + wait(&status); + ksft_test_result(WIFEXITED(status) && WEXITSTATUS(status) == 42, + "dereference after fork\n"); + } else { + static volatile int i = 42; + volatile int *p; + + p = (volatile int *)((uintptr_t)&i | 1UL << (__riscv_xlen - min_pmlen)); + + /* This dereference should succeed. */ + exit(*p); + } + + if (fork()) { + wait(&status); + ksft_test_result(WIFEXITED(status) && WEXITSTATUS(status) == 42, + "dereference after fork+exec\n"); + } else { + /* Will call execve_child(). */ + execve("/proc/self/exe", (char *const []) { "", NULL }, NULL); + } +} + +static void test_tagged_addr_abi_sysctl(void) +{ + char value; + int fd; + + ksft_print_msg("Testing tagged address ABI sysctl\n"); + + fd = open("/proc/sys/abi/tagged_addr_disabled", O_WRONLY); + if (fd < 0) { + ksft_test_result_skip("failed to open sysctl file\n"); + ksft_test_result_skip("failed to open sysctl file\n"); + return; + } + + value = '1'; + pwrite(fd, &value, 1, 0); + ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL, + "sysctl disabled\n"); + + value = '0'; + pwrite(fd, &value, 1, 0); + ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0, + "sysctl enabled\n"); + + set_tagged_addr_ctrl(0, false); + + close(fd); +} + +static void test_tagged_addr_abi_pmlen(int pmlen) +{ + int i, *p, ret; + + i = ~pmlen; + + if (pmlen) { + p = (int *)((uintptr_t)&i | 1UL << (__riscv_xlen - pmlen)); + + ret = set_tagged_addr_ctrl(pmlen, false); + if (ret) + return ksft_test_result_error("PMLEN=%d ABI disabled setup (%d)\n", + pmlen, ret); + + ret = write(pipefd[1], p, sizeof(*p)); + if (ret >= 0 || errno != EFAULT) + return ksft_test_result_fail("PMLEN=%d ABI disabled write\n", pmlen); + + ret = read(dev_zero, p, sizeof(*p)); + if (ret >= 0 || errno != EFAULT) + return ksft_test_result_fail("PMLEN=%d ABI disabled read\n", pmlen); + + if (i != ~pmlen) + return ksft_test_result_fail("PMLEN=%d ABI disabled value\n", pmlen); + + ret = set_tagged_addr_ctrl(pmlen, true); + if (ret) + return ksft_test_result_error("PMLEN=%d ABI enabled setup (%d)\n", + pmlen, ret); + + ret = write(pipefd[1], p, sizeof(*p)); + if (ret != sizeof(*p)) + return ksft_test_result_fail("PMLEN=%d ABI enabled write\n", pmlen); + + ret = read(dev_zero, p, sizeof(*p)); + if (ret != sizeof(*p)) + return ksft_test_result_fail("PMLEN=%d ABI enabled read\n", pmlen); + + if (i) + return ksft_test_result_fail("PMLEN=%d ABI enabled value\n", pmlen); + + i = ~pmlen; + } else { + /* The tagged address ABI cannot be enabled when PMLEN == 0. */ + ret = set_tagged_addr_ctrl(pmlen, true); + if (ret != -EINVAL) + return ksft_test_result_error("PMLEN=%d ABI setup (%d)\n", + pmlen, ret); + } + + p = (int *)((uintptr_t)&i | 1UL << (__riscv_xlen - pmlen - 1)); + + ret = write(pipefd[1], p, sizeof(*p)); + if (ret >= 0 || errno != EFAULT) + return ksft_test_result_fail("PMLEN=%d invalid tag write (%d)\n", pmlen, errno); + + ret = read(dev_zero, p, sizeof(*p)); + if (ret >= 0 || errno != EFAULT) + return ksft_test_result_fail("PMLEN=%d invalid tag read\n", pmlen); + + if (i != ~pmlen) + return ksft_test_result_fail("PMLEN=%d invalid tag value\n", pmlen); + + ksft_test_result_pass("PMLEN=%d tagged address ABI\n", pmlen); +} + +static void test_tagged_addr_abi(void) +{ + ksft_print_msg("Testing tagged address ABI\n"); + + test_tagged_addr_abi_pmlen(0); + test_tagged_addr_abi_pmlen(min_pmlen); + test_tagged_addr_abi_pmlen(max_pmlen); +} + +static struct test_info { + unsigned int nr_tests; + void (*test_fn)(void); +} tests[] = { + { .nr_tests = 17 * 3, test_pmlen }, + { .nr_tests = 3, test_dereference }, + { .nr_tests = 2, test_fork_exec }, + { .nr_tests = 2, test_tagged_addr_abi_sysctl }, + { .nr_tests = 3, test_tagged_addr_abi }, +}; + +int main(int argc, char **argv) +{ + unsigned int plan = 0; + int ret; + + /* Check if this is the child process after execve(). */ + if (!argv[0][0]) + return execve_child(); + + dev_zero = open("/dev/zero", O_RDWR); + if (dev_zero < 0) + return 1; + + /* Write to a pipe so the kernel must dereference the buffer pointer. */ + ret = pipe(pipefd); + if (ret) + return 1; + + ksft_print_header(); + + for (int i = 0; i < ARRAY_SIZE(tests); i++) + plan += tests[i].nr_tests; + + ksft_set_plan(plan); + + for (int i = 0; i < ARRAY_SIZE(tests); i++) + tests[i].test_fn(); + + ksft_finished(); +}
Supm is a virtual ISA extension defined in the RISC-V Pointer Masking specification, which indicates that pointer masking is available in U-mode. It can be provided by either Smnpm or Ssnpm, depending on which mode the kernel runs in. Userspace should not care about this distinction, so export Supm instead of either underlying extension.
Hide the extension if the kernel was compiled without support for the pointer masking prctl() interface.
Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Update pointer masking spec version to 1.0 in hwprobe documentation
Changes in v2: - New patch for v2
Documentation/arch/riscv/hwprobe.rst | 3 +++ arch/riscv/include/uapi/asm/hwprobe.h | 1 + arch/riscv/kernel/sys_hwprobe.c | 3 +++ 3 files changed, 7 insertions(+)
diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index 85b709257918..b9aec2e5bbd4 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -239,6 +239,9 @@ The following keys are defined: ratified in commit 98918c844281 ("Merge pull request #1217 from riscv/zawrs") of riscv-isa-manual.
+ * :c:macro:`RISCV_HWPROBE_EXT_SUPM`: The Supm extension is supported as + defined in version 1.0 of the RISC-V Pointer Masking extensions. + * :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was mistakenly classified as a bitmask rather than a value. diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 1e153cda57db..868ff41b93d6 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -72,6 +72,7 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_EXT_ZCF (1ULL << 46) #define RISCV_HWPROBE_EXT_ZCMOP (1ULL << 47) #define RISCV_HWPROBE_EXT_ZAWRS (1ULL << 48) +#define RISCV_HWPROBE_EXT_SUPM (1ULL << 49) #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index cea0ca2bf2a2..0ac78e9f7c94 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -150,6 +150,9 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZFH); EXT_KEY(ZFHMIN); } + + if (IS_ENABLED(CONFIG_RISCV_ISA_SUPM)) + EXT_KEY(SUPM); #undef EXT_KEY }
The interface for controlling pointer masking in VS-mode is henvcfg.PMM, which is part of the Ssnpm extension, even though pointer masking in HS-mode is provided by the Smnpm extension. As a result, emulating Smnpm in the guest requires (only) Ssnpm on the host.
The guest configures Smnpm through the SBI Firmware Features extension, which KVM does not yet implement, so currently the ISA extension has no visible effect on the guest, and thus it cannot be disabled. Ssnpm is configured using the senvcfg CSR within the guest, so that extension cannot be hidden from the guest without intercepting writes to the CSR.
Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
Changes in v5: - Do not allow Smnpm to be disabled, as suggested by Anup
Changes in v2: - New patch for v2
arch/riscv/include/uapi/asm/kvm.h | 2 ++ arch/riscv/kvm/vcpu_onereg.c | 4 ++++ 2 files changed, 6 insertions(+)
diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index e97db3296456..4f24201376b1 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -175,6 +175,8 @@ enum KVM_RISCV_ISA_EXT_ID { KVM_RISCV_ISA_EXT_ZCF, KVM_RISCV_ISA_EXT_ZCMOP, KVM_RISCV_ISA_EXT_ZAWRS, + KVM_RISCV_ISA_EXT_SMNPM, + KVM_RISCV_ISA_EXT_SSNPM, KVM_RISCV_ISA_EXT_MAX, };
diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c index b319c4c13c54..5b68490ad9b7 100644 --- a/arch/riscv/kvm/vcpu_onereg.c +++ b/arch/riscv/kvm/vcpu_onereg.c @@ -34,9 +34,11 @@ static const unsigned long kvm_isa_ext_arr[] = { [KVM_RISCV_ISA_EXT_M] = RISCV_ISA_EXT_m, [KVM_RISCV_ISA_EXT_V] = RISCV_ISA_EXT_v, /* Multi letter extensions (alphabetically sorted) */ + [KVM_RISCV_ISA_EXT_SMNPM] = RISCV_ISA_EXT_SSNPM, KVM_ISA_EXT_ARR(SMSTATEEN), KVM_ISA_EXT_ARR(SSAIA), KVM_ISA_EXT_ARR(SSCOFPMF), + KVM_ISA_EXT_ARR(SSNPM), KVM_ISA_EXT_ARR(SSTC), KVM_ISA_EXT_ARR(SVINVAL), KVM_ISA_EXT_ARR(SVNAPOT), @@ -127,8 +129,10 @@ static bool kvm_riscv_vcpu_isa_disable_allowed(unsigned long ext) case KVM_RISCV_ISA_EXT_C: case KVM_RISCV_ISA_EXT_I: case KVM_RISCV_ISA_EXT_M: + case KVM_RISCV_ISA_EXT_SMNPM: /* There is not architectural config bit to disable sscofpmf completely */ case KVM_RISCV_ISA_EXT_SSCOFPMF: + case KVM_RISCV_ISA_EXT_SSNPM: case KVM_RISCV_ISA_EXT_SSTC: case KVM_RISCV_ISA_EXT_SVINVAL: case KVM_RISCV_ISA_EXT_SVNAPOT:
On Thu, Oct 17, 2024 at 1:58 AM Samuel Holland samuel.holland@sifive.com wrote:
The interface for controlling pointer masking in VS-mode is henvcfg.PMM, which is part of the Ssnpm extension, even though pointer masking in HS-mode is provided by the Smnpm extension. As a result, emulating Smnpm in the guest requires (only) Ssnpm on the host.
The guest configures Smnpm through the SBI Firmware Features extension, which KVM does not yet implement, so currently the ISA extension has no visible effect on the guest, and thus it cannot be disabled. Ssnpm is configured using the senvcfg CSR within the guest, so that extension cannot be hidden from the guest without intercepting writes to the CSR.
Signed-off-by: Samuel Holland samuel.holland@sifive.com
LGTM.
Reviewed-by: Anup Patel anup@brainfault.org
Regards, Anup
Changes in v5:
- Do not allow Smnpm to be disabled, as suggested by Anup
Changes in v2:
- New patch for v2
arch/riscv/include/uapi/asm/kvm.h | 2 ++ arch/riscv/kvm/vcpu_onereg.c | 4 ++++ 2 files changed, 6 insertions(+)
diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index e97db3296456..4f24201376b1 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -175,6 +175,8 @@ enum KVM_RISCV_ISA_EXT_ID { KVM_RISCV_ISA_EXT_ZCF, KVM_RISCV_ISA_EXT_ZCMOP, KVM_RISCV_ISA_EXT_ZAWRS,
KVM_RISCV_ISA_EXT_SMNPM,
KVM_RISCV_ISA_EXT_SSNPM, KVM_RISCV_ISA_EXT_MAX,
};
diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c index b319c4c13c54..5b68490ad9b7 100644 --- a/arch/riscv/kvm/vcpu_onereg.c +++ b/arch/riscv/kvm/vcpu_onereg.c @@ -34,9 +34,11 @@ static const unsigned long kvm_isa_ext_arr[] = { [KVM_RISCV_ISA_EXT_M] = RISCV_ISA_EXT_m, [KVM_RISCV_ISA_EXT_V] = RISCV_ISA_EXT_v, /* Multi letter extensions (alphabetically sorted) */
[KVM_RISCV_ISA_EXT_SMNPM] = RISCV_ISA_EXT_SSNPM, KVM_ISA_EXT_ARR(SMSTATEEN), KVM_ISA_EXT_ARR(SSAIA), KVM_ISA_EXT_ARR(SSCOFPMF),
KVM_ISA_EXT_ARR(SSNPM), KVM_ISA_EXT_ARR(SSTC), KVM_ISA_EXT_ARR(SVINVAL), KVM_ISA_EXT_ARR(SVNAPOT),
@@ -127,8 +129,10 @@ static bool kvm_riscv_vcpu_isa_disable_allowed(unsigned long ext) case KVM_RISCV_ISA_EXT_C: case KVM_RISCV_ISA_EXT_I: case KVM_RISCV_ISA_EXT_M:
case KVM_RISCV_ISA_EXT_SMNPM: /* There is not architectural config bit to disable sscofpmf completely */ case KVM_RISCV_ISA_EXT_SSCOFPMF:
case KVM_RISCV_ISA_EXT_SSNPM: case KVM_RISCV_ISA_EXT_SSTC: case KVM_RISCV_ISA_EXT_SVINVAL: case KVM_RISCV_ISA_EXT_SVNAPOT:
-- 2.45.1
Add testing for the pointer masking extensions exposed to KVM guests.
Reviewed-by: Anup Patel anup@brainfault.org Signed-off-by: Samuel Holland samuel.holland@sifive.com ---
(no changes since v2)
Changes in v2: - New patch for v2
tools/testing/selftests/kvm/riscv/get-reg-list.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/kvm/riscv/get-reg-list.c b/tools/testing/selftests/kvm/riscv/get-reg-list.c index 8e34f7fa44e9..54ab484d0000 100644 --- a/tools/testing/selftests/kvm/riscv/get-reg-list.c +++ b/tools/testing/selftests/kvm/riscv/get-reg-list.c @@ -41,9 +41,11 @@ bool filter_reg(__u64 reg) case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_I: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_M: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_V: + case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SMNPM: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SMSTATEEN: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSAIA: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSCOFPMF: + case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSNPM: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSTC: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SVINVAL: case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SVNAPOT: @@ -414,9 +416,11 @@ static const char *isa_ext_single_id_to_str(__u64 reg_off) KVM_ISA_EXT_ARR(I), KVM_ISA_EXT_ARR(M), KVM_ISA_EXT_ARR(V), + KVM_ISA_EXT_ARR(SMNPM), KVM_ISA_EXT_ARR(SMSTATEEN), KVM_ISA_EXT_ARR(SSAIA), KVM_ISA_EXT_ARR(SSCOFPMF), + KVM_ISA_EXT_ARR(SSNPM), KVM_ISA_EXT_ARR(SSTC), KVM_ISA_EXT_ARR(SVINVAL), KVM_ISA_EXT_ARR(SVNAPOT), @@ -946,8 +950,10 @@ KVM_ISA_EXT_SUBLIST_CONFIG(aia, AIA); KVM_ISA_EXT_SUBLIST_CONFIG(fp_f, FP_F); KVM_ISA_EXT_SUBLIST_CONFIG(fp_d, FP_D); KVM_ISA_EXT_SIMPLE_CONFIG(h, H); +KVM_ISA_EXT_SIMPLE_CONFIG(smnpm, SMNPM); KVM_ISA_EXT_SUBLIST_CONFIG(smstateen, SMSTATEEN); KVM_ISA_EXT_SIMPLE_CONFIG(sscofpmf, SSCOFPMF); +KVM_ISA_EXT_SIMPLE_CONFIG(ssnpm, SSNPM); KVM_ISA_EXT_SIMPLE_CONFIG(sstc, SSTC); KVM_ISA_EXT_SIMPLE_CONFIG(svinval, SVINVAL); KVM_ISA_EXT_SIMPLE_CONFIG(svnapot, SVNAPOT); @@ -1009,8 +1015,10 @@ struct vcpu_reg_list *vcpu_configs[] = { &config_fp_f, &config_fp_d, &config_h, + &config_smnpm, &config_smstateen, &config_sscofpmf, + &config_ssnpm, &config_sstc, &config_svinval, &config_svnapot,
Hello:
This series was applied to riscv/linux.git (for-next) by Palmer Dabbelt palmer@rivosinc.com:
On Wed, 16 Oct 2024 13:27:41 -0700 you wrote:
RISC-V defines three extensions for pointer masking[1]:
- Smmpm: configured in M-mode, affects M-mode
- Smnpm: configured in M-mode, affects the next lower mode (S or U-mode)
- Ssnpm: configured in S-mode, affects the next lower mode (VS, VU, or U-mode)
This series adds support for configuring Smnpm or Ssnpm (depending on which privilege mode the kernel is running in) to allow pointer masking in userspace (VU or U-mode), extending the PR_SET_TAGGED_ADDR_CTRL API from arm64. Unlike arm64 TBI, userspace pointer masking is not enabled by default on RISC-V. Additionally, the tag width (referred to as PMLEN) is variable, so userspace needs to ask the kernel for a specific tag width, which is interpreted as a lower bound on the number of tag bits.
[...]
Here is the summary with links: - [v5,01/10] dt-bindings: riscv: Add pointer masking ISA extensions https://git.kernel.org/riscv/c/8946ad26c0e2 - [v5,02/10] riscv: Add ISA extension parsing for pointer masking https://git.kernel.org/riscv/c/12749546293e - [v5,03/10] riscv: Add CSR definitions for pointer masking https://git.kernel.org/riscv/c/1edd6226877b - [v5,04/10] riscv: Add support for userspace pointer masking https://git.kernel.org/riscv/c/871aba681a0d - [v5,05/10] riscv: Add support for the tagged address ABI https://git.kernel.org/riscv/c/c4d16116bd11 - [v5,06/10] riscv: Allow ptrace control of the tagged address ABI https://git.kernel.org/riscv/c/cebce87fb04e - [v5,07/10] riscv: selftests: Add a pointer masking test https://git.kernel.org/riscv/c/5e9f1ee1c523 - [v5,08/10] riscv: hwprobe: Export the Supm ISA extension https://git.kernel.org/riscv/c/d250050aae4f - [v5,09/10] RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests https://git.kernel.org/riscv/c/e27f468bcf14 - [v5,10/10] KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test https://git.kernel.org/riscv/c/814779461d84
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org